Stellar Characterization of Keck HIRES Spectra with The Cannon

Malena Rice; John M. Brewer

doi:10.3847/1538-4357/ab9f96

1. Introduction

High-precision spectroscopic stellar characterization is critical to understanding the host environments of planetary systems. The diversity of observed exoplanetary systems suggests a wide range of system properties influenced by a correspondingly wide range of formation environments. Furthermore, several of the most prominent current methods of studying exoplanets rely upon indirect measurements, probing how planets gravitationally perturb their host stars (radial velocity measurements; e.g., Cumming 2004; Lovis & Fischer 2010; Butler et al. 2017)) and/or alter the time-series photometry of their host star (transit and phase curve measurements; e.g., Southworth 2008; Borucki et al. 2009; Cowan et al. 2013; Esteves et al. 2013; Haswell 2010; Morello et al. 2014). To appropriately disentangle the properties of planets from their host stars' signals, and to interpret the relationship between these planets and their formation environments, it is necessary to robustly determine the properties of the host stars within these systems (Brewer et al. 2018).

Several existing catalogs report the derived properties of stars based on different spectroscopic surveys (e.g., Hinkel et al. 2014 and references therein). One example, the Spectroscopic Properties of Cool Stars catalog (SPOCS; Valenti & Fischer 2005), analyzed nearly 2000 Keck High Resolution Echelle Spectrometer (HIRES) spectra of over 1000 F, G, and K dwarfs that were obtained as part of the Keck, Lick, and AAT planet search programs (Cumming et al. 1999; Fischer et al. 1999; Butler et al. 2003; Marcy et al. 2004, 2005). The precise stellar parameters and five elemental abundances (Fe, Si, Ti, Na, Ni) obtained in this survey demonstrated for the first time the positive correlation between the frequency of close-in giant planets and host star metallicity (Fischer & Valenti 2005), providing a landmark constraint toward a more cohesive understanding of planet formation (e.g., Robinson et al. 2006).

The bulk of the spectra analyzed in Fischer & Valenti (2005) were obtained with the Keck HIRES spectrograph (Vogt et al. 1994) prior to a detector upgrade that took place in August 2004. The newer three-chip detector, once installed, proved advantageous; it allowed for more extensive spectral analyses, including higher-precision gravity measurements (Brewer et al. 2015) and abundances for 15 elements (Brewer et al. 2016; Brewer & Fischer 2018) obtained using the stellar modeling program Spectroscopy Made Easy (SME; Valenti & Piskunov 1996). These improved parameters enabled the measurement of more precise masses and radii for observed stars and their accompanying planets (Brewer et al. 2016).

SME incorporates empirical atomic and molecular line data to develop physically motivated synthetic spectra, which can then be compared to stellar data to fit for parameters of interest. However, the computational expense of SME becomes prohibitively high for large stellar samples, since each individual star typically takes ∼14 hours to model in SME. Furthermore, the analysis techniques used to develop uniform catalogs in Brewer et al. (2016) and Brewer & Fischer (2018) rely on the extended wavelength coverage of the newer three-chip detector and, as a result, cannot be applied to the older (pre-2004) spectra.

Keck HIRES includes an iodine cell used to extract the radial velocity signals of planets orbiting stars (Butler et al. 1996). For each observed star, a "template" spectrum of only the star is obtained without the iodine cell in place. To measure the reflex motion of the star, the same star is then observed through the iodine cell, imprinting a rest-frame iodine spectrum onto the stellar spectrum. Planets produce a radial velocity shift in the star, manifest in the observed spectra as a slight offset of the stellar spectrum relative to the iodine lines. The template spectrum of the star at different radial velocity offsets is convolved with a reference spectrum of the iodine cell and an instrumental profile in order to determine the observed radial velocity. The iodine-free template spectra can also be used to deduce properties of the observed stars.

Not all stars observed before 2004 had another template spectrum taken afterwards, and many stars were dropped early on from radial velocity surveys if, after a few observations, they were found to have a root-mean-square scatter below the precision of the spectrograph used at the time (∼3 m s^–1). Although those stars were deemed "planetless" based on the absence of high-amplitude signals, much-improved spectrographs in the modern era currently reach precisions an order of magnitude lower than these prior surveys (e.g., the EXtreme PREcision Spectrograph (EXPRES; Jurgenson et al. 2016; Petersburg et al. 2020); the Echelle SPectrograph for Rocky Exoplanets Search and Stable Spectroscopic Observations (ESPRESSO; Pepe et al. 2014); and the upcoming NN-explore Exoplanet Investigations with Doppler spectroscopy spectrograph (NEID; Schwab et al. 2016)), meaning that many of these stars are again targets of interest in current planet searches.

Fortunately, new spectral analysis techniques can be used to address both problems described above: the computational expense of synthetic spectral models, as well as the dependence of these models upon a specific wavelength coverage. The Cannon (Ness et al. 2015; Casey et al. 2016) is a supervised learning algorithm that determines stellar labels by identifying correlations on a pixel-by-pixel basis. By "learning" the properties of a uniformly classified data set of stars, The Cannon can efficiently and accurately transfer these learned correlations to a new set of stars spanning a similar parameter space.

Previous studies have applied The Cannon to obtain stellar parameters and abundances using spectra from the Apache Point Observatory Galactic Evolution Experiment (APOGEE) as part of the Sloan Digital Sky Survey (SDSS; e.g., Ness et al. 2016; Abolfathi et al. 2018; Holtzman et al. 2018); from the Galactic Archaeology with HERMES (GALAH) survey (e.g., Buder et al. 2018; Kos et al. 2018); from the RAdial Velocity Experiment (RAVE; Casey et al. 2017), and from the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST; Ho et al. 2017), among others. Behmard et al. (2019) also completed an analysis of 141 cool stars observed with Keck HIRES, focusing on the subset of K and M stars with T_eff < 5200 K, to estimate precise effective temperatures (T_eff), stellar radii (R_*), and metallicities ([Fe/H]). We complete a more extended study here, applying the full Brewer et al. (2016) SPOCS catalog to develop a new model predicting 18 stellar labels for dwarfs and subgiants spanning T_eff = 4700–6674 K.

In this paper, we first train The Cannon using the Brewer et al. (2016) SPOCS catalog to produce a model which rapidly and reliably retrieves 18 precisely determined stellar labels ( $\mathrm{log}g$ , T_eff, $v\sin i$ , and 15 elemental abundances: C, N, O, Na, Mg, Al, Si, Ca, Ti, V, Cr, Mn, Fe, Ni, and Y) for input post-2004 Keck HIRES stellar spectra. This model is made available as a tool for public use and is applicable to all current and future Keck HIRES spectra taken since the 2004 detector upgrade. While it covers a considerably smaller temperature range than SpecMatch-Emp (Yee et al. 2017), an empirical grid-based code designed to characterize HIRES spectra with T_eff ≈ 3000–7000 K, our model returns 18 precisely determined stellar labels in comparison with three returned by Specmatch-Emp (T_eff, R_*, and [Fe/H]).

We then interpolate the SPOCS spectra to the pre-2004 detector's wavelength range and again apply The Cannon to develop a homogeneously analyzed catalog of these 18 stellar labels for 477 archival Keck stars, using overlapping stars observed in both the pre- and post-2004 samples as our test set. We demonstrate that, given a reliably labeled training set, The Cannon can be used to efficiently obtain high-precision stellar parameters from large-scale spectroscopic surveys, with a combined speed and accuracy unattainable for more time-intensive, single-object stellar classification methods.

2. Methods: The Cannon

A detailed overview of the methods applied in The Cannon can be found in Ness et al. (2015) and Casey et al. (2016). We briefly review these methods here, and we refer the reader to those papers for a more in-depth description of the code. In this work, we apply the version of The Cannon described in Casey et al. (2016).

In short, The Cannon develops a generative spectral model, described by coefficient vectors θ_j corresponding to each modeled parameter at each pixel j, to characterize the relationship between flux and label values in each pixel. This model is trained using an input data set with known labels spanning the same parameter space as the stars that are characterized. The coefficient vector of each pixel is determined by minimizing the summed log likelihood function

$\begin{eqnarray}&&{\theta }_{j},{s}_{j}^{2}\leftarrow \mathop{\theta ,s}\limits^{\mathrm{argmin}}\left[\displaystyle \sum _{n=0}^{N-1}\mathrm{ln}p({y}_{{jn}}| \theta ,{l}_{n},{s}^{2})\right],\end{eqnarray} \tag{ 1 }$

where the pixel's log likelihood is given by

$\begin{eqnarray}&&\mathrm{ln}p({y}_{{jn}}| \theta ,{l}_{n},{s}^{2})=\displaystyle \frac{{\left[{y}_{{jn}}-\theta \cdot f({l}_{n})\right]}^{2}}{{s}^{2}+{\sigma }_{{jn}}^{2}}+\mathrm{ln}({s}^{2}+{\sigma }_{{jn}}^{2}).\end{eqnarray} \tag{ 2 }$

Here, l_n is a vector containing the star's labels (in our case, 18 labels per star), y_jn is the flux at a given pixel j and stellar spectrum n, and f(l_n) is a vectorizing function that determines the form of our model (in our case, a quadratic polynomial with all cross-terms included). The noise is characterized by ${s}^{2}+{\sigma }_{{jn}}^{2}$ , where σ_jn encapsulates the reported instrumental and Poisson uncertainty, while s provides the intrinsic scatter of the model.

Ultimately, this produces a generative model that describes the probability density function of flux at each wavelength as a function of stellar labels. Once the coefficient vectors have been obtained through supervised learning with a set of pre-labeled input stars, the generative model can be applied to a new set of spectra to transfer stellar labels based on the trained model's vectorizing function and coefficients. This step is accomplished by optimizing Equation (3), which sums over all j pixels in the spectrum, to find the label vector l for each test star m:

$\begin{eqnarray}&&{l}_{m}\leftarrow \mathop{l}\limits^{\mathrm{argmin}}\left[\displaystyle \sum _{j=0}^{J-1}\mathrm{ln}p({y}_{{jm}}| {\theta }_{j},l,{s}_{j}^{2})\right].\end{eqnarray} \tag{ 3 }$

In this way, The Cannon "learns" the characteristics of the stars that it is trained on in order to efficiently transfer labels to a new set of stars with similar properties. Ness et al. (2015) demonstrated that The Cannon provides robust results for low signal-to-noise spectra when trained upon higher-resolution spectra.

While the training and validation step of this label transfer process can be time-intensive, a well-characterized model, once trained, can be easily saved and applied to new data sets for rapid characterization. We employ this property of The Cannon to develop a new open-source model applicable to current and future Keck HIRES stellar spectra in Section 4.

3. Data Selection and Processing

Throughout this work, we use the SPOCS data set for training and model validation testing. Our pre-labeled data set includes ∼3800 HIRES spectra of ∼1600 objects, with labels from Brewer et al. (2016) obtained using the SME software combined with atomic and molecular line data from the Vienna Atomic Line Database 3 (VALD-3; see Brewer et al. 2016 for a comprehensive set of SPOCS line list contributors). All stars were observed with HIRES in the red configuration, with the iodine cell out and R ∼ 70,000. From this original sample, we removed all spectra with one or more of the following properties:

1.
labeled with "NGC" (deep sky objects; not individual stars)
2.
flagged as "bad"
3.
signal-to-noise ratio (S/N) < 100.

Together, these cuts reduced our pre-labeled sample to 1202 stars with 2018 spectra, where the distribution of stellar parameters in our final sample is displayed in Figure 1. To optimize our inputs, we selected only the highest-S/N spectrum from each star, resulting in 1202 total spectra.

**Figure 1.** Distribution of our final sample of 1202 stars, colored by metallicity.
Download figure:
Standard image High-resolution image

An automated version of the data reduction pipeline used to reduce the SPOCS data set is available for public use online (Marcy & Butler 1992; Butler et al. 1996; Howard et al. 2010).⁴ Because the Keck HIRES instrument is an echelle spectrograph, each of our sample spectra contains 16 separate echelle orders. Each echelle order initially included a blaze function convolved with the instrumental response function, leading to an underlying continuum upon which the spectral features of the observed star were imprinted. To deduce the shape of the continuum, each spectrum was individually fit with a iterative polynomials using the algorithm described in Valenti & Fischer (2005). These continuum fits were then divided out of the corresponding spectra to obtain a set of continuum-normalized spectra with baseline flux set to unity.

Our selected spectra were initially slightly shifted relative to each other in wavelength space due to the varying line-of-sight velocities of stars within our sample, leading to slight shifts in the wavelength solutions. To account for this effect, we interpolated all spectra to the same wavelength grid for a one-to-one comparison of each pixel across the spectra. This grid was determined by finding the maximum wavelength range spanned by all spectra in our sample and keeping the total number of data points in each spectrum the same, and we carried out this process independently for each echelle order. Each final spectrum included 16 echelle orders each with 4021 pixels, resulting in a total of 64,336 data points per star.

4. Developing a Model: Current and Future Keck HIRES Spectra

Keck's current HIRES spectrograph has been in use since 2004 to search for and study extrasolar planets. Paired with the 10 m Keck I telescope, HIRES is a powerful tool to probe dim stars, such as many Kepler planet hosts, that are prohibitively faint for study with other telescopes.

Many stars that were not part of the original SPOCS catalog have been and continue to be observed with HIRES. Thus, a reliable model to extract stellar properties from HIRES spectra is crucial. We describe here our methods in developing a new, open-source model that rapidly delivers 18 stellar labels, including T_eff, $v\sin i$ , log g, and 15 elemental abundances (C, N, O, Na, Mg, Al, Si, Ca, Ti, V, Cr, Mn, Fe, Ni, and Y).

4.1. Model Selection Framework

Throughout our model testing phase, we used three different 80%/20% train/test splits to check our model performance. These splits were randomly selected at the beginning of our testing, and we used the same divisions at each progressive test step for a direct comparison between models. We ran three tests at each step to verify that any observed differences in performance were due to generalizable changes in the model performance, rather than stochastic variations in the selected test/train samples.

After training our model on the 80% training set, we benchmarked its performance by (1) checking the model's ability to recover the input training set labels and (2) cross-validating using our 20% test set with known "true" labels. The first of these benchmarks was used only to verify that the model was performing correctly, while we report all results based on the second benchmark, which provides an independent check for our model performance.

While exploring various configurations to optimize our model performance, we ran tests on individual echelle orders as well as the combined 16 orders. Our reasons for this were twofold.

First, testing with a limited wavelength range is far less computationally expensive than with the full range and, as a result, we were able to complete a more extensive analysis in the single-echelle-order case. This informed our more computationally intensive tests that included all 16 echelle orders, allowing us to more quickly optimize our models and to consider a wider range of possible adjustments.

Second, some orders contain particularly important spectroscopic lines—for example, the gravity-sensitive Mg Ib triplet at 5183, 5172, and 5167 Å and the forbidden O line at 6300 Å—and should therefore perform particularly well to extract associated parameters. Beyond producing a model useful for the characterization of Keck HIRES spectra, we were interested to determine from which wavelength ranges The Cannon obtained the most useful information and with what precision a smaller wavelength range with more concentrated information could determine our stellar parameters of interest. Thus, we optimized both a single-echelle-order model and a model including all 16 echelle orders. We report our results in both cases but make only the all-orders model publicly available due to its improved performance over the single-order model.

Our metric for model performance is a χ² test in which we minimize the function

$\begin{eqnarray}&&{\chi }^{2}=\displaystyle \frac{1}{N}\displaystyle \sum _{i=1}^{N}\displaystyle \frac{{\left({x}_{i}-{E}_{i}\right)}^{2}}{| {E}_{i}| }.\end{eqnarray} \tag{ 4 }$

Here, N is the number of values in the sample, and E_i and x_i are the expected and predicted value, respectively, for the parameter at each step in the summation. By dividing the mean squared error by the expected value of each parameter, we normalize our performance metric to avoid biases from the unequal scales of each label. Our adopted χ² thus measures a modified percent deviation from the expected value of each label. We determine the average χ² across our three models after implementing each new adjustment and compare that value with the previous best-performing χ² to determine which adjustments to implement in our final model.

4.2. Outlier Removal

From initial testing, we discovered that there existed several spectra with outlier stellar labels within our train/test sample. As a result, we ran tests in which we removed outliers using several different thresholds in search of a threshold that improved the accuracy of our model without removing enough spectra to degrade the model's performance. Where q₁ and q₃ represent the first and third quartile of the data, respectively, the interquartile range (IQR) is given by

$\begin{eqnarray}&&\mathrm{IQR}=| {q}_{3}-{q}_{1}| .\end{eqnarray} \tag{ 5 }$

We define an outlier as a value that falls more than a factor (x_O · IQR) below q₁ or above q₃, where x_O determines the stringency of our requirement for a data point to be classified as an outlier. We tested three different values x_O = 1.5, 3, and 10 (resulting in 59, 10, and 1 outlier stars removed from the sample, respectively), as well as the case in which no outliers are removed, to determine the optimal x_O value.

We found that, in the all-orders case, we obtained the lowest χ² with x_O = 10. Our best-performing single-order model was order 10, spanning the wavelength range 5355–5445 Å, with no outliers removed. In Table 1, we compare the total number of atomic lines returned by VALD-3 in the best- and worst-performing single-order wavelength ranges. While the number of atomic lines is not a perfect metric to compare the information content of different wavelength orders due to the varying strength of atomic lines, as well as the presence of molecular lines, a zeroth-order comparison between these two wavelength orders reveals that our best-performing wavelength order contains multiple known atomic lines associated with each element. In contrast, this is not the case for our worst-performing wavelength order, which contains no known Na lines and thus may perform particularly poorly in returning the [Na/H] label.

Table 1. Number of Atomic Lines for Each Analyzed Element in Our Best-performing Single Wavelength Order (5355–5445 Å) Compared with Our Worst-performing Single Wavelength Order (6312–6418 Å)

Element	5355–5445 Å	6312–6418 Å
Fe	1414	1253
C	69	53
N	12	32
O	46	31
Na	3	0
Mg	8	8
Al	12	24
Si	56	72
Ca	134	340
Ti	224	208
V	380	320
Cr	654	683
Mn	385	395
Ni	486	582
Y	81	64

Download table as: ASCII Typeset image

In general, each individual order provided systematically better results with no outliers removed than with x_O = 3 or x_O = 10, although the case with x_O = 1.5 provided similar results. We chose to move forward in testing with the top-performing order as a representative wavelength range that performs well on its own, noting that the stochastic variation in performance due to the random train/test split is larger than the margin of improvement obtained from using this order rather than the second-best-performing order.

4.3. Tuning the Model

Next, we consider a range of possible model adjustments to determine an optimal configuration for our final model. To cover a breadth of model configurations, we use the single, best-performing individual wavelength order found in Section 4.2 for initial testing purposes (order 10; 5355–5445 Å, with no outliers removed). Once we have run these tests on an individual order, we use the results to inform further testing with our full wavelength range.

In the following subsections, we test different approaches to continuum normalization (Section 4.3.1), telluric contamination (Section 4.3.2), label censoring (Section 4.3.3), and regularization (Section 4.3.4). We run three configurations of every test setup, each with a different randomly drawn train/test split, to ensure that our results are generalizable across samples spanning a similar range of labels.

We caution that the hyperparameters selected to optimize our best-performing single-order model do not necessarily translate to the best possible model when applied to all of our echelle orders together. Furthermore, we progressively build upon our model adjustments, accepting or rejecting changes to our base model in a set order. An exhaustive search for the single best-performing model would include all possible permutations of these model adjustments and would test all of these models with all echelle orders included. However, the computational expense of this exercise would be prohibitive with potentially diminishing returns. As a result, we operate under the assumptions that (1) a model that performs well for a single echelle order will also perform well with all orders included and (2) altering the order in which we apply model adjustments would not result in substantial improvements in our best-performing model.

4.3.1. Data-driven Continuum Renormalization

The Cannon accepts continuum-normalized spectra with flux baseline set to unity as its inputs and, as a result, the manner with which the continuum normalization is applied also affects the performance of The Cannon. With the goal of improving our continuum-fitting procedure while reducing the signal-to-noise dependence of our model, we tested the effects of applying data-driven continuum renormalization methods when optimizing our training setup.

We completed this process by finding the "true" continuum pixels in a data-driven manner. These pixels, each with a corresponding wavelength value, act as part of the continuum in that they vary minimally with changes in the stellar label values and return flux values close to unity in the spectral model's baseline spectrum, defined as the zeroth-order coefficient vector returned by The Cannon.

We first trained The Cannon using our initial continuum normalization, and we used this model to identify the pixels that varied the least with our four dominant stellar labels: log g, [Fe/H], T_eff, and $v\sin i$ . We selected some percentage N% of pixels with coefficients closest to zero for each label, then determined the set of overlapping pixels across all four labels. Finally, we applied a cut removing all pixels from this set that lay outside 1.5% of unity in the spectral model's baseline spectrum. To explore several possible model configurations, we used four different thresholds for pixel selection: N = 50, 60, 70, and 80. For these thresholds, the final percentage of pixels identified as "true" continuum pixels ranged from ∼3%–6%, ∼9%–18%, ∼11%–21%, and ∼20%–23%, respectively, with variations arising from random differences in our three train/test sets.

We applied two continuum renormalization schemes to each of these four cases to check for improvement in our model results: a polynomial continuum fit as well as a continuum fit composed of summed sine and cosine functions. To select the polynomial order used to fit each spectrum, we tested polynomial fits with n, the number of free parameters, ranging from n = 1 to 10. For each spectrum, we chose the n value corresponding to the lowest reduced χ². We applied the fitting function described in Casey et al. (2016) for our summed sin/cos renormalization tests.

As in previous tests, we ran each case in three iterations and used average values from these iterations to quantify the model performance. Thus, we ran a total of 24 train/test models in this section: two functional forms (polynomial and sin/cos) for each of the four pixel selection thresholds, and three iterations of each combination.

Because our continuum renormalizations do not use all pixels in each spectrum—only the roughly 3%–23% that are selected as "true" continuum pixels—the edges of our spectra are not generally included within the fits. As a result, we renormalized only the pixels between the minimum and maximum "continuum" pixels, setting our data-driven continuum fits equal to unity outside of these bounds. Figure 2 shows sample fits for the N = 70 case, with the polynomial fit shown in green and the sin/cos fit in purple. Generally, as in Figure 2, the two fitting methods closely trace each other and deviate most in wavelength ranges with few identified continuum pixels.

**Figure 2.** Sample N = 70 continuum renormalization fit over the spectrum of K0 star HD 22072 shown in blue. Here, the polynomial fit is shown in green while the sin/cos fit is in purple. Continuum pixels are denoted by black markers. The top panel shows the full spectrum over echelle order 10, while the bottom panel zooms in for a clearer comparison between continuum fits.
Download figure:
Standard image High-resolution image

We found that, for both the polynomial and sin/cos renormalization, the N = 70 case produced the lowest reduced χ² value, with polynomial renormalization providing the best results. Both of these cases showed improvements over our original test case with no continuum renormalization, while N = 50, 60, and 80 each produced slightly degraded results. Thus, we chose to adopt the N = 70 implementation with polynomial continuum renormalization in our continued single-order tests moving forward.

Based on the promising results of this test, we also applied data-driven continuum renormalization to our full model with x_O = 10, testing the N = 70 case with both the sin/cos and polynomial renormalizations. Using the same three 80%/20% splits as our pre-tuned models for a direct comparison, we again found that both renormalization schemes improved our results, and the polynomial renormalization provided the best results. As a result, we chose to include a polynomial continuum renormalization in our final version of the model and in continued tests.

4.3.2. Telluric Masking

Telluric lines are spectral imprints of the Earth's atmosphere superimposed onto all spectra taken by ground-based telescopes. The presence and variation of these lines over time can produce noise in a spectrum that is difficult to disentangle from the astrophysical signal of interest. Thus, our next step in improving our model performance is to mask out telluric lines to avoid introducing false correlations into our model.

In each spectrum, the locations of telluric lines remain stationary while the stellar lines are shifted to their rest-frame wavelengths using a barycentric correction and the radial velocity of the host star. As a result, the locations of the telluric lines do not perfectly align in every spectrum. To account for this effect, we determined the locations of all known tellurics in each spectrum and created a corresponding mask for each. Telluric masks were created by selecting pixels below 99% of the continuum in the National Solar Observatory (NSO) solar atlas telluric spectrum (Wallace et al. 2011) and rescaling the masks to the resolution of our spectra.

We then combined these masks to create a final, uniform mask applied to all spectra. We visualize the resulting mask in Figure 3, where masked pixels are denoted with black markers below the spectrum, while the unmasked pixels are displayed above the spectrum for comparison.

The top panel of Figure 3 displays the telluric mask applied to all 16 wavelength orders placed side-by-side, while the bottom panel zooms in to our best-performing single order. Every echelle order is shown in a different color, and the sample spectrum has been continuum-renormalized using the methods described in Section 4.3.1. With this method, we found that 40,218 of the full 64,336 pixels remained unmasked after telluric masking.

In our single best-performing echelle order alone, 1419 pixels of the full 4021 were masked. Despite this substantial masking, which removed roughly 35% of pixels from the model, our model's performance improved due to the removal of confusion from telluric signals. As a result, we continued to implement telluric masking in our ongoing single-order tests.

Because of the nonuniform distribution of telluric lines across echelle orders, testing the performance of a single telluric-masked order is insufficient to determine the overall effect of telluric masking on the performance of The Cannon. Thus, we also trained The Cannon on our three train/test splits using the full telluric-masked spectrum, with all 16 orders input together. Building upon our previous best case with x_O = 10 and polynomial renormalization, we found that The Cannon returned further improved results with the implemented telluric masking in place for all 16 orders. Thus, we continued to use this telluric masking in ongoing testing and in our final model configuration.

4.3.3. Censoring

Censoring allows the user to select which individual labels contribute to the model's flux in each pixel, providing a method to incorporate prior knowledge of known features that correlate with each label. We use a data-driven approach to apply censoring within our models in a similar manner to our continuum pixel selection implemented in Section 4.3.1. This allows us to circumvent problems arising from the use of individual element line lists since, as illustrated in Ting et al. (2019), abundances may have complex correlations due to the presence of molecules in stellar atmospheres. By applying our data-driven methods, we remain agnostic to the cause of the observed correlations, instead focusing only on the ability of our model to reproduce these features.

To test different thresholds, we first trained our model with all pixels included. Then, for each label, we selected the top (1) 5%, (2) 15%, (3) 50%, (4) 85%, and (5) 95% of nonzero pixels—pixels that were not masked out as telluric lines—with coefficient values furthest from zero, indicating strong variations in flux with changes to that label's value. These maximally varying pixels are most directly impacted by the corresponding stellar label and should thus generally correspond to relevant stellar lines or features. We retrained the model, allowing each label to contribute flux to only its selected, most highly varying pixels, then applied the trained model to our test set to check its performance.

We completed this process for two different cases: one in which we applied censoring only to the primary four labels describing a star (log g, T_eff, $v\sin i$ , and [Fe/H]) and another in which we censored all 18 labels. Sample pixel selections in the four-label case are depicted in Figure 4, which shows the 85% and 15% most highly varying pixels as the top and bottom "unmasked" row of each color, respectively. Conversely, all pixels in Figure 4 that are not "unmasked" are labeled as "masked" below the spectrum to more clearly visualize the distribution of pixels included and excluded. With censoring implemented, each label has its own independent mask such that all labels contribute flux to only the pixels with which they vary the most.

**Figure 4.** Censored wavelengths for sample star HD 22072, selected for the primary four stellar labels: [Fe/H] (green), log g (blue), T_eff (violet), and $v\sin i$ (purple). The unmasked pixels corresponding to each label are shown above the spectrum, and the masked, unused pixels are below. The spectrum shown has been continuum-renormalized with N = 70, the best-performing single-order continuum renormalization. Masks for each label are provided in pairs, where the upper line in each color corresponds to the 85% mask, while the lower line corresponds to the 15% mask. "Unmasked" pixels are included in the analysis for that label, while "masked" pixels are excluded.
Download figure:
Standard image High-resolution image

**Figure 4.** Censored wavelengths for sample star HD 22072, selected for the primary four stellar labels: [Fe/H] (green), log g (blue), T_eff (violet), and $v\sin i$ (purple). The unmasked pixels corresponding to each label are shown above the spectrum, and the masked, unused pixels are below. The spectrum shown has been continuum-renormalized with N = 70, the best-performing single-order continuum renormalization. Masks for each label are provided in pairs, where the upper line in each color corresponds to the 85% mask, while the lower line corresponds to the 15% mask. "Unmasked" pixels are included in the analysis for that label, while "masked" pixels are excluded.
Download figure:
Standard image High-resolution image

We again completed our testing process with three train/test splits for each case to reduce the effect of stochastic variations resulting from different randomly selected train/test samples. We ultimately ran 30 total single-order train/test iterations: three iterations for each of the two censoring thresholds applied to the five choices in the number of labels censored.

Our single-order models performed best with little to no censoring, and censoring only the four primary labels produced consistently more accurate results than censoring all 18. However, our best-performing single-order censoring run—our 95% case with four labels censored—performed slightly worse than our model with no censoring implemented. We tested this case with all orders included as well, and found that, as in the single-order case, our results degraded slightly. Furthermore, censoring within The Cannon causes a substantial increase in the model training time. We concluded that the loss of information from removing even 5% of pixels from each label's training set was greater than the gain from censoring in our model, and we elected to use a less time-intensive version of our model with no censoring for our final model training.

4.3.4. L1 Regularization

Lastly, we explored the use of L1 regularization, or lasso regression, to enforce sparsity within our models. In practice, this means that we include a penalty term in our cost function which scales with the summed absolute value of coefficients for all labels. Models are then "encouraged" to take on a simpler form in which coefficients tend toward zero values, and the severity of the penalty term determines the simplicity enforced for the model. This penalty term is denoted in Equation (6) as β, and the strength of the regularization is set by the parameter Λ:

$\begin{eqnarray}&&\beta ={\rm{\Lambda }}\displaystyle \sum _{q=1}^{Q-1}| {\theta }_{q}| .\end{eqnarray} \tag{ 6 }$

We sum over the Q components in the coefficient vector θ, excluding the zeroth term that provides the baseline spectrum of the model. Thus, the full model with regularization is given by

$\begin{eqnarray}&&{\theta }_{j},{s}_{j}^{2}\leftarrow \mathop{\theta ,s}\limits^{\mathrm{argmin}}\left[\displaystyle \sum _{n=0}^{N-1}\mathrm{ln}p({y}_{{jn}}| \theta ,{l}_{n},{s}^{2})+\beta \right].\end{eqnarray} \tag{ 7 }$

Because our model is high-dimensional, with 18 different parameters, a sparser model may prevent over-fitting and thus lead to improved performance on the test set. We tested for this possibility by implementing regularization within our single-order model, with test values Λ = 1, 10, 100, 1000, and 10,000. The effect of each Λ value on the distribution of coefficient strengths in one of our three trained models is shown in Figure 5.

**Figure 5.** Cumulative fractional sparsity at each tested regularization value in one of our three test cases. At each θ_min value along the x-axis, the total fraction of coefficients with values smaller than θ_min is given for each of our test cases Λ = 1, 10, 100, 1000, and 10,000, as well as the case with no regularization incorporated (Λ = 0). All cumulative distributions bottom out at fractional sparsity 0.353 because this is the fraction of pixels set to zero by our telluric mask.
Download figure:
Standard image High-resolution image

We found that these enforcements of regularization substantially increased the model training time, and higher regularization values degraded the accuracy of our test set label recovery in all cases except Λ = 1, where we found a slight improvement over the Λ = 0 case. However, this improvement was minor (χ² = 3.72 by comparison with χ² = 3.74), and applying Λ = 1 to the all-orders model increased the projected training time by a factor of over 300, making the model much less flexible and more tedious to retrain. Furthermore, Figure 5 shows that Λ = 1 only marginally increases the sparsity of the model, resulting in minimal changes from the Λ = 0 case. As a result, we chose not to incorporate regularization in our final model configurations.

4.4. Final Model Configurations

Our top-performing model configurations in both the single-order and all-orders case are summarized in Table 2. Both models use a polynomial continuum renormalization and include telluric masking, and neither includes censoring. The primary differences between these model configurations are the lack of regularization and the removal of a single outlier data point in the all-orders configuration, using x_O = 10.

Table 2. Optimized Training Configuration for Our Trials, Developed to Classify Post-2004 Keck Hires Spectra

	Single Order	All Orders
x_O	none	10
Renormalization	N = 70, polynomial	N = 70, polynomial
Telluric masking	included	included
Censoring	none	none
Λ	1	0

Note. The single order spans wavelength range 5355–5445 Å.

Download table as: ASCII Typeset image

To understand the performance of our model, it is informative to consider how individual spectral features are reflected in the corresponding, relevant pixel coefficients. For example, Figure 6 shows an observed HIRES solar spectrum—obtained for calibration by observing the bright asteroid Vesta—and three coefficient vectors of our final, all-orders model (θ_Mg, θ_logg, ${\theta }_{v\sin i}$ ) in the vicinity of the Mg Ib triplet. Deviations from the zero baseline of each coefficient vector signify that our model finds a correlation or anticorrelation between the flux at that pixel and the parameter value weighted by that coefficient. Therefore, more weight is placed on pixels further from the baseline during the label transfer process.

**Figure 6.** Comparison of the solar spectrum (top) with three coefficient vectors (θ_Mg, ${\theta }_{v\sin i}$ , θ_{log g}) in the same wavelength segment for comparison. The θ_{log g} and θ_Mg coefficient vectors are vertically displaced to baselines of y = 0.6 and y = 1.0, respectively, for visual clarity; all coefficient baselines are demarcated with solid gray horizontal lines. We focus on the region directly surrounding the Mg Ib triplet to show how the primary pixel correlations deduced by The Cannon correspond to spectral features at the same wavelengths. While θ_Mg directly correlates with the cores of the three Mg lines, which provide information about the Mg abundance, θ_{log g} is more directly affected by the wings of the lines, which provide a metric for the star's surface gravity strength. Intermediate-depth lines are most heavily weighted by ${\theta }_{v\sin i}$ , since these lines are typically neither saturated nor prone to blending together with the baseline. These features are independently identified by The Cannon through its training process, demonstrating that the correlations identified by the model correspond directly to known physical features.
Download figure:
Standard image High-resolution image

**Figure 6.** Comparison of the solar spectrum (top) with three coefficient vectors (θ_Mg, ${\theta }_{v\sin i}$ , θ_{log g}) in the same wavelength segment for comparison. The θ_{log g} and θ_Mg coefficient vectors are vertically displaced to baselines of y = 0.6 and y = 1.0, respectively, for visual clarity; all coefficient baselines are demarcated with solid gray horizontal lines. We focus on the region directly surrounding the Mg Ib triplet to show how the primary pixel correlations deduced by The Cannon correspond to spectral features at the same wavelengths. While θ_Mg directly correlates with the cores of the three Mg lines, which provide information about the Mg abundance, θ_{log g} is more directly affected by the wings of the lines, which provide a metric for the star's surface gravity strength. Intermediate-depth lines are most heavily weighted by ${\theta }_{v\sin i}$ , since these lines are typically neither saturated nor prone to blending together with the baseline. These features are independently identified by The Cannon through its training process, demonstrating that the correlations identified by the model correspond directly to known physical features.
Download figure:
Standard image High-resolution image

The Cannon appropriately infers that the pixel at the core of each Mg Ib triplet line includes substantial information content about the Mg abundance of the star, as shown by dips in θ_Mg at these pixels. Conversely, θ_{log g} approaches zero at each of these line cores and deviates further from zero at the wings of each line, reflecting the physical phenomenon of line broadening with increased surface gravity. Information about the stellar spin can be gleaned from all spectral lines, as reflected in ${\theta }_{v\sin i};$ however, our model most heavily weights intermediate-depth lines. This is likely because deeper lines may be saturated, while shallower lines blend together and are washed out by noise, making intermediate-depth lines the most informative for determining $v\sin i$ . As a whole, Figure 6 reflects that the most prominent correlations identified by The Cannon correspond directly to known physical features, improving confidence in our model's results.

We depict the test results from our final model configuration in Figure 7. The dotted, diagonal line in each panel corresponds to a perfect recovery of the expected label with The Cannon, while deviations from this line indicate scatter in the results. The scatter in results provides a per-label representative 1σ uncertainty estimate for our model. Our best-performing all-orders model returns average χ² = 5.89 across our three test/train splits.

**Figure 7.** Post-2004 test results for all parameters with all echelle orders incorporated. In each panel, the mean μ, median m, and standard deviation σ from a perfect guess ( $| {x}_{i}-{E}_{i}| =0$ ) are provided in the top left. To most clearly visualize the bulk of our results, we exclude three outlier data points with SPOCS labels [Al/H] < −0.66 from the [Al/H] panel. We also do not include these [Al/H] outliers in our reported "reliable range" (see Table 2) or in the calculation of parameters in the top left.
Download figure:
Standard image High-resolution image

To best visualize the overall model performance, we removed three outliers from the [Al/H] panel, each of which lies far to the left of the panel shown in Figure 7. The outliers are all at the bottom end of the distribution of predicted [Al/H] values, and the poor estimates returned for these three stars are likely a result of the sparsity of comparable stars in the SPOCS catalog with low [Al/H]. This behavior is expected, since by design The Cannon performs best on spectra similar to the training set and is not expected to extrapolate beyond what the model has been taught. Estimated labels outside of the range of our training set are unreliable and must be treated with caution.

The parameter space spanned by SPOCS is provided for reference in Table 3, where we report only the reliable range of [Al/H] values ([Al/H] ≥ −0.66) without outliers. We also include in these "reliable ranges" only the range of stellar parameters reliably recovered by our pipeline, meaning that these reported ranges are, in some cases, slightly smaller than the full range spanned by our training set. We note that $v\sin i$ has a cutoff at zero in the SPOCS database, resulting in the observed pileup at low $v\sin i$ in the top right panel of Figure 7. We do not impose an analogous condition with The Cannon.

Table 3. Summary of Our Post-2004 Keck HIRES Model Performance for Each Parameter, Including 1σ Scatter in Each Label and the Parameter Space Spanned by Our Training Data Set over Which Results Are Considered Reliable

Label	σ_1order	σ_full	Reliable Range
T_eff (K)	77	56	4700–6674
log g (cm s⁻²)	0.13	0.09	2.70–4.83
$v\sin i$ (km s⁻¹)	0.85	0.87	0.0044–18.71
[C/H]	0.09	0.05	−0.60–0.64
[N/H]	0.08	0.08	−0.86–0.84
[O/H]	0.07	0.07	−0.36–0.77
[Na/H]	0.09	0.05	−1.09–0.78
[Mg/H]	0.06	0.04	−0.70–0.54
[Al/H]	0.05	0.04	−0.66–0.58
[Si/H]	0.07	0.03	−0.65–0.57
[Ca/H]	0.04	0.03	−0.73–0.54
[Ti/H]	0.05	0.04	−0.71–0.52
[V/H]	0.07	0.06	−0.85–0.46
[Cr/H]	0.06	0.04	−1.07–0.52
[Mn/H]	0.10	0.05	−1.40–0.66
[Fe/H]	0.05	0.03	−0.99–0.57
[Ni/H]	0.07	0.04	−0.97–0.63
[Y/H]	0.07	0.08	−0.87–1.35

Note. Scatter in our best-performing single-order and all-orders results is reported as σ_1order and σ_full, respectively.

Download table as: ASCII Typeset image

We find that The Cannon reliably returns the expected stellar labels, with scatter, listed in Table 3, typically lower than but comparable to the scatter between different catalogs providing spectroscopic parameter estimates. For example, the Hypatia catalog finds roughly 0.1–0.2 dex scatter between catalogs for each elemental abundance (Hinkel et al. 2014). The uncertainty in each stellar label returned by The Cannon is typically a factor of a few higher than that of the input SPOCS uncertainties (see Table 6 in Brewer et al. 2016). This makes sense, since our results cannot be more precise than the labels on which they are trained.

Our full model also returns lower scatter in T_eff and [Fe/H] than that obtained by SpecMatch, which reaches accuracies of 70 K in T_eff and 0.12 dex in [Fe/H] for stellar types K4 (T_eff ∼ 4600) and later (Yee et al. 2017). Furthermore, the scatter in temperature that we find is approximately equivalent to that of the combined infrared flux temperature measurements across different analyses (González Hernández & Bonifacio 2009; Casagrande et al. 2010; Brewer et al. 2016).

This model is somewhat computationally expensive to train (∼80 minutes for our best-performing all-orders configuration); however, once trained, it takes just a few (around 3) seconds per spectrum to extract the 18 parameters of interest. This makes it a particularly powerful tool for large samples of stars, since the up-front model training procedure only needs to be performed once. We make our final model, which employs the top-performing all-orders configuration trained on all 1201 vetted stars in our x_O = 10 sample, publicly available at https://github.com/malenarice/keckspec. All other files required to run the code are also provided.

5. Application to Pre-2004 Spectra

With our optimized model for current Keck spectra at hand, we then developed a framework to classify archival, pre-2004 data. Our archival data set includes 831 Keck spectra, each continuum-normalized in the same manner as the SPOCS data set, obtained from 810 different stars prior to Keck's detector upgrade. We set aside the eight stars in our sample with multiple spectra available for a separate analysis of the scatter in results obtained from The Cannon, described in Section 5.2.

There are 337 single-spectrum stars in our pre-2004 archival data set that were also observed after 2004 and are accordingly included in the SPOCS data set. We use this overlapping sample as our test set to check and optimize model performance; we train on post-2004 spectra of the 865 stars that were not observed prior to 2004, then test our results using the pre-2004 spectra of our 337 test set stars. By construction, therefore, we no longer randomly sample our train/test sets from the same larger pool of spectra. As a result, we completed only one iteration for each test case in this section. Ultimately, we applied our optimized model to report newly obtained stellar labels for the remaining 473 single-spectrum stars, as well as four multi-spectrum stars that were not characterized in Brewer et al. (2016).

All spectra needed to be recalibrated prior to training and testing with The Cannon due to structural differences between the pre- and post-2004 Keck HIRES detectors. Each echelle order of the pre-2004 spectra includes either 2047 or 2048 pixels, rather than the 4021 pixels per echelle order sampled in our post-2004 training set. In addition, the pre-2004 echelle orders do not span exactly the same wavelength ranges as the post-2004 echelle orders.

For each echelle order, we found the broadest wavelength range covered by all available spectra by directly comparing pre- and post-2004 HIRES echelle orders with the maximum wavelength overlap. We then interpolated our training set, as well as all archival spectra, onto a 2048-pixel scale spanning this wavelength range for a uniform comparison across samples. From this process, we obtained 12 overlapping echelle orders covered by both the pre- and post-2004 HIRES spectra, for a total of 24,576 pixels modeled for each star.

5.1. Extrapolating the Model to Pre-2004 Spectra

Our goal in this section is to find a best-performing model that incorporates all 12 echelle orders in order to extract new labels for the 477 unlabeled stars. Due to the differing systematics across spectrographs, it is not necessarily true that the same optimization found in Section 4 will provide the best results in our interpolated model. We therefore repeated the model tuning steps described in Section 4.3 to find an optimal configuration for our interpolated model. We refer the interested reader to the Appendix for a detailed discussion of this process, which closely parallels that described in Section 4.

Our best-performing all-orders model built in this way—by progressively accepting or rejecting each potential alteration one by one—is characterized by outlier removal with x_O = 3 (10 stars removed; seven from the training set and three from the test set), a sin/cos continuum renormalization with N = 70, and no telluric masking, censoring, or regularization. However, when we compared this model with our post-2004 optimization (see Table 2), we found that we obtained the best results when applying the post-2004 configuration to the interpolated spectra. Our final model is therefore trained with the same all-orders optimized hyperparameters described in Table 2. With x_O = 10, the performance of this model was verified using 337 test set stars and 864 training set stars. Our best-performing all-orders model returns χ² = 4.37.

We visually display the performance of our model in Figure 8 and report our final 1σ uncertainties and reliable ranges in Table 4. Because it is trained using the same sample of stars, our pre-2004 model spans the same parameter space as our post-2004 model. The model performs remarkably well given that it has been trained using data from a different spectrograph from the test set. We demonstrate that, even with an interpolated set of spectra taken using a different instrument, our model returns the expected labels with high fidelity.

Table 4. Summary of Our Pre-2004 Keck HIRES Model Performance for Each Parameter, Including 1σ Scatter in Each Label (σ_full), as Well as the Parameter Space Spanned by This Model

Label	σ_full	Reliable Range
T_eff (K)	42	4700–6674
log g (cm s⁻²)	0.05	2.70–4.83
$v\sin i$ (km s⁻¹)	0.98	0.0044–18.71
[C/H]	0.05	−0.60–0.64
[N/H]	0.09	−0.86–0.84
[O/H]	0.07	−0.36–0.77
[Na/H]	0.06	−1.09–0.78
[Mg/H]	0.03	−0.70–0.54
[Al/H]	0.05	−0.66–0.58
[Si/H]	0.03	−0.65–0.57
[Ca/H]	0.03	−0.73–0.54
[Ti/H]	0.03	−0.71–0.52
[V/H]	0.04	−0.85–0.46
[Cr/H]	0.03	−1.07–0.52
[Mn/H]	0.05	−1.40–0.66
[Fe/H]	0.02	−0.99–0.57
[Ni/H]	0.03	−0.97–0.63
[Y/H]	0.07	−0.87–1.35

Note. Because our sample of stars is the same as in the post-2004 model, the reliable range remains unchanged.

Download table as: ASCII Typeset image

After optimizing the model hyperparameters, we re-trained our final model on the full x_O = 10 SPOCS data set with a total of 1201 stars. We then applied this model to our set of 473 unlabeled single-spectrum stars to obtain all 18 labels for each star in the pre-2004 data set, reported in Table 5. We searched these returned labels for outliers that do not fall within the parameter ranges of our training set and that therefore may be unreliable. In total, 128 of our 473 stars had at least one predicted parameter that fell outside of these training set ranges. We flag these stars in the rightmost column of Table 5, where "y" indicates that a star has at least one predicted parameter outside of the reliable range.

Table 5. Stellar Labels Returned by Our Trained Model for Archival, Pre-2004 Keck HIRES Spectra

Star	T_eff	log g	$v\sin i$	[C/H]	[N/H]	[O/H]	[Na/H]	[Mg/H]	[Al/H]	[Si/H]	[Ca/H]	[Ti/H]	[V/H]	[Cr/H]	[Mn/H]	[Fe/H]	[Ni/H]	[Y/H]	Flagged?
HD 98744	6157	3.98	2.15	−0.24	−0.09	0.01	−0.16	−0.19	−0.35	−0.19	−0.10	−0.10	−0.16	−0.15	−0.39	−0.16	−0.24	−0.12	⋯
HD 18144	5514	4.47	0.49	0.05	−0.02	0.11	0.04	0.05	0.07	0.04	0.06	0.05	0.07	0.07	0.07	0.07	0.04	0.05	⋯
HD 91204	5935	4.32	2.17	0.21	0.18	0.26	0.27	0.21	0.27	0.21	0.23	0.22	0.19	0.25	0.28	0.24	0.25	0.19	⋯
HD 230409	5388	4.67	−0.19	−0.49	−0.82	−0.14	−0.78	−0.58	−0.57	−0.55	−0.64	−0.54	−0.62	−0.86	−1.13	−0.81	−0.77	−0.66	y
HD 150437	5742	4.21	2.67	0.20	0.45	0.21	0.49	0.24	0.27	0.26	0.25	0.23	0.20	0.26	0.37	0.27	0.35	0.27	⋯
HD 25825	6020	4.43	6.93	0.11	0.15	0.13	0.04	0.11	0.06	0.13	0.18	0.16	0.17	0.18	0.12	0.19	0.12	0.25	⋯
HD 43587	6194	4.63	14.22	0.25	−0.17	0.35	0.05	0.22	0.04	0.17	0.37	0.31	0.32	0.34	0.27	0.31	0.19	0.30	⋯
HD 213575	5710	4.28	0.58	−0.05	−0.13	0.09	−0.12	−0.03	0.04	−0.07	−0.08	0.01	−0.04	−0.16	−0.27	−0.14	−0.13	−0.15	⋯
HD 139324	5889	4.23	1.32	0.10	0.07	0.19	0.13	0.10	0.10	0.10	0.13	0.10	0.13	0.13	0.17	0.14	0.14	0.11	⋯
HD 188510	5926	4.62	−2.47	−0.54	−0.54	−0.41	−0.82	−0.64	−0.73	−0.65	−0.71	−0.60	−0.67	−0.85	−1.11	−0.77	−0.80	−0.84	y
HD 157172	5481	4.49	2.46	0.10	0.23	0.10	0.23	0.13	0.10	0.13	0.12	0.12	0.13	0.14	0.20	0.14	0.16	0.08	⋯
HD 98618	5861	4.44	0.65	0.04	0.01	0.06	0.04	0.06	0.08	0.04	0.06	0.05	0.08	0.04	0.04	0.05	0.05	0.03	⋯
HD 7727	6080	4.34	2.84	0.06	0.11	0.16	0.06	0.07	0.04	0.08	0.13	0.09	0.09	0.12	0.08	0.12	0.07	0.10	⋯
HD 202108	5732	4.56	0.53	−0.18	−0.30	−0.10	−0.27	−0.19	−0.22	−0.19	−0.16	−0.17	−0.16	−0.19	−0.32	−0.20	−0.27	−0.17	⋯
HD 13825	5711	4.38	0.46	0.15	0.24	0.14	0.30	0.18	0.23	0.18	0.19	0.17	0.19	0.18	0.25	0.17	0.20	0.10	⋯
HD 152792	5738	4.05	0.56	−0.25	−0.30	−0.14	−0.30	−0.24	−0.27	−0.28	−0.20	−0.20	−0.25	−0.30	−0.47	−0.27	−0.31	−0.20	⋯
HD 204587	4556	4.46	1.31	0.31	−0.29	0.22	0.15	−0.06	0.01	0.10	0.10	−0.03	−0.07	−0.14	−0.08	−0.02	0.00	−0.22	y
HD 139457	6084	4.10	0.64	−0.28	−0.25	−0.11	−0.37	−0.26	−0.38	−0.29	−0.23	−0.18	−0.25	−0.33	−0.64	−0.32	−0.39	−0.36	⋯
HD 104067	4884	4.49	1.39	0.14	−0.09	0.12	0.08	0.06	0.09	0.10	0.12	0.09	0.10	0.06	0.08	0.10	0.07	−0.04	⋯
HD 106116	5657	4.35	−0.36	0.08	0.09	0.12	0.12	0.11	0.15	0.09	0.12	0.10	0.12	0.12	0.16	0.13	0.12	0.11	y
HD 208801	4834	3.48	1.35	0.10	0.20	0.20	0.09	0.11	0.25	0.01	0.11	0.18	0.10	0.09	0.19	0.12	0.14	−0.02	⋯
HD 120066	5830	4.14	1.20	0.01	0.04	0.13	0.02	0.10	0.13	0.07	0.13	0.12	0.10	0.08	0.04	0.10	0.09	0.15	⋯
HD 88218	5849	4.10	0.33	−0.10	−0.14	0.03	−0.14	−0.10	−0.08	−0.13	−0.07	−0.07	−0.10	−0.12	−0.23	−0.11	−0.13	−0.07	⋯
HD 141103	6213	4.11	3.83	−0.23	−0.04	−0.02	−0.18	−0.16	−0.34	−0.17	−0.12	−0.08	−0.11	−0.17	−0.42	−0.16	−0.25	−0.22	⋯
HD 213628	5580	4.50	0.46	0.00	−0.06	0.01	−0.05	0.02	0.05	−0.00	0.01	0.02	0.05	0.00	−0.01	0.01	0.00	−0.05	⋯
HD 122303	4972	4.17	−2.87	0.10	−0.22	−0.27	−0.23	−0.05	0.56	−0.22	−0.09	0.05	−0.13	−0.08	0.03	−0.09	0.00	0.14	y
HD 181655	5674	4.44	3.60	0.01	−0.11	0.06	−0.04	0.01	0.04	0.02	0.08	0.03	0.08	0.06	0.01	0.06	−0.00	0.11	⋯
HD 47127	5611	4.34	0.38	0.07	0.05	0.14	0.07	0.09	0.14	0.07	0.10	0.10	0.10	0.09	0.09	0.09	0.08	0.06	⋯
HD 144253	4790	4.25	8.48	−0.13	−0.24	−0.27	0.03	−0.16	0.03	0.07	0.07	−0.06	−0.03	−0.08	0.01	−0.01	−0.03	−0.30	⋯
HD 90125	4816	2.99	0.60	−0.25	−0.14	−0.03	−0.29	−0.16	−0.08	−0.29	−0.11	−0.07	−0.17	−0.17	−0.12	−0.10	−0.12	0.02	⋯
HD 150554	6080	4.42	2.58	0.07	0.04	0.06	0.05	0.04	−0.04	0.05	0.05	0.03	0.03	0.08	0.04	0.08	0.04	0.14	⋯
HD 9331	5580	4.27	1.00	0.13	0.08	0.22	0.14	0.20	0.22	0.18	0.20	0.17	0.21	0.18	0.22	0.19	0.20	0.19	⋯
HD 183650	5654	4.13	1.10	0.23	0.36	0.21	0.45	0.23	0.33	0.25	0.28	0.26	0.23	0.25	0.34	0.26	0.31	0.19	⋯
HIP 84099	5161	4.30	−1.79	0.10	−0.56	−0.30	−0.39	−0.11	0.22	−0.22	−0.18	−0.01	−0.33	−0.16	−0.17	−0.18	−0.15	−0.17	y

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Figure 9 illustrates the distribution of stellar parameters returned for our full set of 473 single-spectrum stars that were not included in the SPOCS data set. Regions outside of the reliable parameter space (see Table 4) are shaded in light red. The distribution of all predicted stellar labels is in light green, and the corresponding distribution of only stars for which all labels fall in the "reliable" parameter space is overlaid in dark green.

5.2. Scatter in Results: Stars with Multiple Spectra

Eight stars in our archival data set have more than one spectrum available, and we use a subset of these for a separate check on the precision of parameters reported for our model.

All stars with multiple archival spectra available are listed in Table 6, along with the total number of spectra available for each star. The first four stars in Table 6 are included in our training set, while the last four are not. Therefore, only the last four stars (HIP 76901, HD 207740, HD 92222A, and HD 92222B) are included in Table 5, where we report results for all spectra of each star for thoroughness. We use the first four stars, for which we have SPOCS labels, to study the performance of our model.

Table 6. Stars in Our pre-2004 Data Set with Multiple Archival Spectra Available

Star Name	# of Spectra
HD 178911B	4
HD 141937	2
HD 11964A	3
HD 212291	2
HIP 76901	3
HD 207740	3
HD 92222A	2
HD 92222B	2

Note. The first four stars in the table are also included in the labeled SPOCS data set, while the last four are not.

Download table as: ASCII Typeset image

In Figure 10, we show the spread in results for HD 178911B, HD 141937, HD 11964A, and HD 212291, our sample stars with multiple archival spectra available and with known SPOCS labels, in order to display the precision of our model (how consistent its predictions are with each other) as well as its accuracy (how these predictions compare with the corresponding SPOCS values). Each star is represented by a color given in the legend, and estimates measured from different spectra of the same star are connected with a line. Values are reported relative to the "correct" SPOCS labels. Thus, points that lie to the right of the zero line (shown as a vertical dashed black line) are overestimated relative to the SPOCS labels, while points to the left of the zero line are underestimated. All points within the shaded gray regions fall within 1σ of the corresponding SPOCS value, and stars are vertically separated for visual clarity.

We find that most, but not all, of our labels from The Cannon fall within 1σ of the "correct" SPOCS label. For a more conservative uncertainty estimate, therefore, these 1σ uncertainties may be multiplied by a factor of 1.5 to 2. The typical scatter in spectral properties of an individual star is fairly small, and labels returned by different observations of the same star are generally consistent with each other within our error bars. Stars may also have some intrinsic variability such that labels will not necessarily stay exactly the same over time.

6. Potential Biases and Systematics

After quantifying the overall performance of The Cannon with our pre-labeled SPOCS data set, we also searched for trends in the model results that could be indicative of systematic biases in the labels returned by SME. We report these trends and their potential origins and, where possible, we propose methods to eliminate these trends in future analyses. We chose to complete this process using an early version of our model, with no tuning implemented, to ensure that any observed trends result from our input labels, rather than adjustments in our model setup. Accordingly, all trends described in this section are based on results from our x_O = 10 all-orders post-2004 model with no additional tuning.

Throughout this section, we show that The Cannon can be used to draw out systematic patterns across an input catalog. This demonstrates that The Cannon may, more broadly, serve as a useful tool to search and correct for systematic trends across a given stellar catalog.

6.1. Metallicity Correlations

While examining the offset between our test results and corresponding SPOCS labels, we observed a clear gradient with metallicity in most abundance estimates, as shown in Figure 11. Figure 11 displays [Mg/H] as a representative example, demonstrating that elemental abundances deduced by The Cannon tend to be overestimated at low metallicity and underestimated at high metallicity relative to the corresponding SPOCS abundances. As shown in Figure 11, the model tends to perform well at solar metallicity; however, our results deviate further from those in the SPOCS catalog for stars with metallicity further from solar. This match at solar metallicity is likely a result of pre-processing in Brewer et al. (2016) that calibrated the VALD-3 line list relative to solar using the NSO solar flux atlas (Wallace et al. 2011). We include a linear regression of all points in Figure 11, as well as the baseline at zero corresponding to a "perfect" parameter recovery, for reference.

**Figure 11.** [Mg/H] from the SPOCS catalog as a function of the difference between the [Mg/H] labels predicted by The Cannon and those of the input SPOCS sample. A linear fit to the data is shown in black, with slope m and y-intercept b, to quantify this downward trend. The statistical uncertainty of each point, obtained for the full population based on scatter in test results with The Cannon and the SPOCS label uncertainties reported in Brewer et al. (2016), is provided at the bottom left.
Download figure:
Standard image High-resolution image

This systematic offset, consistent across abundances, most likely results from a systematic bias in the input data set, since a generative model returns parameters analogous to those that it accepts as an input, including learning any biases inherent in that input model. In particular, the anticorrelation between [Fe/H] and all other elemental abundances suggests a bias in the value of [M/H].

In theory, [M/H] represents the summed abundance of all metals as compared to the solar value and should, therefore, trace [Fe/H] with a slight offset to account for all other metals. In SME, [M/H] is estimated from [Fe/H] and used to choose a model atmosphere grid to build a spectral model. Abundances are then determined by modeling lines using radiative transfer through that atmosphere with the current model parameters. If the parameters do not change substantially, the same atmosphere may be used to sample a range of possible abundances. As a result, the value of [M/H] is not forced to be in agreement with the value of [Fe/H] in our input data set.

To further explore this bias, we studied the correlation of [M/H] with [Fe/H] in our input data set. In Figure 12, we show [Fe/H] as a function of the offset between [M/H] and [Fe/H]. There is a clear downward slope in this offset value, best fit by a line with slope m = −0.086 and y-intercept b = −0.013. The coolest stars in the sample follow a steeper downwards slope, while the hotter stars, which dominate the sample, follow a shallower slope.

Interestingly, the best-fitting linear interpolation that we found does not produce a perfect fit at solar metallicity. We conclude that the observed offset is likely due to degeneracies between T_eff and log g within the model atmosphere selection grid in SME. Although [M/H] is not a parameter in our model with The Cannon, it tends toward solar values in our results based on the anticorrelation observed in Figure 11. The offset in [Mg/H] calculated from The Cannon and SPOCS roughly tracks the trend of [M/H] with [Fe/H] from the SPOCS data set, showing that The Cannon reproduces this inherited pattern.

The systematic trend in [M/H] with [Fe/H] is a problem inherent to our input data set which, in turn, produces a bias in our results from The Cannon, as shown in Figure 11. In the process of determining labels with SME, a stellar atmosphere model is selected from a coarse grid with steps of 250 K in temperature, 0.5 dex in log g, and 0.1 dex in [M/H] for values −0.3 to +0.3, or 0.5 dex outside. The presence of this systematic problem indicates that a reanalysis of these spectra in SME with a finer atmospheric grid—or a different atmospheric grid altogether—may be warranted.

6.2. Systematics in T_eff

We also explored potential systematics that may be present in the distribution of T_eff values across our stellar sample. Figure 13 displays the SPOCS T_eff as a function of the offset between the model and input (SPOCS) T_eff values. The top panel of Figure 13 shows an increase in the dispersion of offsets at high log g and at high T_eff. The lowest log g values are clustered at the coolest stars, reflecting the inclusion of sub-giants and a few giant stars in the sample. We find no clear trend in the scatter of T_eff with metallicity in the lower panel of Figure 13.

7. Additional Applications

The methods explored in this work have a wide range of potential applications beyond the scope of this paper. The model that we have developed may be applied to any current and future Keck HIRES spectra of individual stars to obtain not only the four primary stellar labels, but also all 15 elemental abundances determined in Brewer et al. (2016). With the abundance of new planets around bright stars being discovered by, for example, the Transiting Exoplanet Survey Satellite (Ricker et al. 2014; Huang et al. 2018; Wang et al. 2019), the number of promising targets for follow-up radial velocity observations is regularly increasing. A uniform, large-scale statistical analysis of these host star properties would allow for detailed population studies of the growing sample of known planets.

Once trained, the saved model can be quickly loaded and takes seconds to classify a spectrum, meaning that it is possible to determine all 18 labels with minimal delay after spectra have been obtained, reduced, and normalized. Thus, our model provides a powerful method to rapidly and precisely determine the properties of stars in the northern hemisphere. This may open new doors to study, for example, correlations between host star properties and planet size, composition, multiplicity, and other properties. Furthermore, it may allow for efficient stellar characterization soon after observing to quickly obtain stellar parameters and inform ongoing observations.

In theory, this model could be further applied to spectra obtained from other telescopes and instruments with an overlapping wavelength coverage. We demonstrated in this work that, by interpolating our spectra to a new wavelength grid, we reliably recovered the properties of stars observed with Keck's older, pre-2004 HIRES detector—a separate instrument with different systematics as compared to the newer, current detector. We have also completed preliminary tests extending this concept, showing in K. Worku et al. (2020, in preparation) that our model can successfully recover the primary stellar labels from spectra obtained with the Automated Planet Finder (Vogt et al. 2014). Our findings suggest that, with further refinement, it may be possible to obtain all 18 labels from non-HIRES spectra using interpolated versions of our model, though potentially with higher uncertainties due to the differing systematics across instruments.

8. Conclusions

Throughout this work, we demonstrated applications of The Cannon to obtain 18 stellar labels from Keck HIRES stellar spectra using the SPOCS catalog as a training set. We explored several methods to optimize the model's performance, including outlier removal, data-driven continuum renormalization, telluric masking, label censoring, and L1 regularization. The primary outcomes of this work are as follows.

1.
We developed and tested a novel, efficient open-source tool that takes current (post-2004) Keck HIRES spectra as its inputs and outputs 18 stellar labels, including T_eff, logg, $v\sin i$ , and 15 stellar abundances: C, N, O, Na, Mg, Al, Si, Ca, Ti, V, Cr, Mn, Fe, Ni, and Y. The corresponding uncertainties for each parameter, which are comparable to the scatter in values across catalogs, are described in Table 3.
2.
We demonstrated that an interpolated model trained on the SPOCS catalog can return accurate stellar parameters for spectra spanning a similar parameter space and wavelength range, but obtained from a separate spectrograph.
3.
We applied our interpolated, re-optimized model to create a catalog of 18 stellar labels for 477 stars observed with Keck HIRES prior to its 2004 detector upgrade. These archival spectra could not be processed uniformly with the rest of the SPOCS sample with the SME program due to the older detector's more limited wavelength range. Our results are provided in Table 5 and can be found in full in the online version of this paper.

In addition to quickly delivering stellar properties for individual stars, the high precision and robustness of parameters obtained with The Cannon make it a particularly powerful tool for population studies of stars. Studies comparing these stellar properties to trends in system architecture hold great potential to reveal a more comprehensive understanding of planetary systems and their underlying correlations. Our code's capability to rapidly determine stellar parameters from individual spectra makes it possible to efficiently and uniformly analyze large samples of stars, rendering such studies much more computationally tractable than in the past. Applications to a broader range of stellar spectra may further extend this work and provide a more holistic view of the relationship between stars' properties and their surrounding environments.

We thank Andy Casey, Melissa Ness, and Debra Fischer for helpful conversations over the course of this work. M.R. is supported by the National Science Foundation Graduate Research Fellowship Program under grant No. DGE-1752134. This work has made use of the VALD database, operated at Uppsala University, the Institute of Astronomy RAS in Moscow, and the University of Vienna. The authors wish to recognize and acknowledge the very significant cultural role and reverence that the summit of Maunakea has always had within the indigenous Hawaiian community. We are most fortunate to have the opportunity to conduct observations from this mountain.

Software: numpy (Oliphant 2006; Walt et al. 2011), matplotlib (Hunter 2007), The Cannon (Ness et al. 2015; Casey et al. 2016).

Appendix: Pre-2004 Model Optimization

Here we detail the process of model optimization used to obtain our best-fitting model in Section 5.1. We note that our final model adopts the hyperparameters described in Table 2, rather than the final model obtained in this Appendix. However, the same wavelength ranges and telluric mask described in this section are also used in the final model.

A.1. Outlier Removal

We repeated the outlier removal procedure discussed in Section 4.2, again testing our model performance with no outliers removed and with x_O = 1.5, 3, and 10. As before, we ran extensive tests with a single wavelength order and used our results to inform a narrower set of tests for our full model that includes all 12 wavelength orders. With this approach, we were able to examine a wider range of models before the associated computation time became prohibitive.

We found that echelle order 6, spanning 5366–5432 Å, performed best in these tests with x_O = 1.5. We accordingly adopted this configuration as our representative single-order base test case moving forward. Unsurprisingly, this order fully overlaps with the best-performing order from our post-2004 tests, indicating the high information content of this wavelength range.

The x_O = 3 case returned the lowest χ² value in our interpolated, pre-2004 model with all 12 echelle orders included. For both our single-order and all-orders interpolated models, we obtained the lowest χ² value when implementing a more liberal outlier removal criterion than in the previous model optimized for current Keck HIRES data. This may reflect the sparser training set used in this section. The training set used for pre-2004 model testing contains significantly fewer stars (865 before outlier removal) than the post-2004 case, where 80% of the pre-labeled vetted SPOCS stars were used for training (961 before outlier removal). The removal of outliers reduces the parameter space over which a model can be considered reliable; however, it also decreases the chance that the edges of the parameter space, where stars are poorly sampled, are falsely included within the reliable parameter range.

A.2. Data-driven Continuum Renormalization

To explore several possible model configurations, we again used four different thresholds for our data-driven continuum pixel selection: N = 50, 60, 70, and 80. As in Section 4.3.1, we first trained our model to find the N% of pixels with coefficients closest to zero for each of our four primary labels. We then selected the pixels that both overlapped between these sets and fell within 1.5% of the continuum baseline.

For per-label pixel cuts at the 50th, 60th, 70th, and 80th percentile, the final percentage of pixels identified as "true" continuum pixels in our single-order fit was roughly 7%, 12%, 16%, and 20%, respectively, with some variation occurring from spectrum to spectrum. A sample spectrum of G0 star HD 36130 is shown in Figure A1 with N = 50 which, for this spectrum, results in 7.6% of all pixels being selected as continuum pixels. Figure A1 displays both the selected continuum pixels and the two corresponding fits for comparison. As in Section 4.3.1, and as illustrated by Figure A1, the two functional forms—sin/cos and polynomial fits—typically provide similar results.

**Figure A1.** Sample continuum renormalization fit over the spectrum of HD 36130, shown in blue, for the best-fitting single-order wavelength range of our pre-2004 Keck HIRES data. As in Figure 2, the polynomial fit is shown in green and the sin/cos fit is in purple. Continuum pixels are denoted by black markers, where here we show the N = 50 case.
Download figure:
Standard image High-resolution image

We found that both the polynomial and sin/cos renormalization functions consistently improved our single-order fit for all N values. The best-performing single-order model, which we adopted for our ongoing testing, used the N = 50 threshold with a sin/cos renormalization.

To apply these results to our full model with all 12 orders, we again tested the N = 50 case with both a sin/cos and polynomial renormalization. We found that both renormalization schemes degraded our results with all orders included. However, upon closer examination of the individual per-order renormalization fits, we found that the N = 50 threshold resulted in a very sparse and potentially unreliable continuum fit in several echelle orders. To determine whether the inclusion of more "continuum" pixels would improve our fit, we also tested the N = 70 case, which provided only marginally less reliable results in the single-order case and which resulted in our best fits in Section 4.3.1.

As suspected, the N = 70 case did improve our results for both a polynomial and sin/cos fit with all echelle orders incorporated. This suggests that, while order 6 (5366–5432 Å) performs best with N = 50, sufficiently few pixels are selected in other echelle orders with N = 50 such that the continuum fit is not consistently reliable across orders. With N = 70, a substantially larger number of pixels incorporated into the continuum fit, leading to a better approximation of the underlying baseline. While both fits improved our results, we obtained the lowest χ² value with a sin/cos renormalization applied to all echelle orders. As a result, we implemented sin/cos renormalizations with N = 50 in our single-order models and with N = 70 in our all-orders models moving forward.

A.3. Telluric Masking

We also applied telluric masking in our model configuration tests. First, we created a new telluric mask by finding the locations of telluric lines in each spectrum from our pre-2004 training set and creating a mask for each spectrum. The telluric lines do not match up exactly in every spectrum due to the differing barycentric corrections and radial velocities of each observed star. Thus, we combined all of our individual masks to create one master mask that we applied to all spectra. This mask is visualized in Figure A2 with a sample spectrum from HD 36130 shown for reference.

**Figure A2.** Top: full, continuum-renormalized spectrum of sample star HD 36130, showing the 12 overlapping wavelength regions shared across our pre- and post-2004 spectra. The portion of the spectrum corresponding to the lower panel is highlighted in gray. Bottom: zoom-in of only echelle order 6, ranging from 5366–5432 Å. Black markers denote the telluric pixels ("masked pixels") at the bottom of each panel, as well as the non-telluric pixels ("unmasked pixels") at the top of each panel.
Download figure:
Standard image High-resolution image

With our telluric mask in place, 648 of the 2048 pixels in our single, best-performing echelle order were excluded from our fit, with the masked pixels shown in Figure A2. Despite the loss of information, we found that this masking slightly improved our results. This reflects the tradeoff between removing noise and removing signal with substantial masking implemented. We continued to apply telluric masking in our ongoing single-order tests to account for the net improvement observed in our label recovery.

In our all-orders tests, we instead found that telluric masking degraded our results, and we chose not to include it within our final model as a result. This suggests that, in our interpolated model with all orders included, telluric masking reduces the signal more than it reduces the noise in our model, leading to poorer performance overall.

A.4. Censoring

As in Section 4.3.3, we also applied censoring at the 5%, 15%, 50%, 85%, and 95% levels, meaning that, for example, only the most highly varying 5% of all nonzero pixels for each label were used when fitting at the 5% censoring level. We ran two sets of tests in which we censored (1) all labels or (2) only the primary four labels (T_eff, log g, $v\sin i$ , and [Fe/H]), resulting in a total of 10 test cases. Two of these cases—85% and 15%, each with only the primary four stellar labels censored—are illustrated in Figure A3, which depicts the masked/unmasked pixel locations for the each case.

**Figure A3.** Sample censored wavelengths for sample star HD 36130, selected for the primary four stellar labels: [Fe/H] (green), log g (blue), T_eff (violet), and $v\sin i$ (purple). The unmasked pixels corresponding to each label are shown above the spectrum, and the masked, unused pixels are below. Masks for each label are provided in pairs, where the upper line in each color corresponds to the 85% mask, while the lower line corresponds to the 15% mask. "Unmasked" pixels are included in the analysis for that label, while "masked" pixels are excluded.
Download figure:
Standard image High-resolution image

Ultimately, we found that both the 85% and 95% test cases with four labels censored improved our single-order model results. The 85% test case performed slightly better and we therefore used this case moving forward. In general, heavier censoring—using smaller samples of pixels to fit each label—led to less reliable results than lighter censoring.

We then tested these two best-performing cases in our all-orders model to determine whether they would produce improvements in our label recovery. We found that, with all orders included in the model, censoring four labels at either the 85% or 95% level provided no substantial improvements to our model performance. This reflects the tradeoff between removing noisy pixels and eventually removing pixels that provide useful information. Thus, we did not incorporate censoring into our final, all-orders model and instead elected to use it only in our single-order model.

A.5. L1 Regularization

Lastly, we applied regularization to our best-fitting individual order with Λ = 1, 10, 100, 1000, and 10,000, resulting in label density distributions almost identical to those in Figure 5. We found that lower regularization values generally provided better results than higher ones, but that any of our tested regularization values degraded the model results relative to the case with no regularization. We chose not to include regularization in either our single-order or all-orders model configuration.

While we did not find that the tested regularization values led to an improved χ² value, this does not imply that no values of regularization would improve our results. Of our tested Λ values, we obtained the best results with Λ = 100. This suggests that, if any Λ value exists that would improve our model results, it is likely between Λ = 10 and Λ = 1000. Behmard et al. (2019) also tested regularization values on a grid spanning Λ = 10⁻⁶ to Λ = 10² and found that no tested Λ values improved their model results. Given that our test set results already had low scatter and that additional benefits from fine-tuning would likely be only marginal, we found that it was not practical for our purposes to sample a finer grid of possible values.

A.6. Final Model Configuration

Our best-performing pre-2004 model configurations for both the single-order and all-orders cases are provided in Table A1. Both the best-fitting single-order and all-orders models are characterized by strict outlier thresholds, removing several stars from the training/test sets. This improves our model performance with the tradeoff that our model spans a smaller parameter space and cannot be applied to as wide a range of stars. Our final all-orders model obtained in this section, with x_O = 3, ultimately includes 334 test set stars and 858 training set stars, returning χ² = 4.79. We emphasize that the final configuration used to obtain the catalog in Table 5 is not this model, but rather one that applies the same hyperparameters as the optimized post-2004 model. Our final model does, however, use the same telluric mask and the same wavelength ranges described throughout this Appendix.

Table A1. Optimized Training Configurations for Our Models, Developed to Classify Pre-2004 Keck HIRES spectra

	Single Order	All Orders
x_O	1.5	3
Renormalization	N = 50, sin/cos	N = 70, sin/cos
Telluric masking	included	not included
Censoring	85%, 4 labels	none
Λ	0	0

Note. The single-order run spans wavelength range 5366–5432 Å. We note that our final model does not use this configuration, since the hyperparameters found in our analysis of current Keck spectra provided further improved results.

Download table as: ASCII Typeset image

Stellar Characterization of Keck HIRES Spectra with The Cannon

Article metrics

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Abstract

1. Introduction

2. Methods: The Cannon

3. Data Selection and Processing