Statistics of the Chemical Composition of Solar Analog Stars and Links to Planet Formation

Jacob Nibauer; Eric J. Baxter; Bhuvnesh Jain; Jennifer L. Van Saders; Rachael L. Beaton; Johanna K. Teske

doi:10.3847/1538-4357/abd0f1

1. Introduction

High precision spectroscopy reveals that the Sun may have an unusual elemental composition compared to the majority of nearby solar analogs. In particular, the Sun is found to have a lower than average ratio of refractory-to-volatile elements, as first discovered by Meléndez et al. (2009). In that work, the chemical composition of 11 stars nearly identical to the Sun was determined using a line-by-line, strictly differential spectral analysis that allowed for ∼0.01 dex precision on individual abundance measurements. The depletion of refractory elements found in the Sun was found to correlate with the 50% condensation temperature (T_c)—the temperature at which 50% of an element condenses from gaseous to solid phase—of each element under protoplanetary disk conditions (Lodders 2003). The T_c trend has spurred extensive debate, with some studies able to verify the anomalous solar composition while others were unable to detect similar depletion patterns across various samples of stars (Ramírez & Asplund 2009; Gonzalez et al. 2010; Ramírez et al. 2011; Schuler et al. 2011b, 2015; Meléndez et al. 2012; González Hernández et al. 2013b; Liu et al. 2014, 2020; Tucci Maia et al. 2014; Nissen 2015; Teske et al. 2016; Bedell et al. 2018; Nissen et al. 2020).

Notably, Ramírez & Asplund (2009) found evidence of a bimodality in the abundance versus T_c slope distribution at super-solar metallicities, with some stars displaying solar-like depletions while others appear comparatively enriched in refractory elements relative to the more volatile elements. Various mechanisms for the depletion trends have been proposed, the most intriguing of which suggests that missing refractory material could be locked up in rocky planets. This explanation was initially proposed by Meléndez et al. (2009) and later supported by Chambers (2010), who demonstrated that the Sun's refractory deficit corresponds to roughly four Earth masses of terrestrial material. Other explanations include the possibility of refractory enrichment of stars by planet engulfment (Ramírez et al. 2011; Spina et al. 2015; Oh et al. 2018; Church et al. 2020), or interactions within the parent molecular cloud leading to the observed refractory depletion (Gustafsson et al. 2010). More recently, Booth & Owen (2020) utilized evolutionary models for protoplanetary disks surrounding young stars to show that the depletion trend might emerge from a gap in the disk created by a forming giant planet. Under this hypothesis, the gap forms a pressure trap that impedes the accretion of refractory material onto the host star, thereby resulting in the observed deficiency of refractory material. Galactic chemical evolution (GCE) and consequently stellar age have also been proposed as a possible explanation for these trends (Adibekyan et al. 2014; Spina et al. 2016). In a recent work, however, Bedell et al. (2018) recovered a solar depletion trend after correcting for GCE effects in a sample of 79 Sun-like stars.

The standard approach for quantifying trends in the abundance versus T_c plane involves performing linear fits to the abundance data for each star to derive estimates of the slope and intercept of the abundance trend, which are then compared between stars. This approach requires the determination of elemental abundances to high precision, since the magnitude of the refractory deficit is of order 0.05 dex (Meléndez et al. 2009). Additionally, in order to reduce systematic uncertainties stemming from GCE and potentially inaccurate stellar models, most studies focus entirely on solar-like stars, or on stars in binary systems, thus severely limiting sample sizes. Consequently, while previous works have obtained high precision, they have been limited to smaller samples with ∼20–100 stars.

Several authors have suggested the importance of increasing sample sizes to build up the statistics for the presence or absence of depletion trends across many stars (i.e., Meléndez et al. 2009; Ramírez et al. 2014; Schuler et al. 2015). While elemental abundance measurements from high-resolution spectroscopy are available for hundreds of thousands of stars from surveys such as the Apache Point Galactic Chemical Evolution Experiment (APOGEE; Majewski et al. 2017) or the Galactic Archaeology with HERMES survey (GALAH; Martell et al. 2017), the precision of these abundance measurements are generally below the threshold required to detect subtle depletion trends for individual stars using the methods previously described.

In this work, we attempt to measure depletion trends across many stars using a novel statistical approach. We utilize abundance data from the 16th data release of the Sloan Digital Sky Survey (SDSS), primarily relying on the second iteration of the APOGEE survey (APOGEE-2; Ahumada et al. 2020; Jönsson et al. 2020). While the uncertainties on elemental abundance data from APOGEE-2 are generally higher than in the previously mentioned studies, we combine measurements across stars using a likelihood-based approach to place constraints on the overall distribution of elemental abundances in solar analogs, rather than relying on the precision measurements of individual stars. In addition to applying our analysis to large samples of stars from APOGEE-2, we also apply our analysis to publicly available line-by-line abundance measurements from Bedell et al. (2018) for a sample of 79 Sun-like stars, finding roughly consistent results.

We have previously used a similar formalism to detect circumstellar debris disks in data from the Planck cosmic microwave background survey, constraining parameters such as the fraction of stars with debris disks (Nibauer et al. 2020). The present work employs a similar modeling scheme, enabling us to place constraints on the fraction of stars with elemental depletions.

The paper is organized as follows. In Section 2, we describe the data from APOGEE-2 and Bedell et al. (2018) used in our analysis; in Section 3, we describe our analysis pipeline and modeling approach; in Section 4, we present our results on data from APOGEE-2 and Bedell et al. (2018), and we conclude in Section 5.

2. Data

2.1. APOGEE

We utilize data from the 16th release of the SDSS, primarily relying on the second iteration of APOGEE (APOGEE-2, hereafter referred to as APOGEE; Blanton et al. 2017; Majewski et al. 2017; Ahumada et al. 2020; Jönsson et al. 2020). APOGEE has produced H-band spectra of over 450,000 stars, allowing for the determination of stellar parameters and chemical abundances to trace the history and evolution of the Milky Way. APOGEE includes data taken from both hemispheres, using the Sloan Foundation 2.5 m telescopes from the Apache Point Observatory and the Irénée du Pont telescope from Las Campanas Observatory (Bowen & Vaughan 1973; Gunn et al. 2006), with targeting procedures described in Zasowski et al. (2013, 2017). In particular, our study relies on data obtained in Contributed Programs with APOGEE-2S (F. Santana et al. 2020, in preparation) and in the Bright Time Extension of APOGEE-2N (R. Beaton et al. 2020, in preparation).

Stellar parameters and chemical abundances are determined by the APOGEE Stellar Parameter and Chemical Abundance Pipeline (ASPCAP; García Pérez et al. 2016; Holtzman et al. 2018; Jönsson et al. 2020), based on the FERRE code for spectroscopic analysis (Allende Prieto et al. 2006). The ASPCAP pipeline first determines fundamental stellar parameters over the entire spectral range. Then, by adopting the best-fit parameters a sequential fit for each elemental abundance is performed, using limited spectral windows matched to the element of interest. This procedure is repeated for each star, with a line list enabling the determination of abundances for up to 26 species (Shetrone et al. 2015, V. Smith et al. 2020, in preparation). The reliability and precision of the abundance determinations varies widely across stars and elements. We therefore make use of a limited sample in order to minimize the effects of systematic errors (see Sections 2.1.1 and 2.1.2). Detailed discussions of all elements included in ASPCAP DR16 and further information on the ASPCAP pipeline can be found in Jönsson et al. (2020).

2.1.1. Star Selection

We select solar analogs by roughly following the criteria used in Gonzalez et al. (2010) and Bedell et al. (2018). In particular, we include stars with a distance of < 350 pc, ∣[Fe/H]∣ < 0.1 dex, $\mathrm{log}g$ within 0.1 dex of solar, and T_eff within 195 K of solar. Distances are determined using parallax measurements from Gaia DR2 (Gaia Collaboration et al. 2018), while all other stellar parameters are from the calibrated values produced by the ASPCAP pipeline. The APOGEE-2 ASPCAP pipeline provides a series of bitmask flags to indicate stars with potential issues. Accordingly, we select stars with ASPCAPFLAG = 0, and make exclusive use of the X_FE tagged columns since these are only populated for the most reliable spectra. The final population of ∼1800 stars is illustrated by the orange points in Figure 1, where the dashed black lines depict solar values. Our selection criteria firmly places our sample on the main sequence and sufficiently far from the subgiant branch to ensure that we are sampling dwarf-type stars. Over this $\mathrm{log}g$ -T_eff range, a single calibration is used for the stellar parameters (Jönsson et al. 2020).

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** Top: Kiel diagram of the APOGEE sample. Orange points indicate the population of Sun-like stars used in our analysis of APOGEE data, as described in Section 2.1.1. Horizontal and vertical dashed lines are placed at solar values for reference. Bottom: color–magnitude diagram with parallax, color, and magnitude from Gaia DR2. Orange points correspond to the same set of stars in the top panel, while blue points represent all APOGEE-2 targeted stars within 350 pc. Horizontal and vertical lines correspond to solar values for M_G and BP − RP, respectively.
Download figure:
Standard image High-resolution image

A small fraction of our selected stars scatter in the direction of high BP − RP and M_G (see orange points in Figure 1, bottom panel), likely a result of line-of-sight reddening, which has not been corrected for in this diagram.⁸ We find that removing these stars from our analysis has no significant impact on our results.

2.1.2. APOGEE Abundances and Element Selection

In order to maximize our ability to detect subtle elemental depletion patterns, we exclude the lower T_c (volatile) elements that are not expected to fall along the steepest portion of the depletion trend (Meléndez et al. 2009; Chambers 2010; Bedell et al. 2018). The distinction between refractory and volatile elements has been roughly defined at T_c ∼ 900 K, where elements with condensation temperatures greater than this threshold have been found to fall along the steepest portion of the abundance versus T_c trend. (Meléndez et al. 2009; Ramírez & Asplund 2009; Bedell et al. 2018).

APOGEE provides chemical abundance measurements for up to 26 chemical species, spanning a wide range of condensation temperatures. Some species have significantly higher precision abundance determinations than others, in part due to the difficulty in measuring elemental compositions when there are few clean spectral features for a given species. To achieve the highest degree of precision possible in our study, we use a subset of high T_c elements with the most reliable abundance determinations. Based on the discussions presented in Jönsson et al. (2020), Si, Mg, Ni, Ca, and Al are deemed the five most precisely determined elemental abundances with high T_c, for dwarf stars in particular. We therefore place constraints on T_c trends with these elements using APOGEE data in Section 4.1.

Because we will compare our study to external results, it is important to understand how the APOGEE data is calibrated. APOGEE works with stars spanning a wide range of stellar parameters, and has thus tuned their synthetic stellar atmospheres to both the solar spectrum (a G dwarf) and to Arcturus (a K giant) (see, e.g., Shetrone et al. 2015, Smith et al. 2020, in preparation). Internal uncertainties are calibrated using serendipitous duplicate observations of the same stars that were reduced independently (Jönsson et al. 2020; Poovelil et al. 2020). Stellar parameters are calibrated against external estimates, with $\mathrm{log}g$ calibrated to seismic measurements and T_eff calibrated against photometric temperatures (for details see Jönsson et al. 2020). The APOGEE abundances are not adjusted after the calibration of the stellar parameters, but are instead calibrated by the determination of a zero-point using stars in the solar neighborhood (defined as those stars within 500 pc). This calibration sample uses all stars of all stellar types and not just solar-type stars. Jönsson et al. (2020) provides the residuals for the solar spectrum reflected off of the asteroid Vesta as a means of assessing the offsets between the APOGEE calibration and calibrations directly to the solar spectrum.

The APOGEE calibration methods are motivated by numerous results in the literature suggesting that stars near solar abundance in the solar neighborhood have mean [X/Fe] = 0 (see Jönsson et al. 2020 and references therein). It should be emphasized that this calibration is simply a zero-point offset applied to each element, meaning that intrinsic scatter in the abundances is unaffected. Additionally, the size of the offset is small for the five elements selected for in this work, with the largest zero-point shift of 0.043 dex for Al and Ni (see Jönsson et al. 2020, their Table 4). The result is that the APOGEE abundances are explicitly calibrated to the solar neighborhood, which makes the absolute abundance scale "solar-like."⁹

Uncertainties on the APOGEE abundances are also derived statistically, based on a sample of ∼15,000 stars with duplicate observations that were processed independently and are then separated into samples representative of ranges of stellar type by T_eff and $\mathrm{log}g$ . Lastly, in our study we use the values in the "named tags," and as described in Jönsson et al. (2020), these tags are only populated if the underlying spectral data and the APOGEE methods applied to those stellar types are trustworthy.

2.2. Data from Bedell et al. (2018) and Spina et al. (2018)

In addition to our analysis on APOGEE data, we utilize abundance measurements from Bedell et al. (2018) and Spina et al. (2018) for a sample of 79 Sun-like stars with similar T_eff, $\mathrm{log}g$ , and [Fe/H] to our APOGEE sample. These stars were observed with the high-resolution High Accuracy Radial velocity Planet Searcher spectrograph (Mayor et al. 2003), and abundances were obtained using the strictly differential line-by-line technique described in Bedell et al. (2014). This enables high signal-to-noise abundance determinations for more than 20 elements, allowing us to place constraints on subtle depletion trends in the data. For a more detailed description of these data, we refer readers to Bedell et al. (2018) and Spina et al. (2018).

3. Analysis

3.1. Modeling

We now develop a model for the stellar elemental abundance measurements discussed above. Our primary goal in modeling the abundances is to determine whether there is evidence for two populations of stars: one that is depleted in high T_c refractory elements relative to the more volatile elements, and another that is comparably less deficient in high T_c elements relative the volatiles.

Motivated both by simplicity and by several previous works (e.g., Meléndez et al. 2009; Ramírez & Asplund 2009; Bedell et al. 2018), we assume that the elemental abundance value for an individual star is a linear function of the element condensation temperature, T_c, over the narrow range of T_c values considered. This assumption is made for both the depleted and non-depleted populations. In other words, we assume

$\begin{eqnarray}&&{d}_{j}^{\mathrm{obs}}={d}_{j}^{\mathrm{true}}+{N}_{j}={{mT}}_{c,j}+b+{N}_{j},\end{eqnarray} \tag{ 1 }$

where ${d}_{j}^{\mathrm{obs}}$ is the abundance measurement for the jth element, ${d}_{j}^{\mathrm{true}}$ is the true abundance value for that element, N_j represents a noise term to be discussed later, and m and b are parameters of the model. The model in Equation (1) is likely to be a good approximation as long as the set of elements considered is strictly refractory and therefore does not span a very large range in T_c (Bedell et al. 2018; Oh et al. 2018; Hinkel & Unterborn 2018).

Since we are attempting to model the abundance distribution as the sum of two populations of stars, we adopt a mixture model for describing the measurements from all stars. We refer to the fraction of stars with depleted abundances (i.e., with lower abundances of refractory elements relative to the more volatile elements) as f, so the fraction of stars without depletion is 1 − f. The values of m and b for each star are assumed to be drawn from two distributions corresponding to the depleted and not-depleted model components. For simplicity, we assume that the distributions of m and b values for both components are Gaussians:

$\begin{eqnarray}&&{P}^{{\rm{D}}}(\{m,b\})={ \mathcal N }(\{m,b\}| \mu ={\mu }_{\{m,b\}}^{{\rm{D}}},\sigma ={\sigma }_{\{m,b\}}^{{\rm{D}}}),\end{eqnarray} \tag{ 2 }$

where D stands for depleted, and we use the notation {A, B} to represent a choice of either A or B. We note that the formalism for not-depleted (ND) stars appears exactly the same as for D stars, up to a superscript (D is replaced with ND for the not-depleted stars in Equation (2)).

Once m and b are drawn for a star, all that remains to connect the model to the data is a description of the noise term in Equation (1). We refer to the observed abundance measurement for the ith star and jth element as ${d}_{{ij}}^{\mathrm{obs}}$ (i.e., [X/Fe] measured in dex). We write the noise for the ith star and jth element as the sum of two noise terms:

$\begin{eqnarray}&&{N}_{{ij}}={N}_{\mathrm{intr},j}+{N}_{\mathrm{inst},{ij}},\end{eqnarray} \tag{ 3 }$

where ${N}_{\mathrm{intr},j}$ and N_inst,ij represent intrinsic scatter and instrumental scatter in the abundance measurements, respectively. The intrinsic scatter for each element is assumed to be constant across all stars, while the instrumental scatter varies element to element and star to star. For the intrinsic scatter, we treat the dispersion as a free parameter of the model denoted by ${\sigma }_{\mathrm{intr},j}$ , which can be different for both depleted and not-depleted stars (denoted by ${\sigma }_{\mathrm{intr},j}^{{\rm{D}}}$ and ${\sigma }_{\mathrm{intr},j}^{\mathrm{ND}}$ , respectively). For the instrumental scatter, we simply adopt the uncertainties reported by APOGEE (or Bedell et al. 2018 in Section 4.2.2), σ_inst,ij.

A directed acyclic graph representing either the depleted or not-depleted model components of the hierarchical model is shown in Figure 2. This model allows us to write the likelihood for the data as

$\begin{eqnarray}&&\begin{array}{l}{P}^{{\rm{D}}}\left({d}_{{ij}}^{\mathrm{obs}}| {\theta }^{{\rm{D}}}\right)\,=\displaystyle \int d({d}_{{ij}}^{\mathrm{true}}){dm}\,{db}\,P\left({d}_{{ij}}^{\mathrm{obs}}| {d}_{{ij}}^{\mathrm{true}},{\sigma }_{\mathrm{inst},{ij}}\right)\,\times \,P\left({d}_{{ij}}^{\mathrm{true}}| m,b,{\sigma }_{\mathrm{intr},j}^{{\rm{D}}}\right){P}^{D}(m| {\theta }^{D}){P}^{{\rm{D}}}(b| {\theta }^{{\rm{D}}}),\end{array}\end{eqnarray} \tag{ 4 }$

where we have marginalized over the (unobservable) values of d^true, and θ^D represents parameters describing the distributions (i.e., ${\mu }_{\{m,b\}}^{{\rm{D}}}$ and ${\sigma }_{\{m,b\}}^{{\rm{D}}}$ ). We note again that Equation (4) is the same for not-depleted stars, up to the D superscript (D is replaced with ND for the not-depleted case).

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** Directed acyclic graph representing the hierarchical mixture model described in Section 3.1. Note that the architecture of the model is the same for both mixture components (D vs. ND), with the only difference being separate parameters for all μ_{m,b}, σ_{m,b}, and ${\sigma }_{\mathrm{intr},j}$ .
Download figure:
Standard image High-resolution image

**Figure 2.** Directed acyclic graph representing the hierarchical mixture model described in Section 3.1. Note that the architecture of the model is the same for both mixture components (D vs. ND), with the only difference being separate parameters for all μ_{m,b}, σ_{m,b}, and ${\sigma }_{\mathrm{intr},j}$ .
Download figure:
Standard image High-resolution image

Assuming the measurement of each element is statistically independent, we may write the likelihood for the ith star as a product over the element index j:

$\begin{eqnarray}\begin{array}{rcl}{{ \mathcal L }}_{i}({d}_{i}^{\mathrm{obs}}| \theta ) & = & f\displaystyle \prod _{j}{P}^{{\rm{D}}}({d}_{{ij}}^{\mathrm{obs}}| {\theta }^{{\rm{D}}})\\ & & +(1-f)\displaystyle \prod _{j}{P}^{\mathrm{ND}}({d}_{{ij}}^{\mathrm{obs}}| {\theta }^{\mathrm{ND}}),\end{array}\end{eqnarray} \tag{ 5 }$

where θ represents the complete set of model parameters, and ${d}_{i}^{\mathrm{obs}}$ represents the abundance data for the ith star. Next, by assuming the measurement of each star is statistically independent, we may write the likelihood for the data as a product over all stars:

$\begin{eqnarray}&&{ \mathcal L }\left(\{{d}_{{ij}}^{\mathrm{obs}}\}| \theta \right)=\displaystyle \prod _{i}{{ \mathcal L }}_{i}({d}_{i}^{\mathrm{obs}}| \theta ).\end{eqnarray} \tag{ 6 }$

We adopt uniform priors on the parameters in our model, so the posterior on these parameters is simply proportional to the likelihood in Equation (6).

3.2. Outlier Rejection

We assume that the distributions of m and b are Gaussian. In order to ensure that the model fit is not being driven by departures from our Gaussian assumption, particularly in the tails of the abundance distributions, we employ outlier rejection. In particular, we calculate the standard deviation of the abundance measurements for each element, $\sigma \left([{\rm{X}}/\mathrm{Fe}]\right)$ , after the selections described in Sections 2.1.1 and 2.1.2. We adopt a fiducial outlier rejection of 2σ, such that any star with $| [{\rm{X}}/\mathrm{Fe}]-\langle [{\rm{X}}/\mathrm{Fe}]\rangle | \gt 2\sigma \left([{\rm{X}}/\mathrm{Fe}]\right)$ is removed from the analysis. Since the expected depletion trend is of the order ∼0.05 dex, the signal of interest should not be driven by the tails of abundance distributions regardless. Outlier rejection removes roughly 200 stars from our Sun-like sample, with 1556 remaining. We note that on simulated data, we recover the correct input parameters with 2σ outlier rejection.

We explore variations on the 2σ outlier rejection in Section 3.3 for simulated data and in Appendix D.1 for actual APOGEE data. In both cases, we find that model constraints are robust to various rejection choices overall.

3.3. Analysis of Simulated Data

We test our analysis pipelines and evaluate the extent to which we are able to recover input parameters with realistic levels of noise using simulated data. We generate mock data by drawing from the distributions described in Section 3.1, using the real APOGEE uncertainty estimates. Mock abundances are simulated for the five elements used in the data analysis. We assume ∼2000 stars, meant to roughly match the APOGEE sample discussed in Section 2.1.1.

The true parameter values describing the distributions of m and b are chosen based on the distribution of abundance versus T_c slopes provided by Bedell et al. (2018) for a Sun-like sample. The intrinsic dispersion for each mock element (given by ${\sigma }_{\mathrm{intr},j}$ ) is chosen to fall between 0.02 and 0.05 dex for both depleted and not-depleted stars, since this is roughly the range of standard deviations for a typical chemical abundance distribution from our APOGEE solar analog sample. Our fiducial parameter choice is f = 0.5, based on the large population of stars near solar abundances in Ramírez & Asplund (2009), Bedell et al. (2018).

The results on the simulated data are presented in Appendix B. We find that for five elements with uncertainties sampled directly from APOGEE data, all input parameters are recovered without bias. Additionally, when simulating data with only a single population of stars and APOGEE abundance uncertainties, the analysis correctly recovers f = 0.

We test our sensitivity to the assumptions of Gaussian distributions in Appendix B.1, and find our analysis to be generally robust to small departures from Gaussinity.

4. Results

4.1. Abundances from APOGEE

We now present the results of applying the statistical methodology developed in Section 3 to APOGEE data for the selection of stars described in Section 2, using the five elements Si, Mg, Ni, Ca, and Al.

In Figure 3, we illustrate our constraints on the elemental abundance of solar analogs as a function of element condensation temperature, T_c. Because APOGEE calibrates to solar neighborhood stars (see Section 2.1), we also plot abundances from the APOGEE solar reference spectrum (purple), which is measured in reflected sunlight from the asteroid Vesta (Jönsson et al. 2020). A linear fit to these points is illustrated by the purple line. We note that the solar reference spectrum is roughly consistent with an abundance versus T_c slope and intercept of zero (horizontal dashed line in Figure 3), indicating that the APOGEE solar neighborhood calibration is not unusually different from calibrations relative to the Sun.¹⁰ In Section 5.3, we discuss the potential impact of the APOGEE ASPCAP calibrations on our results and comparisons to other studies.

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** Model constraints (68% CL) from APOGEE data for the elements Si, Mg, Ni, Ca, and Al. The [X/Fe]–T_c trend is illustrated by plotting a realization of the model at each step in the MCMC parameter chain. Blue and red correspond to not-depleted and depleted, respectively. Dashed vertical lines are placed at the condensation temperature of each element, while the colored dashed lines depict the best-fit mean trend for D and ND stars (i.e., μ_mT_c + μ_b). The APOGEE solar reference spectrum, measured in reflected sunlight from the asteroid Vesta, is shown in purple.
Download figure:
Standard image High-resolution image

We find the mean slope of the abundance versus T_c relation for not-depleted (ND) and depleted (D) stars to be significantly different, with

$\begin{eqnarray}&&{\mu }_{m}=\left\{\begin{array}{l}3.47{\pm }_{0.358}^{0.269}\times {10}^{-4}\,\mathrm{dex}\,{{\rm{K}}}^{-1}\,\mathrm{for}\,\mathrm{ND},\\ 1.65{\pm }_{0.307}^{0.448}\times {10}^{-5}\,\mathrm{dex}\,{{\rm{K}}}^{-1}\,\mathrm{for}\,{\rm{D}}\end{array}\right.\end{eqnarray} \tag{ 7 }$

at 68% confidence. We find that the D band in Figure 3 is not in fact below the ND band for the entire range of T_c. While we are primarily interested in the slope of the abundance versus T_c relation—from which the ND population is refractory enriched relative to the more volatile elements—the absolute abundances of both populations are potentially complicated by ASCAP calibrations, as discussed in Sections 2.1 and 5. Our constraints place an abundance versus T_c slope of zero (i.e., the Sun's pattern) to reside at roughly 1.2σ from the D mean, toward the more refractory-depleted tail of the slope distribution. We note that due to degeneracies between μ_m and μ_b, the marginalized uncertainty on μ_m does not capture the full picture. Instead, it is more instructive to evaluate the ND and D abundances as a function of T_c, as shown in Figure 3. Additionally, in Appendix E we provide contour plots of the model posterior.

We find evidence of two populations of stars—one with near zero (solar) abundance versus T_c slopes (D), and another with significantly steeper slopes (ND) indicating a higher ratio of refractory-to volatile (i.e., high T_c to low T_c) abundances compared to the Sun. In Figure 4, we plot constraints on the fraction (f) of stars drawn from the depleted model component. When incorporating only statistical uncertainty, we find $f=0.84{\pm }_{0.03}^{0.01}$ at 68% confidence. In order to evaluate the robustness of our fit and provide a systematic error bar on f, we repeat the analysis multiple times by removing each element in turn. The labels on the y-axis of Figure 4 specify the element removed for each constraint, with "None" corresponding to the case of no elements removed (i.e., Si, Mg, Ni, Ca, and Al all included). In order to derive a systematic error bar from the six constraints illustrated in Figure 4, we add the probability density functions on f from the Markov chain Monte Carlo (MCMC) parameters chains across all six trials, and determine the 68% confidence interval on the resulting distribution. This assumes that the true result could be any of the six trials illustrated in Figure 4. Our constraint on the fraction of stars drawn from the depleted distribution is then

$\begin{eqnarray}&&f=0.84{\pm }_{0.17}^{0.05}\end{eqnarray} \tag{ 8 }$

at 68% confidence.

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** Constraints on the fraction (f) of stars in the depleted population. In order to evaluate the robustness of our constraint on f, we remove each element in turn and repeat the analysis. The labels on the y-axis specify which element is removed for each constraint, with "None" corresponding to the case for which none of the five elements are removed. The red band corresponds to our estimate of the combined statistical and systematic error (68% CL).
Download figure:
Standard image High-resolution image

The magnitude of the mean abundance versus T_c slopes are generally within the expected range of previous observations, with most studies reporting [X/Fe] versus T_c slopes between∼−1×10⁻⁴ and ∼3 × 10⁻⁴ for a generally wider range of condensation temperatures (Meléndez et al. 2009; Ramírez & Asplund 2009; Adibekyan et al. 2014; Nissen 2015; Bedell et al. 2018). A rough comparison of the mean abundance versus T_c slope constraints in Equation (7) can be made with Ramírez & Asplund (2009), who applied a Kolmogorov–Smirnov test to the abundance versus T_c slopes of ∼60 super-solar metallicity stars, finding evidence of a bimodality at 2.7σ. Their test assumes two Gaussian distributions centered on m = − 0.5 × 10⁻⁴ and 0.9 × 10⁻⁴ dex K⁻¹, respectively. We perform a more detailed comparison with previous works in Section 4.2.

We emphasize that the parameter f describes the fraction of stars drawn from the depleted population. Several previous authors have instead considered the fraction of stars that have a lower refractory-to-volatile element ratio compared to the Sun (i.e., stars with [X/Fe] versus T_c slopes ≤ 0), finding that ∼20% of solar analogs are more depleted than the Sun (Ramírez & Asplund 2009; Gonzalez et al. 2010; Schuler et al. 2011b; Bedell et al. 2018). Our constraints on the depleted fraction f suggest that the majority of solar analogs have an abundance versus T_c relation similar to the Sun. The two findings are consistent: in particular, by simulating data with underlying parameters governed by our model constraints, we find that less than 15% of the full sample of stars have abundance versus T_c trends as refractory depleted as the Sun. While previous studies have interpreted this result as the Sun being unusual in its elemental composition relative to the majority of solar analogs (e.g., Meléndez et al. 2009), our constraints place the Sun close to the middle of a distribution of stars with similar abundance versus T_c relations (the depleted population). We defer a more detailed discussion of our interpretation of these results to Section 5.

We perform a detailed analysis of the goodness of fit of our model in Appendix C, where we employ posterior predictive methods. We find that for the set of elements presented in this section, the model provides a good fit to the data. We also note that the preference for a bimodal model (i.e., f ≠ 0) is preferred at high significance on APOGEE data, compared to the model with f = 0 and all other parameters free.

4.2. Comparison to Bedell et al. (2018)

We now compare our results derived from APOGEE data to high precision measurements from Bedell et al. (2018).

4.2.1. Distribution of Abundance versus T_c Slopes

We generate elemental abundance data with underlying parameters governed by our model constraints on APOGEE in Section 4.1. Since we wish to compare the resulting abundance versus T_c trends to the literature (i.e., Bedell et al. 2018), we adopt a scatter of ∼0.02 dex on all simulated abundances, propagated through to the distribution of abundance versus T_c slopes. This scatter is meant to emulate the uncertainty in abundance determinations from generally higher precision surveys, such as Meléndez et al. (2009) and Bedell et al. (2018), thereby enabling a more direct comparison.

The resulting distribution of abundance versus T_c slopes is illustrated by the blue band in Figure 5, where we also include the distribution of T_c > 1300 K slopes from Bedell et al. (2018) for a smaller sample of 79 Sun-like stars with similar T_eff, $\mathrm{log}g$ , and metallicity to our APOGEE sample (green). Note that the Bedell et al. (2018) distribution is derived by performing a linear fit to the abundance data for each star, as is common in the literature. We limit the Bedell et al. (2018) abundance data to the 21 elements with T_c > 1300 K, since this enables a more direct comparison to our constraints on APOGEE (with ${T}_{c,\min }\sim 1300\,{\rm{K}}$ ). Additionally, Bedell et al. (2018) found that including moderately volatile elements with T_c between 900 and ∼1300 K can bias the abundance versus T_c slope.

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** Distribution of [X/Fe] vs. T_c slopes from model constraints on APOGEE data (blue). The width of the blue band as a function of [X/Fe] vs. T_c slope is derived from the 68% confidence interval on the simulated APOGEE data, which has an added abundance uncertainty of ∼0.2 dex meant to emulate errors in other works (see Section 4.2). Blue and red vertical dashed lines are placed at the mean slope values (μ_m) for ND and D stars, respectively. The vertical dashed black line is placed at zero, representing the solar abundance trend. We also include the distribution of slopes from Bedell et al. (2018) in green, derived from a linear fit to the abundance data (vs. T_c) for a sample of 79 stars and 21 high T_c elements. Both distributions (APOGEE and Bedell et al. 2018) are normalized to have equal areas, such that the y-axis can be interpreted as a probability density.
Download figure:
Standard image High-resolution image

We find good agreement between constraints from our APOGEE sample discussed in Section 4.1 and results from Bedell et al. (2018), given that the Bedell et al. (2018) data includes generally higher precision abundance measurements of many more elements determined from far fewer stars. Furthermore, we find that when adopting a minimal abundance uncertainty of ∼0.02 dex and simulating data with model parameters governed by our APOGEE constraints, the fraction of stars with [X/Fe] versus T_c slopes less than zero (i.e., stars which are more refractory depleted than the Sun relative to the lower T_c elements) is roughly 30%. Note that this quantity can be roughly interpreted as the integral under the blue band in Figure 5, to the left of the black vertical dashed line at solar abundances. This is somewhat higher than the fraction of stars with negative [X/Fe] versus T_c slopes determined by Bedell et al. (2018), namely 19% for abundances measured relative to hydrogen.

Other studies report a more frequent occurrence of refractory depletions relative to the more volatile elements for various samples, thereby suggesting that the Sun's depletion pattern is not entirely uncommon. Ramírez & Asplund (2009), for instance, found that the fraction of stars with refractory-to-volatile element ratios as depleted as the Sun is in the range of ∼15% for sub-solar metallicities to ≳50% for super-solar metallicities. Additionally, Gonzalez et al. (2010) found a significantly larger population of stars with such depletions (∼50%) compared to Bedell et al. (2018), though their sample extends beyond solar analogs and specifically targets stars with planets regardless of stellar type. Evidently, larger sample sizes with high precision abundance measurements will be helpful to place more stringent constraints on the rarity of the solar depletion trend across various samples of stars.

4.2.2. Model Constraints on Data from Bedell et al. (2018)

In this section, we present results from our model using high precision elemental abundance measurements of 79 solar analogs from Bedell et al. (2018) and Spina et al. (2018) described in Section 2.2. The sample includes a population similar to the APOGEE stars described in Section 2.1.1, with effective temperatures (T_eff) generally within 100 K of solar, surface gravities ( $\mathrm{log}g$ ) within 0.1 dex of solar, and metallicities (taken as [Fe/H]) within 0.1 dex of solar. While there are far fewer stars compared to our APOGEE sample, the number of available elements is significantly greater and the signal to noise for each abundance measurement generally higher.

In order to allow for a more direct comparison to our constraints on APOGEE data in Section 4.1, we use abundance measurements from Bedell et al. (2018) relative to iron. Additionally, we include the 21 elements with T_c > 1300 K to more closely match the range of condensation temperatures used on APOGEE data. This also ensures the exclusion of moderately volatile elements (i.e., 900 K ≲ T_c ≲ 1300 K), which in many cases Bedell et al. (2018) found to bias the refractory abundance versus T_c slope.

We choose not to employ outlier rejection in this section given the small sample size relative to wider surveys such as APOGEE. We believe this to be well motivated, since the line-by-line and strictly differential abundances from Bedell et al. (2018) are less likely to have significant sources of systematic error within their sample (Bedell et al. 2014). However, this does not ensure that systematics between abundance data from APOGEE and Bedell et al. (2018) are necessarily reduced, since the use of varying stellar models, line lists, and generally different methodologies can complicate a straightforward comparison between the two studies.

Our primary result for the 79 solar analogs and 21 elements are shown in Figure 6, where the abundance pattern is plotted in blue for ND stars and red for D stars. As before, vertical dashed lines are placed at the condensation temperature of each element. We find the mean slopes for ND and D stars to be significantly different, with

$\begin{eqnarray}&&{\mu }_{m}=\left\{\begin{array}{l}1.55{\pm }_{0.20}^{0.33}\times {10}^{-4}\,\mathrm{dex}\,{{\rm{K}}}^{-1}\,\mathrm{for}\,\mathrm{ND},\\ 2.92{\pm }_{0.74}^{0.72}\times {10}^{-5}\,\mathrm{dex}\,{{\rm{K}}}^{-1}\,\mathrm{for}\,{\rm{D}}\end{array}\right.\end{eqnarray} \tag{ 9 }$

at 68% confidence. Our constraint on the fraction (f) of stars drawn from the depleted slope distribution for this sample is

$\begin{eqnarray}&&f=0.56\pm 0.06,\end{eqnarray} \tag{ 10 }$

also at 68% confidence. A systematic error bar is not provided on this constraint as was done in Section 4.1, since the Bedell et al. (2018) sample includes many more elements and constraints are found to be generally robust to small variations on which elements are included.

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Constraints on the depletion pattern for high precision abundance data from Bedell et al. (2018) for a selection of 21 elements with T_c > 1300 K. Dashed vertical lines are placed at the condensation temperature of each element, while the colored dashed lines depict the best-fit mean depletion trend for D and ND (i.e., μ_mT_c + μ_b).
Download figure:
Standard image High-resolution image

The mean slopes are again in general agreement with measurements of individual stars (i.e., Meléndez et al. 2009; Ramírez & Asplund 2009; Maldonado et al. 2015; Bedell et al. 2018), with a larger scatter compared to our constraints on APOGEE data in Section 4.1, which is due to a combination of both higher statistical uncertainty and a preference for larger intrinsic scatter. In addition, we provide contour plots of the model posterior for the main parameters of interest in Appendix E.

Differences in the intrinsic scatter on the slopes (σ_m) and intercepts (σ_b) between our APOGEE constraints and the Bedell et al. (2018) constraints are likely driven by the inclusion of a significantly larger number of elements in the Bedell et al. (2018) sample. Furthermore, inaccuracies in the reported elemental abundance uncertainties can have the propagated effect of inflating or deflating the model fit for σ_m and σ_b, since these parameters are meant to encode intrinsic scatter rather than measurement uncertainties. Small differences in intercepts and the absolute slopes are also to be expected, since APOGEE calibrates to a "solar-like" scale whereas the Bedell et al. (2018) measurements are conducted line by line, strictly differential to the Sun. Nonetheless, we find similar abundance versus T_c trends for both the D and ND model components between the two data sets.

A unique feature of our analysis compared to previous studies is our ability to place a constraint on the fraction (f) of stars drawn from the depleted slope distribution. For the APOGEE and Bedell et al. (2018) sample, our constraint on this fraction is expressed in Equations (8) and (10), respectively. We find that when incorporating the systematic error bar provided on the APOGEE result in Equation (8), the Bedell et al. (2018) constraint is consistent to within ∼1σ.

As we find on APOGEE data, while the fraction of stars drawn from the depleted distribution is significant (∼50%), the number of stars with depletion trends as extreme (i.e., as refractory depleted relative to the volatiles) as the Sun is comparably smaller (∼20%). Perhaps unsurprisingly, this is consistent with Bedell et al. (2018), who found that 19% of stars have refractory-to-volatile ratios as refractory depleted as the Sun using abundances measured relative to hydrogen.

4.3. Solar Depletion Trend and the Average Solar Analog

Several studies have demonstrated that the Sun appears depleted in refractory elements relative to the majority of nearby solar analogs, with these results being extended to other planet host stars in some cases (Meléndez et al. 2009; Ramírez & Asplund 2009; Gonzalez et al. 2010; Ramírez et al. 2010; Schuler et al. 2011b; Liu et al. 2016, 2020; Bedell et al. 2018). In this section, we closely follow the methodology of Bedell et al. (2018) and calculate the abundance trend of the "average" solar analog compared to the Sun using our constraints on all model parameters derived from our APOGEE sample. In particular, the sample average is taken in linear space with the number density of atoms, so that the mean abundance ratio of element X to iron for a population of N stars is

$\begin{eqnarray}&&{\left\langle \left[\displaystyle \frac{{\rm{X}}}{\mathrm{Fe}}\right]\right\rangle }_{\mathrm{Stars}}={\mathrm{log}}_{10}\left(\displaystyle \frac{1}{N}\displaystyle \sum _{n=0}^{N}{10}^{{\left[\tfrac{{\rm{X}}}{\mathrm{Fe}}\right]}_{n}}\right).\end{eqnarray} \tag{ 11 }$

In order to derive the average abundance ratios from our results on APOGEE data, we simulate data with underlying parameters governed by our model constraints, computing Equation (11) at the relevant condensation temperatures corresponding to each element. The resulting average abundance trend is illustrated in Figure 7. The width of the blue error band is derived from the 68% CL on the average abundance as a function of T_c. For comparison, we also include linear fits to the average elemental abundance measurements from Bedell et al. (2018) for a sample of 79 Sun-like stars in red. The green line depicts a linear fit to the same data, but limited to the five elements Si, Mg, Ni, Ca, and Al (i.e., those used on APOGEE data). We include two y-axes in Figure 7 to emphasize that the abundances from Bedell et al. (2018) and APOGEE are measured on slightly different scales. In particular, abundance measurements from Bedell et al. (2018) are measured relative to the Sun ([X/Fe]_Sun), while abundances from APOGEE are measured relative to the solar neighborhood ([X/Fe]_S.N.). Consequently, the offset in Figure 7 between our APOGEE results and those from Bedell et al. (2018) are likely driven by differences in the abundance scales.

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.** The abundance pattern of the average solar analog compared to the Sun, using our constraints from APOGEE data (purple). Red and green points depict high precision abundance determinations of 79 solar analogs from Bedell et al. (2018). We fit linear least squares trends to all 25 elements plotted (red) and the five elements we study on APOGEE data (green). We reproduce the finding that the Sun is depleted in refractory elements relative to the more volatile elements.
Download figure:
Standard image High-resolution image

We find our constraints are consistent with the Bedell et al. (2018) measurements up to an offset term. Furthermore, we recover the previously observed depletion trend, with the Sun appearing increasingly deficient in the high T_c refractory elements relative to the lower T_c, more volatile elements.

We note that the result presented in this section is indeed consistent with our discussion in Section 4.1, where we find that the Sun, itself, belongs to a majority distribution of stars with similar abundance versus T_c trends. The existence of a second population of stars (ND component) with refractory enhanced abundance trends relative to the more volatile elements, drives the elemental abundances of the average solar analog to higher values as a function of T_c. This leads to the upward sloping result in Figure 7. Consequently, our constraints are consistent with the claim that the Sun appears increasingly deficient in higher T_c elements relative to the average solar analog, though our interpretation of this result differs. We discuss our interpretation of these results more thoroughly in Section 5.

5. Discussion

5.1. Summary

We have developed a novel statistical approach to probe stellar elemental abundances as a function of element condensation temperature, T_c. Using a two component mixture model, we fit the abundance measurements of roughly 1700 stars from APOGEE. By combining constraints from many stars, we find evidence for two populations of stars: one population has an abundance versus T_c relation roughly in line with the Sun, while the other population has an enhanced abundance of refractory elements with high T_c. We refer to these groups as the depleted and not-depleted populations, respectively. The key advantage of our approach is that it can be applied to data sets for which the individual abundance measurements may have low signal to noise. By combining constraints from multiple stars at the likelihood level, we can obtain a high significance constraint from a large sample of individually low signal-to-noise measurements.

We find that the pattern of refractory element abundances seen in the Sun is not uncommon. Indeed, the fraction of depleted stars (of which the Sun is a member) is found to be $f=84{\pm }_{17 \% }^{5 \% }$ (including both statistical and systematic error) in the APOGEE sample. The distribution of abundance versus T_c slopes inferred from our analysis is shown with the blue band in Figure 5.

These results can be compared to a recent analysis of similar abundance versus T_c trends by Bedell et al. (2018). In contrast to our analysis, Bedell et al. (2018) used a much smaller sample of stars (79 solar analogs), with more accurately measured abundances, and for a wider range of elements. Our results appear to be in qualitative agreement with those of Bedell et al. (2018), as seen in Figures 5 and 7. When we apply our analysis methodology directly to the Bedell et al. (2018) measurements, we also find fairly consistent results, as discussed in Section 4.2.2.

However, our interpretation of these results differs somewhat from that of Bedell et al. (2018). Bedell et al. (2018) noted that the Sun shows a greater depletion of refractory elements compared to roughly 80% of stars. We find a similar result in our analysis; this fraction is roughly equivalent to the integral under the blue band in Figure 5 below the black dashed line. However, our interpretation is not that the Sun is unusually depleted in refractory elements, but rather that it sits close to the middle of a distribution of similar, refractory-depleted stars that make up a large portion of all stars.

5.2. Implications for Abundance of Small Planets

Meléndez et al. (2009) were the first to claim that the Sun has a deficiency in refractory elements relative to volatiles compared to other Sun-like stars. This result has since fostered an ongoing debate, with some works verifying the Sun's unusual abundance pattern and others refuting it (Ramírez & Asplund 2009; Adibekyan et al. 2014; Nissen 2015; Bedell et al. 2018). While we find evidence that the Sun shows a greater depletion than the average solar-like star, as noted above, our analysis suggests that the Sun may actually be a part of a dominant population of stars that show similar trends.

If we adopt the rocky planet hypothesis proposed by Meléndez et al. (2009), our results can be interpreted as separating out stars with rocky planets from those without, with roughly 80% belonging to the former group. Some arguments against the rocky planet hypothesis stem from the fact that few stars seem to have refractory depletion trends as extreme as the Sun, while far more stars likely host small rocky planets (Schuler et al. 2011a; Burke et al. 2015; Booth & Owen 2020). Our results suggest that the Sun actually falls into a dominant population of refractory-depleted stars.

One can further speculate that the placement of a star in the distribution of abundance versus T_c slopes is determined not solely by its rocky planet hosting status, but may also be related to giant planet formation. Recently, Booth & Owen (2020) argued this to be the case, since they demonstrated that a forming giant planet can create a pressure trap in the protoplanetary disk, thereby inhibiting refractory material from accreting onto the host star. They make the argument that rocky Earth-like planets are not the likely culprits of the solar depletion pattern, since rocky planets are far more common (Lineweaver & Grether 2003; Youdin 2011; Kopparapu 2013; Hsu et al. 2019) (as discussed above, our interpretation of the depleted population does not support this argument, given the large population of stars with near solar abundances). Their results also suggest that there is not enough refractory material locked up in the Earth to account for the observed solar depletion trend, making the giant planet hypothesis more appealing.

Adopting the planet hypothesis for refractory depletions, our approach can in principle address both the rocky planet explanation proposed by Meléndez et al. (2009) and the giant planet hypothesis previously discussed (Booth & Owen 2020). Because the detection rate of giant planets is low (∼15%; Wittenmyer et al. 2016, 2020; Fernandes et al. 2019), some stars may reside toward the more depleted tail of the abundance versus T_c slope distribution. In particular, stars like the Sun, with multiple terrestrial and gas giant planets, may have excess depletions compared to the typical depleted star if both types of planets are associated with depletion. We find that the Sun is in fact over 1σ depleted relative to the mean of the depleted population. Significant additional work is needed to explore and validate these correlations.

Regardless of the hypothesis one adopts, we suggest that future works studying subtle elemental depletion patterns in various samples of stars more rigorously test for the presence of bimodalities in the distribution of abundance versus T_c slopes. Doing so provides a more complete picture of where the Sun falls in its elemental composition relative to other stars, paving the way to a more thorough understanding of the planet-star connection.

5.3. Caveats

GCE effects are not corrected for in this analysis, and could lead to elemental abundance trends with T_c (Adibekyan et al. 2014; Spina et al. 2016). Bedell et al. (2018) addressed this by fitting age trends to the abundances of solar analogs in their sample as a function of inferred stellar age, thereby indirectly probing GCE. They found that when subtracting off the inferred GCE fit from all abundances, the Sun still appears increasingly depleted in higher T_c elements relative to the average solar analog, suggesting that GCE alone is not driving the refractory depletion trends. Larger samples that include more elements with similar chemical origins will help constrain the extent to which GCE could be responsible for elemental abundance trends with T_c.

With respect to our analysis of APOGEE data, binary star systems can adversely effect chemical abundance measurements either by having changed the evolution of the star targeted by APOGEE, or if spectral features from both stars can be identified from the spectrum, by changing the spectral features being measured. The APOGEE observation strategy was designed to be sensitive to close binaries (see discussion in Majewski et al. 2017). By modeling the multi-epoch radial velocities of APOGEE DR16 stars, Price-Whelan et al. (2020) generated a catalog of roughly 20,000 close binary companions enabling us to crosscheck our sample for possible binary contamination.¹¹ We find that ∼2% of stars in our final sample are potential binaries from the Price-Whelan et al. analysis. Since both D and ND components prefer fractions greater than 10%, binary star systems are unlikely to be the driving force behind the observed shifting bimodalities. Indeed, after removing the potential binaries from our sample constraints remain virtually unchanged. An alternate strategy presented in El-Badry et al. (2018) models the spectra, themselves, rather than using RV variations to identify likely binaries. This approach is sensitive to a slightly different population of binaries; the binary fractions, however, remain small compared to the D and ND signals in our work.

Due to calibrations differences between the APOGEE elemental abundances and other works, it is not necessarily valid to compare the absolute abundance measurements from APOGEE to those derived from a strictly differential analyses measured relative to the Sun (i.e., as in Bedell et al. 2018). In Section 4.1, however, we find that the APOGEE solar reference spectrum measured in reflected sunlight from the asteroid Vesta (Jönsson et al. 2020) is roughly consistent with an abundance versus T_c slope and intercept of zero on the APOGEE scale. Moreover, the solar pattern is consistent with the depleted population. This is further validated by our results in Section 4.2.1, where we find that our constraints from APOGEE yield similar elemental abundance patterns to those from Bedell et al. (2018), albeit with some offsets (with respect to the not-depleted abundance versus T_c slopes and intercepts, in particular). Consequently, while calibration offsets could have the effect of shifting absolute elemental abundance trends, we do not believe this to be the dominant factor in the trends constrained by the APOGEE abundance data. Furthermore, because APOGEE calibrations are simply applied as a zero-point offset per element to all stars, the presence or absence of distinct populations in the distribution of elemental abundances is unaffected, as well as the relative slopes between the depleted and not-depleted model components.

In order to achieve a high level of precision, our analysis on APOGEE data is limited to the five elements Si, Mg, Ni, Ca, and Al, spanning a range of condensation temperatures from ∼1300–1600 K. Consequently, our constraints should be interpreted as applying only to this range of T_c, since some stellar abundance trends are better described by piecewise linear functions when incorporating elements spanning a wider range of condensation temperature (Meléndez et al. 2009; Saffe et al. 2016; Bedell et al. 2018). Increasing the number of high precision, high T_c elements will be useful for future studies of elemental depletion patterns.

Our constraints from APOGEE presented in Section 4.1 are potentially sensitive to systematic uncertainties stemming from the ASPCAP pipeline. While we have chosen elements with small calibration zero-point shifts to the mean abundances in the solar neighborhood (≲0.04 dex), inaccuracies in abundance determinations could bias model constraints. In particular, we find evidence of two populations of stars; one with roughly solar abundances and another which is increasingly enhanced in refractory elements as a function of T_c, indicating a departure from the solar neighborhood. Because APOGEE calibrates to the solar neighborhood, stars that depart from this trend (i.e., the not-depleted population) could be overcorrected for in the ASPCAP pipeline. We believe this to be the underlying reason why the not-depleted constraints on APOGEE prefer a relatively low intercept and high slope in Figure 3 compared to our results from Bedell et al. (2018) in Figure 6. Regardless, the relative abundance trends between depleted and not-depleted populations in both results are similar.

Relevant to the results of this work, if the true uncertainties on the APOGEE abundances are higher than reported, the intrinsic scatter model parameters (i.e., σ_m, σ_b, etc.) will account for both intrinsic dispersion and noise fluctuations. Our model constraints on all intrinsic scatter parameters therefore rely on the accuracy of the APOGEE provided uncertainties, and those provided by Bedell et al. (2018) (and Spina et al. 2018 for [Fe/H]) in Section 4.2.2). We note, however, that inaccuracies in abundance uncertainties are not found to bias the main model parameters of interest on simulated data (f, μ_m, μ_b), since these parameters represent population means. For further details on how the abundances and their associated uncertainties used in this analysis are derived, we direct readers to Jönsson et al. (2020), Spina et al. (2018), and Bedell et al. (2018).

Our modeling scheme described in Section 3.1 assumes that the distribution of abundance versus T_c slopes and intercepts are Gaussian. For this reason, we impose outlier clipping on the abundance data to ensure that departures from gaussianity in the distribution tails are not driving the model fit. We adopt a conservative outlier rejection of 2σ on the distribution of abundances (see Section 3.2 for details), and find that our main results are robust to variations on this threshold (Appendix D.1). While the goodness of fit and tests for robustness suggest that our finding of two populations of stars is legitimate, larger sample sizes with more elements will help to further determine the validity of this claim.

Lastly, while it is true that the abundances in ASPCAP are determined independently, families of elements are not completely independent in that the underlying spectral parameters include abundance terms for [M/H] and [α/M] by which individual abundance measurements are constrained (García Pérez et al. 2016). However, the elements used in this analysis include two elements from the metal family (Ni, Al) and three elements from the α-family (Si, Mg, Ca). However, these classifications are not correlated with T_c and are unlikely to drive the T_c slopes. Moreover, the calibration to the solar neighborhood is performed element by element and should take into account systematics in the individual element values.

5.4. Future Directions

Our model enables the use of large, automated spectroscopic surveys with many stars to place constraints on the distribution of elemental abundances and subtle trends with condensation temperature. Similar studies could also be conducted using data from the GALAH Survey or Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST; Buder et al. 2018; Yao et al. 2019).

Our sample is limited to a population of solar analog stars with similar stellar parameters to the Sun. Few analyses have explored beyond solar-like samples in order to achieve high precision abundance measurements and reduce stellar model-driven biases (Bedell et al. 2018). Gonzalez et al. (2010), however, extended their sample to a wider range of effective temperatures and metallicities (taken as [Fe/H]), and found that super-solar metallicity stars with planets have the most refractory-depleted abundance trends. Ramírez & Asplund (2009) also found intriguing behavior at super-solar metallicities, with evidence of a bimodality in the distribution of [X/Fe] versus T_c slopes at 2.7σ. These findings are particularly interesting, given that the frequency of planets is widely thought to increase with stellar metallicity (Fischer & Valenti 2005; Wang & Fischer 2015). Because our analysis relies on large samples of stars, it is well suited to using splits of the data to explore trends in depletion patterns with varying stellar properties.

Previous studies (i.e., Gonzalez et al. 2010) have split their sample into stars with detected planets and stars without detected planets, evaluating the abundance versus T_c trend for each group. Gonzalez et al. (2010) found that stars with planets tend to have more refractory-depleted abundance trends than stars without detected planets, though other works have found no such correlation (González Hernández et al. 2013a; Schuler et al. 2015). Our analysis can be applied to stars with planets and without planets separately to directly test the connection of abundance trends with planet hosting status. Applying such analyses to large data sets from APOGEE, GALAH, and LAMOST will help contribute to our understanding of the nature of elemental depletion patterns with condensation temperature, and possible links to planet formation.

We are grateful to Megan Bedell and Cullen Blake for helpful comments on an earlier draft of the paper. We also thank Gary Bernstein, Neal Dalal, Mike Jarvis, Marco Raveri, Matthew Belyakov, Sarah Kane, and Eli Wiston for helpful discussions. We are grateful to Sten Hasselquist, Christian Hayes, and Matthew Shetrone for their invaluable insights regarding APOGEE data and APOGEE data analysis. J.N. has been supported in part by the Center for Undergraduate Research at the University of Pennsylvania. Further support for this work was provided by NASA through Hubble Fellowship grant #51386.01 awarded to R.L.B. by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS 5-26555.

Facilities: du Pont (APOGEE) - , Sloan (APOGEE) - , Gaia - , NASA Exoplanet Archive. -

Appendix A: Star Selection and Abundance Distributions

In this appendix we provide additional visualizations of the star selection described in Section 2.1.1 and the resulting distribution of elemental abundances as described in Section 2.1.2.

In Figure A1 (top panel), we provide a Mollweide projection of our star sample in Galactocentric coordinates. Stars surviving our baseline cuts are plotted as orange points, while the rest of the APOGEE-2 sample are shown as blue points. Figure A1 demonstrates that our sample is not drawn from a single substructure or subprogram in APOGEE, which reduces, if not eliminates, concern over sample bias. The majority of our stars are at high Galactic latitude (∣b∣ > 10°) and this occurs because the APOGEE color–magnitude targeting algorithms are more likely to select solar-type dwarfs out of the Galactic plane where de-reddened color limits are relaxed to target bluer stars (Zasowski et al. 2013, 2017). Moreover, scientific programs within the APOGEE survey that specifically target dwarfs use fields out of the Galactic plane, like the Kepler-field and the Transiting Exoplanet Survey Satellite Continuous Viewing Zones (R. Beaton et al. 2020 in preparation, F. Santana et al. 2020, in preparation).

Figure A1. Refer to the following caption and surrounding text. — **Figure A1.** Top: Mollweide projection of the APOGEE sample in Galactocentric coordinates. Orange points indicate the population of stars used in our analysis, described in Section 2.1.1. Bottom: APOGEE abundance ratios relative to iron plotted against [Fe/H] for the population of Sun-like stars used in this work (described in Section 2.1.1). These distributions show no evidence for α-bimodality known to exist in the Galaxy for this sample of stars.
Download figure:
Standard image High-resolution image

The Galaxy contains two clear chemical populations that are separated by their behavior in the [α/Fe] and [Fe/H] diagram into an α-high and α-low population. The relative numbers of stars in either population varies as a function of location in the Galaxy, which is shown vividly by (Hayden et al. 2015, their Figure 4). The two populations, however, merge in the solar neighborhood to more-or-less Solar-like values. Weinberg et al. (2019) specifically investigated chemical evolution trends in the two sequences and found, more or less, that the chemical evolution within the two α-sequences appears to occur in a similar fashion (with exceptions for elements with more complicated yields); stated differently, the overall chemical trends within the populations are similar, but the α-abundance is offset. In (Weinberg et al. 2019, their Figure 7), the nucleosynthetic pathways for the elements are summarized and from this the elements used in our analysis (Si, Mg, Ni, Ca, and Al) all have some component from core-collapse supernova that drive α-element abundances at early times. Given our model looks for the presence of two populations and most of our elements have some core-collapse supernova origins, it is not unreasonable to ensure that our sample shows no α-bimodality that in turn could be driving the depletion signal.

In Figure A1 (bottom panel), we plot abundances relative to iron against [Fe/H] for the five elements from APOGEE data used in this analysis. Points are only plotted for stars passing our analysis cuts. We do not find clear evidence of multiple modes for any of the elements in Figure A1, indicating that our model fit should not be driven by two populations of stars with vastly different chemical origins. There are slight trends for Mg and Ni over the full range of [Fe/H], but these trends are not likely to drive the conclusions found in the main text.

Appendix B: Analysis of Simulated Data

Constraints on simulated chemical abundance data for five elements and 2000 stars with uncertainties sampled directly from APOGEE are provided in the top panel of Figure A2. A more complete description of simulation procedures is provided in Section 3.3. Red and blue lines correspond to the true parameters values for depleted (D) and not-depleted (ND) stars, respectively. We note that the abundance versus T_c intercept in Figure A2 is defined at T_c = 1300 K rather than zero in order to reduce degeneracy with μ_m. We find that our analysis pipeline is successful in recovering the true parameters on simulated data for various choices of input parameters.

Figure A2. Refer to the following caption and surrounding text. — **Figure A2.** Top: results on simulated data for five elements with the same condensation temperatures used on actual data in Section 4.1. ND (blue) and D (red) correspond to not-depleted and depleted stars, respectively. The blue and red vertical dashed lines represent the input parameters for the respective groups. Bottom: results on simulated data with a Student's t-distribution truth for various outlier rejection choices. Note that our model assumes normality, motivating the use of outlier rejection on actual data. Input parameters are given by the vertical dashed lines for depleted (red) and not-depleted (blue) stars. We omit constraints on the intrinsic scatter parameters ( ${\sigma }_{\mathrm{intr},j}$ ) due to the high degree of degeneracy with the degree of freedom parameter ν.
Download figure:
Standard image High-resolution image

B.1. Departures from Gaussian Assumption

In order to evaluate how reasonable departures from our Gaussian assumption impacts model constraints, we generate abundance data from the Student's t distribution. The Student's t distribution is symmetric and bell shaped like the normal distribution, but can have significantly heavier tails depending on the number of degrees of freedom (ν) that one assumes. This means that the distribution is more likely to produce values farther from the mean, providing us with a useful test of our outlier rejection scheme.

We simulate data by generating mock elemental abundances from the Student's t distribution with ν = 2 degrees of freedom, giving significant weight to the tails of the abundance distributions. We suppose that all abundance versus T_c slopes and intercepts are normally distributed as in Section 3.3, with only the resulting distribution of abundances departing from Gaussian.

Results for this test are presented in the bottom panel of Figure A2, with rejection thresholds ranging from 1.5–3σ. We omit constraints on the intrinsic scatter parameters ${\sigma }_{\mathrm{intr},j}$ due to the high degree of degeneracy with the Student's t parameter ν. For this reason, parameter constraints on ${\sigma }_{\mathrm{intr},j}$ are not emphasized in this analysis, since a poor fit to the intrinsic scatter on both simulated and actual data will become evident in the goodness-of-fit test described in Appendix C.

We find the most significant bias in σ_b for less conservative outlier rejections (i.e., for ≥ 2.5σ), but note that constraints are generally robust across all rejection choices. We therefore adopt a rejection threshold of 2σ on APOGEE data for our fiducial result, since this choice provides a conservative rejection with minimal bias in the key parameters of interest (f, μ_m, μ_b for depleted (red) and not-depleted (blue) stars). We explore variations on the fiducial outlier rejection threshold on actual data in Appendix D.

Appendix C: Testing the Goodness of Fit

In order to assess the quality of the model fit, we make use of a posterior predictive check, described as follows. Let θ_i be the vector of model parameters corresponding to the ith step of the MCMC parameter chain after appropriate burn-in. For each step in the parameter chain, we simulate data according to the modeling outlined in Section 3.1, with θ_i as the underlying model parameters. We then histogram the simulated data with the same bins as the real, observed data from APOGEE. The result is that we are able to compare the simulated data to the observed data for each step in the chain, while also making the same comparison between two simulated realizations of data.

In order to quantify differences between the simulated data realizations and the actual data, we make use of the reduced-χ² (denoted ${\chi }_{\nu }^{2}$ ) test statistic. Let ${n}_{i}^{\mathrm{obs}}$ be the counts of the observed data in bin i for element X, while ${n}_{i}^{\mathrm{sim}}$ represents the same quantity but for simulated data. We then define the reduced-χ² between the data and the model as

$\begin{eqnarray}&&{\chi }_{\nu ,\mathrm{Data}-\mathrm{Model}}^{2}\equiv \displaystyle \frac{1}{N-1}\displaystyle \sum _{i}\displaystyle \frac{{\left({n}_{i}^{\mathrm{obs}}-{n}_{i}^{\mathrm{sim}}\right)}^{2}}{{n}_{i}^{\mathrm{sim}}},\end{eqnarray} \tag{ C1 }$

where N is the number of stars and ${n}_{\mathrm{sim}}$ is assumed to be Poisson distributed. We also compute the same statistic between two simulated realizations of the model with underlying parameters θ_i. In this case, the reduced-χ² is

$\begin{eqnarray}&&{\chi }_{\nu ,\mathrm{Model}\ -\ \mathrm{Model}}^{2}\equiv \displaystyle \frac{1}{N-1}\displaystyle \sum _{i}\displaystyle \frac{{\left({n}_{i}^{\mathrm{sim}* }-{n}_{i}^{\mathrm{sim}}\right)}^{2}}{{n}_{i}^{\mathrm{sim}}},\end{eqnarray} \tag{ C2 }$

where ${n}_{i}^{\mathrm{sim}* }$ is the histogram counts for an additional realization of the model with underlying parameters θ_i.

The general logic is that if the model provides a decent fit to the data, the distributions of ${\chi }_{\nu ,\mathrm{Data}\ -\ \mathrm{Model}}^{2}$ and ${\chi }_{\nu ,\mathrm{Model}\ -\ \mathrm{Model}}^{2}$ should be similar, with ${\chi }_{\nu ,\mathrm{Data}\ -\ \mathrm{Model}}^{2}$ occasionally indicating a better fit than ${\chi }_{\nu ,\mathrm{Model}\ -\ \mathrm{Model}}^{2}$ .

C.1. Goodness of Fit: Data from APOGEE

We provide the results of the posterior predictive test in Figure C1 (top panel), where we plot ${\chi }_{\nu ,\mathrm{Data}\ -\ \mathrm{Model}}^{2}$ in blue and ${\chi }_{\nu ,\mathrm{Model}\ -\ \mathrm{Model}}^{2}$ in red for each element. The p-value in this context does not match the frequentist definition, but is defined as the fraction of times ${\chi }_{\nu ,\mathrm{Data}\ -\ \mathrm{Model}}^{2}$ is less than ${\chi }_{\nu ,\mathrm{Model}\ -\ \mathrm{Model}}^{2}$ . Thus, "perfect" p-values are expected to fall around 0.5. With the exception of Ni, all p-values are greater than 0.10 indicating a generally strong fit. The overall p-value, determined by summing over all elements is 0.06. We find that when removing Ni abundances and running the analysis again, parameter constraints remain consistent with the case of all five elements included. Furthermore, the overall p-value with Ni removed is p = 0.32, indicating an excellent fit to the data. While the overall p-value is significantly better after removing Ni from the analysis, the consistency of model constraints with Ni removed compared to the case of all five elements included indicates that Ni alone is not driving the model fit in Section 4.1.

Figure C1. Refer to the following caption and surrounding text. — **Figure C1.** Top panel: posterior predictive distributions for the reduced χ² between the data and model (blue) and two simulated realizations of the model (red). The p-value is defined as the fraction of times the χ² between the data and the model is less than the same metric between the two simulated realizations of the model. Middle panel: visualization of the model fit to data, where the x-axis is the elemental abundance [dex] and the y-axis is a histogram count. The black histogram shows the distribution of [X/Fe] for APOGEE data, while the green histograms depicts the 68% CL from simulated data with underlying parameters governed by the MCMC chains. Bottom panel: same as the middle panel, but with the two components (ND and D) plotted separately. Note that the y-axis is plotted on a logarithmic scale.
Download figure:
Standard image High-resolution image

The middle panel of Figure C1 provides a visualization of the model fit to APOGEE data for the five elements Si, Mg, Ni, Ca, and Al. The black curve is a histogram of the abundance data for each element over our solar analog sample, while the green band is derived from our model constraints. We note that generating the green band in Figure C1 makes use of all model parameters providing a useful visualization of the fit. Meanwhile, the strength of the fit is quantified in the top panel.

The bottom panel of Figure C1 provides a further visualization of our model constraints presented in Section 4.1. In particular, we plot the depleted and not-depleted model components separately in Figure C1, over the distribution of elemental abundances (in black) from our solar analog sample. The mean of the depleted population can be seen to remain roughly consistent with [X/Fe] = 0 across the board, while the ND component shifts from left to right for higher T_c elements. We note that the amplitudes of the D and ND components in this figure are not scaled by the parameter f (or 1 − f) in order to place the two distributions on the same scale.

C.2. Goodness of Fit: Data from Bedell et al. (2018)

In this appendix we evaluate the goodness of fit of the model constraints on abundance data from Bedell et al. (2018) of 79 stars and 21 elements. Given the large number of elements in this section, we report an overall p-value in place of individual element p-values. We find that the model provides an extremely good fit to the data, with an overall p-value of 0.27.

Appendix D: Robustness of APOGEE Constraints

In this appendix we test the robustness of our fit to APOGEE data for the five elements Si, Mg, Ni, Ca, and Al.

D.1. Variations on Outlier Rejection

In Section 3.2 we discussed the outlier rejection scheme used on APOGEE data, and present variations on the rejection threshold in this section. Our fiducial result presented in Section 4.1 adopts a 2σ rejection threshold, such that any star with $| [{\rm{X}}/\mathrm{Fe}]-\langle [{\rm{X}}/\mathrm{Fe}]\rangle | \gt 2\sigma \left([{\rm{X}}/\mathrm{Fe}]\right)$ is removed from the analysis (see Section 3.2). In Figure E1, we provide model constraints on the main parameters of interest for various rejection thresholds ranging from 1.5–3σ. Indeed, our constraints are robust to variations on the rejection threshold, indicating a robust fit to the data.

Figure E1. Refer to the following caption and surrounding text. — **Figure E1.** Results on APOGEE data for various outlier rejection thresholds. Red and blue points correspond to D and ND model components, respectively. We find that for variations around the 2σ rejection threshold adopted in Section 4.1, parameter constraints are robust.
Download figure:
Standard image High-resolution image

D.2. Element Exclusion Test

A more stringent test of the model constraints presented in Section 4.1 includes removing each element in turn, and evaluating the resulting fit across multiple trials. Constraints on f for this test are presented in Figure 4 and discussed in Section 4.1. In this appendix, we provide an illustration of the depletion patterns for each trial, shown in Figure E2. Filled and unfilled bands correspond to not-depleted and depleted model components, respectively. The legend in Figure E2 refers to the element removed, with "None" representing the case for which all five elements are included (i.e., our main result in Figure 3). The inset plot illustrates the best-fit mean abundance trend (e.g., μ_mT_c + μ_b).

Figure E2. Refer to the following caption and surrounding text. — **Figure E2.** Results on APOGEE data for the element exclusion test described in Appendix D.2. The legend specifies the excluded element, with the case of "None" corresponding to no elements removed (i.e., Si, Mg, Ni, Ca, and Al all included). Solid and dashed bands correspond to the ND and D model components, respectively. The inset plot illustrates the best-fit mean abundance trends for both model components, with the form μ_mT_c + μ_b.
Download figure:
Standard image High-resolution image

Constraints on the mean slopes and intercepts are robust, with ∼1σ shifts to be expected. The most significant differences stem from the standard deviation parameters σ_m and σ_b, as can be seen by the varying bandwidths in Figure E2. Overall, however, the depletion patterns across all presented trials are similar, indicating that our analysis is robust to excluding a single element from our main results in Section 4.1.

Appendix E: Posterior on Model Parameters

The posteriors on model parameters are illustrated in Figure E3 for results on data from APOGEE and Bedell et al. (2018) in blue and green, respectively. Model parameters are subscripted with D and ND for depleted and not-depleted stars, respectively. Constraints on ${\sigma }_{\mathrm{intr},j}$ for Si and Mg are included in Figure E3 (labeled σ_Si,{D,ND}, σ_MG,{D,ND}), though we omit the inclusion of the other ${\sigma }_{\mathrm{intr},j}$ constraints since these parameters are not particularly meaningful to our present analysis.

Figure E3. Refer to the following caption and surrounding text. — **Figure E3.** Posterior on model parameters derived from APOGEE data (blue) and data from Bedell et al. (2018) (green). Given the large number of parameters, we plot intrinsic scatter constraints for only Si and Al in each case. We note that the mean intercept parameter, μ_b, is defined at T_c = 1300 K rather than zero in order to reduce degeneracy with the mean slope parameter, μ_m.
Download figure:
Standard image High-resolution image

We remind readers that some offsets in slope and intercept are to be expected, since APOGEE calibrates to solar neighborhood stars whereas Bedell et al. (2018) calibrates directly to the Sun (see Sections 2.1 and 5.3 for details). Furthermore, our APOGEE samples includes roughly 1700 stars with five elemental abundances for each, while the abundance data from Bedell et al. (2018) contains many more elements with far fewer stars. We note that when limiting the abundance data from Bedell et al. (2018) to the five overlapping elements used in our analysis of APOGEE data (Si, Mg, Ni, Ca, and Al), the resulting constraints have a higher degree of overlap. We postpone the inclusion of more high T_c elements in our APOGEE sample to a future work, since this analysis already relies upon high precision chemical abundances based on detailed discussions in Jönsson et al. (2020).