The APOGEE-2 Survey of the Orion Star-forming Complex. II. Six-dimensional Structure

, , , , , , , , , , , , , , , , , , , and

Published 2018 August 9 © 2018. The American Astronomical Society. All rights reserved.
, , Citation Marina Kounkel et al 2018 AJ 156 84 DOI 10.3847/1538-3881/aad1f1

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/156/3/84

Abstract

We present an analysis of spectroscopic and astrometric data from APOGEE-2 and Gaia DR2 to identify structures toward the Orion Complex. By applying a hierarchical clustering algorithm to the six-dimensional stellar data, we identify spatially and/or kinematically distinct groups of young stellar objects with ages ranging from 1 to 12 Myr. We also investigate the star-forming history within the Orion Complex and identify peculiar subclusters. With this method we reconstruct the older populations in the regions that are currently largely devoid of molecular gas, such as Orion C (which includes the σ Ori cluster) and Orion D (the population that traces Ori OB1a, OB1b, and Orion X). We report on the distances, kinematics, and ages of the groups within the Complex. The Orion D group is in the process of expanding. On the other hand, Orion B is still in the process of contraction. In λ Ori the proper motions are consistent with a radial expansion due to an explosion from a supernova; the traceback age from the expansion exceeds the age of the youngest stars formed near the outer edges of the region, and their formation would have been triggered when they were halfway from the cluster center to their current positions. We also present a comparison between the parallax and proper-motion solutions obtained by Gaia DR2 and those obtained toward star-forming regions by the Very Long Baseline Array.

Export citation and abstract BibTeX RIS

1. Introduction

Star formation in the Orion Complex has taken place over an extended (>10 Myr) period of time, and over a large (>100 pc) volume of space, containing multiple stellar populations. Blaauw (1964) originally identified the Orion OB1 association at 199° < l < 210° and −12° < b < −21° (Figure 1). The current epoch of star formation is confined to the Orion A and B molecular clouds, which contain massive clusters such as the Orion Nebula Cluster (ONC), NGC 2024, and NGC 2068. The stars in these clouds have typical ages of ∼1–3 Myr (Levine et al. 2006; Flaherty & Muzerolle 2008; Muench et al. 2008; Da Rio et al. 2010). Several other populations of young stars that have already dissipated their molecular gas and dust can also be identified. The Orion OB1b region roughly traces the belt stars, with stellar ages of ∼5 Myr (Briceño et al. 2005; Bally 2008; Caballero & Solano 2008). A massive cluster centered at σ Ori is found toward Ori OB1b, although it is comparatively younger than OB1b itself with an age of ∼3 Myr, and the relationship between σ Ori and Ori OB1b is still ill-defined (Jeffries et al. 2006; Sherry et al. 2008; Peña Ramírez et al. 2012; Hernández et al. 2014). Ori OB1a has typical stellar ages of ∼7–10 Myr; the most well-studied group in this region is a cluster near 25 Ori (Briceño et al. 2007; Downes et al. 2014), although many of the properties of this subassociation, including its extent or membership, still remain highly uncertain. North of Ori OB1 is the λ Ori cluster, which has an age of ∼6 Myr (Dolan & Mathieu 2001; Bayo et al. 2011). A supernova occurred near the center of the cluster disassociating nearby molecular gas and sweeping up the remaining material into a ring of dust and gas with a radius of 4°–5° (Dolan & Mathieu 2002; Mathieu 2008; Hernández et al. 2010; Bayo et al. 2011).

Figure 1.

Figure 1. Wide-field image of Orion with the identification of the prominent star-forming regions. Background image is astrophotography, courtesy of Rogelio Bernal Andreo.

Standard image High-resolution image

Although the Orion Complex is the best-studied large-scale star-forming region, much of its six-dimensional (6D) structure remains uncertain, compromising our understanding of its star formation history. Located at the average distance of ∼400 pc toward the galactic anticenter, it was too far away to have precise parallax or proper-motion (PM) estimates by Hipparcos (de Zeeuw et al. 1999). In lieu of direct distance measurements, other studies have used radial velocity (RV), or photometric fitting in order to distinguish structures along the line of sight. Jeffries et al. (2006) observed two distinct RV distributions toward σ Ori, separated by 7 km s−1. They suggested that two populations account for this difference in RVs, one that consists of bona fide members of σ Ori, and one that is a foreground population that increases in concentration toward the north of σ Ori, consisting of members of the Ori OB1ab subassociations. Other RV surveys in the region were conducted by Maxted et al. (2008), Sacco et al. (2008), and Hernández et al. (2014). Caballero & Solano (2008) and Kubiak et al. (2017) have also tried to characterize the population of stars toward Ori OB1b using only photometry.

Alves & Bouy (2012) and Bouy et al. (2014) used photometric observations to argue for an extended foreground population toward the ONC. These findings were supported by Pillitteri et al. (2013). However, those conclusions have been questioned by Da Rio et al. (2016), Fang et al. (2017), and Kounkel et al. (2017a).

The first large-scale RV studies with subkilometer-per-second resolution toward ONC were conducted by Fűrész et al. (2008), followed by work from Tobin et al. (2009) and Kounkel et al. (2016) to identify a peculiar blueshifted stellar population relative to the molecular gas distribution.

Da Rio et al. (2016, 2017) conducted the first APOGEE survey of Orion A as an SDSS-III Ancillary Science program. Radial velocities from this program allowed the confirmation of new members and the identification of substructures in position–position–velocity space. Stellar parameters inferred from the spectra also provided age diagnostics that agreed well with other methods (e.g., H-R diagram placement, disk excess, and association with extinction). The APOGEE Orion data were also used in additional studies of the structure and kinematics of the Orion region. Hacar et al. (2016, 2018) studied the distribution of stars and gas in the Orion A molecular cloud to identify elongated strings that could have formed as a result of global cloud collapse into individual filaments. Stutz & Gould (2016) used the APOGEE Orion data in a paper analyzing a mechanism where the region's velocity dispersion reflects the rate at which protostellar cores decouple from the large-scale magnetic field.

A few studies of PMs in the region have also been conducted; however, the resulting measurements were either not precise enough to place strong conclusions on the dynamics in the Orion regions (Zacharias et al. 2013) or very limited in scope (Dzib et al. 2017). Extensive analysis has also been focused on the region's temporal structure, in terms of both deriving ages for the individual regions (see above) and searching for an age gradient within a given cluster (e.g., Beccari et al. 2017).

Recently, Kounkel et al. (2017b) used the Very Long Baseline Array (VLBA) to measure parallaxes of 27 nonthermally emitting young stellar objects (YSOs) and create a 3D model of the region, although this study was confined only to the Orion A and B molecular clouds. However, with the small sample size it was only possible to investigate the overall orientation of the clouds, but not the presence of any substructure along the line of sight. Zari et al. (2017) recently used Gaia DR1 data to analyze the distribution of young stars in the Orion Complex and identify overdensities corresponding to young clusters. However, the Gaia DR1 data only contained five-dimensional (5D) astrometric solutions for the brightest stars, and the uncertainties in distances did not allow Zari et al. (2017) to conclusively resolve the populations from each other.

In this paper we present an analysis of structure and star formation history in the extended Orion Complex as determined from stellar positions and kinematics (both PM and RV) from Gaia DR2 5D astrometric solutions and APOGEE-2 near-infrared (NIR) spectra. In Section 2 we discuss the data involved in this study. In Section 3 we describe the hierarchical clustering algorithm used for the identification of stellar groups, including an assessment of its performance and limitations. In Section 4 we show the identified groups and discuss their properties. In Section 5 we conclude our results.

2. Data

2.1. APOGEE

2.1.1. Data Products

NIR high-resolution spectral observations of stars toward the Orion Complex were conducted with the Apache Point Observatory Galactic Evolution Experiment (APOGEE) spectrograph, mounted on the 2.5 m Sloan Digital Sky Survey (SDSS) telescope (Gunn et al. 2006; Blanton et al. 2017). This instrument covers the spectral range of 1.51–1.7 μm with R ∼ 22,500 (Wilson et al. 2010; Majewski et al. 2017), and it is capable of observing up to 300 targets simultaneously in a field of view of 1fdg5. A total of 8991 unique targets have been observed toward the Orion Complex (4259 toward the Orion B/Ori OB1ab region, 2991 toward the Orion A molecular cloud, and 1741 toward λ Ori), most of which were observed repeatedly over the course of two to three epochs. These targets were observed either as part of the SDSS-III APOGEE IN-SYNC survey (with the targets selected primarily on the basis of the previous identification of stars as YSOs; Da Rio et al. 2016, 2017) or as part of the APOGEE-2 Young Cluster Survey (with sources identified based on the observed IR excess, optical variability, or other indicators of membership; more details pertaining to the targeting of the survey are presented in Zasowski et al. 2017; Cottle et al. 2018). Previously, 43% of the data used in this work have been included as a part of SDSS DR14 (Abolfathi et al. 2018), and 30% in DR12 (Alam et al. 2015).

In addition to the stars in Orion, in the discussion of the data we also include sources observed by APOGEE and APOGEE-2 toward other star-forming regions and young clusters. We also analyze spectra from Kepler 21 and NGC 188 fields to test the stability of the stellar parameters we extract from the APOGEE spectra, as both fields have been targeted by APOGEE for more than 20 epochs. However, the exact properties of the targets and the stellar population in these regions are beyond the scope of this work, and we defer the discussion of the non-Orion sources to future papers.

Unfortunately, the gradient in RV remains for the sources with Teff < 3000 K in the correlation, which offsets RVs by up to 3 km s−1 at 2400 K; however, this affects only a small fraction of the sources in Orion (∼150 stars, which are typically hotter than 2600 K). A more comprehensive analysis of stellar RVs across all the star-forming regions (particularly those that are nearby and for which the spectra for the cooler sources are available) would be needed in order to model this gradient in the future.

2.1.2. Stellar Parameters

All the spectra are originally processed by the APOGEE Stellar Parameter and Chemical Abundances (ASPCAP; García Pérez et al. 2016) pipeline, providing various stellar properties, including the abundances (Holtzman et al. 2015). However, ASPCAP is primarily optimized for giants; given that the evolved YSOs targeted toward Orion are primarily dwarfs, this could introduce potential biases or systematic errors in the derived parameters. It is also a known issue that the uncertainties of various quantities (e.g., RVs) may be significantly underestimated, or that there may be a correlation in RVs for sources with Teff < 3000 K (Nidever et al. 2015).

To account for these and other issues, we processed the spectra using the pipeline developed by Cottaar et al. (2014), which is better suited for the analysis of spectra of YSOs. It fits to each spectrum the effective temperature (Teff), surface gravity ($\mathrm{log}\,g$), RV, rotational velocity ($v\,\sin \,i$), and veiling (rH), using a synthetic grid and interpolating between the grid points. A Markov chain Monte Carlo (MCMC) simulation is performed at the end of the fit to measure the uncertainties in all the parameters and validated against the scatter in each parameter across multiple epochs of spectroscopic observations. Average parameters are determined for each star as a weighted average of the parameters measured for each epoch, as presented in Table 1.

Table 1.  Properties Derived from the APOGEE Spectra toward the Stars in Orion

α δ 2MASS ID Gaia DR2 Nepoch Teff $\mathrm{log}\,g$ $\overline{{RV}}$ ${F}_{\mathrm{bol}}$ AV Θ Flaga Group
(J2000) (J2000)   ID   (K) (dex) (km s−1) (10−10 erg s−1 cm−1) (mag) (μas)    
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13)
85.1387100 −10.148064 2M05403328-1008530 3011876641301608832 1 4575 ± 46 3.360 ± 0.113 −12.379 ± 0.390 2.37 ± 0.10 2.40 ± 0.25 20.13 ± 0.59 0 field
85.5587769 −9.938928 2M05421410-0956201 3011907015310374528 1 4645 ± 31 4.326 ± 0.075 21.491 ± 0.528 6.51 ± 0.43 4.05 ± 0.20 32.39 ± 1.15 2 l1647-2
85.8625565 −9.993770 2M05432701-0959375 3011892137543646080 1 5388 ± 68 4.710 ± 0.092 20.863 ± 0.560 6.62 ± 1.49 2.05 ± 1.00 24.27 ± 2.81 2 l1647-2

Note.

a0 = photosphere; 1 = photosphere with poor goodness of fit; 2 = photosphere up to H band.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

Table 2.  Astrometric Solutions from Gaia Utilized in the Analysis

α δ Gaia DR2 π μα μδ Group
(J2000) (J2000)   (mas) (mas yr−1) (mas yr−1)  
(1) (2) (3) (4) (5) (6) (7)
85.2329562 −10.639077 3010342680846510208 2.262 ± 0.093 0.768 ± 0.170 −0.601 ± 0.158 l1647-3
87.9103488 −10.637480 3010909277228047744 2.385 ± 0.075 −0.358 ± 0.126 3.808 ± 0.111 field
86.0242576 −10.557282 3011793525094619008 2.109 ± 0.025 0.179 ± 0.045 −1.225 ± 0.043 l1647-2

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

To test the performance of the pipeline, we processed the APOGEE spectra of the stars observed toward the Pleiades with various synthetic grids, namely, the Coelho (Coelho et al. 2005), BT-Settl (Allard et al. 2012), and PHOENIX (Husser et al. 2013) spectral libraries, all of which were restricted just to the solar metallicities. A comparison was also made to the parameters previously obtained for the sources in the original IN-SYNC survey (Cottaar et al. 2014), which used the BT-Settl grid in spectral fitting. All the grids produced largely comparable solutions, although there were some systematic differences, most likely due to the underlying assumptions in the models (Figure 2). The most notable difference is in the absolute value of $\mathrm{log}\,g$: while the correlation between all models is largely linear, there is an offset of 0.5 dex between the solutions produced with PHOENIX and BT-Settl grids and an offset of 1.5 dex between the PHOENIX grid and the ASPCAP catalog. There are some differences in the absolute calibration of Teff. The choice of grid has little to no effect on the determination of RV and $v\,\sin \,i$. We ultimately adopted the PHOENIX grid for our spectral analysis, as it produced the most self-consistent solutions and covers the widest range in Teff (2300–15,000 K).

Figure 2.

Figure 2. Comparison of Teff and $\mathrm{log}\,g$ between the correlations produced with various grids. The first row shows a comparison with the results from Cottaar et al. (2014) and Da Rio et al. (2016), the second row is restricted to sources in the Pleiades (plateau is from the Coelho grid edges), and the bottom row includes sources that were observed by APOGEE toward the observed star-forming regions that have both Teff and $\mathrm{log}\,g$ produced in the ASPCAP catalog (which limits the sample primarily to the background red giants).

Standard image High-resolution image

While the pipeline does try to interpolate between grid points, it is less successful in some regimes than others (mainly with respect to Teff), and particular values for the parameters are preferred to the exclusion of others. The most prominent zones of avoidance (nodding) for the PHOENIX grid occur at Teff ∼ 3900, 5000, 5200, and 5700 K (Figure 3). A similar nodding effect is present in correlations with all grids, although it appears to be most frequent in the BT-Settl grid and weakest in the Coelho spectral library. Correlations with the PHOENIX grid also show a peculiar absence of sources at Teff ∼ 3500 K, and to a lesser degree at Teff ∼ 5000 K, although this gap does not follow the grid edges. The BT-Settl grid does not appear to show this gap (resulting in a dip deviating from the line of equity in Figure 2), and the presence or absence in the Coelho and ASPCAP grids is inconclusive owing to the limited temperature range. Comparison with optical surveys (e.g., Fang et al. 2018) suggests that most likely this 3500 K gap is due to the PHOENIX model producing NIR spectra that are not entirely representative of the real stars in this temperature regime.

Figure 3.

Figure 3. Distribution of Teff and $\mathrm{log}\,g$ for the sources correlated with the PHOENIX grid. Yellow circles show the entire sample observed by APOGEE toward all the young clusters (including sources in star-forming regions other than Orion, as well as the contaminating field population); only sources toward the Orion Complex that pass the imposed membership cuts are shown in blue.

Standard image High-resolution image

Veiling is a property that is only applicable to the stars that are accreting, and while a number of sources in the sample are disk-bearing accretors, a majority of them are not. However, when present, veiling will weaken spectral lines. Since the spectra were fit with only solar-metallicity models, the pipeline uses veiling to address any discrepancies in the absolute line depths. At low Teff, the veiling parameter does systematically rise up to 0.3–0.5, even though most sources are expected to have veiling ∼0 (however, the accreting sources with high veiling can be identified above the systematic values). We correlated a small fraction of the stars (∼200 sources in Orion) with a larger PHOENIX grid that does include the effects of metallicity. The systematic effect in veiling at low Teff for nonaccretors is significantly reduced to only ∼0.1. The correspondence in Teff between models with and without metallicity variations is mostly one-to-one (although there is some scatter beyond 5000 K). Comparing the reduction where metallicity was and was not allowed to vary, the surface gravities are consistent for sources with $\mathrm{log}\,g$ > 4, but offsets grow for lower $\mathrm{log}\,g$. Below $\mathrm{log}\,g$ ∼ 3, the parameters derived assuming solar metallicity produced $\mathrm{log}\,g$ about 1 dex lower than those where metallicity is allowed to vary (primarily affecting the location of the red giant branch). The effects of nodding and the gaps described above remain unchanged.

The pipeline is capable of two modes of operation: fitting a single epoch of a given star at a time, or processing all the epochs to simultaneously fit a single Teff, $\mathrm{log}\,g$, and $v\,\sin \,i$. However, with a large number of epochs, the latter mode produces increasingly unreliable solutions for the quantities that are allowed to vary between epochs, such as RV and veiling. We tested the pipeline on sources in the NGC 188 and Kepler 21 regions with 20+ epochs. Almost all sources had RVs of several epochs with signal-to-noise ratio (S/N) > 20 that were artificially scattered to values far outside the mean RV, compared to the solutions produced in the single-epoch fits. This effect was also present in sources with as few as three epochs. In extreme cases, an excessive noise from a single low-S/N spectrum may affect the invariant parameters as well. For this reason the parameters presented in this paper were derived by fitting each visit spectrum individually and then computing a weighted average of the visit-specific parameters, as the parameters from the individual visits have a much greater reliability. This is contrary to the approach taken in the IN-SYNC analyses, which adopted simultaneous fits that can result in spurious RVs. Because of this, identifications of spectroscopic binaries made on the basis of the original IN-SYNC multiepoch RVs may therefore be unreliable.

2.1.3. Uncertainties

The formal uncertainties σ in all the parameters are calculated by $\sigma =3{\sigma }_{\mathrm{MCMC}}\sqrt{{\chi }_{f}^{2}}$, where σMCMC is the variance in the MCMC output and ${\chi }_{f}^{2}$ is the reduced ${\chi }^{2}$ of the best fit. These calculations are similar to the approach by Cottaar et al. (2014) and provides a good agreement to the scatter in the data. If multiple observations of the same source are available, weighted averages (and the weighted uncertainties in those averages) are computed for all parameters from spectra with S/N > 20. It should be noted that these σ incorporate only the uncertainties in the fit, and not the systematic differences between various models.

For all sources with more than four epochs, we compute a degree of scatter ξ = Δ/σ, where Δ is defined as the absolute difference between the weighted average of a parameter and the value of that parameter at a given epoch (Figure 4). The distribution of ξ can be approximated by a Lorentzian 1/(γ(1 + ξ/γ)2) with a width γ = 0.4, and it is comparable for all parameters (the ξRV distribution does have a long tail, however, largely due to the presence of spectroscopic binaries). It is also largely invariant from the number of epochs available, or from which cluster a source originated. Approximately 80% of the scatter is within 1σ of the weighted average, ∼90% is within 2σ, and ∼95% is within 3σ. The median single-epoch uncertainties in all the parameters are ${\bar{\sigma }}_{\mathrm{RV}}$ = 0.4 km s−1 , ${\bar{\sigma }}_{\mathrm{log}g}$ = 0.1 dex, ${\bar{\sigma }}_{{T}_{\mathrm{eff}}}$ = 50 K, ${\bar{\sigma }}_{v\sin i}$ = 1.7 km s−1, and ${\bar{\sigma }}_{\mathrm{veil}}$ = 0.035. These uncertainties should be considered as a lower limit since they do not take into account potential added systematic offsets in the parameters.

Figure 4.

Figure 4. Top panel: distribution of ξ values in the entire sample for all parameters (normalized to 1 at ξ = 0.1). The black line shows the Lorentzian distribution with γ = 0.4. Middle panel: fraction of sources with ξ < 1 within Teff bins of 1000 K. Corrections for σ at Teff > 6250 K are applied. Bottom panel: same as above, but for $\mathrm{log}\,g$ bins of 0.5 dex.

Standard image High-resolution image

There does not appear to be a significant variation in the distribution of ξ of any parameters as a function of temperature or surface gravity at Teff < 6250 K and $\mathrm{log}\,g$ > 1 (Figure 4, although RVs of giants with $\mathrm{log}\,g$ < 3.5 may be affected by jitter). Since stars with high Teff have only a few spectral features that do not change significantly between different templates, σRV, σvsini, σveil, and σlogg appear to be underestimated by a factor of 2, and ${\sigma }_{{T}_{\mathrm{eff}}}$ by a factor of 8 in sources with Teff > 6250 K. We correct the uncertainties by the corresponding factors. The uncertainties for parameters with $\mathrm{log}\,g$ < 1 are somewhat more difficult to correct owing to a very low number of the affected sources; however, all of them are on the red giant branch, for which more precise fits are available in the ASPCAP catalog.

2.1.4. Stellar Radius via Spectral Energy Distributions (SEDs)

The observed APOGEE spectra have been fitted by the method discussed by Stassun & Torres (2016) and Stassun et al. (2017), in which a star's angular radius, Θ, can be determined empirically through the stellar bolometric flux, ${F}_{\mathrm{bol}}$, and effective temperature, Teff, according to

Equation (1)

where σSB is the Stefan–Boltzmann constant.

${F}_{\mathrm{bol}}$ is determined empirically by fitting stellar atmosphere models to the star's observed SED, assembled from archival broadband photometry over as large a span of wavelength as possible, preferably from the ultraviolet to the mid-infrared. Teff is measured from the APOGEE spectra, and thus the determination of ${F}_{\mathrm{bol}}$ from the SED involves only the extinction (AV) and an overall normalization as the two free parameters.

We adopt the broadband photometry available in the literature spanning the wavelength range of approximately 0.15–22 μm. As demonstrated in Stassun et al. (2017), with this wavelength coverage for the constructed SEDs, the resulting ${F}_{\mathrm{bol}}$ are generally determined with an accuracy of a few percent. For stars with Teff uncertainties of ≲1%, the ${F}_{\mathrm{bol}}$ uncertainty is dominated by the SED goodness of fit. With the exception of a few outliers, it was shown that one can achieve an uncertainty on ${F}_{\mathrm{bol}}$ of at most 6% for ${\chi }_{\nu }^{2}\leqslant 10$, with 95% of the sample having an ${F}_{\mathrm{bol}}$ uncertainty of less than 5%.

Here we apply this methodology to stars for which the estimated veiling in the H band was less than 0.5, suggesting little to no active accretion. Most of the stars should therefore be diskless, and the SED fit should be representative of the bare stellar photosphere. However, a number of the stars were found to exhibit significant excess emission in the WISE infrared bands, and in some cases in the Two Micron All Sky Survey (2MASS) KS band as well. Therefore, we ran the SED fitting procedure a second time on these stars, now excluding the 2MASS KS and the WISE photometry. Out of 8991 stars observed toward Orion, 6850 could be fit effectively as pure photospheres with the full wavelength range up to 22 μm, 298 had somewhat poorer goodness of fit, and 663 can be fit as photospheres up to H band, excluding longer-wavelength photometry. The remaining 1180 sources could not be fit owing to either high veiling or large infrared excess.

2.1.5. Sample

The sources observed toward the Orion Complex with APOGEE had substantial contamination from the foreground and background field stars. To minimize this contamination in the analysis throughout the paper, we excluded sources with RV < 10 km s−1 or RV > 40 km s−1, or those that fail the cuts in parallax, PMs, or color imposed in Section 2.2, if the data from Gaia are available (Figure 5).

Figure 5.

Figure 5. Distribution of sources that satisfy the membership cuts toward the Orion Complex. The green symbols show sources that are uniquely detected by Gaia, the red symbols those that are uniquely detected in the APOGEE footprint, and the blue symbols those that are common between both surveys.

Standard image High-resolution image

2.2. Gaia

Gaia has recently produced its second data release (Gaia Collaboration et al. 2018), which contains astrometric solutions of parallax and PMs for 1.3 billion stars with G < 21 mag, significantly improving on the precision of previously available estimates of these parameters from Gaia DR1 and Hipparcos.

Previously, parallax and PMs have been measured for a number of young stars with the VLBA (Melis et al. 2014; Kounkel et al. 2017b; Ortiz-León et al. 2017a, 2017b; Dzib et al. 2018; Galli et al. 2018). The precision of these observations is comparable to that of Gaia DR2. Currently, a direct comparison of the measured distances between two surveys can be performed for 55 stars.

In general, there does appear to be good agreement between the measurements for single stars, as well as spectroscopic and long-period binaries (Figure 6), i.e., the sources that do not strongly deviate from an approximation of a linear PM. The astrometric solutions between the two surveys can be described with

The systematic offset between parallax measurements is consistent with the zero-point offset of −0.03 mas reported by Lindegren et al. (2018). The large offset in μδ is driven by a few sources that may be affected by multiplicity that has not been apparent in the time frame of the VLBA-only observations; this offset is consistent with 0 for the remaining sources. As Gaia DR2 still does not have a prescription for astrometric binaries, the systems with a period of a few to a few dozen years present a considerable source of uncertainty that significantly increases the scatter in all parameters. In future data releases, this is expected to be improved.

Figure 6.

Figure 6. Comparison of the astrometric solutions between Gaia and VLBA observations. Blue symbols show the sources that did not show deviation from linearity in the VLBA observations (single stars, spectroscopic binaries, binaries with very long periods), green symbols show the sources that were found to be accelerating over the course of the VLBA observations, and red symbols are the astrometric binaries with periods up to ∼15 yr. The right column shows the difference between two measurements vs. their combined uncertainty.

Standard image High-resolution image

To analyze the structure within the Orion Complex, we selected the sources in Gaia DR2 with a position on the sky of 75° < α < 90° and −11° < δ < 15°, parallax 2 mas < π < 5 mas, and PMs −4 mas yr−1 < μα < 4 mas yr−1 and −4 mas yr−1 < μα < 4 mas yr−1 (Table 2). The ranges were chosen from the visual examination of the data to include all the overdensities that are associated with Orion. We excluded the astrometric measurements for the sources that had σπ > 0.1 mas or σμ > 0.2 mas in order to retain only the sources with precise measurements. Furthermore, we discarded the sources that do not satisfy

where MG = G + 5 − log(1000/π) in order to minimize the contamination from the main-sequence and red giant stars (Figure 7).

Figure 7.

Figure 7. Color–magnitude diagram for sources observed with APOGEE and those that satisfy positional and kinematical cuts in the Gaia catalog without the APOGEE counterparts. The blue sources show those that pass the color cuts to identify the members of the Orion Complex; the yellow sources are the ones that have been rejected.

Standard image High-resolution image
Figure 8.

Figure 8. Examples of the generated synthetic population (that remains after all the cuts, described in the Appendix), both single and multicluster runs. Circles show the cluster members; diamonds are the field population. Filled symbols correspond to the sources for which membership was accurately determined; open symbols correspond to either false positives (diamonds) or false negatives (circles). Colors are used to distinguish between multiple clusters in the same population. Only projection onto the sky is shown.

Standard image High-resolution image

Throughout the paper, all the calculations are done in the parallax space. Conversion to distances occurs only when we report the averages and to obtain the stellar ages. When converting from parallax to the distance space, we correct for 0.03 mas offset and assume a systematic error of 0.02 mas (Lindegren et al. 2018).

2.3. Age Derivation

We estimate the stellar ages using several different methods. One set of estimates is purely photometric. For YSOs without a reliable parallax measurement, we calculate MG by assigning the average distance of their corresponding group (See Section 4). These calculations of MG ignore the effect of extinction: outside of the youngest regions that are still associated with the molecular gas, representing most of the footprint of the Orion Complex, the extinction is expected to be relatively small (AV ∼ 0). We then interpolate MG and [GB − GR] over the PARSEC-COLIBRI grid of isochrones (Marigo et al. 2017). Despite the latest bandpass definitions, there appears to be a systematic offset between the data and the grid; to correct for it, we offset the grid's MG by +0.2 mag and [GB − GR] by +0.15 mag. We note that the absolute calibration of the resulting ages may be imprecise, but the relative ages across the Complex should be largely consistent outside of the regions strongly affected by extinction. We refer to the ages derived with this technique as AgeCMD.

In addition to the above calculations, we also obtained ages for the sources observed with APOGEE, working with the Teff obtained from the spectra (see Section 2.1.2) and ${L}_{\mathrm{bol}}$. We work with the extinction values computed as described in Section 2.1.4, as well as those estimated comparing the observed G − J and GBP − GRP colors from Gaia DR2 and 2MASS catalogs with the intrinsic colors obtained by interpolating Teff in a modified version of Table A5 from Kenyon & Hartmann (1995) to allow for the new bandpasses and then transforming the color excesses to AV using the coefficients CG = 0.9145, ${C}_{{G}_{\mathrm{BP}}}=1.039$, and ${C}_{{G}_{\mathrm{RP}}}$ = 0.601 and a relation of ${A}_{V}=2.283{(({G}_{\mathrm{BP}}-{G}_{\mathrm{RP}})-({G}_{\mathrm{BP}}-{G}_{\mathrm{RP}})}_{o})$. With these extinctions and using the parallaxes from Gaia DR2, we estimated the absolute MG and MH magnitudes, which were converted to ${L}_{\mathrm{bol}}$ by applying the bolometric corrections from Kenyon & Hartmann (1995). To obtain the ages, we interpolated ${L}_{\mathrm{bol}}$ and Teff into the PARSEC-COLIBRI isochrones, as explained in Suárez et al. (2017). When possible, ages obtained from both bands were averaged, and we refer to them as AgeHR. In contrast to AgeCMD, because of the dependence on Teff from APOGEE, AgeHR values are not available for the entirety of stars that are identified as members of the Orion Complex. Where they are available, however, they are somewhat more reliable for the regions that are still associated with the molecular gas.

Finally, for the sources observed with APOGEE, we also interpolate the spectroscopically determined Teff and $\mathrm{log}\,g$ values directly to estimate the ages. It is difficult to interpolate over these parameters with the PARSEC grid owing to strong nonlinear offsets between the grid and the data. But the relative distribution of ages for the various populations that we can infer from the spectroscopic parameters is consistent with the photometric distribution.

We do not report age measurements for the individual stars owing to the multitude of the utilized methods and the scatter in the measurements, which we plan to study and minimize in future work, but we discuss the population-averaged ages (both AgeCMD and AgeHR) in Section 4. In most cases, both are comparable, with the main difference originating from a somewhat different sample size.

3. Clustering Algorithm

A number of techniques are available for identifying clustering in multidimensional data. Three main approaches that are commonly used are nonparametric hierarchical clustering, k-means, and parametric mixture modeling. The first method relies on computing a distance matrix between all sources and then grouping sources with a chosen algorithm together with a distance below a certain threshold. k-means requires an assumption regarding the number of groups present and iteratively partitions the data until the centroid within each partition does not change significantly between steps. The mixture modeling fits a population as a collection of parametric distributions, usually Gaussian, and gives a probability to each source of belonging to a particular group based on this distribution. Here the number of clusters emerges as a parameter of the model.

A number of algorithms are available that utilize these techniques. Each one has its own advantages and limitations, pertaining to the ability to reject nonclustered field sources that are not part of any group (such as foreground or background stars); ability to separate populations that are closer to each other than their respective sizes; ability to identify structures of different sizes, shapes, and densities; and dependence on the initial assumptions pertaining to the structure within the data. A review of various methodologies is presented by Everitt et al. (2011) and Feigelson & Babu (2012). We chose "average linkage" hierarchical clustering that gives structures intermediate between elongated (single linkage) and compact (complete linkage).

A major issue that affects most of these algorithms is the difficulty in processing sources with incomplete data (such as for sources that have RV but not μ/π and vice versa), making them impractical to use in this analysis. A common statistical treatment in these cases is to either exclude these sources from a clustering analysis entirely or fill in the missing values with sensible estimates through imputation (e.g., Gelman & Hill 2006). While both techniques can work, it is possible for them to introduce strong biases (Wagstaff & Laidler 2005).

In this section we describe a hierarchical clustering procedure for which we developed a prescription for the missing data. We use this code to identify distant structures in the positional and kinematical data provided by Gaia and APOGEE toward the Orion Complex. We then test this code on a synthetic population of stars with approximate precision of the APOGEE and the Gaia DR2 to determine how reliably it is able to recover clusters. Prior to the release of DR2, the code was also tested with Gaia DR1 (Gaia Collaboration et al. 2016) and HSOY (Altmann et al. 2017), although due to the significant improvement in the data quality and quantity of DR2, these tests are not described in the text.

Throughout this discussion, all the velocity components are used in the local standard of rest (LSR) reference frame, and we will use vr, vα, and vδ to explicitly refer to RVLSR, μα,LSR and μδ,LSR. However, multiple definitions of LSR exist, and both are used in the text. One of them can be applied to all three dimensions of motion by first converting them to the galactic reference frame (Johnson & Soderblom 1987; Perryman & ESA 1997), subtracting the stellar motion (Schönrich et al. 2010), and converting back to the equatorial reference frame. If one or two of the dimensions of motion are unavailable, they are set to zero for the purposes of the conversion, but they do not have any bearing on the remaining dimensions. The other definition of LSR is used to determine just vr, from assuming the solar motion of 20 km s−1 toward α = 18h, δ = 30° (in B1900 reference frame; Gordon 1976). The former definition is used for the analysis of structure and kinematics in Orion, while the latter is used for comparisons of the stellar motion to that of the molecular gas, as it is more commonly used in radio astronomy. In Orion, both definitions produce vr comparable to within 1 km s−1.

3.1. Methodology

Hierarchical clustering algorithms require one or more rules (or "constraints") to define where to "cut the tree" so that a small number of physically reasonable clusters are produced. These rules can be algorithmic, as with the choice of critical minimum cluster population in the widely used single-linkage clustering procedure of Gutermuth et al. (2009). But the constraints can be based on disciplinary knowledge external to the data set under study. We choose to set four constraints: maximum cluster size, maximum velocity range, an outlier rejection we call "reach," and the minimum number of stars. We set the values of these constraints based on a detailed simulation study described in Section 3.2. In statistical parlance, this is an example of "constrained" or "semi-supervised" clustering (Basu et al. 2008).

We identify structures using the 6D data (α, δ, π, vα, vδ, vr), first standardizing them by scaling each dimension using the standard deviation of the values within it to avoid a mixture of units between positional and kinematical components and to give all dimensions an equal weight. Then, we compute the Euclidean pairwise distance matrix (L2 norm) between all the sources using the IDL routine distance_measure. Groups that are formed with this metric are invariant to rotations and translations in the p-space (Duda et al. 2001).

Given that astrometric and spectroscopic catalogs do not have a complete agreement between the sources that are included, APOGEE-only sources have only three dimensions of data (α, δ, vr), and Gaia-only sources have five dimensions (α, δ, π, vα, vδ). 6D distances involving these sources become undefined. To incorporate all the available data, we compute distance matrices on all three permutations (3D, 5D, and 6D) of the catalogs separately. All three matrices have the same size to simplify their merger later on even though certain elements of each matrix may not be defined in cases where data are incomplete. Nonetheless, even though all variables have been normalized, the measures of distance in each matrix are not compatible: those that were measured from the data with the larger number of the available dimensions are systematically larger (e.g., by $\sqrt{5/3}$). Additionally, the difference in source selections may further bias the measured distances. To correct for this, each matrix is then normalized by the 2nth-smallest element in the matrix, where n is the number of sources that have complete data in a given number of dimensions. This is an approximation of the median linking length of a constructed dendrogram for each permutation. The distance matrices are merged into a single distance matrix so that a unified clustering calculation can be performed, keeping the value from the permutation with the largest number of available dimensions for each pair of stars.

Then, a modified version of the IDL routine cluster_tree is used to identify structures in the data using the computed distance matrix, using the pairwise average linkage. Note that average linkage gives more compact and reliable clusters than the more commonly used single-linkage ("friends-of-friends") algorithm (Everitt et al. 2011). By default, it continues to run until every single source in a catalog is joined into a single group. To minimize excessive merging of unrelated structures or contamination from field stars, we imposed constraints that prevent a star from joining into a group if (1) the resulting group would be bigger than 4° in diameter (∼20–30 pc at the relevant distance range), (2) the absolute velocity difference (in any of the three dimensions) is bigger than 9 km s−1, and (3) after a group has begun growing, it cannot accept any new stars if the linking length necessary to do so relative to the median linking length binding all stars to a cluster is greater than the threshold of 1.2 (cluster reach). At the end of the process, each star has a unique group assignment, and groups with fewer than 10 members are rejected. All these parameters are chosen empirically through the testing of performance on a synthetic data set, which is described below.

3.2. Synthetic Tests

We applied the average linkage clustering algorithm on the synthetic population described in the Appendix. This population consists of a uniform distribution of field stars into which is embedded a randomly generated cluster with parameters such as distance, characteristic cluster velocity, age, and a number of stars that follow the normal distribution of a given velocity dispersion and the characteristic size. These parameters are varied in a manner that is representative of the ranges appropriate for the young stars toward Orion (Figure 8). While a single cluster cannot compare to the real population in terms of complexity, it can be used to test the performance of the algorithm in the various regimes, and the recovery properties of a single cluster are comparable to the recovery properties of multiple clusters in a single population.

For the purposes of clarity, the term "cluster" will refer to the generated population of YSOs, and the term "group" will refer to the output of the algorithm. To test the performance of the algorithm, we analyzed results for 1000 random clusters (each with a unique field contamination) processed by the clustering algorithm under a variety of input parameters (e.g., maximum group size, maximum group velocity). We evaluated the success of the clustering algorithm by computing the following for the resulting groups:

  • 1.  
    False-positive fraction (FP) is defined as a number of field stars falsely identified as group members relative to the total number of grouped stars.
  • 2.  
    True-positive fraction (TP) is defined as the number of true cluster members in a group relative to the total number of true cluster stars.
  • 3.  
    If FP > 0.5, then the group is considered as fake. In a given set of permutations, a single TP and a single FP are calculated, adding the values for all the real groups (i.e., FP < 0.5) found.
  • 4.  
    Split cluster fraction (SC) is defined as $\sum {N}_{\mathrm{real}}/{N}_{\det }-1$, where $\sum {N}_{\mathrm{real}}$ is the total number of real groups identified in all runs and Ndet is the number of runs in which at least one group was detected (which may be <1000, as in some configurations of clusters no groups can be identified).
  • 5.  
    Similarly, fake cluster fraction (FC) is defined as $\tfrac{\sum {N}_{\mathrm{fake}}}{{N}_{\det }}$, where $\sum {N}_{\mathrm{fake}}$ is the total number of fake groups identified in all runs.
  • 6.  
    Missing cluster fraction (MC) refers to the fraction of runs without a single group detection relative to the total number of permutations, i.e., $1-\tfrac{{N}_{\det }}{1000}$.

Prior to determining the exact set of conditions that would be appropriate to use to truncate the branches of the cluster tree, we explored the parameter space, varying one property of the algorithm at a time, in order to consider the effect they had (Figure 9). The goal of this exercise was to minimize the contamination while maximizing the fraction of true cluster members recovered by the algorithm to determine the appropriate thresholds for all the parameters.

Figure 9.

Figure 9. Synthetic cluster recovery performance. Panels (a)–(d) refer to the parameters of the algorithm, and the calculations were performed on the same set of 1000 clusters. The vertical line shows the default parameters used. Panels (e)–(i) refer to the synthetic population properties.

Standard image High-resolution image

Cluster reach (tolerance relative to the median linking length in a group to reject outliers) has relatively little effect on all of the parameters. As it relaxes, contamination increases as well (FP increases by 2% from varying the reach parameter from 1 to 2; FC increases by 4%), and clusters are more likely to be split into several groups (MC increase of 5% over the same range), while having small gains in TP (TP increases by 2.8%).

Varying a maximum allowed size of a recovered group has the largest effect on TP and MC: if it is too restricting, so as not to encompass the entire cluster, cluster members are more likely to be missed. On the other hand, contamination levels rise relatively modestly with more tolerant thresholds. Varying the maximum size from 2° to 4° increases FP by 2%, increases TP by 10%, and decreases SC by 8%. This parameter was fixed to 4°, which was the extent of the simulation, and which is almost large enough to encompass an entire molecular cloud in the Orion Complex.

The maximum velocity cut has a qualitatively similar effect to the maximum size; however, relaxing the threshold beyond 9 km s−1 increases the contamination significantly, with FC rising by 14% at 12 km s−1. Finally, the critical number of stars—defined as the minimum number of stars that can be considered as a group at the end of the output—also has a significant effect. Allowing smaller groups increases TP through producing smaller groups that split from the main cluster, while also exponentially increasing FC.

With the default parameters listed in Section 3.1 applied to the synthetic data set, TP = 79.5% ± 0.5% (27,648 out of 34,784 cluster members identified), FP = 6.20% ± 0.14% (1830 out of 29,524 grouped stars were from the field), SC = 27.8% ± 1.7% ($\sum {N}_{\mathrm{real}}=1247;$ Ndet = 976), FC = 2.9% ± 0.5% ($\sum {N}_{\mathrm{fake}}$ = 28), and MC = 2.4% ± 0.5%.

Next, we tested the effect of the specific cluster properties on the recovery. Holding one cluster parameter at a time fixed to a specific value and allowing others to vary in the manner described in the Appendix, we generated 1000 clusters for each iteration. As the number of stars in a cluster becomes larger, the cluster is increasingly more likely to split into several smaller groups: clusters with 75 stars have produced multiple groups in 85% of iterations. While many of these splits occur along one dimension, some of the resultant groups are significant. While clusters were randomly generated, some correlations between the positional and kinematic components may occur, and multiple groups may portray this correlation. Sometimes, a split may occur just through a single dimension, and a closer examination of the output is necessary to confirm its significance. On the other hand, clusters with fewer than 30 stars are less likely to be recovered in the first place: almost 10% of clusters with 20 stars have not produced any groups.

Cluster distance does not have a significant effect on any of the tests, nor does the cluster age up to 15 Myr, as clusters older than that would require different photometric cuts, and neglecting any effects of that would increase the dispersion of the older clusters beyond the random generation. Compact clusters with small velocity dispersion are more likely to be recovered, with TP > 80% for the typical cluster radii less than 2.5 pc (with the Gaussian distribution of stars in a cluster, a total cluster size is >10 pc), and for velocity dispersion less than 2 km s−1. Those with velocity dispersion larger than 1.5 km s−1 are likely to run into the edges of the maximum allowed velocity range and be split into multiple groups, with the increase of SC from ∼20% up to 60% at 2.5 km s−1. It should be noted, however, that in the Orion Complex the ONC is the most massive cluster, the velocity dispersion of which is ∼2.5 km s−1 (Kounkel et al. 2016); in other regions dispersion on the order of 1 km s−1 is not uncommon (e.g., Briceño et al. 2007; Nishimura et al. 2015).

Then, we considered the effect of the field contamination, by increasing the number of field stars by a multiple factor to what has been previously adopted in the Appendix. If the field population is underestimated in the generation of the synthetic population (or, if the photometric cuts are modified to include YSOs older than 15 Myr), then TP does decrease somewhat (dropping to ∼74% if the field population is 3 × 4.5 × 104 stars), with a slight increase in FP (up to 11% at the same level), but the main difference is the significant rise in FC (almost every run produces one or more fake groups), which requires additional vetting.

Finally, a test was performed for the confusion in assignment of the stars in a multicluster population. Two clusters were placed at a random position in the sky in a field population covering 8 × 8 deg2. Other properties were allowed to vary as above. A group was identified as being associated with a particular cluster if more than 2/3 of all stars identified in the group originated from that cluster. In a run consisting of 10,000 simulations, we recorded the fraction of stars from a neighboring cluster that made it into a wrong group, which was 0.6% ± 0.01%. We also tracked the fraction of the permutations in which at least one of the identified groups contained a mix of stars such that no clear cluster assignment could take place (with at least one-third of sources in a group originating from one cluster, and one-third from the second cluster). This fraction was somewhat higher at 2.04% ± 0.14%, where 60% of the affected groups were split off from the main groups associated with each cluster, 20% had confident grouping of one of the clusters and confusion regarding the assignment of the second one owing to the contamination from the first, and in 10% of the cases both clusters are merged into a single group. Unfortunately, due to a multitude of the dimensions in which parameters could vary, it is difficult to provide the specific conditions of the minimum distance between two stellar populations before confusion would occur.

4. Results

In this section we present the results of the clustering algorithm (Figure 10), which was applied to 10,248 stars, of which 890 came solely from the APOGEE sample, 7081 were unique to Gaia, and 2277 appeared in both data sets with complete 6D characterization. In total, we identified 190 groups throughout the Orion Complex; their average properties are listed in Table 3. These groups were manually combined into five larger structures, which are described below (Figures 1113), and when feasible cataloged according to the closest major object near them, such as a star visible to the naked eye, known cluster, or molecular gas region.

Figure 10.
Start interaction
Standard image High-resolution image Figure data file
Figure 10.

Figure 10. Left: structure of the Orion Complex as traced by the characteristic sizes of the identified groups. Colors correspond to the different populations: Orion A (green), Orion B (orange), Orion C (cyan), Orion D (red), and λ Ori (blue). Right: typical kinematics of the groups. Dashed lines show the PM of those groups in which no RV information is available. The sizes of the points at the beginning of the lines correlate to the number of stars in a group. By default, LSR kinematics are displayed; buttons convert the kinematics to the reference frame of a given population. An interactive figure is available in the online version.

Start interaction
Standard image High-resolution image Figure data file
Figure 11.

Figure 11. Distribution of stars identified as members Orion A (green), Orion B (orange), Orion C (cyan), Orion D (red), and λ Ori (blue). In the left panel, black circles show the position of the major bright stars in Orion.

Standard image High-resolution image
Figure 12.

Figure 12. PMs of the stars identified as members of the Complex, relative to the average (LSR) PM in each structure. The black arrow shows the subtracted average motion, which has a magnitude of ∼3.1, 3.4, 3.7, 2.9, and 3.4 mas yr−1 in Orion A, Orion B, Orion C, Orion D, and λ Ori, respectively. The length of the vectors corresponds to the motion of over 1.2 Myr. Note that the scale is not consistent across all panels. Black circles show the position of the major bright stars in Orion; in the bottom right panel the bigger circle is λ Ori.

Standard image High-resolution image
Figure 13.

Figure 13. Estimation of stellar ages for the sources in the Orion Complex. Left: AgeHR; right: AgeCMD.

Standard image High-resolution image

Table 3.  Identified Groups and Their Properties

Cluster ID $\overline{\alpha }$ $\overline{\delta }$ $\overline{{v}_{r}}$ a $\overline{{v}_{\alpha }}$ a $\overline{{v}_{\delta }}$ a $\overline{{RV}}$ b $\overline{{\mu }_{\alpha }}$ b $\overline{{\mu }_{\delta }}$ b $\overline{\pi }$ $\overline{{\mathrm{Age}}_{\mathrm{CMD}}}$ $\overline{{\mathrm{Age}}_{\mathrm{HR}}}$ σα σδ ${\sigma }_{{v}_{r}}$ ${\sigma }_{{v}_{\alpha }}$ ${\sigma }_{{v}_{\delta }}$ σπ ${\sigma }_{{\mathrm{Age}}_{\mathrm{CMD}}}$ ${\sigma }_{{\mathrm{Age}}_{\mathrm{HR}}}$ Ningroup NAPOGEE
  (J2000) (J2000) (km s−1) (mas yr−1) (mas yr−1) (km s−1) (mas yr−1) (mas yr−1) (mas) (Myr) (Myr) (deg) (deg) (km s−1) (mas yr−1) (mas yr−1) (mas) (Myr) (Myr)    
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22)
l1641N-1 83.70 −6.76 7.42 0.36 2.58 24.71 0.92 −0.18 2.572 3.5 2.2 0.54 0.67 1.44 0.29 0.35 0.086 1.9 1.6 104 29
l1641N-2 84.61 −6.97 4.24 −0.53 1.99 21.67 −0.19 −0.54 2.436 0.8 2.1 0.47 0.51 1.60 0.37 0.23 0.084 1.1 1.1 25 15
l1641N-3 83.85 −6.45 3.79 0.18 3.38 21.09 0.61 0.61 2.539 3.0 5.2 0.37 0.65 2.85 0.18 0.16 0.061 1.9 1.1 16 6
l1641N-4 83.89 −6.20 9.21 0.81 3.33 26.48 1.34 0.72 2.503 2.0 0.7 0.09 0.22 0.19 0.72 0.38 0.196 1.6 1.9 11 11

Notes.

aLSR. bHeliocentric.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

There are six groups identified as spurious detections, as they are unlikely to correspond to any larger structure; they are also listed in the table, for completeness.

In many cases, the identified groups may not necessarily correspond to distinct subclusters. In some cases, the split may occur only along one principle axis with the artificial split smaller than the dispersion in the cluster (e.g., in NGC 2024 groups are separated in vr by ∼0.5 km s−1, with a total dispersion velocity of 1.3 km s−1). This is most apparent in the massive clusters: the ONC alone is associated with more than 30 groups. This occurs because the algorithm is calibrated to recover less numerous, less concentrated populations that occur in the other parts of the Complex. There may be a few clustering algorithms that might have an improved performance when confronted with such vast differences in scale through a mixture of several techniques (e.g., Zhang et al. 1996; Karypis et al. 1999). Here, however, we use the identified groups to trace the distribution of the larger structure and look for strong significant deviations that are attributable to real subclusters. Some confusion may also occur when the sources from two separate large structures may have some overlap in one of the dimensions, if they have incomplete data in other dimensions. For example, there are a few groups associated with Orion D where only one or two sources have RV information, which is maybe more consistent with the Orion C population. This affects a relatively small fraction of sources, and when reporting on the averages we require at least three measurements in a given dimension.

4.1. Orion C and D

In the observations of RV toward σ Ori, Jeffries et al. (2006) have observed two distinct velocity components separated by 7 km s−1, one of which was interpreted to originate from the population of stars associated with σ Ori, whereas the other component is attributed to Ori OB1ab. We obtain similar results in the RV observations with APOGEE. The two components appear to extend northwest significantly beyond the area originally covered by Jeffries et al. (2006). The component centered at vr ∼ 3–5 km s−1 (Figure 14.1) traces the population of YSOs surrounding the belt stars, as well as the OB stars such as η Ori, 22 Ori, 25 Ori, and ψ2 Ori. This population includes stars associated with Orion OB1a and OB1b. It stretches further south beyond the area covered by the APOGEE footprint toward Rigel, which includes a few of the Orion outlying clouds (Alcalá et al. 2008; Biazzo et al. 2014), and some authors recently referred to the region as Orion X (Bouy & Alves 2015; Zari et al. 2017). On the other hand, the component centered at vr ∼ 13 km s−1 (Figures 11, 14.2) consists of the σ Ori cluster, and it stretches diagonally toward an area east of 25 Ori, concluding northeast of ψ2 Ori, largely excluding any other notably bright OB stars. The two populations are also found at different distances; this has also been recently found by Briceno et al. (2018). This demonstrates that σ Ori originated from a different cloud than the rest of the OB1ab regions. We extend the naming convention of the Orion A and B molecular clouds, and we will refer to this progenitor as the (former) Orion C molecular cloud. Collectively, we will also refer to the entire region encompassing OB1a, OB1b, and Ori X as Orion D. We note that the original definition of the OB1 subassociations is largely based on the area on the sky—not necessarily representative of the underlying structure. As two populations are largely superimposed onto each other, caution must be exercised in comparison with the literature, especially in the vicinity of OB1b. The motivation behind grouping it with Ori D is due to the tighter correlation of the foreground population with the belt stars, but many of the stars that have been associated with OB1b are also associated with the Orion C region.

Figure 14.

Figure 14.

Structure and kinematics of all of the identified groups toward the Orion Complex that show the projection of the individual measurements in all six dimensions. Orientation of the uncertainties in parallax and PMs includes the correlations from Gaia. Other panels include the distribution Teff and $\mathrm{log}\,g$ color–magnitude diagram, and a 3D rendering of the structure. Colors are randomly assigned to distinguish the groups from each other in all panels. Orion D is shown here; the complete figureset (5 images) is available in the online journal. (The complete figure set (5 images) is available.)

Standard image High-resolution image

At the present day, the gas from Orion C has been almost completely dissipated. There does appear to be a clear age gradient in the stars along the filamentary structure from α = 85°, δ = −3° to α = 82°, δ = 3° (Figure 13). Group C-North is the oldest, having a wide distribution in ages around a median of ∼7.5 Myr; C-Central has an equally broad distribution, with a median of ∼5.5 Myr, but also a peak at ∼3 Myr; and the σ Ori cluster is the youngest population, with an age of 1.9 ± 1.6 Myr. Both AgeHR and AgeCMD show a very similar distribution. Correcting for systematics, we measure the weighted average distances of 406 ± 4 pc for σ Ori, 413 ± 4 pc for Ori C-C, and 416 ± 4 pc for Ori C-N. Orion C also appears to be moving uniformly radially; however, it is expanding along the plane of the sky, with σ Ori and Ori C-N moving away in opposite directions from Ori C-C.

On the other hand, it is difficult to conclusively determine whether all parts of the Orion D population share the same progenitor or not, as the population containing them is largely continuous and mostly dispersed with few spatial kinematical differences. (In the case of OB1a and OB1b, this contradicts what has been previously observed by Briceño et al. [2007]; however, the sample of stars that they used as representative members of OB1b actually originated from the Orion C population.) Orion D has a comparable size in the sky to the entirety of Orion A, B, and C put together, and in another few megayears it is possible that those A, B, and C regions may become largely indistinguishable from each other. On the other hand, it is possible that Orion D is all part of a single expanding population, and most of the PM vectors are consistent with the process of expansion, with them dispersing most likely as a result of the loss of the molecular gas (e.g., Tutukov 1978).

Orion D has very little molecular gas remaining. Projected in part onto Orion B, the tail end of D corresponds to the very diffuse gas that can be seen in Figure 15 that has a gradient in vr from 6 to 3 km s−1. L1622 also is most likely associated with Orion D (see Section 4.3). A few diffuse clouds are found in the southwest, namely, L1616, L1634, and IC 2118. While most of the population is largely dissipated, some concentrated clusters are still apparent, such as 25 Ori at 354 ± 3 pc (AgeCMD = 6.2 ± 2.3 Myr, AgeHR = 7.4 ± 2.0 Myr), consistent with Briceno et al. 2018), ψ2 Ori at 347 ± 3 pc (AgeCMD = 5.5 ± 1.7 Myr, AgeHR = 4.9 ± 1.5 Myr), η Ori at 347 ± 3 pc (AgeCMD = 5.7 ± 1.4 Myr, AgeHR = 6.4 ± 1.9 Myr), OB1b at 357 ± 3 pc (AgeCMD = 3.7 ± 1.8 Myr, AgeHR = 4.1 ± 1.6 Myr), and L1616 at 360 ± 3 pc (AgeCMD = 5.0 ± 1.7 Myr). The southern portion does deviate in distance from the rest of Ori D—the population that is associated with IC 2118 is located at 291 ± 2 pc (AgeCMD = 7.9 ± 2.5 Myr).

Figure 15.

Figure 15. RVs of stars toward the Orion B molecular cloud. Blue symbols show measurements from APOGEE, red symbols those uniquely detected by Kounkel et al. (2017c). The grayscale background shows the 13CO molecular gas map from Nishimura et al. (2015).

Standard image High-resolution image

A small group of stars are found north of κ Ori with a distance similar to IC 2118 at 302 ± 2.5 pc. This group has not been cataloged previously, although a few stars we identify have been previously confirmed as YSOs by Alcalá et al. (2000). We will refer to this group as Orion Y. Further tests will be necessary to confirm the properties and membership of this group. This group is spatially discontinuous from Ori D and older (AgeCMD = 10.5 ± 2.5 Myr). However, it does have similar PMs; therefore, we group it together with Orion D on all the plots.

Pillitteri et al. (2017) have also observed a nearby group of stars surrounding V1818 Ori for which they estimated a distance of 270 pc. It is not clear whether it is associated with Orion Y, as we do not recover it, and nearly all sources identified as YSOs by Pillitteri et al. (2017) have π < 2 mas.

One group identified by the algorithm is kinematically peculiar. 25 Ori-2 (it was recently identified by Briceno et al. 2018, under the name of HR 1833) is kinematically distinct by almost 10 km s−1 in vr and 1.8 mas yr−1 in vα from the main group. It also appears to be significantly older than 25 Ori, with AgeCMD = 15.1 ± 3.4 Myr (AgeHR = 12.9 ± 2.8 Myr). Its kinematics are similar to those of Orion C; however, its distance is more consistent with the Orion D population. Since it is closer than Orion C but its vr is redshifted, it is unlikely that these stars have originated in Orion C; instead, they may be part of an earlier wave of star formation in Orion D.

4.2. Orion A

The clustering algorithm used in this work has a relatively poor performance when it is applied on very numerous and extended stellar populations with a large velocity dispersion, creating many largely artificial distinctions between separate regions of the same cluster. As such, the ONC creates a particular challenge to disentangle in terms of its true internal structure (Figure 14.3). The high degree of extinction toward the cluster exacerbates the problem further. The ONC has been the subject of a number of studies to analyze its structure using methods that may be more suited to such an environment, including the work done by Hillenbrand & Hartmann (1998), Kuhn et al. (2014), Megeath et al. (2016), and Hacar et al. (2016), using 2D and 3D data. Here we present the identified groups mainly as a method to trace the extremes of the stellar population in a 6D space, with the caveat that the groups themselves may not be physical in nature, particularly compared to the results observed in the other regions in this paper.

In previous RV studies toward the ONC, a population of stars were identified that are blueshifted to the molecular gas from which these stars have formed, although no obvious correlations have been found with any of the stellar properties or the position of those stars on the sky (Tobin et al. 2009; Da Rio et al. 2016; Kounkel et al. 2016). Unfortunately, stellar PM information cannot be compared directly to the kinematics of the molecular gas. With the addition of the astrometric solutions from Gaia, the mystery of the blueshifted population remains, showing no correlation with the distribution of distances or PMs when applied to the optical observations from Kounkel et al. (2016). However, it is notable that the APOGEE sample does show significantly better agreement between the stellar RVs and those of the molecular gas.

In general, regarding the 3D structure of Orion A, there is a good agreement with the model from Kounkel et al. (2017b). Using Gaia astrometry, the average parallax toward the ONC is 2.540 ± 0.001 mas; corrected for systematics, this results in a distance of 389 ± 3 pc. The southern end of L1641 is found at π = 2.364 ± 0.004 mas, or 417 ± 4 pc, with relatively smooth continuity between them, and L1647 is found at π = 2.227 ± 0.006 mas, or 443 ± 5 pc.

Similarly good agreement is found for the individual measurements of kinematics (however, it should be noted that there was an error in conversion from μα,δ to vα,δ in Kounkel et al. 2017b), but now it is possible to reconstruct the full map of the PMs along the cloud. The motions in the ONC appear to be mostly random, with slight preference for expansion near the outer edges. PMs in L1641 are preferentially oriented perpendicular to the filament in the plane of the sky (most of the grouped sources are located near the northern edge of the molecular cloud, and their PMs are oriented toward the gas); in L1647, stars are primarily moving away from the main filament in the plane of the sky.

We estimate AgeHR = 1.6 ± 1.5 Myr (AgeCMD = 3.2 ± 2.2 Myr, which is an overestimate due to extinction). This is consistent with the estimates by Getman et al. (2014). Similar distributions are present in the northern and southern parts of L1641, with ages of 2.0 ± 1.5 Myr and 1.9 ± 1.4 Myr (AgeCMD = 2.4 ± 1.8 Myr and 2.1 ± 1.7 Myr), respectively, consistent with the measurements from Hsu et al. (2013). L1647 is the youngest region, with AgeCMD = 1.3 ± 1.3 and AgeHR = 1.9 ± 0.9 Myr.

There is one group toward the ONC that is kinematically different from the rest (ONC-22), as it is offset from the main population in vα by 2.6 mas yr−1. Unfortunately, no RV measurements are available, but it does not significantly deviate in other dimensions, though it may be found closer to the back of the cluster.

Particular interest has been given in the past to NGC 1980, a population that exhibits relatively little extinction compared to the rest of Orion A. Several studies have suggested that this region is more evolved compared to the ONC and that it may be located somewhat in the foreground (Alves & Bouy 2012; Pillitteri et al. 2013; Bouy et al. 2014). However, there has also been some doubt to these claims (Da Rio et al. 2016; Fang et al. 2017; Kounkel et al. 2017a). We do find a diffuse distribution of YSOs in the foreground as part of the Ori D structure, and some of it does coincide spatially with Orion A. It does not appear to be particularly concentrated near the NGC 1980 region, and only ∼8% of the total that is found toward it (∼30 stars) is likely to be associated with the foreground population, but it is possible that they may have contributed to the confusion surrounding the issue.

Confusion with Orion D toward Orion A is most notable toward the north of the cluster, near NGC 1981, which is another population that has been previously observed to be older than the rest of the ONC (e.g., Maia et al. 2010). There are a number of groups that are observed to have a distance more consistent with that of Orion D, at 357 ± 3 pc, with AgeCMD = 5.0 ± 1.8 Myr and AgeHR = 2.8 ± 1.7 Myr. Most of these groups do not have reliable vr measurements, but those that do have a good agreement with that of the ONC—somewhat surprising considering the peculiar RV structure of Orion A north of δ < −5°, which is not representative of Orion D. While we associate these groups with Orion D, this assignment may not necessarily be accurate.

4.3. Orion B

Previously Kounkel et al. (2017c) studied the RV structure of the Orion B molecular cloud. They identified a peculiar RV structure toward the NGC 2024 region that is largely asymmetric from the RV structure of the molecular gas. Although due to a high degree of extinction toward it, their optical sample contained relatively few sources. With the expanded sample of the APOGEE observations, the agreement between the kinematics of the stars and the molecular gas is significantly improved. The blueshifted clump can be resolved into a kinematically distinct population that is centered at vr ∼ 6 km s−1 (Figure 15), which appears to be associated with Ori D. On the other hand, the redshifted clump is traced by the sources that originate from σ Ori (see Section 4.1).

Toward the NGC 2068 cluster the agreement between the stellar and molecular kinematics remains good. There is some hint of a slight excess of stars somewhat redshifted relative to the gas, but it is difficult to conclusively state how significant it is.

We do recover a population that is located east of L1617, in the gas-free region close to the edge of the molecular cloud. We will refer to this population as the Orion B-North group. All three clusters are found at comparable distances of 404 ± 5 pc, 417 ± 5 pc, and 403 ± 4 pc for the Ori B-N group, NGC 2068, and NGC 2024, respectively (Figure 14.4). There is a deviation between the distances measured by Gaia and those measured by Kounkel et al. (2017b) toward these regions; this could be attributed to the small sample size and multiplicity. In terms of PMs, both Ori B-N and NGC 2068 appear to be moving toward NGC 2024.

L1622 is found outside of the footprint covered by APOGEE, and we do not recover it owing to a limited number of sources in Gaia associated with it because of the high extinction. From seven Class II stars that have been identified by Megeath et al. (2012) as YSOs in L1622, we measure average astrometric parameters of π = 2.871 ± 0.026 mas (d = 345 ± 5.5 pc), vα = 5.343 ± 0.045 mas yr−1, and vα = 4.271 ± 0.040 mas yr−1. Combined with the typical vr = 1.17 km s−1 (Kun et al. 2008), we conclude that L1622 is not associated with the Orion B molecular cloud, but may possibly have been related to Orion D. On the other hand, while L1622 and L1617 appear to be projected in a similar area of the sky, it is probable that the two are unrelated. While no parallaxes are available for the sources associated with the molecular gas in L1617, it is more likely to be a part of Orion B given its vr (Reipurth et al. 2008).

We estimate AgeHR of 1.1 ± 1.0 Myr and 1.0 ± 0.5 Myr (AgeCMD of 1.9 ± 2.0 Myr and 1.0 ± 1.4 Myr) for NGC 2024 and NGC 2068, respectively. Ori B-N group has AgeCMD = 2.2 ± 1.0 Myr.

4.4.  $\lambda $ Ori

Most of the molecular gas toward λ Ori has already been dissipated by a supernova; however, a couple of clouds within the ionized bubble remain, namely, B30 in the northwest, and B35 in the southeast. YSOs observed toward this region are primarily located alongside a filamentary structure stretching between these clouds. Stellar RVs are largely consistent with the RVs of the molecular gas, although a direct comparison is no longer possible (Figure 16).

Figure 16.

Figure 16. RV measurements of stars toward λ Ori. The grayscale background shows the CO molecular gas map from Dame et al. (2001).

Standard image High-resolution image

In the cluster centered on λ Ori, though, there do appear to be two distinct kinematical components (Figure 14.5): the main one at vr ∼ 12 km s−1, and a secondary subcluster (λ Ori-2) at vr ∼ 14 km s−1. The separation is also apparent in PMs, but not in parallax (average distance to the cluster of 404 ± 4 pc) or in the spatial position on the sky. It appears that λ Ori-2 favors somewhat older ages than λ Ori-1, with AgeHR = 3.7 ± 1.0 Myr versus 5.1 ± 1.1 Myr (AgeCMD = 3.5 ± 0.9 Myr vs. 4.4 ± 1.4 Myr). A few groups (λ Ori-3 and λ Ori−4) also have somewhat peculiar PMs, although they lack a reliable number of RV measurements. YSOs near B30 (located at a distance of 396 ± 4 pc) are younger, 2.4 ± 1.3 Myr, with a near-uniform distribution of ages from 2 to 5 Myr, with the northernmost sources (δ > 12°) near the cloud having AgeCMD = 1.2 ± 0.8 Myr (AgeHR = 2.1 ± 0.8 Myr). These YSOs are clumped preferentially near the inner edge of the molecular cloud, with a common vr ∼ 10 km s−1. A similar distribution is found toward B35, with a typical age of 2.6 ± 1.3 Myr (distance of 397 ± 4 pc), with the easternmost sources becoming somewhat preferentially younger with an age of AgeCMD = 1.8 ± 1.5 Myr (AgeHR = 2.0 ± 1.0 Myr).

A supernova was produced near the cluster center a few megayears ago. Nearly all PMs larger than 1 mas yr−1 relative to the average cluster motion are moving radially away from λ Ori, and the farther the sources are away from the center of the cluster, the faster they tend to move (Figure 17). This is consistent with a single trigger expansion, which has a travel time from the cluster center of ∼4.8 Myr. On the other hand, sources within 1fdg5 from the cluster appear to be mostly relaxed.

Figure 17.

Figure 17. Magnitude of the PM vector of stars toward λ Ori (corrected for the average motion of the cluster) vs. the offset between the position angle of the PMs and the position angle of the star relative to the center of the cluster. Sources are color-coded according to their distance from the center of the cluster.

Standard image High-resolution image

Based on their age, B30 and B35 would have formed halfway from λ Ori to their current position and would not have been accelerated owing to the sudden mass loss of the molecular gas from the cluster. The fastest-moving stars have PMs of up to 6 km s−1 relative to the average motion of the cluster. Considering that

  • 1.  
    observationally, kinematics of YSOs usually closely mirror kinematics of the gas from which they form,
  • 2.  
    it is easier for an expanding shock wave to influence the velocity of the molecular gas compared to the already-forming YSOs, and
  • 3.  
    outlying stars are significantly younger than those in the cluster center,

it is possible that the formation of a number of those stars has been triggered by the supernova, similarly to the scenario described by Mathieu (2008). However, it should be noted that from simulations it appears to be difficult to distinguish between stars whose formation has been truly triggered and those that are in the process of formation regardless of any feedback (Dale et al. 2015). Further modeling of the cluster dynamics will be needed to demonstrate the role that triggering has played in the formation of the outlying stars.

There may be additional structure near the outskirts of λ Ori, which was suggested by Zari et al. (2017; e.g., L1588), although we do not recover it.

5. Conclusions

  • 1.  
    For the first time, we map the full extent of the population of YSOs found in the regions toward the portions of the Orion Complex that are currently devoid of molecular gas and trace their 3D kinematics. Using these data, we can now reconstruct the properties of the dispersed molecular clouds that produced them, and they represent an example of potential evolution of their younger counterparts.
  • 2.  
    Most notably, we identify two separate populations toward the Ori OB1ab subassociations that are coherently separated in RV and distance space. These two populations are projected on top of one another, both spanning several deg2. One of them, to which we refer as Orion C, has three distinct epochs of star formation, ranging from 2 to 10 Myr. This population is the progenitor of the σ Ori cluster. The other, which we refer to as Orion D, spans the full extent of OB1ab, tracing most of the brightest OB stars in the region, and extends further south toward Rigel, encompassing the Orion X region. We also recover a previously uncataloged population north of κ Ori, which we refer to as Orion Y.
  • 3.  
    We identify several peculiar groups that may represent kinematically distinct subgroups. Particularly notable are those located within λ Ori, ONC, and 25 Ori.
  • 4.  
    We measure average distances of 386 ± 3 pc for the ONC, extending up to 443 ± 4 at the southern end of Orion A, 407 ± 4 pc for Orion B, 412 ± 4 pc for Orion C, 350 ± 3 pc for Orion D, and 400 ± 4 pc for λ Ori, and we also trace the kinematics and the distribution of ages within them. Large structures that are devoid of molecular gas are preferentially expanding (although further investigation will be necessary to confirm whether individual clusters in these structures are bound or not). In λ Ori the expansion is largely radial and attributable to a supernova explosion, which could be modeled in the future. In Orion A, the kinematics are preferentially perpendicular to the cloud, and in Orion B it appears that various groups are preferentially moving toward NGC 2024.
  • 5.  
    Together these data represent a major step forward in terms of understanding the dynamical evolution within the young star-forming regions. This will be instrumental for the future modeling efforts of the assembly and dispersal of molecular clouds and the stars that form within them, as well as determining whether any of the subclusters will remain bound and analyzing the difference in the kinematical population as a function of age.

M.K. and K.C. acknowledge support provided by the NSF through grant AST-1449476 and from the Research Corporation via a Time Domain Astrophysics Scialog award (no. 24217). J.H. acknowledges support from programs UNAM-DGAPA-PAPIIT IN103017, Mexico. A.R.-L. acknowledges partial financial support provided by the FONDECYT REGULAR project 1170476. K.P.R. acknowledges CONICYT PAI Concurso Nacional de Inserción en la Academia, Convocatoria 2016 Folio PAI79160052. Support for J.B. is provided by the Ministry for the Economy, Development and Tourism, Programa Iniciativa Cientica Milenio grant IC120009, awarded to the Millenni. D.A.G.-H. and O.Z. acknowledge support provided by the Spanish Ministry of Economy and Competitiveness (MINECO) under grant AYA-2017-88254-P. Funding for the Sloan Digital Sky Survey IV has been provided by the Alfred P. Sloan Foundation, the U.S. Department of Energy Office of Science, and the Participating Institutions. SDSS-IV acknowledges support and resources from the Center for High-Performance Computing at the University of Utah. The SDSS website is http://www.sdss.org. SDSS-IV is managed by the Astrophysical Research Consortium for the Participating Institutions of the SDSS Collaboration, including the Brazilian Participation Group, the Carnegie Institution for Science, Carnegie Mellon University, the Chilean Participation Group, the French Participation Group, Harvard-Smithsonian Center for Astrophysics, Instituto de Astrofísica de Canarias, Johns Hopkins University, Kavli Institute for the Physics and Mathematics of the Universe (IPMU)/University of Tokyo, Lawrence Berkeley National Laboratory, Leibniz Institut für Astrophysik Potsdam (AIP), Max-Planck-Institut für Astronomie (MPIA Heidelberg), Max-Planck-Institut für Astrophysik (MPA Garching), Max-Planck-Institut für Extraterrestrische Physik (MPE), National Astronomical Observatories of China, New Mexico State University, New York University, University of Notre Dame, Observatário Nacional/MCTI, The Ohio State University, Pennsylvania State University, Shanghai Astronomical Observatory, United Kingdom Participation Group, Universidad Nacional Autónoma de México, University of Arizona, University of Colorado Boulder, University of Oxford, University of Portsmouth, University of Utah, University of Virginia, University of Washington, University of Wisconsin, Vanderbilt University, and Yale University. This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

Software: IN-SYNC pipeline (Cottaar et al. 2014), PHOENIX (Husser et al. 2013), PARSEC-COLIBRI (Marigo et al. 2017).

Appendix: Synthetic Population Properties

We generated a synthetic population of stars with properties matched to those observed toward the Orion Complex. This population includes synthetic field stars, as well as a cluster with a random age, distance, and 3D mean velocity. A random number of cluster members are generated with peculiar position and velocities, drawn from normal distributions corresponding to the cluster characteristic radius and velocity dispersion. Both the velocity dispersion and the characteristic size are allowed to vary in three dimensions independently of each other, so as to allow filamentary structures to form. All the parameters are drawn from a uniform distribution, with the ranges corresponding to those listed in the Table 4. This cluster is embedded into a field population, in which each field star has a random position within a 4 × 4 deg2 area on the sky, a distance of up to 1 kpc (to account for the Lutz–Kelker bias; Lutz & Kelker 1973), an age from 0.1 to 12 Gyr, and a 3D velocity drawn from a normal distribution centered at 0 km s−1 with a 25 km s−1 velocity dispersion (e.g., Rix & Bovy 2013). The masses for both the cluster and field stars are drawn from the initial mass function (IMF) from Muench et al. (2002). The various properties, such as Teff, $\mathrm{log}\,g$, MG, ${M}_{{G}_{B}}$, ${M}_{{G}_{R}}$, and MH band luminosities, are interpolated from the PARSEC-COLIBRI isochrones in accordance with the previously assigned masses and ages (Marigo et al. 2017).

Table 4.  Properties of the Synthetic Clusters

Property Min Max
Number of stars 15 75
Distance (pc) 300 450
Median velocity (vr, vα, vδ km s−1) −20 20
Cluster radius (pc) 0.4 5.0
Velocity dispersion (km s−1) 0.3 2.5
Age (Myr) 3 15

Download table as:  ASCIITypeset image

We assume an average stellar density of 0.09 M pc−3 (Kipper et al. 2018). The volume of space encompassed by the field stars in the simulation is ∼1.63 × 106 pc3. With the resulting average stellar mass of 0.3 M, we fixed the total number of field stars to 4.5 × 104.

Observational properties were then computed for the synthetic stars, with limits applied to simulate the impact of the APOGEE and Gaia detection limits. The positional and kinematical parameters are converted to the observable properties (vr, vα, vδ, π), and the apparent magnitudes G and H are computed. Stars fainter than G > 20 have their astrometric parameters discarded, and those with H > 13 have no spectral information. If a source fails the brightness test in both parameters, it is rejected from the sample. Noise is then applied to all the measurements:

  • 1.  
    G > 15: σπ = 0.04 mas, σμ = 0.06 mas yr−1.
  • 2.  
    G = 17: σπ = 0.1 mas, σμ = 0.2 mas yr−1.
  • 3.  
    G = 20: σπ = 0.7 mas, σπ = 1.2 mas yr−1 (Katz & Brown 2017).
  • 4.  
    Absolute magnitude MG is recomputed to incorporate the uncertainty in distance.
  • 5.  
    G = 13 mag: σG = 0.001 mag, ${\sigma }_{{G}_{B}-{G}_{R}}$ = 0.4 mag.
  • 6.  
    G = 20 mag: σG = 0.02 mag, ${\sigma }_{{G}_{B}-{G}_{R}}$ = 0.3 mag.
  • 7.  
    σv = 0.25 km s−1 for Teff < 7000 K (vr for sources with Teff > 7000 K are rejected, as their measurements are too uncertain).

In between the magnitude bins, the value of the uncertainties is interpolated. The sample is then limited only to those sources with 2 mas < π < 3.5 mas, −25 km s−1 < v < 25 km s−1, and −20 mas yr−1 < vα,δ < 20 mas yr−1 (if such parameters are defined), which includes only ∼3500 field stars. Furthermore, photometric cuts

are applied to exclude the field main-sequence stars and red giants (Figure 18). After this, only ∼285 noncluster sources remain in the sample. For populations of up to 15 Myr, the YSO rejection rate is <1%. These photometric cuts are somewhat different from those described in Section 2, due to the systematic differences between the assumptions made regarding the isochrones and the real observations; however, qualitatively these cuts should yield similar results.

Figure 18.

Figure 18. Color–luminosity (left) and Teff$\mathrm{log}\,g$ (right) diagrams for the synthetic population, generated to replicate the observed distribution of parameters from APOGEE and Gaia; details given in the text. YSOs shown have ages of up to 15 Myr; field stars have ages of over 100 Myr. Dark lines show the cuts applied in the magnitude space to filter the main-sequence and giant stars; open symbols correspond to the photometrically rejected sources. The arrow corresponds to a total extinction of 1 AV.

Standard image High-resolution image

It should be noted that this synthetic population ignores effects of extinction (which may confuse the separation between cluster and field stars, as well as render many stars too faint to have astrometric parameters with Gaia), or variability. However, the extinction should have a significant effect primarily on the youngest and most well-defined clusters, and it is not expected to have a strong effect on the tests in Section 3.2 beyond the initial sample selection. The effect of multiplicity is also neglected; orbital motion does affect the kinematics measured for an individual star in a system, accelerating it to velocities where it may appear to be unrelated to a cluster. Binaries with a period of a few years are absent from the astrometric solutions in Gaia DR2. Additionally, while close unresolved binaries with equal masses could double the apparent flux of a system, observationally only a small fraction of all sources would be affected by it. Finally, other selection biases are also ignored, such as the fiber collision within densely packed clusters or the targeting criteria that would affect the completeness of the APOGEE observations (Cottle et al. 2018).

Please wait… references are loading.
10.3847/1538-3881/aad1f1