Untangling the Galaxy. II. Structure within 3 kpc

Marina Kounkel; Kevin Covey; Keivan G. Stassun

doi:10.3847/1538-3881/abc0e6

1. Introduction

The unprecedented precision and sensitivity of Gaia DR2 (Gaia Collaboration et al. 2018) has resulted in a significantly improved understanding of the structure and kinematics of the Galaxy. Among the topics that have benefited the most from Gaia is the identification of clusters, associations, and comoving groups, as it is now possible to not only robustly identify members of such populations based on their position in the phase space (e.g., Cantat-Gaudin et al. 2018, 2019; Castro-Ginard et al. 2019, 2020; Liu & Pang 2019; Sim et al. 2019), but to also analyze their position on the H-R diagram.

Previously, in Kounkel & Covey (2019, hereafter Paper I), we systematically clustered Gaia DR2 data within $| b| \lt 30^\circ$ and parallax π > 1 mas using HDBSCAN (Campello et al. 2013; McInnes et al. 2017) to identify 1640 populations containing in total 288,370 stars. Furthermore, we estimated the ages of these structures via isochrone fitting.

Approximately half of the stars in these populations are found in coherent extended string-like populations that have a typical length of ∼200 pc and width of ∼10 pc. They are ubiquitous among young populations: most of the sources younger than 300 Myr are found in such strings. Given their age, these structures are not likely to be a result of tidal stretching of previously compact clusters, as the tidal tails take several hundreds of megayears to stretch out to similar distances (Ernst et al. 2011; Röser & Schilbach 2019; Röser et al. 2019). Thus, the number of long strings found at ages significantly younger than 100 Myr suggests a different origin. Rather, the strings are likely to be primordial, preserving the shape of filamentary giant molecular clouds from which the stars have formed. Some such strings have also been found by Beccari et al. (2020), Jerabkova et al. (2019), Meingast et al. (2019), and Sim et al. (2019).

As the stars from these populations slowly dissolve into the field, after ∼300 Myr, most of the stars in the low-density regions of the strings would no longer be clusterable, typically leaving behind the densest, most cluster-like isolated and compact groups of stars.

The identified populations in Paper I exceed the spatial coverage of previously known open clusters (e.g., Dias et al. 2002; Kharchenko et al. 2013; Cantat-Gaudin et al. 2018), making the catalog more effective at analyzing the distribution of a number of properties, from stellar to galactic, as a function of age. Examining the populations younger than 100 Myr, we found that there is a preferred orientation of the strings to be generally in parallel to one another, perpendicular to the Local Arm. It is possible that these identified strings are stellar analogs to various gaseous spurs or feathers that have been identified in observations of other galaxies (e.g., Schinnerer et al. 2017). At the ages of 8–8.7 dex, the overall stream that can be traced by the superposition of the strings has shifted. While they were still oriented in parallel to one another, the stream was tilted by ∼60° relative to the Local Arm, with a very sharp transition in age between these two modes. Furthermore, yet another transition has been observed at the age of ∼8.7 dex, revealing two different streams with a very different morphology. The interpretation of this has been that the strings trace the spiral arms as they existed when the stars inside these populations has been forming, and that the the spiral arms are transient—dissolving, twisting, and re-forming in a similar region of the galaxy with a different underlying shape. The timescale for this transition would be on an order of a Galactic orbit, or a few 100 Myr.

However, the catalog in Paper I extended only to 1 kpc and included only the Local Arm. Although astrometric precision does drop at larger distances, Gaia DR2 does allow the catalog to be extended further, increasing the census of clustered structures, as well as the census of stars with age estimates. In this work, we build on previous efforts to identify additional structures up to π < 0.2 mas, toward the detection limit for clustering imposed by the extinction. In Section 2, we discuss the clustering technique to identify structures and derive the limits of the clustering approach as a function of age due to extinction. In the Appendix, we develop an algorithm to derive structure properties. In Section 3, we show and discuss the three-dimensional distribution of the structures in the context of the Galactic structure. Finally, in Section 5, we discuss the results.

2. Methods

In this section, we describe the process of selecting the sample for the analysis. Section 2.1 discusses extending the clustering analysis originally presented in Paper I to larger distances through a homogeneous search for statistically meaningful overdensities in five-dimensional phase space and deriving the average properties of the stars in those identified groups. Section 2.2 describes the validation of the identified populations through independent confirmations of derived ages, as well as identifying the highest confidence sample that can be separated from the randomly drawn field stars based on their collective photometry (the analysis in Section 3 is limited to this highest confidence sample). In Section 2.3, we examine various biases as well as estimate the completeness limits in distance along the various lines of sight as a function of age.

2.1. Clustering

The original clustering of Gaia DR2 data (Gaia Collaboration et al. 2018) in Paper I was done in several onion-like layers (processing the entire catalog up to a given distance and merging it with a catalog that extends even farther, joining overlapping structures) to preserve the same sensitivity in the nearby populations as in the more distant ones. The clustering was done in 5D space (l, b, μ_α, μ_δ, and π) in HDBSCAN (Campello et al. 2013; McInnes et al. 2017).

Extending the sample beyond 1 kpc, we used the same general approach. All the data quality cuts from Paper I (Section 2) were retained in this work, as were the cuts in the plane of the sky ( $| b| \lt 30^\circ$ ) and velocity ( $| {v}_{\mathrm{lsr}}^{\alpha ,\delta }| \lt 60$ km s⁻¹).

Three new layers were added—one extending up to π > 0.6 mas, one to π > 0.3 mas, and the final one extending up to π > 0.2 mas. Note that all the sources within the solar neighborhood were preserved between different layers, i.e., the π > 0.6 mas slice retains all of the sources in the catalog from 0.6 < π < 1000 mas, and, similarly, the π > 0.3 mas slice retains all the sources $0.3\lt \pi \lt 1000$ mas, to ensure overlap of the identified structures. As the data volume became too large due to an increasing number of sources, each one of these layers was further split in chunks of l ranges: 0°–65°, 60°–125°, ... 300°–5° for the π > 0.6 mas sample, and 0°–35°, 30°–65° ... 330°–5° for the π > 0.2 and 0.3 mas samples. These individual chunks were clustered separately using the "leaf" algorithm, with the minimum sample of 25 and a minimum cluster size of 40 stars. The "leaf" algorithm is more optimal for identifying more granular populations, selecting leaf nodes from the minimum spanning tree. On the other hand, the default "excess of mass" algorithm tends to perform poorly on Gaia data, merging together most stars in the solar neighborhood in just a single population. The sample size is responsible for how conservative the clustering is in considering overdensities of stars and rejecting the noise, which helps set the characteristic scale of the identified populations. Finally, the cluster size further rejects the overdensities number that have a fewer number of sources than the set threshold. See Paper I and HDBSCAN manual³ for the more complete description of the process.

The resulting outputs were stitched along the seams and joined into the merged catalog from Paper I. In some cases, some pairs of groups identified in Paper I were joined together if there were newly added stars that illustrated a common origin between them. The properties of the populations scale nonlinearly with distance (e.g., more distant groups are smaller on the sky, have smaller and more uncertain parallaxes and proper motions, and there are more of them in that volume of space in comparison to those groups that are more nearby). Thus, the layers extending up to different distances have different "tuning" for characteristic scales of the identified structures; as such, those layers that include more distant stars have a sparser recovery of the stars in the more nearby populations. By and large, however, different slices trace the same underlying structure within the overlapping volume of space, even if the sensitivity to characteristic density may be somewhat different (see Paper I for full discussion).

In contrast, different slices in l within the same distance limit have mostly negligible differences for the stars in the overlap area, as they are "tuned" to the same characteristic scale, and joining them together is mostly a trivial process. We further note that within each distance limit the performance by HDBSCAN identifies all overdensities in phase space that satisfy minimum sample and minimum cluster size. It is not intrinsically biased for or against populations in a given age range (as age, or anything regarding photometry of the stars, is not provided to it) outside of the fact that older populations tend to be more dispersed (both spatially and kinematically) and harder to identify. Nor is it intrinsically biased for or against any particular line of sight outside of astrophysically significant limitations such as extinction (Section 2.3). Although some inhomogeneity in sensitivity may occur right along the seams of different slices, overall the identified groups trace mostly the uniform census of the overdensities in the volume of space we examine.

A total of ∼82 million stars have been analyzed by HDBSCAN. The final catalog of clustered structures consists of 987,376 stars in 8292 groups, of which 6671 are new from Paper I. Their catalog is available in Table 1. Their stellar properties, such as age, average extinction, and average distance, are derived using deep learning with the Auriga neural network that is described in detail in the Appendix and presented in Table 2.

Table 1. Clustered Sources

Gaia DR2	α	δ	Theia
ID	(deg)	(deg)	Group ID
2172342682494385024	320.57220379	52.07733956	1
2170224782576556672	315.65202897	52.20035062	1
2168760576695182208	315.71659886	50.22889438	1

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Table 2. Structure Parameters

Theia	Common	String?	Distinct	Age	A_V	Dist.	N_*	l	b
ID	Name		from field?	(dex)	(mag)	(pc)		(deg)	(deg)
1	LDN_988e	F	Y	6.48 ± 0.08	1.16 ± 0.20	655 ± 30	194	90.9216	2.7943
2	Chameleon_I	F	Y	6.48 ± 0.06	1.48 ± 0.31	210 ± 7.8	193	297.0931	−14.9898
3		F	Y	6.32 ± 0.08	2.42 ± 0.27	873 ± 53	367	84.7852	−0.1183

μ_l^a	μ_b^a	v_r	X	Y	Z	U^a^,^b	V^a^,^b	W^a^,^b
(mas yr⁻¹)	(mas yr⁻¹)	(km s⁻¹)	(pc)	(pc)	(pc)	(km s⁻¹)	(km s⁻¹)	(km s⁻¹)
−5.88	−1.13	−9.25	−10.52	654.4	31.9	0.83	−8.07	−3.96
−2.88	−2.01	2.40	92.5	−180.9	−54.4	3.20	−2.86	−2.56
−5.33	0.88	−0.06	79.3	869.1	−1.8	−1.59	−0.79	3.64

Notes.

^aIn the local standard of rest. ^bCorrected for the bulk rotation, assuming solar velocity of 220 km s⁻¹, and the distance to the galactic center of 8.15 kpc (Reid et al. 2019).

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

Effort has been made to further identify strings in the increased sample. As in Paper I, strings were identified by searching for large axial elongation of the population in l compared to b, μ_l, and μ_b. Due to the lower precision in π, examination of that dimension was done only to ensure continuity. However, due to the decreased size in the sky of the strings at larger distances coupled with more uncertain parallaxes and proper motions, they become significantly more difficult to disentangle from the more compact and more isolated groups. As such, while a few new extended string-like structures have been flagged in this work, their census, compared to 1 kpc sample, is by no means complete. Furthermore, due to larger uncertainties in distances, beyond 1 kpc, conversion to the three-dimensional Cartesian coordinates may be imprecise. In total, out of ∼8000 identified structures presented in the paper, only ∼400 have been classified as strings. Of these, only ∼60 strings are located at distances >1 kpc. For this reason, as well as degraded confidence in distance, we do not base the analysis presented in this paper on strings. Instead, we focus only on the average position and velocity of all of the identified structures.

The interactive 3D plot with selectable age range is presented in Figure 1. The static version shows only the examples of the typical patterns seen in the data, showing ranges of ages similar (albeit not exact) to those discussed at various points in the text.

**Figure 1.** Three-dimensional distribution of the identified structures, color-coded by their ages. Left panels show the face-on view, right—edge on. Top row shows age ranges of 7–7.8 dex, middle—8.3–8.8 dex, and bottom—9–9.5 dex. Solid dots are the average position of the stars in the structures; their sizes correlate to a log of the number of sources with ${M}_{G}\lt 4$ . Thick lines on the top panel show the trace of the strings along their spine. Thinner lines show the typical kinematics of the groups over next 5 Myr, in the local standard of rest, corrected for the circular velocity of 220 km s⁻¹. Transparent dots on the face-on view show the areas where the clustering is incomplete due to extinction. Black lines in the top panel show the location of the Perseus, Sagittarius, and Scutum Arms from Reid et al. (2019); the olive line shows the Radcliffe Wave (Alves et al. 2020), which is similar to the position of the Local Arm. In the bottom two panels, the outlines show the rough locations of the overdensities likely corresponding to the ancient spiral arms (left), as well as the Heliocentric Stream (right). Interactive version with full selection of the age ranges and layer control is available. Rasterized movie of the interactive plot is available at http://mkounkel.com/mw3d/mw3dv2.mp4.
Start interaction Open in new tab
Download figure:
Standard image High-resolution image Figure data file

2.2. Contamination

Similarly as in the Paper I, some caution needs to be raised about both the contamination from the unrelated field stars that happen to coincide in the phase space with real comoving groups, as well as fake groupings that only appear to be comoving. At larger distances, with more uncertain parallaxes and proper motions, contamination is of even greater concern than in the solar neighborhood.

For each star in the catalog, we assign randomly drawn field counterparts located at comparable b, π, and that have comparable spatial A_V. We then evaluate the significance of each group relative to this random catalog based on any of the three following categories.

1.
First we compare the distribution of fluxes in each of the three Gaia bands between the populations to determine if all three have been drawn from different distributions with >3σ significance based on the Kolmogorov–Smirnov test (other two-sample uni- or multivariate tests, such as Anderson–Darling or Cramér's V tests, produce qualitatively comparable outputs). This accounts for 1297 populations, of which 546 are unique to this category and do not overlap with the other two criteria. This tests whether the identified populations have comparable distribution of fluxes between high-mass and low-mass stars, such as if the mass function is consistent is consistent with the field or not, and how much scatter there is.
2.
We also derive the "age" of the randomly drawn groups with Auriga to test if the age of the groups in the catalog is different from the nonphysical groups on a >3σ level, considering the reported uncertainties. This accounts for 2149 populations, of which 1522 are unique to this category. This examines the overall shape of the H-R diagram in comparison to the field, whether there are prominent and rare pre-main sequence, of if there is a lack of early-type stars in the environments where they may be expected to be found with some regularity.
3.
Furthermore, we consider all groups that have counterparts to previously known populations (e.g., Cantat-Gaudin et al. 2018, 2020) as real, independent of other metrics. This amounts to 783 populations (of which 226 are unique to this category). These groups are highlighted in Table 1 if they have an alternate name that has previously been used in the literature. The closest matching counterparts have been assessed through cross-matching the membership catalogs, identifying the population with the greatest number of matches, and verifying that the overall properties of the identified group are consistent with the properties of the crossmatched population. Sometimes, it is impossible to identify a direct one-to-one match, such as, e.g., the Orion Complex, which in this paper corresponds to a single structure but contains multiple clusters that have been independently cataloged in other works. The inverse lookup of the clusters presented by Cantat-Gaudin et al. (2018) to the identified structures is listed in Table A1.

In total, 3165 out of 8293 populations belong to the higher confidence sample. Based on this analysis, the groups tend to be most robust if they are younger than 100 Myr (as such pre-main-sequence stars are relatively rare in the field, and they occupy a distinct space on the H-R diagram), and those groups that are massive, containing >100 stars (comparing to the minimum clustering size of 40 stars), although these criteria are not comprehensive. At larger distances, due to a lack of of detectable pre-main-sequence stars, ages of younger populations may be more uncertain (Figure 2, top left). On the other hand, some of the older populations, due to their complete lack of early-type stars, with the turnoff occurring at increasingly lower mass, may stand out more clearly if the environment around them has early-type stars in larger numbers. Nonetheless, most of the populations with the apparent age of ∼8.6 dex are suppressed in the highest confidence sample—as this is a typical age at which most diffuse populations get dispersed into the field (Paper I), replenishing it with the early-type stars of this age; this is often returned as the typical "age" of the randomly drawn field groups (Figure 2, top right). We note that the precise typical "age" generally depends on the galactic latitude, with typically older "ages" at higher $| b|$ . Thus, for example, a group identified by HDBSCAN with age of ∼8.6 dex would be distinct at higher $| b|$ , while a group with the age of >9 dex would be more distinct at lower $| b|$ .

**Figure 2.** Top left: ages and corresponding uncertainty of the identified populations as a function of distance. Top right: the overall distribution of ages for the full and the highest confidence samples. Bottom left: dispersion in Z-axis for all of the identified populations as a function of age. Bottom right: velocity dispersion in Z-axis as a function of age.
Download figure:
Standard image High-resolution image

We include all of the identified structures in tables and in Figure 1, as even fake groupings could be informative with regard to the kinematical structure of the Galaxy in bulk, even though some of the individual membership may not be. Nonetheless, we restrict the analysis in the subsequent sections to this highest confidence sample. Although the groups that satisfy the above criteria correspond to less than half of the total sample, we note that there is no systematic difference in the spatial distribution of the groups as a function of age between these groups versus the full sample; thus, the analysis presented in Section 3 is robust against contamination. (See interactive version of Figure 1 and static version in Figure 3.)

**Figure 3.** Same as Figure 1, but restricted only to groups that have an H-R diagram different from randomly drawn field stars at >3σ significance, (i.e., most distinct from the field) or that correspond to previously known populations.
Download figure:
Standard image High-resolution image

If some of the groups in the sample are fake, consisting of physically unrelated stars that happen to be comoving, then the estimated ages from the isochrone would not necessarily be completely accurate for every single star in that population. Nonetheless, as clustering does not introduce any bias with mass, it would still be possible to distinguish between populations that are on average older or younger than one another, resulting in a representative age of the stars that is found in that phase space. Preliminary analysis confirms that the stars with T_eff > 4000 K for which rotational periods can be measured tend to have smooth and continuous evolution in their periods as a function of age estimated by Auriga, as is expected through gyrochronological relations (e.g., compared to the works of Angus et al. 2019). Angus et al. (2020) use the velocity dispersion in Z as a proxy of age in comparison to the gyrochrones, showing that, as expected, dispersion increases at the older age bins. Although in clustering the data together we do not pass on any information regarding the stellar age explicitly, HDBSCAN can discriminate in assigning stars to a given group based on their velocity. As is in Paper I, we also see the velocity dispersion and the scale height of the disk increase as a function of age (Figure 2 bottom). Thus, the agreement between the ages of the groups we measure compared to the trends we see relative to the gyrochrones is not necessarily surprising. We defer the full discussion of gyrochronological analysis to future works in the series.

The H-R diagram of these two types of groups similarly shows a good agreement of the ages compared to the expected distribution (Figure 4). The vast majority of the populations do show good agreement with the isochrones. Validation of the performance is presented in Appendix A.3. We recover well the ages of structures for which previous estimates were available, although some groups do have some contamination from the field that are clear outliers (e.g., a presence of an early-type star in an otherwise old population, or a few sources on a red giant branch in an otherwise young population). However, that stellar contamination is regularly significant, either in the groups that can be distinguished from the field based on the H-R diagram or those that cannot. In estimating the ages, Auriga focuses on the overall distribution of star (ratio of high-mass to lower-mass stars, ratio of red giants to main-sequence stars, and a number of other correlations that are present in the data); thus, it is able to disregard such stellar contamination in deriving the likeliest age, similar to the age that could be estimated through visual examination. And, although we cannot exclude the possibility that there are a number of older or younger stars among the lower-mass stars hidden in the H-R diagram, the estimate should be appropriate for the bulk of the stars in the group.

The identified groups would be found trace the most typical position–velocity components of the solar neighborhood, which allows for a comprehensive look at the Galactic structure. Of great interest in particular are the systematic gaps found in the spatial distribution of the populations as a function of age. Whether the populations consist of bona fide comoving groups or fake groupings, the presence of such gaps implies that a particular region of the phase space is unlikely to be inhabited by the stars of a given age range, even when allowed to pick and chose stars at random.

As the sample of sources with measured radial velocity increases, it should be possible to not only discriminate real groups more robustly, but to also clean up the membership of the real groupings from the contamination.

2.3. Completeness

Selection based on the quality of Gaia data translate to a rough cut of $G\lt 18$ mag. As older populations have a turnoff point that occurs at increasingly fainter magnitudes, it becomes more difficult to detect them at larger distances, both due to a larger distance modulus and due to build up of extinction along the line of sight. As can be seen in Figure 1, there is a gap along the Galactic Plane. The interactive version of this figure demonstrates that the gap appears at ages >8 dex and becomes increasingly more pronounced at the older ages. Extinction is largely responsible for this gap, as other old populations are found at similar distances at higher galactic latitudes.

Several studies have been conducted in the past to analyze the 3D distribution of the dust in the solar neighborhood based on Gaia DR2, such as Chen et al. (2019) and Green et al. (2019); however, both of them have some limitations of the lines of sight, with the former completing the analysis within $| b| \lt 10^\circ$ , and the latter, due to their reliance on the Pan-STARRS data, to δ > −47 fdg 5.

The Gaia collaboration has released early estimates of extinction for a number of stars (Andrae et al. 2018). Although individual measurements may be uncertain, it is possible to average them in bulk, as (outside of the young embedded objects) extinction depends only on three parameters: l, b, and π. We used a convolutional neural network with an architecture similar to Auriga that was trained on Gaia DR2 A_G values, with the training sample of the top 3 million stars ordered by random_index with $\pi \gt 0.1$ mas. The advantage of a neural network in this case is that it is not necessary to either assume a functional form for the extinction along the line of sight nor force the computation to be based on a particular grid pattern. Using a transformation of 1A_G = 0.85926A_V (as listed on the PARSEC isochrones web interface; Marigo et al. 2017), we predict A_V at the given spatial coordinates. These A_Vs are consistent to within 0.3 mag of the map of Green et al. (2019) over the applicable volume.

Through examining the absolute magnitude of 40th faintest star in each group as a function of age, we roughly parameterize the detection threshold of

$\begin{eqnarray}&&D+{A}_{G}\lt 26.2549-1.46189\times t,\end{eqnarray} \tag{ 1 }$

where D is the distance modulus and t is the age from 8.1 to 10 dex. As evolution of massive stars has only small effect 0n the shape of the H-R diagram for populations younger than that, we adopt the same threshold for groups at <8.1 dex as that at 8.1 dex (Figure 5). This threshold is only an approximation, and there are a few groups that can still be found at beyond it. Nonetheless, it can serve as an appropriate bound in Figure 1 to highlight the regions in the 3D volume where the sample is expected to be largely complete at a given age range.

**Figure 5.** Combination of extinction correction and the distance modulus of the detected structures as a function of age, with the threshold estimated by the Equation (1).
Download figure:
Standard image High-resolution image

The effect of completeness from above does not take into the account the mass of the cluster. As at larger distances, only the massive stars would be detected, using the same threshold of 40 stars as a minimum would result in a lack of detections of low-mass populations further away (Figure 6). All of the detected populations, regardless of the distance, do contain sources with extinction-corrected ${M}_{G}\lt 4$ . Thus, counting sources brighter than this threshold can serve as an age-dependent proxy for mass. The completeness volume estimate is appropriate ${N}_{({M}_{G}\lt 4)}\gtrapprox 40$ . Figure 7 shows the three-dimensional distribution of only the populations that meet this cut, excluding nearby lower-mass populations, for a more homogeneous view.

**Figure 6.** Correlation in distance of the population vs. the number of stars with ${M}_{G}\lt 4$ , as a function of age.
Download figure:
Standard image High-resolution image

**Figure 7.** Same as Figure 1, but restricted only to sources with ${N}_{({M}_{G}\lt 4)}\gt 40$ to show a more uniform sensitivity in distance.
Download figure:
Standard image High-resolution image

**Figure 7.** Same as Figure 1, but restricted only to sources with ${N}_{({M}_{G}\lt 4)}\gt 40$ to show a more uniform sensitivity in distance.
Download figure:
Standard image High-resolution image

Some caution should be exercised about the "edge" effect. The search for the structures is limited in the parallax space to 0.2 mas (ignoring the aforementioned biases), and the maximum uncertainty on the parallax is 0.1 mas. If a particular population has members with the resulting parallax measurement smaller than the limit, they would be excluded, and the resulting average distance for a population would appear to be closer than it really is, similarly to the Lutz–Kelker and Malmquist biases (Malmquist 1922; Lutz & Kelker 1973; Luri et al. 2018). This effect is simulated in Figure 8.

**Figure 8.** The resulting average distance of the population with the parallax cutoff of 0.2 mas and the uncertainty in parallax of 0.1 mas.
Download figure:
Standard image High-resolution image

Extinction moving the effective completeness limit closer is not expected to have as stringent of an effect in this matter: if a population is not intrinsically embedded into a cloud, sources with smaller apparent parallaxes from the scatter due to uncertainties would not necessarily be excluded in clustering, even if this makes them appear projected behind the above detection threshold. However, extinction does have a role in shaping the recovered populations at large distances: as only a few lines of sight remain, these lines of sight would create an overdensity of sources compared to their surroundings, and a clustering algorithm may group sources together more easily than if it would have been otherwise (see Section 3.2 for discussion). In the face-on projection in Figure 1, this creates a number of groups ordered directly behind one another, protruding radially, like fingers. This can also be seen in the clusters in the sample from Castro-Ginard et al. (2020). Although the fraction of real populations compared to random groupings along these lines of sight is likely comparable to what is found in the less extinct areas, we further caution for the need of independent vetting of these sources in the follow-up studies. Furthermore, we note that their membership may be incomplete in such a way that would affect their mean position.

Gaia DR2 has a known systematic offset in parallaxes that could be as large ∼80 μas (Stassun & Torres 2018). In estimating the distances with Auriga, this offset is partly taken into the account (Appendix A.3). Although there may be some not previously characterized second-order effects that might influence the magnitude of the systematic offset on individual stars (such as, e.g., from color), this should not have a strong effect on the result presented in this work, as it only amounts to a systematic and mostly uniform scaling of the detected structures in distance with respect to the Sun and does not strongly affect the location of the structures relative to one another.

3. Results: 3D Structure and Evolution of the Milky Way

3.1. Populations Younger than 100 Myr

In contrast to Paper I, with the sample extending up to 3 kpc, it is now possible to reach both the Sagittarius and the Perseus spiral arms. The separation between the arms (in their current form) persists up to the age of 7.8 dex.

3.1.1. Sagittarius Arm

Examining the young populations at 900 < X < 2500 pc, most likely associated with the Sagittarius Arm, shows that the arm undergoes a progressive evolution of its position: 60 Myr ago, it was more than 500 pc closer to the solar position than at the current epoch (Figure 9). The trace of this arm from Reid et al. (2019) shows a better agreement with the population with an age of 7.8 dex than those that have an age of 7 dex. It is most clearly shown in the bottom panel of Figure 9. Examining the difference between the X position of the populations that can be associated with the arm and the X position of the trace (at the corresponding Y position of each particular group), the groups younger than 20 Myr appear to be the most distant, on average, whereas those that are older than 60 Myr are the most nearby. The difference for the different age bins persists even with a more restrictive selection on quality of the identified groups (i.e., using only the groups with small uncertainty in age and those that have statistically significant differences in their H-R diagram relative to the field). As the star formation rate over the last 100 Myr remained more or less constant in the region of space currently associated with the arm (Figure 10), this displacement is unlikely to be caused by the contamination from a separate and unrelated structure, although we cannot rule out a possibility of the Sagittarius Arm supplanting another arm-like structure by forming in a similar place while the latter disappeared.

**Figure 9.** Slices from Figure 3 (highest confidence sample), but zooming in at the young populations toward the Sagittarius Arm, at the age ranges of 7–7.2, 7.2–7.4, 7.4–7.7, and 7.7–8 dex. Black lines show the location of the Sagittarius and Scutum Arms from Reid et al. (2019). Bottom-left plot shows the kernel density estimate of the distributions of the populations that can be associated with the Sagittarius Arm relative to the spiral arm trace from Reid et al. (2019) as a function of age, limited to the highest confidence sample. Bottom-middle and -right plots show the distribution of velocities as a function of age for this sample.
Download figure:
Standard image High-resolution image

**Figure 10.** Distribution of ages of the populations in the spiral arms, defining the Perseus Arm at $-3000\lt X\lt -1000$ pc, Local Arm at $-1000\lt X\,\lt 1000$ pc, and the Sagittarius Arm at $1000\lt X\lt 3000$ pc, for the highest confidence sample.
Download figure:
Standard image High-resolution image

**Figure 10.** Distribution of ages of the populations in the spiral arms, defining the Perseus Arm at $-3000\lt X\lt -1000$ pc, Local Arm at $-1000\lt X\,\lt 1000$ pc, and the Sagittarius Arm at $1000\lt X\lt 3000$ pc, for the highest confidence sample.
Download figure:
Standard image High-resolution image

The trace of the spiral arms from Reid et al. (2019) is global for the Galaxy based on various molecular masers, and there is substantial scatter in the position of those masers on smaller scales. However, the offset between the trace of the arm and the youngest populations of stars exceeds the reported width of 270 pc.

The displacement of the Sagittarius Arm is not an artifact due to the variable extinction limit of the Gaia catalog. This limit, which we calculate and account for in our analysis, is not substantially different between the Sagittarius and Perseus Arms when averaged across the entire arm in the volume of space of this study, with both arms having a comparable A_V distribution. Therefore, extinction would not explain the difference in the evolution of structure as a function of age between these two populations. Similarly, as the brightest, high-mass members of 20 and 60 Myr stellar populations are not significantly different, our sensitivity should not be strongly age dependent in the presence of a consistent magnitude+extinction limit.

If in the estimation of distances we have underestimated the systematic offset in parallax from Gaia DR2, everything would shift inwards, with the younger populations being located closer to the trace of the arm. The relative separation in distance between them and the older populations would persist, however, pushing the ∼60 Myr populations away from the trace, also closer toward the Sun. However, we note that there is no significant difference in position between the trace of the arm and the identified stellar populations toward it at any age bin within the past 100 Myr. As both Sagittarius and Perseus Arms are close to being equidistant from us and the extinction along the line of sight near their vicinity is comparable, if there were significant systematics affecting either our ability to identify groups at these age ranges as a function of distance or to estimate distances in general, we would expect these systematics to be observed toward both arms. Nonetheless, it is not clear why there is such a disagreement between the trace from Reid et al. (2019), which was derived from masers presumably associated with the young star-forming regions and the actual populations younger than <20 Myr.

Based on the difference in position of the young populations over the last 100 Myr, we estimate the speed of motion of the spiral arm of ∼8–10 km s⁻¹, consistent with estimates of the density wave propagation from Dobbs & Baba (2014). This true spatial velocity is not to be confused with the pattern speed of 28.2 km s⁻¹ kpc⁻¹ from Dias et al. (2019), as the latter is referring to the angular frequency of rotation that is uniform throughout the disk that is necessary to preserve the spiral arms, assuming they are a standing wave. It does not refer to the velocity nor does it take into the account the temporal evolution of the spiral arms that is commonly seen in the simulations (e.g., Li & Gnedin 2019).

3.1.2. Local Arm

Along the Local Arm, a recent study by Alves et al. (2020) has found a long, coherent 2.7 kpc molecular gas structure, the Radcliffe wave that perturbs out of the Galactic plane with the maximum amplitude of 160 pc. The formation of this filament may be responsible for the apparent tilt of local star-forming regions relative to the plane that has long been identified as the Gould Belt. Our clustering recovers stellar populations all along this wave. Almost all of the stellar groups that appear associated with it are younger than 12 Myr (7.1 dex), and it is no longer apparent at ages older than 15 Myr (7.2 dex). Thus, this formation within the Local Arm is a relatively recent phenomenon.

There has been some debate in the literature whether the regions of the Gould Belt/Radcliffe wave that are displaced from the Galactic plane to high Z (ripples) are caused by a specific event, such as a collision with a high-velocity H ii cloud (Comeron & Torra 1994), impact with dark matter (Bekki 2009), or a series of supernova eruptions warping the disk locally (Pöppel & Marronetti 2000), presenting the conditions in the solar neighborhood as somewhat unique in causing star formation in the solar neighborhood to occur at comparatively high altitude relative to what was commonly found elsewhere.

The apparent ripples in the Local Arm relative to the plane (independently of the Radcliffe wave, as it is neither a standing, nor a traveling wave with respect to time) does appear to persist up to the ages of 50 Myr (7.7 dex), possibly as old as 100 Myr (8 dex), after which the vertical scale height of the disk exceeds the amplitude of the ripples inside the Local Arm. This significantly exceeds the lifetime of the populations that are associated with it. Thus, either the origin of the ripples precedes the formation of the wave or had to have occurred multiple times throughout these last 100 Myr. Some ripples can also be observed in the Sagittarius Arm as well; see Figure 11, which has been previously also observed by Alfaro et al. (1992). Thus, any mechanism that would cause such ripples is unlikely to be unique or particularly rare.

**Figure 11.** Distribution of the height from the Galactic plane as a trace of the ripple of the disk, using the highest confidence sample. Top: populations younger than 7.8 dex, middle: 8.3–8.8 dex, bottom: populations older than 9 dex. Note the difference in the scale height between panels.
Download figure:
Standard image High-resolution image

A rather curious feature of the Radcliffe wave is that despite its youth, very few strings are associated with it, both within 1 kpc, where their census should be largely complete, and outside of it. Rather, most of the structures recovered along it tend to be compact and isolated. In Paper I, we found that the vast majority of stars younger than ∼8 dex tend to be a part of extended strings that are oriented preferentially perpendicular to the Local Arm. Thus, the Radcliffe wave may represent a different mode of star formation than most other young stellar populations. Alternatively, it is possible that strings will later develop, as molecular gas continues to accrete in the vicinity of these regions. Star formation in a particular region could persist for ∼10 Myr (e.g., the Orion Complex; Kounkel et al. 2018), with the molecular gas still infalling to form clouds in one part of it, even while the gas is fully consumed and/or dispersed in a different part of the same region. Given that all of the populations are very young and, indeed, still associated with molecular gas, their assembly may still be ongoing.

It is also notable that the Radcliffe wave (nor, arguably, any other stellar populations along the Local Arm younger than 30 Myr (7.5 dex) to some extent) does not extend all the way toward the completeness edge imposed by the extinction, which, at these ages, allows us to peer up to almost 3 kpc in any direction. Rather, the wave truncates outside of $-1000\,\lesssim Y\lesssim +1500$ pc. Similarly, although the Sagittarius Arm reaches the edge of completeness along the X-axis, few young stellar populations are found outside of $-1000\lesssim Y\lesssim +1000$ pc. Finally, very few groups younger than 25 Myr (7.4 dex) have been recovered along the Perseus Arm within the completeness limit, although it does have a greater presence at somewhat older ages (Figures 12, 10).

**Figure 12.** Slices from Figure 3 (highest confidence sample), but zooming in at the young populations toward the Perseus Arm, at the age ranges of 7–7.3, 7.3–7.5, 7.5–7.7, and 7.7–7.9 dex. Black line show the location of the Perseus Arm from Reid et al. (2019)
Download figure:
Standard image High-resolution image

3.2. Populations Older than 100 Myr

Unfortunately, the resolution to large-scale structure is not as precise as what is offered by the 1 kpc sample. The few strings of stars that have been identified continue showing the correlation similar to that in the previous work: these extended populations tend to form perpendicular to the spiral arms, however, their catalog is not complete even at the younger ages, and no new strings were added at ages greater than 8.5 dex. The uncertainty in parallax also becomes significant beyond 1 kpc, making it difficult to conclusively analyze their geometry, compared to the sample from Paper I. Furthermore, due to a turnoff that occurs at increasingly fainter magnitude in the older populations, much of the Galactic midplane beyond 1 kpc is truncated due to extinction, leaving behind only a few select lines of sight along which the structure could be recovered (Figure 13), producing the apparent "fingers" that could be seen in, e.g., Figure 1. And although it is possible to recover groupings at larger distances at higher galactic latitudes, they may not be as reliable tracers of the overall structure.

**Figure 13.** Distribution of the clustered populations older than 9 dex located at distances larger than 750 pc, plotted on top of Gaia DR2 color map (courtesy of ESA/Gaia/DPAC, Gaia Collaboration et al. 2018), to highlight the areas of large extinction. Note that extinction can clear out certain lines of sight, and the "Fingers of God" that are discussed in Section 2.3 can be attributed to the extinction patterns. The heliocentric stream is highlighted in yellow. White crosses show the smoothed median b position of the remaining sources as a function of l, and the horizontal line shows the location of the Galactic plane. The full sample is plotted; this underlying distribution of stars is also consistent in the highest confidence sample only.
Download figure:
Standard image High-resolution image

Even considering a single average position of the identified populations, without incorporating the strings, the overall distribution of overdensities is consistent with Paper I. The Local Arm in the current form is no longer apparent in the population older than 7.8 dex. At ∼8 dex, the distribution of density of the populations is mostly uniform, lacking any gaps that could be attributable to the spiral arms. However, an overdensity in the distribution of identified groups that may be related to an ancient spiral arm at a similar position but with different orientation is apparent at ∼8.2–8.8 dex. At these ages, there may be older analogs to the Perseus and Sagittarius Arms as well.

These overdensities, which may be remnant spiral arms, are labeled in Figure 1, and, unaided, all three are most apparent in Figure 7, which excludes lower-mass populations. Figure 3 particularly shows the orientation of the older counterpart to the Local Arm within 1 kpc. This particular feature can also be recovered by analyzing the 3D distribution at this age range in the catalog of known clusters from Cantat-Gaudin et al. (2018, 2020).

At ages of 7.9–8.1 dex, there are no apparent overdensities that correspond to spiral arms, with a very sharp transition period.

Similarly, at ages older than 8.8 dex, two different overdensities are apparent. The relative orientation of these structures results in a substantial deficit of old populations in the direction of the Galactic anticenter (Figures 1, 13), even within the volume of space in which we should be complete at these ages. Although there is some difference in the recovery of groups by HDBSCAN in layers found at different distances along the seam lines, such biases should not affect the populations outside of these seams. Furthermore, the gap separating the two different structures is robust even if the sample is restricted to the highest confidence group (Figure 3). Although this gap has not been seen prior to Paper I, such as in the distribution of previously known clusters (e.g., Cantat-Gaudin et al. 2018), few clusters are known in this volume of space at these age bins to be able to fully analyze their density distribution. Furthermore, the lack of old clusters in the solar neighborhood (which in part can be explained by the gap we observe) has long since been known (Oort 1958).

Improvement in astrometry from the future data releases from Gaia, as well as ground-based spectroscopic follow-up of the stars in the identified populations would be beneficial for confirming that these overdensities are indeed related to the ancient spiral arms and not occur for any other reason. We should note, however, that in tracing the position of clusters in the galactic simulations over time (such as, e.g., Li & Gnedin 2019), it should be indeed possible to recover diffuse ancient spiral arms that have formed even up to the oldest epochs of the simulations (up to 1 Gyr for this particular one). We defer the full comparison of observations to the simulations to the follow-up works.

Some rippling of the galactic disk may be apparent for populations older than 9 dex (Figures 11, 13). Examining the median b position of the stars at these ages as a function of l shows that their distribution may be tilted relative to the galactic plane by as much as 10° (Figure 13). It is broadly consistent with Romero-Gómez et al. (2019)—although asymmetries in the Z distribution in that work are found on much larger scales, there is a similar trend of populations found below the Galactic plane being predominantly located in the $-Y$ directions, and those that are above being located in the $+Y$ direction. No such ripples are apparent in the populations of intermediate age (8–9 dex).

3.3. Heliocentric Stream

Although extinction does shape a substantial portion of substructure that is seen in the distribution of the old populations, there is one feature that does stand apart from the bulk of the identified old groups, and it does not clearly trace the edge of the dusty clouds along the line of sight (Figure 13). This feature resembles a stream of some kind. It is deconvolved into dozens of different groups that do not show a strong relative kinematic coherence, even though it does appear to be continuous in the plane of the sky. It is most apparent toward the Galactic center, elevated at b ∼ 25° away from it at the maximum separation, possibly continuing toward b ∼ −25° near the Galactic anticenter, apparently forming a great circle in the sky. A part of this stream is also present in the 1 kpc sample from Paper I. In the 3D plot in Figure 1, this feature appears to form a plane centered at the Sun, reaching upwards of 600 pc from the plane at the furthest point.

Almost all of the identified groups associated with this heliocentric stream have measured ages of >8.8 dex. Such a distribution of ages would not be surprising at these distances at b ∼ 25°, as they are dominated by the thick disk and halo stars (Rix & Bovy 2013). The stream does persist as a distinct feature in the highest confidence sample, but it is difficult to say how statistically significant the coherence in ages of the groups along the stream is. Furthermore, although there has been a number of remnants of tidally stretched globular clusters and dwarf galaxies that exist as extended streams, it is difficult to imagine why any such stream would be centered at the solar position, making it more likely that it is artificial. On the other hand, its position has no correlation to the Gaia scanning law that would explain the reason behind the systematic preference in the coherence of velocities along this specific plane. Further investigation would be necessary to determine the true nature of the heliocentric stream.

4. Discussion

In this section, we address several points of concern that could be raised regarding the sample that could affect our results.

4.1. Orientation of Strings

As in Paper I, we have found that the young strings prefer an orientation that is close to perpendicular to the Local Arm (Figure 14, left), which has implications for their origins. Although there may be some evidence for this behavior toward the Perseus and Sagittarius Arms as well, it is difficult to state conclusively due to their highly incomplete census. There are observational effects that could in principle mimic this behavior, such as "Fingers of God" pointed radially along the observer's line of sight due to parallax/distance errors. However, as shown in Figure 14 (right), the tangential orientation of strings in the plane of the sky is slightly preferred within the closest few 100 pc (as they are on average ∼200 pc long, this is to be expected). Beyond that, up to 1 kpc, averaging across all ages, there is no preferred orientation of strings relative to the observer, only to the other strings in a similar age bin. Beyond 1 kpc, there is a preference for radial orientation. It is difficult to say if it is due to greater uncertainty in distance or due to them being oriented perpendicular to the Sagittarius and Perseus spiral arms. As we are able to recover strings of only populations $\lesssim 100\,\,\mathrm{Myr}$ , this would also be the preferred orientation in the latter scenario. In a future data release from Gaia, with an improvement in astrometric measurements (Gaia EDR3 is expected to improve the parallaxes by 20%), it would be possible to better disambiguate the orientation of the more distant strings from the parallax uncertainty.

4.2. In Situ String Formation

Our findings suggest that each generation of stars that was formed inside the region associated with the Sagittarius Arm since then has been progressively forming closer toward the Galactic center over the last 100 Myr. It should be noted, however, that there is no strong systematic trend in the velocity vector corrected for the average Galactic rotation in any particular direction, along U, V, or W (Figure 9). The average U may be 2–5 km s⁻¹ toward the Galactic anticenter in several age bins. If we assumed that all of the generations of stars have formed where the current epoch of star formation is, then this is the direction in which they would need to migrate over time. However, the populations in the 40–60 Myr age bin show the average U ∼ 3 km s⁻¹ in the opposite direction, toward the Galactic center, despite being located, on average, between 20–40 and 60–80 Myr populations. Furthermore, the overall velocity distributions are broad. For these reasons, deprojecting the position of the populations backwards in time (with velocities corrected for the Galactic rotation and local standard of rest) would not result in them being more compact, or appearing to originate from a more similar position. Thus, this change in position cannot necessarily be attributable to the young stellar populations inside the arm moving or migrating significantly away from their initial position after their formation. As populations form, they remain roughly comoving in the reference frame of the Galaxy. However as they evolve and the velocity dispersion increases, they do scatter away from their position of birth, and the direction is random. Furthermore, many of them lack the speed to travel the necessary distances in the given period of time. It is impossible to reproduce the current distribution with stellar dynamics alone. Rather, the motion is most likely be attributable the density wave itself. The spiral arm density wave then triggers the molecular cloud formation, with clouds continuously forming at a slightly different position with respect to the Galactic center in the volume of space covered by the data. Then, from these clouds, young populations from which we trace the spiral arms then forms. If there is any motion with respect to the Y-axis as well, it is difficult to conclusively measure due to the overall shape of the spiral arms.

We note that there is no such similar displacement of either Local or Perseus Arms. The corotation radius of stars to be comoving with respect to the spiral arms is expected to be comparable to the solar radius (Dias et al. 2019), thus, a lack of evolution in position of the Local Arm may be expected. However, the separation from the Perseus Arm to the corotation radius is comparable to the separation from the Sagittarius Arms. Thus, the change in position of only one of them but not the other is noteworthy.

5. Conclusions

In this paper, we present a sample of 987,376 stars that can be clustered in the Gaia DR2 catalog to reveal 8292 comoving groups. Although individual membership would require further vetting, particularly in the more distant populations, their bulk allows for a detailed look at the temporal evolution of the structure of the Galaxy.

Such analysis is only possible if the ages of all the stellar populations can be derived in a robust and uniform manner. For that purpose, we developed a neural network, Auriga, that predicts population properties based on the input Gaia and 2MASS photometry of the members. The advantage of a neural network compared to the traditional isochrone fitting is the speed (the entire sample is processed simultaneously instead of one population at a time, deriving parameters for the entire catalog in this work in only a few minutes). It also does not require initial estimates for the parameters of any of the populations, which also significantly automates the process. Finally, outside of constructing the initial training set, deriving ages using neural networks is agnostic to any potential differences between the synthetic isochrones and the real data beyond what is necessary to produce an initial training set. As more and more coherent stellar populations are found in the Gaia catalog, far exceeding the number of clusters that can be carefully visually inspected, neural networks like Auriga are ideally suited for deriving parameters of these populations.

This work builds on Paper I, expanding the distance reach from 1 kpc up to the boundary defined by the extinction map in the magnitude-limited sample. This boundary is age dependent: the youngest (<8 dex) populations are complete up to the distance >3 kpc (Figure 1). Older populations lack bright high-mass stars and thus it is more difficult to recover them in areas of high extinction. No old (>9 dex) structures are found beyond 1 kpc along the Galactic plane, although they can extend up to 2 kpc at higher galactic latitudes.

Analyzing the highest confidence sample and accounting for the completeness limits, we find that:

1.
The Local Arm in the current epoch of star formation has a finite size, not extending to the completeness limits of the survey.
2.
The nearby portion of the Perseus Arm has experienced a lull in the star formation activity over the last 25 Myr. Most of the populations along it have an age of 25–60 Myr.
3.
Unlike the Perseus Arm, the position of which has remained stable over time, the Sagittarius Arm has been continuously shifting closer toward the Galactic Center in the last 100 Myr, having been displaced by more than 500 pc during that time. This corresponds to the velocity of ∼8–10 km s⁻¹. This effect cannot be accounted for by stellar migration; rather, it appears to be an evolution of the position of the arm itself.
4.
A number of youngest populations in the Local Arm are positionally consistent with the Radcliffe wave—a recently discovered 2.7 kpc gaseous structure that oscillates from the Galactic plane with an amplitude of 160 pc. Many such populations along it tend to be less extended, more compact than what is typically observed in other young populations.
5.
The ripple in the disk is not limited to just the Radcliffe wave, but exists among both the young (<8 dex) and the old (>9 dex) populations.
6.
Similarly to Paper I, we find evidence of the Local as well as the Sagittarius and Perseus Arms becoming distinct ∼100 Myr ago. At the older age bins, there are other overdensities and gaps in the distribution of the identified populations that may be attributed to the older spiral arms.
7.
The relative position of these overdensities resulted in a current apparent lack of populations older than 1 Gyr in the solar neighborhood or toward the galactic anticenter.
8.
There is a peculiar heliocentric stream, elevated at ∼25° above the Galactic plane, extending to ∼1–2 kpc in distance. It is unclear what the origin of this stream may be.

We thank Tristian Cantat-Gaudin and Luke Bouma for wonderful discussions. M.K. and K.C. acknowledge support provided by the NSF through grant AST-1449476, and from the Research Corporation via a Time Domain Astrophysics Scialog award (#24217). This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

Software: TOPCAT (Taylor 2005), HDBSCAN (McInnes et al. 2017), PyTorch (Paszke et al. 2017), GalPy (Bovy 2015).

Appendix: Auriga Neural Network for Structure Parameter Determination

A.1. Background

With the discovery of increasingly larger numbers of star clusters and associations with Gaia, it becomes increasingly more important to use isochrone fitting to determine their ages to fully understand the dynamical evolution of the Galaxy. Unfortunately, increased sample makes commonly used techniques quite impractical. Some studies frequently rely on "chi-by-eye" estimates (and thus may be prone to produce erroneous and nonoptimal measurements) as few robust fitters are available. The ones that are available may be strongly dependent on the initial guesses of the fitted parameters and the corresponding choice of priors. They may struggle to converge, may be prone to overfitting and underestimating the uncertainties, slow to run, and require careful vetting of final results. Finally, the isochrones themselves may not necessarily offer the perfect representation of the actual stellar photometry—e.g., the low-mass stars may appear to be inflated compared to the isochrones (Jackson et al. 2018). Although these systematic discrepancies can be managed in the fitting process, they could add an external source of error. Neither approach is optimal when dealing with a large number of populations in a manner that could be considered uniform.

To date, the two largest and most commonly used catalogs of cluster parameters are Dias et al. (2002) and Kharchenko et al. (2013, hereafter MWSC). For a number of clusters, the parameters are copied between two catalogs verbatim. For the ones that were not duplicated, the dispersion in measured ages is 0.57 dex.

More recently, other studies have been conducted to rederive cluster ages in bulk. Liu & Pang (2019) have developed an autonomous pipeline for that purpose, estimating ages and distances (but not A_V) for 2443 clusters; however, it often obtains incorrect ages, forcing many to the lower limit of the isochrones used, <10 Myr, even when the clusters themselves appear to be evolved. Sim et al. (2019) have also performed isochrone fitting for 665 stellar groups, of which 188 reliably overlap with previously known clusters. Unfortunately, they did not provide membership lists used in their work nor their fitting algorithm. Furthermore, Bossini et al. (2019) have used BASE-9 (Robinson et al. 2016), deriving ages, distances, and extinctions (and corresponding uncertainties) for 269 clusters, producing a rather clean, albeit somewhat limited, sample. The consistency in these three aforementioned works is within 0.3 dex—a factor of 2 better than Dias et al. (2002) and MWSC.

In Paper I, we have attempted to use neural networks in order to derive parameters of stellar populations. An advantage of deep learning is that a fully trained neural network is very fast, capable of processing large volumes of data in a very short period of time in a uniform and repeatable manner, memorizing and constructing a function for the complete data model, instead of fitting each individual object separately. However, it requires a large training set that spans a full parameter range across which it would be possible to interpolate against, and this training training set has to be representative of the data in question. In Paper I, we used the MWSC catalog supplemented by synthetic cluster photometry but that was proven to be insufficient, producing acceptable predictions of age and A_V in less than half of all stellar populations, but still requiring to use a fitter on top of it. That exercise did result in a large sample of stars, in conjunction with the above catalogs, that could now be used as a training sample in another improved iteration of the neural network.

A.2. Sample and Training

To assemble the training sample, we took the catalog from Paper I. We dereddened it using the A_V estimates from that work, using the transformations of ${A}_{{BP}}/{A}_{V}=1.068$ , ${A}_{{RP}}/{A}_{V}\,=0.652$ , ${A}_{G}/{A}_{V}=0.859$ , ${A}_{J}/{A}_{V}=0.288$ , ${A}_{H}/{A}_{V}=0.178$ , ${A}_{K}/{A}_{V}=0.117$ (Marigo et al. 2017). Furthermore, we converted the apparent magnitudes to absolute using Gaia parallaxes.

Due to the proximity of all the sources inside it resulting in a smaller relative flux error, it is more optimal for flux manipulation than other catalogs. We grouped all stars in age bins of 0.1 dex, mixing different populations together. Each bin would then be placed at a random distance, of up to 20 kpc, reddened up to 10 A_V, and noise in all the measurements was applied. Sources fainter than the resulting $G\gt 18$ mag were discarded (this propagates to the faint magnitude limit of the synthetic population consistent with that of the unaltered data in the other bands as well), and 250 of the remaining stars were randomly chosen to create a set. This exercise was repeated 1000 times for each age bin, resulting in x random artificially constructed populations. All the data were arranged into 7 × 250 tensor for each population, containing G, G_BP, G_RP, J, H, and K magnitudes, as well as parallax of all 250 stars, ordered by G mag, and all the inputs were normalized in such a way that they fall between the −0.5 to 0.5 range.

This approach is more optimal than using isochrones to generate synthetic cluster photometry, as it does not introduce systematic offsets in flux from incompatibilities between the model in the data. The differences between the real photometry and synthetic photometry may be slight and can be understood by humans as representing a population of similar age. Nonetheless, due to how the neural networks interpret the data, pure synthetic photometry can add odd artifacts in the training process if the goal is to model on the real data.

The augmented sample, however, is essential for the data model produced by the neural network to achieve better convergence compared to the sample produced only from real clusters. But, this sample still ends up being too "clean," and the reddening laws that have been used may be imprecise for wider bands without additional corrections in temperature (Anders et al. 2019). Thus, although this sample is useful to the network to better learn the underlying trends, to teach it what actual data look like, the artificial sample was supplemented with photometry from real clusters. We crossmatched the catalog from Cantat-Gaudin et al. (2018, hereafter CG18) with the cluster parameters from Bossini et al. (2019), Liu & Pang (2019), and Kharchenko et al. (2013), and used the resulting set that was added to the unmodified catalog from Paper I. For the parameters, distances from Kharchenko et al. (2013) were replaced with the distances from CG18. Any clusters younger than 10 Myr in Liu & Pang (2019) were excluded, and as this catalog does not include extinction, this parameter was duplicated from MWSC.

Similarly to the artificially constructed populations, 250 stars from each real cluster were chosen randomly. Although the photometry was mostly unaltered, all the inputs were scattered by the corresponding flux and parallax uncertainties. If a particular cluster has fewer than 250 objects, the same star can be drawn multiple times, with slightly different fluxes from the errors, and each cluster was represented 20 times. Of the real clusters (including all of their realizations), 15% were set aside for validation purposes and were not included in training.

We note that when the network was trained on just the artificial populations or just on the real clusters, the performance was worse than in the combination of the two. The real data did not have the necessary bulk and were missing prominent parts of the parameter space, while the artificial data had minute inconsistencies with the real photometry. By combining the two, it was possible to overcome limitations of either data set and achieve a more general solution.

The neural network that we refer to as Auriga was constructed in Pytorch (Paszke et al. 2017), predicting age, A_V, and a log of distance. We experimented with various model architectures. The architecture that yielded the best loss was based on the mnist model;⁴ it resulted in a 20% better convergence compared to the model in Paper I, or compared to any other model we explored. The training used a stochastic gradient descent to minimize mean squared error loss, stopping when the loss in the real cluster data has converged, and it used the Adam optimizer with the learning rate of 1e−4.

Neural networks themselves do not generate the uncertainties in predicted parameters. However, it is possible to use a method described in Olney et al. (2020) to estimate them. The inputs that were scattered by the errors are statistically comparable, but the neural network treats each realization of the same data set as independent, producing slightly different solution for each one. This makes it possible to measure scatter between them.

A.3. Validation

The trained Auriga model is made available on GitHub,⁵ and the current version is deposited in Zenodo (Kounkel 2020). The predicted parameters for clusters from CG18 are presented in Table A1.

Table A1. Parameters of Clusters from Cantat-Gaudin et al. (2018)^a

Cluster	Theia^b	Age	A_V	Dist.
	ID	(dex)	(mag)	(pc)
ASCC_10	557	8.27 ± 0.11	0.80 ± 0.05	664.1 ± 15.4
ASCC_101	445	8.60 ± 0.06	0.21 ± 0.04	394.6 ± 7.1
ASCC_105	544	8.17 ± 0.10	0.56 ± 0.07	563.5 ± 17.6

Notes.

^aIncludes clusters from Cantat-Gaudin et al. (2019) and Castro-Ginard et al. (2019, 2020). ^bRefers to the closest matched populations in Table 2. Note that multiple clusters can be associated with the same underling population.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as: Data Typeset image

The comparison of these predictions to the fits presented in other papers are shown in Figure A1 for age, Figure A2 for extinction, and Figure A3 for distance. Although, as expected, there is a somewhat better convergence on the train compared to test samples that the network has not seen or learned from, they tend to achieve comparable performance.

**Figure A3.** Comparison of distances derived by Auriga to those from other studies.
Download figure:
Standard image High-resolution image

The reported uncertainties could reproduce the difference between the parameters derived in this work from those in other surveys by a factor of 2–3, depending on the underlying precision of the survey and the parameter in question. We note that the corresponding errors (most notably in age) in these surveys are typically not provided—including them would further account for some of the scatter. Thus, the uncertainties do largely appear to be representative of the underlying error distribution in the data.

There are slight systematic deviations between our work and some data sets. We underestimate the age for old clusters and overestimate the age of the young clusters by ∼0.1 dex, with the split between the two occurring at the age of ∼120 Myr—the transition age at which all of the low-mass stars would already completely arrive onto the main sequence and the high-mass stars would not yet evolve away from it in bulk. Because the best tracers of age in the young and the old populations are very different, the fitting of their isochrones tend to require different strategies. The slight resulting "S" curve is apparent in most panels in Figure A1, to various degrees. We note that the ages from Sim et al. (2019) show the most consistent linear correlation with our ages, without this discontinuity.

In extinction, there is a systematic offset of 0.18 mag in A_V derived in this work, and that found by Bossini et al. (2019) and Paper I. There is no bulk zero-point offset in A_V relative to MWSC and Dias et al. (2002), although with a much larger scatter.

The distance to a population can be generally computed in two ways: through averaging the parallax of the individual stars and through calculating the distance module from their photometry. Auriga considers the weights from both the photometry and the parallax in its interpolation. However, Gaia DR2 parallaxes are systematically too small, with the quoted values of this offset as small as −29 μas and as big as −82 μas (Lindegren et al. 2018; Stassun & Torres 2018; Bobylev 2019; Xu et al. 2019). Some population studies do take this offset into the account and some do not, but these offsets become increasingly more important the farther the cluster is located.

As a result, there is a zero-point offset between distances derived here and some of the other works, namely −16 μas relative to Liu & Pang (2019) and −26 μas relative to CG18. There is no offset in comparison to Bossini et al. (2019) or Sim et al. (2019). This holds consistently for populations located in between 200 and 5000 pc. Beyond 5000 pc, the offset for the latter increases to −70 μas. We note that in cases where in the MWSC training subsample the native distances were kept instead of replacing them with the ones from CG18, this tends to result in a more uniform offset of −70 μas at all distances—although, as that work predates Gaia, this results in a much greater uncertainty and scatter in the predicted distance. For the populations located closer than 200 pc, Auriga tends to overestimate their distance by ∼20 pc—as only a few clusters are found in that space, such an overshoot is probably related to edge effects of the interpolation.

Untangling the Galaxy. II. Structure within 3 kpc

Article metrics

Permissions

Author e-mails

Author affiliations

ORCID iDs

Dates

Abstract

1. Introduction