The kinematics of young stellar population in the W5 region of the Cassiopeia OB6 association: implication on the formation process of stellar associations

The star-forming region W5 is a major part of the Cassiopeia OB6 association. Its internal structure and kinematics may provide hints of the star formation process in this region. Here, we present a kinematic study of young stars in W5 using the Gaia data and our radial velocity data. A total 490 out of 2,000 young stars are confirmed as members. Their spatial distribution shows that W5 is highly substructured. We identify a total of eight groups using the k-means clustering algorithm. There are three dense groups in the cavities of H II bubbles, and the other five sparse groups are distributed at the ridge of the bubbles. The three dense groups have almost the same ages (5 Myr) and show a pattern of expansion. The scale of their expansion is not large enough to account for the overall structure of W5. The three northern groups are, in fact, 3 Myr younger than the dense groups, which indicates the independent star formation events. Only one group of them shows the signature of feedback-driven star formation as its members move away from the eastern dense group. The other two groups might have formed in a spontaneous way. On the other hand, the properties of two southern groups are not understood as those of a coeval population. Their origins can be explained by dynamical ejection of stars and multiple star formation. Our results suggest that the substructures in W5 formed through multiple star-forming events in a giant molecular cloud.


INTRODUCTION
Star formation takes place on a few parsecs to several hundreds of parsecs scales in a hierarchical way (Elmegreen et al. 2000).Stellar associations are the superb laboratories to study star formation process on such different spatial scales as they are the prime starforming sites distributed along the spiral arm structure in the host galaxies (Battinelli et al. 1996;Lada & Lada 2003;Gouliermis 2018).OB associations are particularly interesting stellar systems because they contain a Corresponding author: Beomdu Lim blim@kongju.ac.kr number of massive stars (Ambartsumian 1947), which are rare in the solar neighborhood.OB associations are, in general, composed of a single or multiple stellar clusters and a distributed stellar population (Blaauw 1964;Koenig et al. 2008a;Lim et al. 2019Lim et al. , 2020)).This internal structure may be closely associated with their formation processes.
Expansion of stellar clusters has been steadily detected in many associations (Kuhn et al. 2019;Lim et al. 2019Lim et al. , 2020Lim et al. , 2021)).These findings seem to be the key features to understand the unboundedness of associations according to a classical model for the dynamical evolution of embedded clusters after rapid gas expulsion (Tutukov 1978;Hills 1980;Lada et al. 1984;Kroupa et al. 2001;Banerjee & Kroupa 2013, 2015).Based on the observational data, Lim et al. (2020) suggested that the young stellar population distributed over 20 pc in the W4 region of the Cassiopeia OB6 association originates from escaping stars from the central open cluster IC 1805.
However, cluster expansion alone cannot explain the origin of substructures commonly found in stellar associations.Such substructures are composed of stellar groups (or subclusters) (Kuhn et al. 2014) that are kinematically distinct (Lim et al. 2019(Lim et al. , 2021;;Lim et al. 2022).The formation of substructures can naturally be explained by star formation along filaments in almost all turbulent clouds (André 2015).A range of gas densities leads to different levels of star formation efficiencies.High-density regions are the sites of cluster formation (Bonnell et al. 2011;Kruijssen 2012).Gas clumps have different sizes and velocity dispersions depending on their virial states, which is observed as the so-called size-line width relation (Larson 1981).There are attempts to detect this signature from substructures in stellar associations (Lim et al. 2019;Ward et al. 2020).
Since Elmegreen & Lada (1977) proposed the socalled collect and collapse scenario, a number of observational studies have reported the signatures of feedbackdriven star formation, such as the morphological relationship between remaining gas structures and young stellar objects (YSOs), and their age sequences (Fukuda et al. 2002;Sicilia-Aguilar et al. 2004;Zavagno et al. 2007; Koenig et al. 2008a;Lim et al. 2014b, etc.).Recently, the physical causality between the first and the second generations of stars was assessed by using gas and stellar kinematics (Lim et al. 2018(Lim et al. , 2021)).Meanwhile, a series of theoretical work showed that feedback from massive stars predominantly suppresses subsequent star formation by dispersing remaining clouds (Dale et al. 2012(Dale et al. , 2013;;Dale et al. 2015).This result is supported by recent observations (Yi et al. 2018(Yi et al. , 2021)).The cores in the λ Orionis cloud exposed to a massive O-type star have higher temperatures, lower densities, lower masses, smaller sizes, and lower detection rates of dense gas tracers (N 2 H + , HCO + , and H 13 CO + ) than those in the adjacent star-forming clouds Orion A and B, which implies the former cloud has less favorable conditions for core formation than the others.Therefore, further observational studies are required to test the collect and collapse scenario.
Early studies of the bright-rimmed cloud IC 1848A (W5A/S201) at the border of the giant H II region suggested that star formation in the cloud had been triggered by the expansion of the H II region (Loren & Wootten 1978;Thronson et al. 1980).Wilking et al. (1984) also found another possible site of feedback-driven star formation at the northern cloud (W5NW).The doublepeaked 13 CO (J = 1 − 0) line they observed was interpreted as a result of the passage of a shock driven by the ionization front.Later, it was found that a number of YSOs and cometary nebulae were distributed along the H II bubble (Karr & Martin 2003;Koenig et al. 2008a,b).In addition, YSOs far away from the ionizing sources tend to be at an earlier evolutionary stage of protostars (Koenig et al. 2008a).These results were interpreted by feedback-driven star formation models (Elmegreen & Lada 1977;Sandford et al. 1982).
The presence of multiple clusters, distributed stellar population, and the young stars distributed along the border of H II regions suggest that this SFR might have been formed through multiple processes.The absence of kinematic information has hindered our understanding of its formation process.However, the parallax and proper motion (PM) data obtained from the Gaia mission (Gaia Collaboration et al. 2016) along with radial velocities (RVs) allow us to evaluate the membership of young stars and further investigate their kinematic properties.In this study, we aim to understand the formation process of this SFR.Data that we used are described in Section 2. In Section 3, the scheme of genuine members is addressed.We present the results of this study in Section 4 and discuss the star formation process within W5 in Section 5. Finally, our results are summarized in Section 6 along with our conclusions.

Selection of member candidates
Most OB associations are distributed along the Galactic plane (Wright 2020), and therefore a large number of field interlopers are observed together in the same field of view.The member selection is a procedure of crucial importance to obtain reliable results as emphasized by our previous observational studies (e.g., Lim et al. 2020Lim et al. , 2021;;Lim et al. 2022).We selected members through two steps.First, the candidates of young star members were identified using several spectrophotometric criteria.Second, the final member candidates can be selected in the parallax and PM cuts.
We first gathered four different catalogues.Massive O-and B-type stars found in SFRs are probable member candidates because of their short lifetime, especially for O-type stars.We obtained the lists of such O-and B-type stars from several databases of MK classification (Wenger et al. 2000;Reed 2003;Skiff 2009;Maíz Apellániz et al. 2013).A catalogue of 192 O-and B-type stars in W5 region was created after some duplicates were removed.Koenig et al. (2008b) published a list of 17,771 infrared sources distributed over W5 region.We took only 2,062 sources showing infrared excess.Later, a catalogue of 408 YSO candidates was released by Koenig & Allen (2011).This catalogue contains the spectral types and Hα equivalent widths of the stars.We considered stars with Hα equivalent widths smaller than −10 Å and 0 Å as Hα emission stars and candidates, respectively.The last catalogue contains a total of 567 members in W5 West selected using U BV I and Hα photometry (Lim et al. 2014a).
We cross-matched the four catalogues to create a master catalogue of member candidates.All O-and B-type stars were found in the catalogue of Koenig et al. (2008b) except one.A total 564 out of 567 member candidates from Lim et al. (2014a) have the infrared counterparts.Among the three candidates without infrared counterparts, two candidates are Hα emission stars, and the other one is an early-type star.Since they are highly probable members, we added these four sources to the master catalogue.All the YSO candidates from Koenig & Allen (2011) were included in the infrared source list of Koenig et al. (2008b).The master catalogue contains a total of 2376 member candidates, of which 2000 have counterparts in the catalogue of Gaia Early Data Release 3 (EDR3; Gaia Collaboration et al. 2021).
The parallaxes of Gaia EDR3 have zero-point offsets as a function of magnitude, color, and ecliptic latitude (Lindegren et al. 2021).We corrected such offsets for the parallaxes of individual member candidates using the public Python code (Lindegren et al. 2021; https://gitlab.com/iccub/public/gaiadr3zeropoint).In the catalogue of member candidates, we did not use stars with negative parallaxes or close companion (duplication flag = 1 or RUWE > 1.4) and stars without astrometric parameters in analysis.Figure 1 displays the colormagnitude diagram (CMD) of the member candidates.

Radial velocities
We performed multi-object spectroscopic observations of 273 YSO candidates on September 3, October 30 in 2020, and October 22, 27, and 29 in 2021 using highresolution (R ∼ 34, 000) multi-object spectrograph Hectochelle (Szentgyorgyi et al. 2011) on the 6.5m telescope of the MMT observatory.All the spectra were taken with the RV31 filter that covers a spectral range of 5150 to 5300 Å in a 2 × 2 binning mode.For one observation setup, several tens of fibers were assigned to the YSO candidates, and the others were directed toward blank sky to obtain sky spectra.The exposure time for each frame was set to 35 minutes.A minimum of three frames were taken for the same observation setup to eliminate cosmic rays and achieve as high a signal-to-noise ratio as possible.For calibration, dome flat and ThAr lamp spectra were also obtained just before and after the target observation.

Lim et al.
We reduced the raw mosaic frames using the IRAF1 /MSCRED packages following standard reduction procedures.One-dimensional spectra were subsequently extracted from the reduced frames using the dofiber task in the IRAF/SPECRED package.Target spectra were then flattened using dome flat spectra.The solutions for the wavelength calibration obtained from ThAr spectra were applied to both target and sky spectra.Some spectra were affected by the scattered light because our observations were conducted under bright sky condition.The scattered light was unevenly illuminated over the field of view (1 • in diameter), resulting in a spatial variation of sky levels.Hence, we constructed a map of sky levels for a given setup following the procedure used in our previous study (see Lim et al. 2021 for detail).Target spectra were subtracted by sky spectra scaled at given target positions.The sky-subtracted spectra for the same target were then combined into a single spectrum.Finally, all target spectra were normalized by using continuum levels traced from a cubic spline interpolation.We rejected the spectra of 115 targets from subsequent analysis.Among them, the spectra of 108 targets have signals close to the sky background levels, and therefore the signals of these spectra were insufficient to measure RVs.The spectra of six targets were dominated by continuum, and that of the other one is dominated by emission lines.
We measured the RVs of the rest 158 YSO candidates using a cross-correlation technique.Synthetic stellar spectra for the solar abundance and log g = 4 were generated in a wide temperature range of 3,500 to 10,000 K using SPECTRUM v2.76 (Gray & Corbally 1994) 2 based on a grid of the ODFNEW model atmospheres (Castelli & Kurucz 2004).These synthetic spectra were used as template spectra.We derived the cross-correlation functions between the synthetic spectra and the observed spectra of the YSO candidates with xcsao task in the RVSAO package (Kurtz & Mink 1998).The velocities at the strongest correlation peaks were adopted as the RVs of given YSO candidates.The errors on RVs were estimated using the equation as below (Kurtz & Mink 1998): where w, h, and σ a represent the full widths at halfmaximum of cross-correlation functions, their amplitudes, and the root mean square of antisymmetric components, respectively.Rapidly rotating stars, in general, have large uncertainties in RVs because they have large full widths at half-maximum of cross-correlation functions.Also, the RV errors exponentially increase as the r-statistics of cross-correlation functions (r = h/ √ 2σ a ; Tonry & Davis 1979) decrease.Indeed, it was confirmed that the cross-correlation functions with rstatistics smaller than 6 yield very large errors of RVs (> 5 km s −1 ).We thus excluded some RV measurements where the r-statistics of cross-correlation functions is less than 6.The median error of the RVs is about 1.2 km s −1 .The RVs of YSO candidates were then converted to velocities in the local standard of rest frame using the IRAF/RVCORRECT task.

MEMBER SELECTION
We selected the member candidates based on the spectrophotometric properties of young stars.However, there may be a number of nonmembers in the catalogue of member candidates (see Lim et al. 2020Lim et al. , 2021;;Lim et al. 2022).For instance, the two bright infrared sources (G RP < 10 mag and G BP − G RP > 3) in Figure 1 are probably asymptotic giant branch stars in the Galactic disk because they are too bright to be the pre-mainsequence members of this SFR.It is, thus, necessary to filter out additional nonmembers using the Gaia parallaxes and PMs of stars.In order to exclude stars with very large measurement errors in parallax and PM, we used stars that are brighter than 18 mag in G RP band and have parallaxes greater than three times the associated errors.Note that a total of 863 candidates are fainter than 18 mag.The left panel of Figure 2 displays the parallax distribution of the member candidates.Most member candidates have parallaxes smaller than 1 mas (d > 1 kpc).The distances determined from previous studies range from 1.7 to 2.3 kpc (Sharpless 1955;Johnson et al. 1961;Becker & Fenkart 1971;Georgelin & Georgelin 1976;Moffat 1972;Loktin et al. 2001;Chauhan et al. 2011;Lim et al. 2014a).However, we considered candidates between 1.0 kpc and 3.5 kpc to contain as many probable members as possible.Their PMs distribution is shown in the right panel of Figure 2.
The PM distribution shows the strong concentration of member candidates around (µ α cos δ, µ δ ) = (0 mas yr −1 , 0 mas yr −1 ).Most of them may be genuine members.In order to remove PM outliers, the statistical clipping method described in Lim et al. (2022) was applied for the member candidates.
We excluded some member candidates with PMs larger than five times the standard deviation (5σ) from the mean PMs.This criterion allows us to select some walkaway stars.The mean and standard deviation values were redetermined using the remaining member candidates.This iterative process was performed until the statistical values reached constant values.
Figure 3 displays the CMD of the selected members.However, the bright infrared source (G RP = 7.1 and G BP −G RP = 3.2) was still selected as a member.As we   In order to compute the reliable distance to W5, we used members with parallaxes larger than 10 times the associated errors.The bin sizes of 0.2 kpc and 1.2 km s −1 were used to obtain the distance distribution and RV distribution, respectively.The red curves represents the best-fit Gaussian distributions.
mentioned, this star may be an asymptotic giant branch star with similar kinematics to those of W5 members at almost the same distance.We excluded this star in our member list.A total of 490 candidates were finally selected as members.The members are listed in Table 1.Member candidates that we selected using optical and infrared data are active stars with warm circumstellar disks.There are, in fact, a number of diskless YSOs in this SFR.Therefore, it is necessary to determine whether or not our member sample is representative for statistical analysis.
Such diskless YSOs without infrared excess emission have been identified using X-ray data in many previous studies (Getman et al. 2005;Flaccomio et al. 2006;Townsley et al. 2011;Caramazza et al. 2012, etc).However, any extensive X-ray survey has not yet been per- formed for W5.Several hundreds of X-ray sources only in the eastern edge of W5 East (AFGL 4029) were detected (Townsley et al. 2019).We found a total of 257 X-ray counterparts in the Gaia EDR3 catalogue (Gaia Collaboration et al. 2021), of which 56 were genuine members brighter than 18 mag in G RP according to our member selection criteria.The member catalogue of this study contains 15 out of 56 X-ray sources.The PMs of members in the catalogue were compared with those of the 56 members with X-ray emission.As a result, they have median PM (−0.110 mas yr −1 , −0.055 mas yr −1 ) similar to that of the X-ray members (−0.122 mas yr −1 , −0.092 mas yr −1 ).Hence, we confirmed that the members selected in this study are representative sample of young stellar population in W5.We present the spatial distribution of members in Figure 4. W5 region has a high level of substructures.There are several groups of stars with high surface density and a distributed stellar population.The identification of stellar groups is addressed in a later section in detail.Figure 5 shows the distance and RV distributions of members.The distances of individual members were obtained from the inversion of the zero-point-corrected Gaia parallaxes (Gaia Collaboration et al. 2021;Lindegren et al. 2021).These two distributions were fit to Gaussian distributions, respectively.We obtained the distance to W5 to be 2.1 ± 0.1 (s.d.) kpc and its systemic RV to be −37.8±3.3 km s −1 from the center values of the best-fit Gaussian distributions.RV data within three times the standard deviation from the mean RV were used to minimize the contribution of close binaries. [mag] [mag] [ 4. RESULTS

Substructure
Our previous studies have shown that stellar groups constituting the substructure in associations are spatially and kinematically distinct (Lim et al. 2019(Lim et al. , 2020(Lim et al. , 2021;;Lim et al. 2022).This means that they are individually different physical systems.We identified stellar groups in W5 by means of the unsupervised machine learning algorithm k-means clustering (Lloyd 1982).This algorithm finds a set of groups that have the smallest variance of each group (i.e., most compact groups) at a given number of groups.We used four-dimensional parameters as input data: R.A., decl., µ α cos δ, and µ δ .To find the optimized number of groups, we tested a number of groups from 1 to 20 and then computed the inertia value that is a sum of squared distances of stars to the centroid of their nearest group.
Figure 6 displays the variation of the inertia values with respect to the number of stellar groups.Adopting a small number of groups prevents many real stellar groups from being identified, while adopting a large number of groups can lead to an overestimation of the number of genuine groups.In the figure, the location where an abrupt change of the inertia value occurs is referred to as the elbow.The elbow value is useful to determine the optimal number of groups.It is assumed that the variation of the inertia values approximates the combination of two straight lines (red lines in the figure).These two lines were obtained using a least square method for the number of groups ranging from 2 to 9 and 10 to 19, respectively.The elbow value was determined as the intersection of two straight lines.Finally, we adopted eight groups that constitute the substructure in W5.The identified groups are plotted by different colors in Figure 7.These stellar groups were named according to R.A. order (A to H).
The groups C (green), D (blue), and F (purple) are stellar clusters denser than the other groups in W5.The two former groups are located at W5 West, while the latter ones are centered at the W5 East.Early-type stars in these three groups may be the main ionizing sources of W5.There are five sparse groups of stars around these clusters.The groups A, E, and H are located at the border of the northern H II bubbles, and the other two groups B and G are found in the southern part of the H II regions.We summarize the properties of individual groups in Table 2.
In order to quantify the structural properties of the eight groups in W5, the minimum spanning tree (MST) technique were applied to the eight groups.The MST technique is to find the minimum length of the edges to connect all data points, which is often used to measure the degree of mass segregation (Allison et al. 2009).In this work, we measured a dimensionless parameter Λ defined as the ratio of standard deviation of length of edges to the mean length of edges (Hong et al. 2017).The larger Λ means that there is a substructure in a group, such as core, clumps or filamentary structures.On the other hand, the lower Λ means that there is no structural trend in the group and, for example, the Λ becomes ∼ 0.46 when the group follows a uniform random distribution.The result of MST analysis can be found in Table 2.Although the MST results of individual groups rely on the membership determination of host groups by the clustering algorithm, the results clearly show a trend that dense groups show larger Λ, and sparse groups show lower Λ.Especially, the MST results for three groups (B, E, and G) show that there is no significant structure.
We DBSCAN finds groups based on the densities of data points.A main parameter is a radius ( ) in consideration of neighboring data points.The value from 3.5 to 3.0 identifies 7 to 10 groups.When is 3.5, DBSCAN identified two major groups in the east and west (D+C and F+H in Figure 7).As decreases, these groups tend to be split.DBSCAN could not identify sparse groups (A, B, E, and G) in all cases; these group members were identified as noises.
DBSCAN is limited to identify groups with similar density, and therefore we implemented hierarchical HDBSCAN (Campello et al. 2013).Unlike the DB-SCAN algorithm, HDBSCAN can identify clusters with various density.A major parameter is the minimum sample size (n), which is a number of neighboring points to be considered as cores.Larger n provides more conservative clustering results.We tested n from 5 to 20.When Similar to DBSCAN, HDBSCAN identified denser major groups in east and west (C, D, and H+F).These groups were split when using smaller n.Two sparse groups B and G were identified when using n = 5.Other sparse groups A and E were not identified in any cases.We then tested a hierarchical clustering algorithm, Agglomerative Clustering.The major parameter is the number of groups.This algorithm was tested for the number of groups (n groups ) from 4 to 14.When n groups =7, six groups (A, B, C, D, G, and H) were identified and the E+F group was identified as a single group.Larger n groups values result in splitting groups such as A, D, and H. Smaller n groups tends to combine groups; when n groups =4, group B and G are combined to C and F, respectively.While details are different, major results (east, west, and southern sparse groups) are similar to those obtained from the k-means clustering.
All clustering algorithms that we used, in common, properly identified the dense groups although the memberships of stars at the boundaries of host groups is slightly different.The sparse groups were regarded as noise for the DBSCAN and HDBSCAN algorithms while the Agglomerative Clustering and k-means clustering algorithms identified them as real groups.Therefore, we should cautiously adopt the clustering results because the internal structures in SFRs are, in fact, very complex and related to their formation processes.Further information is required to determine if they are real physical systems.The ages and kinematics of group members may provide additional constraints on the identities of individual groups.Since some sparse groups seem to be real groups associated with remaining clouds, adopting the result from the k-means clustering is suitable for the purpose of this study.

Relative Ages
The star formation history in W5 can be inferred from the age distribution of stellar groups.The representative ages of individual groups are, in general, estimated from comparison of the overall features of CMDs with stellar evolutionary models.Especially, the luminosity of the main-sequence turn-on (MSTO) point (pre-mainsequence to main-sequence) is sensitive to the age of a given stellar group.
Figure 8 displays the distance-corrected CMDs of individual groups with theoretical isochrones.The age of the group C (the open cluster IC 1848, green dots) is known to be about 1-5 Myr (Moffat 1972;Karr & Martin 2003;Koenig & Allen 2011;Lim et al. 2014a).We estimated the age of this cluster fitting the MESA isochrones considering the effects of stellar rotation (Choi et al. 2016;Dotter 2016) to the CMD of the cluster.The distance modulus (DM) of 11.6 mag (2.1 kpc) was applied to the 5 Myr isochrone.The minimum extinction A V of 1.86 mag was adopted from our previous study (Lim et al. 2014a).This reddened isochrone fits well to the ridge of main-sequence as well as the luminosity of the MSTO.Therefore, we adopt 5 Myr as the age of this group.There are a number of stars brighter than the 5 Myr isochrone at given colors.These stars may be highly reddened stars (see the blue curve in the figure).
We compared the MSTO and the overall features of the CMDs of individual stellar groups with that of the group C. The CMD morphology of the two dense groups D (blue) and F(purple) is very similar to that of the group C, and therefore the three dense groups would have similar age (5 Myr).The members of the sparse group A, E, and H are brighter and redder than those of the group C. We plotted the 2 Myr isochrones reddened by 3.00 mag, which passes through the middle of their CMDs.The color spread from the isochrones implies the presence of differential reddening across each group.In fact, they are located at the border of the H II region where large amounts of gas remain.This result indicates that there is an age difference between the northern sparse groups and dense groups.Therefore, the group division by the k-means clustering is meaningful.
On the other hand, the representative ages of the two southern groups B (red) and G (pink) are unclear as their MSTO is not well defined.The overall morphology of their CMDs are somewhat different from those of the other groups.There is no star between 2 and 4 mag in G RP − DM .The stars brighter than 2 mag close to the 2 Myr isochrone, while the faint stars seem to be older (> 5 Myr) than the bright stars.We speculated that these two groups may not be the groups of coeval stars with the same origin.4) and ( 5

Kinematics
W5 has systemic PMs of −0.273 mas yr −1 and −0.333 mas yr −1 along R.A. and decl., respectively.We investigated the kinematics of individual groups.The tangential velocities (V R.A. and V decl. ) were computed from the PMs multiplied by the distance of 2.1 kpc. Figure 9 exhibits the distributions of velocities with respect to R.A. and decl.Since most groups are distributed along the east-west direction rather than the north-south direction, it is easier to probe some trends in position-velocity plane along R.A.
There is no clear tendency between V R.A. and positions of stars along R.A., while a gradual variation of V decl.along R.A. is detected.V decl.decrease at 0.08 km s −1 pc −1 .Any significant large-scale variation in RV was not found.We present the median velocities (V R.A.,med , V decl.,med , and RV LSR,med ) in Table 2.
We computed the standard deviation of the velocities of individual groups after excluding some outliers.The median measurement errors were adopted as the typical velocity errors.The velocity dispersions of given groups were then obtained from quadratic subtraction between the standard deviation values and the typical velocity errors.For RVs, we computed the velocity dispersions of the four groups that have more than ten stars with RV measurements.The velocity dispersions are presented in Table 2.
The dense, populous groups C, D, and F tend to have velocity dispersions smaller than those of the other groups.In addition, the motions of stars in such groups seem to be nearly isotropic, given similar velocity dispersions among the tangential velocities and RV.The group A shows a large velocity dispersion in V R.A. , while it has a small value in V decl.(see also Figure 9).The groups B and G have particularly large velocity dispersions.
We investigated the motions of individual groups within W5. Figure 10 shows the median PMs of individual groups relative to the systemic motion of W5.The number of members in W5 West is twice that in W5 East, so the groups A, C, and D have relative PMs close to the systemic motion of W5.The eastern groups E, F, and H are moving toward north, while the southern groups B and G are radially receding away from the center of this association.
Finally, the motion of individual members relative to the center of their host groups is probed in Figure 11.The panels in the first and third rows of the figure exhibit the relative PM vectors of individual members in given groups.The group members show different pattern of motion.Some members of the groups C, D, and F seem to be moving outward from their group center.The PM vectors of members in the other groups have somewhat random direction.
In order to quantitatively investigate the direction of their motion, we computed the vectorial angle (Φ) that is defined by the angle between the position vector from the group center and the relative PM vector of members (Lim et al. 2019(Lim et al. , 2020(Lim et al. , 2021;;Lim et al. 2022).A Φ = 0 • indicates that a star is radially receding away from the group center, while a Φ = 180 • means that it is sinking inward.
The panels in the second and fourth rows of Figure 11 displays the Φ distribution with respect to the projected distances from the center of each group.More than 40% of members in the groups C, D, and F have Φ values close to 0 • .This result indicates that these groups are expanding.The group B and G also shows a pattern of expansion, but the number of members are too small to confirm this claim.The members of the other groups show random Φ distributions, indicating random motion.The biggest difference between the dense groups and the two southern groups is that the dense groups have nearly isotropic inner region, which means that these groups are self gravitating.
Figure 12 displays the integrated intensity map of the 12 CO J = 1 − 0 taken from Heyer et al. (1998).The integrated intensity map clearly shows the cavities in W5, where the groups C, D, and F incubating massive stars are located (see also Koenig et al. 2008a).Most molecular clouds are distributed in the north of the three groups.The members of the stellar groups A, E, and H are spread over the northern clouds.Meanwhile, a small amount of clouds remain in the southern region.
The molecular clouds have RVs in a range of −45 km s −1 to −35 km s −1 as shown in the position-velocity (PV) diagrams of Figure 12.Some clouds appear to be influenced by massive stars in groups C, D, and F. This aspect is found in the PV diagram along R.A.Although there is a scatter in stellar RVs, the RVs of molecular clouds are slightly smaller than those of the adjacent stellar groups hosting massive stars.Similar results were also found in some star-forming regions (Lim et al. 2018(Lim et al. , 2021)).

STAR FORMATION IN W5
The extent of W5 is over 70 pc, and a high-level of substructures are found within the SFR.The dense stellar groups C, D, and F are located in the cavities of the giant H II regions.Their age estimated from the MSTO is, in common, about 5 Myr, indicating that they formed in almost the same epochs.These dense groups are older than the other sparse groups, and therefore the star formation had been ignited at their current locations.
These dense groups have velocity dispersions smaller than the other groups.If the small velocity dispersions indicate their physical state of their natal cloud, the cloud might have rapidly reached a subvirial state.Such a process may favorably occur in dense filaments by gravitational instability (André et al. 2010), which may lead to cluster formation (Bonnell et al. 2011;Kruijssen 2012).Hence, the three dense groups might have formed in the densest region in a giant molecular cloud on a very short timescale.
About 40% of stars in the dense groups are escaping from their host groups.Some stars scattered over this SFR may have originated from the expansion of such dense groups.However, their expansion pattern is not as significant as that found in IC 1805 which has a core-halo structure (Lim et al. 2020).We therefore argue that the expansion of the dense groups may not be the origin of the overall structure in W5.The age difference between the dense groups and the northern groups supports this argument.Despite, it is expected that their expansion will lead to a distributed stellar population over several Myrs.
Star formation propagated to the northern part of W5 2 Myr ago.The sparse groups (A, E, and H) formed along the ridge of the H II regions.A number of previous studies have proposed that W5 is the site of feedbackdriven star formation (Loren & Wootten 1978;Thronson et al. 1980;Wilking et al. 1984;Karr & Martin 2003;Koenig et al. 2008a).Koenig et al. (2008a) suggested that the radiatively driven implosion mechanism (Klein et al. 1985) operates on a small spatial scale, e.g., cometary globules or elephant trunk structures in the southern ridge of W5 West, while the collect and collapse mechanism (Elmegreen & Lada 1977) works on a larger scale, e.g., the northern clouds.
If the sparse groups were formed by the expansion of the H II regions, then they are expected to be receding away from ionizing sources.Since the group members form in compressed clouds, these stars have similar kinematics to that of the remaining clouds.However, observational results do not fully support the argument of Koenig et al. (2008a).
In figure 12, the PM vectors of the group A members are shown relative to the group C. Their PM vectors have somewhat random orientation.Only one member has a RV measurement, and its RV is far different from that of the adjacent clouds.The RV data of more members would be required to better determine their systemic RV.
We display the PM vectors of the group E and H members relative to the nearest dense group F in W5 East.Similarly, the group E members show random motions although they have similar RVs to those of the remaining clouds.The group H members are systemically receding away from the group F. We computed the Φ values of the group H members relative to the group H.As a result, the Φ distribution shows a peak at ∼ −20 • .The fraction of stars showing such a systemic motion is over 50% of all group members.Also, their RVs are consis-tent with those of the elephant trunks at the eastern edge of the H II region.The formation of the group H out of the three northern groups is likely associated with feedback from massive stars.
Perhaps, the groups A and E spontaneously formed.They have velocity dispersions larger than those of the dense groups.It implies that their natal cloud was probably less favorable sites of star formation, like lowdensity thin filaments (André et al. 2010).Star formation have proceeded on a timescale longer than the formation timescale of the dense groups.
Figure 12 also exhibits the PM vectors of the members in the southern groups B and G relative to the dense group C and F, respectively.The group B members have randomly oriented PM vectors, while more than 50% of the group G members are moving away from this SFR.As seen in Section 4.2, the group B and G members are not as young as those of the groups A, E, and H. Also, it is not sure that the members of these two groups are coeval populations as seen in their CMDs.
Hence, these group members may have different origins.One possible explanation is that some of them are walkaway stars (de Mink et al. 2014).Runaway and walkaway stars can originate from the end of binary evolution (Blaauw 1961).In this scenario, the massive primary star undergoes supernova explosion, and then the less massive secondary star is ejected, becoming either a runaway or a walkaway star.Another hypothesis is related to the dynamical ejection of stars in star-forming regions (Poveda et al. 1967;Oh et al. 2015;Oh & Kroupa 2016).Since there is weak evidence of supernova explosions in W5 (Vallee et al. 1979), the latter one is the more favorable mechanism for the southern groups in W5.The noncoeval population and the abnormal highmass star population in CMDs (Figure 8) can also be naturally explained according to this mechanism.Lim et al. (2020) confirmed, in both numerical and observational ways, that subvirial collapse of cluster can lead to isotropic core and expanding halo structure.Maíz Apellániz et al. ( 2022) also confirmed some stars and stellar systems isotropically ejected from the Bermuda cluster.They suggested that these ejected stars and stellar systems have a large amount of mass, and therefore such large mass-loss can result in cluster expansion.However, the group B and G members show anisotropic spatial distribution relative to this SFR.They occupy only the southern regions.If group G had the same origin as the halo of IC 1805 or the stellar systems ejected from the Bermuda cluster, it might miss some stars escaping in different directions, or the clustering algorithm we used might not be able to iden-tify them.Based on current observational data, only a few stars are moving outward beyond this SFR.
The group B members have randomly-oriented PM vectors.The dynamical ejection mechanism cannot explain the random orientation in PM vectors.Low-levels of local star formation events may be another possible explanation for the origin of the southern groups.In this case, it should solve the question why there are more early-type stars than later-type stars, given the typical initial mass function (Salpeter 1955;Kroupa 2001).
In this study, a total of eight groups were identified.However, it is necessary to search for smaller subgroups in the future work.For instance, stars in the eastern part of group H (∆R.A. ∼ 60 ) constitute a small aggregation.They seem to be associated with a small H II bubble east of the W5 East bubble (see figure 7 of Koenig et al. 2008a).Their PM vectors relative to the group F show random directions, unlike stars in the western part of the group H.In addition, there is a small subgroup south of the group C.This group seems to be associated with the pillar-like structures at the border of the southern ridge of the W5 West bubble (see also figure 7 of Koenig et al. 2008a).These further groupings will help to better understand the formation process of this SFR.

SUMMARY AND CONCLUSION
We studied the spatial and kinematic properties of young stars in the massive SFR W5 of the Cassiopeia OB6 association using the Gaia EDR3 data and highresolution spectra to understand the formation process of stellar associations.
A total of 490 out of 2,000 young stars over W5 were selected as members using the Gaia parallaxes and PMs.The spatial distribution of the members reveals highlevels of substructures in W5.We identified eight stellar groups in total by means of the k-mean clustering algorithm.Three dense groups are centered at the cavities of the giant H II regions, and three sparse groups are found at the border of the H II bubble.The other two groups were found on the outskirt of the southern bubble.Our results were compared with those obtained from the other unsupervised machine learning algorithms, such as DBSCAN, HDBSCAN, and Agglomerative Clustering.
The dense groups are composed of the oldest stellar population (5 Myr), indicating that they are the first generation of stars in W5.They are now expanding.Three million years after their birth, star formation might have propagated toward the northern regions.Only one group (H) shows the signature of feedbackdriven star formation.A number of its members are moving away from the nearest ionizing sources in the neighboring dense group F. In addition, their RVs are similar to those of the adjacent gas structures.On the other hand, the other two northern groups do not show such signatures, and therefore they might have spontaneously formed in the current positions.
The southern groups B and G seem not to be composed of coeval population.The group B members have randomly oriented PM vectors, while more than half of the group G members are moving away from W5.We discussed their possible origins as walkaway stars from W5 and/or multiple low-level star formation events.
In conclusion, the major star formation process in W5 may be associated with the structure formation in a giant molecular cloud.Multiple star formation might have spontaneously taken place in different positions and epochs.In addition, feedback from massive stars has triggered the formation of a new generation of stars, but the spatial scales at which this mechanism occurs may not as large as Koenig et al. (2008a) suggested.Subsequent dynamical evolution of stellar groups will form a distributed stellar population in several Myr.
The authors thank the anonymous referee for constructive comments and suggestions.The authors would also like to express thanks to Prof. Mark Heyer for providing supplementary data, Dr. Nelson Caldwell, and the other mountain staffs for assisting with Hectochelle observations.Observations reported here were conducted at the MMT Observatory, a joint facility of the University of Arizona and the Smithsonian Institution.This paper has made use of data obtained under the K-GMT Science Program (PIDs: MMT-2020B-001 and MMT-2021B-001) partly supported by the Korea Astronomy and Space Science Institute (KASI) grant funded by the Korean government (MSIT; No. 2023-1-860-02, International Optical Observatory Project) and from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia),processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium).Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.This research has also made use of the SIMBAD database, operated at CDS, Strasbourg, France.This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT; grant Nos.NRF2019R1C1C1005224 and 2022R1C1C2004102) and the research grant of Kongju National University in 2022.BL is grateful for Ms. Seulgi Kim's assistance in data reduction and Prof. Jeong-Eun Lee's comments on observing proposal.

Figure 2 .
Figure2.Parallax (left) and PM (right) distributions of member candidates.The left panel displays the parallaxes and their associates errors.We plot stars that are brighter than 18 mag in GRP band and have parallaxes greater than three times the associated errors.In the left panel, dashed lines indicate the boundary used to search for genuine members between 1.0 and 3.5 kpc.The right panel exhibits the PM distributions of member candidates between the 1.0 and 3.5 kpc.Only stars with parallaxes greater than their associated errors are considered for analysis.The ellipse in the right panel shows the region confined within five times the standard deviation from the weighted mean PMs, where the inverse of the squared PM error is used as the weight value.The selected members are shown by red dots.

Figure 3 .
Figure 3. CMD of the selected members.The symbols are the same as in the Figure 1.

Figure 4 .
Figure 4. Spatial distribution of members in W5.The size of dots is proportional to the brightness of individual stars.The positions of stars are relative to the reference coordinate R.A. = 02 h 54 m 45. s 80, decl.= +60 • 22 04. 3 (J2000).

Figure 5 .
Figure 5. Distribution of distances (left) and RVs (right).In order to compute the reliable distance to W5, we used members with parallaxes larger than 10 times the associated errors.The bin sizes of 0.2 kpc and 1.2 km s −1 were used to obtain the distance distribution and RV distribution, respectively.The red curves represents the best-fit Gaussian distributions.

Figure 6 .
Figure6.Relation between the number of groups and inertia values.The number of groups was obtained from the elbow value of the inertia curve.The elbow value of eight (8) was determined from the point of intersection of the two red straight lines.See the main text for detail.
additionally tested three different clustering algorithms: Density-Based Spatial Clustering of Applications with Noise (DBSCAN; Ester et al. 1996), hierarchical DBSCAN (HDBSCAN; Campello et al. 2013), and Agglomerative Clustering.Adopting the results from the k-means clustering, we tuned parameters for each algorithm that makes similar results.Scikit-learn (Pedregosa et al. 2011) was used for k-means clustering, DBSCAN, and Agglomerative Clustering, while the software developed by McInnes et al. (2017) was used for HDBSCAN.

Figure 7 .
Figure 7. Identification of stellar groups in W5.A total of eight groups were identified by means of the k-means clustering algorithm.The identified clusters were shown by different colors.The size of dots is proportional to the brightness of individual stars.

Figure 8 .
Figure8.CMD of stellar groups.The colors of dots represent the members of given groups corresponding to the color codes shown in Figure7.The CMD of the group C is plotted by gray dots for comparison.The red and blue curves exhibit the 5 Myr isochrones reddened by a total extinction (AV ) of 1.86 and 3.00 mag, respectively, while the 2 Myr isochrone reddened by AV of 3.00 is plotted by gray curves.The CMDs of individual stellar groups are compared with that of the open cluster group C (green).

Figure 9 .Figure 10 .
Figure 9. Position-velocity diagrams of stars.The colors of dots are the same as those in Figure 7.The vertical lines represent the errors of velocity measurements.

Figure 11 .
Figure 11.Motions of stars in their host groups.The panels in the first and third rows display the spatial distributions of stars in given groups.The straight lines with different lengths represent the PM vectors of stars relative to their host groups.The Φ distributions are shown in the panels in the second and fourth rows.The colors of dots corresponds to those of stellar groups in Figure 7.

Figure 12 .
Figure 12.Integrated intensity maps and PVs of the entire region (left) and W5 East region (right).These radio data were obtained from 12 CO (J = 1 − 0) line (Heyer et al. 1998).The gray scale represents the distribution of molecular clouds.The color-coded dots shows the distribution of members in the integrated intensity maps and PVs.The size of dots are proportional to the brightness of members.Arrows indicate the PM vectors of the sparse group members relative to their nearest dense groups (C in W5 West and F in W5 East).The nearest dense group containing ionizing sources in W5 West and East are the group C and F, respectively.

Table 1 .
List of members (Wenger et al. 2000;Reed 2003;Skiff 2009;umns (4) and (5) : Absolute parallax and its standard error.Columns (6) and (7) : PM in the direction of right ascension and its standard error.Columns (8) and (9): PM in the direction of declination and its standard error.Columns (10 -12) : Heliocentric RV, RV in the local standard of rest frame, and its error.Column Classification of young stars.'E'representsO-orB-typestars obtained from the data bases of MK classification(Wenger et al. 2000;Reed 2003;Skiff 2009; Maíz Apellániz et al. 2013, SIMBAD).'H' and 'h' are Hα emission stars and Hα emission star candidates, respectively.'1','2',and 'T' denote Class I, Class II, and YSOs with a transitional disk, respectively.Column (18) : The host group name.The astrometric and photometric data were taken from the Gaia Early Data Release 3 (GaiaCollaboration et al. 2021).This table is available in its entirety in machine-readable form.