The ALMA Survey of Star Formation and Evolution in Massive Protoclusters with Blue Profiles (ASSEMBLE): Core Growth, Cluster Contraction, and Primordial Mass Segregation

The Atacama Large Millimeter/submillimeter Array (ALMA) Survey of Star Formation and Evolution in Massive Protoclusters with Blue Profiles (ASSEMBLE) aims to investigate the process of mass assembly and its connection to high-mass star formation theories in protoclusters in a dynamic view. We observed 11 massive (M clump ≳ 103 M ⊙), luminous (L bol ≳ 104 L ⊙), and blue-profile (infall signature) clumps by ALMA with resolution of ∼2200–5500 au (median value of 3500 au) at 350 GHz (870 μm). We identified 248 dense cores, including 106 cores showing protostellar signatures and 142 prestellar core candidates. Compared to early stage infrared dark clouds (IRDCs) by ASHES, the core mass and surface density within the ASSEMBLE clumps exhibited a significant increment, suggesting concurrent core accretion during the evolution of the clumps. The maximum mass of prestellar cores was found to be 2 times larger than that in IRDCs, indicating that evolved protoclusters have the potential to harbor massive prestellar cores. The mass relation between clumps and their most massive core (MMCs) is observed in ASSEMBLE but not in IRDCs, which is suggested to be regulated by multiscale mass accretion. The mass correlation between the core clusters and their MMCs has a steeper slope compared to that observed in stellar clusters, which can be due to fragmentation of the MMC and stellar multiplicity. We observe a decrease in core separation and an increase in central concentration as protoclusters evolve. We confirm primordial mass segregation in the ASSEMBLE protoclusters, possibly resulting from gravitational concentration and/or gas accretion.


INTRODUCTION
Observations suggest that massive stars form either in bound clusters (Lada & Lada 2003;Longmore et al. 2011Longmore et al. , 2014;;Motte et al. 2018) or in large-scale hierarchically structured associations (Ward & Kruijssen 2018).However, the process of stellar mass assembly, which includes fragmentation and accretion, remains poorly understood.This is a critical step in determining important parameters such as the number of massive stars and their final stellar mass.Also, it is important to note that the fragmentation of molecular gas and core accretion are both time dependent, as the instantaneous physical conditions in the cloud vary during ongoing star formation and feedback (e.g., radiation and outflow).So, it is challenging to pinpoint the physical conditions that give rise to the fragmentation and accretion observed at present.
Over the past decades, researchers have focused on massive clumps associated with infrared dark clouds (IRDCs), which are believed to harbor the earliest stage of massive star and cluster formation (e.g.Rathborne et al. 2006Rathborne et al. , 2007;;Chambers et al. 2009;Zhang et al. 2009;Zhang & Wang 2011;Wang et al. 2011;Sanhueza et al. 2012;Wang et al. 2014;Zhang et al. 2015;Yuan et al. 2017;Pillai et al. 2019;Huang et al. 2023).Despite their large reservoir of molecular gas at high densities > 10 4 cm −3 , IRDC clumps show few signs of star formation.For example, only 12% in a sample of 140 IRDCs has water masers (Wang et al. 2006).Moreover, IRDCs have consistently lower gas temperatures and line widths, with studies in NH 3 finding temperatures of ≲ 15 K (Pillai et al. 2006;Ragan et al. 2011;Wang et al. 2012Wang et al. , 2014;;Xie et al. 2021) and linewidths of ≲ 2 km s −1 averaged over a spatial scale of 1 pc (Wang et al. 2008;Ragan et al. 2011Ragan et al. , 2012)).Both two parameters are lower than those observed in highmass protostellar objects (HMPOs) with temperature of ∼ 20 K and linewidths of ∼ 2 km s −1 (Molinari et al. 1996;Sridharan et al. 2002;Wu et al. 2006;Longmore et al. 2007;Urquhart et al. 2011), and those observed in ultra-compact Hii (UCHii) regions with > 25 K and ≳ 3 km s −1 (Churchwell et al. 1990;Harju et al. 1993;Molinari et al. 1996;Sridharan et al. 2002).Therefore, there is a clear evolutionary sequence from IRDCs to HMPOs and then to UCHii regions, which sets the basis for a time-dependent study of massive star formation.
Taking advantage of the low contamination from stellar feedback in IRDCs, great efforts have been made to investigate the initial conditions of massive star formation therein.For example, Zhang et al. (2009) first conducted arcsec resolution studies of the IRDC G28.34+0.06 with the Submillimetre Array (SMA) and found that dense cores giving rise to massive stars are much more massive than the thermal Jeans mass of the clump.This discovery challenges the notion in the "competitive accretion" model that massive stars should arise from cores of thermal Jeans mass (Bonnell et al. 2001).The larger core mass in the fragments demands either additional support from turbulence and magnetic fields (Wang et al. 2012) or a continuous accretion onto the core (Vázquez-Semadeni et al. 2023).On the other hand, observations also find that the mass of these cores does not contain sufficient material to form a massive star (Sanhueza et al. 2017(Sanhueza et al. , 2019;;Morii et al. 2023), and the cores typically continue to fragment when observed at higher angular resolution (Wang et al. 2011(Wang et al. , 2014;;Zhang et al. 2015;Olguin et al. 2021Olguin et al. , 2022)), or at slightly later evolutionary stages (e.g., Palau et al. 2015;Beuther et al. 2018).Therefore, the idea of monolithic collapse (McKee & Tan 2003) for massive star formation does not match the observations.On the simulation side, recent work by Pelkonen et al. (2021) has shed light on the inadequacies of both core collapse and competitive accretion scenarios.Their findings reveal a lack of a direct correlation between the progenitor core mass and the final stellar mass for individual stars, as well as a lack of an increase in accretion rate with core mass.
However, Padoan et al. (2020) suggested a scenario where massive stars are assembled by large-scale, converging, inertial flows that naturally occur in supersonic turbulence.Very recently, He & Ricotti (2023) performed high resolution up to ∼ 7 au and found that gas should be continuously supplied from larger scales beyond the mass reservoir of the core.Such a continuous mass accretion is observed directly (Dewangan et al. 2022;Redaelli et al. 2022;Xu et al. 2023) or indirectly (Contreras et al. 2018).More recent observations of IRDCs with the Atacama Large Millimeter/submillimeter Array (ALMA) routinely reach a mass sensitivity far below the thermal Jeans mass and detect a large population of low-mass cores in the clumps that are compatible with the thermal Jeans mass (Svoboda et al. 2019;Sanhueza et al. 2019;Morii et al. 2023).These cores may form low-mass stars in a cluster.To summarize, these observations point to a picture of massive star formation in which dense cores continue to gain material from the parental molecular clump, while the embedded protostar undergoes accretion (see review in Section 1.1 in Xu et al. 2023).
Mass assembly is a dynamic process that occurs over time after all, and it is essential to compare the predictions of theoretical models and numerical simulations with observations of massive clumps at a broad range of evolutionary stages to understand high-mass star and cluster formation.While the state-of-the-art understanding of massive star formation suggests gas transfers along filamentary structures to feed the massive dense cores where protostars grow in mass (Gómez & Vázquez-Semadeni 2014;Motte et al. 2018;Naranjo-Romero et al. 2022;Xu et al. 2023), observational evidence is required to provide more straightforward constraints on the physical processes during protocluster evolution, which will yield a time-tracked understanding of high-mass star and cluster formation.
Therefore, we conduct the ALMA Survey of Star Formation and Evolution in Massive Protoclusters with Blue Profiles (ASSEMBLE), designed to study mass assembly systematically, including fragmentation and accretion, and its connection to high-mass star formation theories.The survey aims at providing a "dynamic" view from two main perspectives: 1) answering a series of kinematics questions such as when infall starts and stops, how gas transfers inwards, and where infalling gas goes; 2) and unveiling the evolution of key physical parameters in the protoclusters since the sample in the survey provides more evolved protoclusters compared to early-stage IRDCs.The first idea is reflected in our sample selection that all the 11 massive clumps are chosen from pilot single-dish surveys with evident blue profiles indicating global infall motions and rapid mass assembly.The sample also benefits from synergy with ALMA Three-millimeter Observations of Massive Star-forming regions (ATOMS; Liu et al. 2020), supporting gas kinematics analyses (Xu et al. 2023).The second idea is to compare the ASSEMBLE results with those in earlystage IRDCs reported by Sanhueza et al. (2019); Morii et al. (2023) as well as Svoboda et al. (2019).Sanhueza et al. (2019); Morii et al. (2023) are both included in series work "The ALMA Survey of 70 µm Dark High-mass Clumps in Early Stages" (ASHES hereafter), which focus on a pilot sample of 12 (ASHES Pilots; Sanhueza et al. 2019) and a total sample of 39 (ASHES Totals; Morii et al. 2023) of carefully chosen IRDCs, respectively.The mean temperature of these IRDCs is ∼15 K, with a range of 9 to 23 K, and the luminosity-to-mass ratio ranges from 0.1 to 1 L ⊙ /M ⊙ , supporting the idea that these clumps host the early stages of massive star formation (Morii et al. 2023).
In this paper, we present comprehensive analyses of dust continuum emission from a carefully selected sample comprising 11 massive protocluster clumps that exhibit evidence of gas infall.Our study focuses on investigating the physical properties and evolution of cores within these clumps, including their mass, spatial distribution, and comparison with earlier stages.The paper is structured as follows.Section 2 describes the criteria used for the selection of our sample.Section 3 provides a summary of the observation setups and details the data reduction process.In Section 4, we present the fundamental results derived from the ASSEMBLE data.Section 5 offers in-depth discussions on the implications and significance of the observed results.To gain further insights into protocluster evolution, Section 6 presents comparative analyses with the ASHES data and contributes to the development of a comprehensive understanding of the protocluster evolution.Finally, in Section 7, we summarize the key findings and provide future prospects.

Massive Clumps with Infall Motion
The ASSEMBLE sample, consisting of 11 carefully selected sources, owes its creation to advanced observational tools such as IRAS, Spitzer, and Herschel satellites, as well as various ground-based surveys focusing on dust continuum and molecular lines.Bronfman et al. (1996) conducted a comprehensive and homogeneous CS (2-1) line survey of 1427 bright IRAS point sources in the Galactic plane candidates that were suspected to harbor UCHii regions.Subsequently, Faúndez et al. (2004) conducted a follow-up survey of 146 sources suspected of hosting high-mass star formation regions (bright CS, (2-1) emission of T b > 2, K, indicative of reasonably dense gas), using 1.2 mm continuum emission.The same set of 146 high-mass star-forming clumps was then surveyed by Liu et al. (2016a), using HCN (4-3) and CS (7-6) lines with the 10-m Atacama Submillimeter Telescope Experiment (ASTE) telescope.With the most reliable tracer of infall motions HCN (4-3) lines (Chira et al. 2014, Xu et al. in submission), they identified 30 infall candidates based on the "blue profiles".
Out of the 30 infall candidates, 18 are further confirmed by HCN (3-2) and CO (4-3) lines observed with the Atacama Pathfinder Experiment (APEX) 12-m telescope (Yue et al. 2021).Furthermore, the 18 sources were found to have virial parameters below 2, indicating that they're likely undergoing global collapse.All the 18 sources are covered by both the APEX Telescope Large Area Survey of the Galaxy (ATLASGAL; Urquhart et al. 2018) and the Herschel Infrared Galactic Plane Survey (Hi-GAL; Molinari et al. 2010;Elia et al. 2017Elia et al. , 2021)), allowing well-constrained estimates of clump mass and luminosity from the infrared SED fitting (Urquhart et al. 2018).Given ASSEMBLE's goal of investigating massive and luminous star-forming clumps, the study adopts additional selection criteria that the clump should be massive and luminous.

Physical Properties of Selected Sample
Table 1 presents the basic properties of the ASSEM-BLE sample, including the clump kinematic properties, distances, and physical characteristics.The velocity in the local standard of rest (V lsr ) was determined from the C 17 O (3-2) lines in the APEX observations (Yue et al. 2021), which is listed in column (5).The line asymmetric parameter (δV 6) defines the line as having a blue profile.The kinematic distance as well as its upper and lower uncertainties is estimated using the latest rotation curve model of the Milky Way (Reid et al. 2019) and is listed in column (7).The clump radius is derived from 2D Gaussian fitting and is listed in column (8).The radius is derived from the 2D Gaussian fitting, the same method as adopted in Sanhueza et al. (2019) to better compare with.The dust temperature (T dust ), clump mass (M cl ), bolometric luminosity (L bol ), and luminosity-to-mass ratio (L/M ) are obtained from the far-IR (70-870 µm by Herschel and ATLASGAL survey) SED fitting (Table 5 in Urquhart et al. 2018) and are listed in columns ( 9)-( 12), respectively.The clump surface density, Σ cl = M cl /πR 2 cl , is listed in column (13).It is noteworthy that all of the ASSEMBLE clumps have a surface density of Σ cl ≳ 1 g cm −2 , significantly surpassing the threshold (0.05 g cm −2 ) for high-mass star formation proposed by Urquhart et al. (2014) and He et al. (2015), which further justifies our sample selection.
The background Spitzer three-color composite map (blue: 3.6 µm; green: 4.5 µm; red: 8 µm) in Figure 1 displays the infrared environment.All the 11 targets exhibit bright infrared sources indicating active massive star formation, although the L bol /M cl derived from Table 1 columns (10)-( 11), varies from 12 to 80.The differences in the L bol /M cl suggest potential variations in the evolutionary stages among the samples.For instance, I16272-4837 (also known as SDC335; Peretto & Fuller 2009) with the value of 12 is in an early stage of highmass star formation embedded in a typical IRDC (Xu et al. 2023).Another example is I15520-5234, where extended radio (ν = 8.64 GHz) continuum emission indicates evolved UCHii regions (see Fig. 4     In Appendix A, we present additional information on the radio emission derived from the MeerKAT Galactic Plane Survey 1.28 GHz data (Padmanabh et al. 2023, Goedhart et al. in prep.).All of the protoclusters included in our study exhibit embedded radio emission, which can originate from UCHii regions, or extended radio emission, which may arise from radio jets or extended Hii regions.For instance, in the case of I14382-6017, we observe cometary radio emission that exhibits a spatial correlation with the 8 µm emission, outlining the extended shell of the Hii region.However, at an early stage, I16272-4837 displays two radio point sources associated with two UCHii regions (Avison et al. 2015).Additionally, I17720-3609 exhibits northward extended radio emission that is linked to a blue-shifted outflow (Baug et al. 2020(Baug et al. , 2021)).  1.Each mosaicked field has a uniform size of ∼ 46 ′′ and a sky coverage of ∼ 0.58 arcmin 2 .The on-source time per pointing is 2.7-3.7 minutes, which is listed in column (2) of Table 2.
High-density tracers, such as H 13 CN (4-3) and CS (7-6), can determine the core velocity.Additionally, some sulfur-bearing molecules, such as H 2 CS and SO 2 , can serve as tracers of rotational envelopes, while shock tracers include SO 3 Σ (8 8 − 7 7 ) and CH 3 OH (13 1,12 − 13 0,13 ).The hot-core molecular lines, such as H 2 CS, CH 3 OCHO, and CH 3 COCH 3 , have a sufficient number of transitions to facilitate rotation-temperature and chemical abundance studies.A summary of the target spectral lines can be found in Table A1 of Xu et al. (2023).

ALMA Data Calibration and Imaging
The pipeline provided by the ALMA observatory was utilized to perform data calibration in CASA (McMullin et al. 2007) version 5.1.15.The phase, flux, and bandpass calibrators are listed in columns ( 7)-( 8) of Table 2.The imaging was conducted through the TCLEAN task in CASA 5.3.To aggregate the continuum emission, linefree channels were meticulously selected by visual inspection, with the bandwidth and its percentage of total bandwidth listed in column (3).A total of three rounds of phase self-calibration and one round of amplitude selfcalibration were run to enhance the dynamic range of the image.For self-calibration, antenna DA47 was designated as the reference antenna.During imaging, the deconvolution was set as "hogbom" while the weighting parameter was set as "briggs" with a robust value of 0.5 to balance sensitivity and angular resolution.The primary beam correction is conducted with pblimit=0.2.Following self-calibration, the sensitivity and dynamic range of the final continuum image were significantly improved, as indicated in column (4), ranging from 0.5-1.7 mJy beam −1 with a mean value of ∼ 1 mJy beam −1 .The beam size (i.e., angular resolution) with 0. ′′ 8-1.′′ 2 and maximum recoverable scale (MRS) with 7. ′′ 2-9.′′ 2 are presented in columns (5)-(6).

Dust Continuum Emission
Figure 2 presents the ALMA 870 µm dust continuum images without primary beam correction for a uniform rms noise.As a comparison, the dust continuum emission at the same wavelength from the single-dish survey ATLASGAL (Schuller et al. 2009) with a beam size of 19.′′ 2 is overlaid as black contours.In all of the 11 targets, the small-scale structures resolved by ALMA show a good spatial correlation with the large-scale structures seen by ATLASGAL.In other words, the dense structures surviving in the interferometric "filtering-out" effect are mostly distributed in the densest part of the clump.However, the small-scale structures present various morphologies: some present elongated filaments (e.g., I14382-6017 and I16071-5142); some have centrally concentrated cores (e.g.I16272-4837); some have spirallike dust arms (e.g., I15596-5301 and I16060-5146).

Core Extraction and Catalog
We here present the extraction of core-like structures (or cores) and the measurement of fundamental physical parameters including integrated flux, peak intensity, size, and position.The choice of core extraction algorithm should be carefully made based on actual physical scenarios and scientific expectations.In this work, we use the getsf extraction algorithm that spatially decomposes the observed images to separate relatively round sources from elongated filaments as well as their background emission (Men'shchikov 2021).As suggested by Xu et al. (2023), the getsf algorithm is a better choice than astrodendro in the case study of SDC335 (one of the ASSEMBLE sample), because it can: 1) deal with uneven background and rms noise; 2) can separate the blended sources/filaments; 3) extract extended emission features.
We perform the getsf algorithm on the continuum emission maps without primary beam correction (unpbcor).The unpbcor map is firstly smoothed into one with a circular beam whose size is equal to the major axis of the original beam.The getsf is set to extract sources whose sizes should be larger than the beam size but smaller than the MRS.As suggested by Men'shchikov (2021), significantly detected sources are defined as: 1) signal-to-noise ratio larger than unity; 2) peak intensity at least five times larger than the local intensity noise; 3) total flux density at least five times larger than the local flux noise; 4) ellipticity not larger than 2 to ensure a core-like structure; 5) footprint-tomajor-axis ratio larger than 1.15 to rule out cores with abrupt boundary emission.After core extraction as well as fundamental measurement by getsf, two flux-related parameters (integrated flux and peak intensity) are corrected by the primary beam response, depending on the core location in the continuum emission maps with primary beam correction (pbcor).Fundamental measurements of the core parameters are listed in Table 3.
To evaluate how much flux is recovered by the ALMA observations, we integrate the ATLASGAL 870 µm flux over the field of view of the ASSEMBLE clumps.If all the sources and filaments extracted by getsf are included, then the recovered flux by ALMA ranges from 10% to 25%.Although the flux recovery can be further improved by including short-baseline observations (e.g., the Atacama Compact Array), some SMA/ALMA observations show a typical flux recovery between 10% to 30% (e.g., Wang et al. 2014;Sanhueza et al. 2017;Liu et al. 2018;Sanhueza et al. 2019).In our case, the maximum recoverable scale is ∼ 9 ′′ .Therefore, most of the mass in the massive clump is not confined in dense structures (cores).
As shown in panel (a) of Figure 3, we found no overall correlation between the number of detected cores and the mass sensitivities with a Pearson correlation coefficient of 0.08.Likewise, panel (b) reveals no overall correlation between the number of detected cores and the physical resolution with a Pearson correlation coefficient of -0.04.Therefore, the number of detected cores is basically independent of the mass sensitivity and spatial resolution provided by the observations.

Core Classification and Evolutionary Stages
All the ASSEMBLE clumps have infrared bright signatures.As an example of a relatively early stage, I16272-4837 has extended 4.5 µm emission, which is a common feature of outflows (Cyganowski et al. 2008).A more evolved example of I14382-6017 is totally immersed in a cometary Hii region traced by the PAH emission in the 8 µm emission.Therefore, at least some cores in each clump are in an active star formation stage.
The classification of the evolutionary stages of the 248 cores is based on the identification of star-formation indicators, including molecular outflows, H 2 CS multiple transition lines, and CH 3 OCHO multiple transition lines.
For molecular outflows, Baug et al. (2020) used CO (3-2), HCN (4-3), and SiO (2-1) emission lines to confirm the presence of 32 bipolar and 41 unipolar outflows in the 11 ASSEMBLE clumps, and then a total of 42 continuum cores are associated with outflows.In this study, we updated the outflow catalogs by a channel-bychannel analysis of the outflow lobes to determine their association with the extracted cores, and subsequently assigned the outflows accordingly.A total of 39 (∼ 16%) cores are assigned bipolar or unipolar outflows.If a core is assigned outflows, then it is classified as protostellar (Nony et al. 2023).Some cores even show multi-polar outflows (e.g., I16272-4837 ALMA8; Olguin et al. 2021), indicating either precession of accreting and outflowing protostars or presence of multiple outflows from multiple system.However, we should acknowledge that the method will miss those weak outflows associated with the lowest mass objects, especially for the more distant regions, naturally yielding a lower limit in the number of protostellar objects.
Owing to its comparatively abundant nature, the emission of H 2 CS is observed extensively in the core pop- ulation (Chen et al., in preparation).However, the relatively low abundance of CH 3 OCHO species restricts its detection to hot molecular cores with line-rich features.In this paper, we first classify those cores with robust (> 3σ) detections of both CH 3 OCHO and H 2 CS multiple transition lines as "hot cores", especially those that have robust rotation temperature estimation by both CH 3 OCHO and H 2 CS molecules.Since the "hot cores" are believed to be the result of warm-up processes by central protostar(s) to 100-300 K (Gieser et al. 2019), there should be a stage of dense cores with temperature of < 100 K and without line-rich features, which are called "warm cores" (Sanhueza et al. 2019).Then we define cores with only robust detection of H 2 CS but without detection of CH 3 OCHO lines as warm cores.We present examples of both "hot cores" and "warm cores"  in Figure 4, where I16060-ALMA7 is a typical hot core with line-rich feature, and evident detection of multiple transitions of CH 3 OCHO, as well as H 2 CS.However, I16060-ALMA15 has a paucity of hot molecular lines, including CH 3 OCHO, but with only H 2 CS.
Among the 248 ASSEMBLE cores, H 2 CS line emissions have been identified in 92 cores, of which 35 display "line-rich" features and are further categorized as hot cores, while the other 57 cores are classified as "warm cores" based on the detection of enough   H 2 CS lines.Among these warm cores, 22 have insufficient H 2 CS transitions available for the calculation of temperature.142 core without the starforming indicators mentioned above (outflows, H 2 CS, or CH 3 OCHO lines) are then classified as a prestellar core candidate, implying a stage preceding the protostellar phase.Based on the classification above, we mark the core in the column (10) of Table 3:  are detected, and 4 = both CH 3 OCHO line and outflow are detected, 5 = only CH 3 OCHO line is detected.
Caveats of the core classification results: 1) external heating by hot cores in the vicinity can also excite H 2 CS lines in some prestellar cores, so some warm cores can have no stars form inside; 2) prestellar core candidates may include both pre-protostellar cores that are gravitational bound, and cores that are not bound and  I14382@4.1 kpc I14498@3.2kpc I15520@2.6kpc I15596@4.4kpc I16060@5.2kpc I16071@4.9kpc I16076@5.0kpc I16272@3.2kpc I16351@2.9kpc I17204@2.9kpc I17220@7.6kpc    2).The core IDs are in order from the north to the south.
The equatorial coordinate centers of the cores are listed in (3)-( 4).The peak intensity and integrated flux are listed in ( 5)-( 6).The fitted FWHM of the major and minor axes convolved with the beam and the position angle (anticlockwise from the north) are listed in ( 7)-( 8).The deconvolved FWHM of the core size is shown in (9).The core classification in (10) is based on Section 4.3.This table is available in its entirety in machine-readable form.a Core classification: 0 = prestellar candidate, 1 = only molecular outflow is detected, 2 = only warm-core line is detected, 3 = both outflow and warm-core line are detected, and 4 = both outflow and hot-core line are detected, 5 = only hot-core line is detected.
unable to form star. To keep consistent, we don't distinguish the two and refer to them as prestellar core candidates in the following part of the paper.We note that spectral analyses of these cores can further constrain their dynamic states.
It is noteworthy that outflows have been observed in all of the ASSEMBLE clumps, providing evidence of star-forming activities with a 100% occurrence rate in our clump sample.However, two massive starforming clumps, namely I14382-6017 and I17204-3636, do not exhibit any detection of hot cores.This ab-sence of hot cores has been confirmed by cross-matching with the ALMA Band-3 dataset, ensuring their nonexistence (Qin et al. 2022).In the case of the protocluster I14382-6017, the extended spherical morphology of H40α line emission is spatially consistent with the MeerKAT Galactic Plane Survey 1.28 GHz data (Padmanabh et al. 2023, Goedhart et al. in prep.).As identified by Zhang et al. (2023), it represents an UCHii region with an electron density of 0.15-0.16×10 4 cm −3 .The protocluster I14382-6017 is situated on the outskirts of the UCHii region, suggesting the possibility of a second   generation of cores (refer to Figure 14).As a result, the absence of hot cores in this particular region can be attributed to the relatively young age of the newly formed protocluster.The absence of hot cores in I17204-3636 can be a different issue, as the H40α and the 1.28 GHz emission are spatially correlated with dense cores (see Figure 14).But we note that I17204-3636 has the lowest mass of 760 M ⊙ , with the maximum core mass of 2.9 M ⊙ (refer to Section 4.4).Furthermore, the temperature of the only warm core I17204-ALMA16 is 88(±7) K (Section 4.4), which is not so high as ≳ 100 K to be a hot core.Therefore, in the case of I17204-3636, the cores may not be massive and hot enough to excite hot molecular lines or initiate hot core chemistry.

Core Physical Properties
Temperature estimation utilizes three hybrid methods (clump-averaged temperature, H 2 CS line, and CH 3 OCHO line) based on the core properties.H 2 CS lines are chosen due to their strong spatial correlation with dust as demonstrated in Xu et al. (2023), and their widespread distribution (Chen et al. submitted).The ASSEMBLE spectral window encompasses multiple hyperfine components of the J = 10−9 transitions, with upper energy levels from 90 to 420 K (see Table C1 in Xu et al. 2023).However, H 2 CS lines could be optically thick towards massive hot cores, therefore only tracing the core envelope.To trace the dust temperature of hot cores, CH 3 OCHO molecule with upper energy up to ∼ 589 K is employed instead.Temperatures obtained from CH 3 OCHO (mean value of 110 K) are consistently higher than those derived from H 2 CS (mean value of 95 K), indicating that CH 3 OCHO is a suitable tracer of the inner and denser gas.In cases where neither H 2 CS nor CH 3 OCHO lines are detected, it is assumed that the core either lacks sufficient column density or is too cold to excite the lines.This suggests that the core has not developed its own temperature gradient and thus is assumed to share the same temperature as the clump from the SED fitting.The temperature as well as the method to obtain it are listed in the column (3)-(4) of Table 4.
Assuming that all the emission comes from dust in a single T dust and that the dust emission is optically thin, the core masses are then calculated using where F int ν is the measured integrated dust emission flux of the core, R is the gas-to-dust mass ratio (assumed to be 100), D is the distance, κ ν is the dust opacity per gram of dust, and B ν (T dust ) is the Planck function at a Xu et al.  given dust temperature T dust .In our case, κ ν is assumed to be 1.89 cm 2 g −1 at ν ∼ 350 GHz (Xu et al. 2023), which is interpolated from the given table in Ossenkopf & Henning (1994), assuming grains with thin ice mantles and the MRN (Mathis et al. 1977) size distribution and a gas density of 10 6 cm −3 .Substituting the temperature in Equation 1, the core masses are then calculated and listed in the column (5).
Cores are characterized by 2D Gaussian-like ellipses with the FWHM of the major and minor axes (θ maj and θ min ), and position angle (PA) listed in the column ( 7)-( 8) of Table 3.Following Rosolowsky et al. (2010) and Contreras et al. (2013), the angular radius can be calculated as the geometric mean of the deconvolved major and minor axes: where σ maj and σ min are calculated from θ maj / √ 8 ln 2 and θ min / √ 8 ln 2 respectively.The σ bm is the averaged dispersion size of the beam (i.e., θ bmaj θ bmin /(8 ln 2) where θ bmj and θ bmin are the FWHM of the major and minor axis of the beam).η is a factor that relates the dispersion size of the emission distribution to the angular radius of the object determined.We have elected to use a value of η = 2.4, which is the median value derived for a range of models consisting of a spherical, emissivity distribution (Rosolowsky et al. 2010).Therefore, the core physical radius can be directly calculated by R core = θ core × D, as shown in the column (6) of Table 4.
The number density, n, is then calculated by assuming a spherical core, where µ H 2 is the molecular weight per hydrogen molecule and m H is the mass of a hydrogen atom.Throughout the paper, we adopt the molecular weight per hydrogen molecule µ H 2 = 2.81 (Evans et al. 2022), and derive the number density of hydrogen molecule n(H 2 ).
The core-averaged surface density can be calculated by Σ = M core /(πR 2 core ).The peak column density is estimated from where F peak ν is the measured peak flux of core within the beam solid angle Ω 1 .The calculated volume, surface and peak column densities are shown in ( 7)-( 9) of Table 4.
The major sources of uncertainty in the mass calculation come from the gas-to-dust ratio and the dust opacity.We adopt the uncertainties derived by Sanhueza et al. (2017) of 28% for the gas-to-dust ratio and of 23% for the dust opacity, contributing to the ∼ 36% uncertainty of the specific dust opacity.The uncertainties of the core flux (∼14%), temperature (∼20%), and distance (assumed to be 10%) are included.Monte Carlo methods are adopted for uncertainty estimation and 1σ confidence intervals are given for core mass, volume density, surface density and peak column density in ( 5), ( 7)-( 9) of Table 4.
We also summarize the statistics of the core physical parameters in Table 5.The number of cores in each clump is listed in column (3).The minimum, maximum, and mean core mass are listed in columns (4-6).The mean values of core radius, volume density, surface density, and column density are listed in columns (7-10).The numbers of prestellar and protostellar cores are listed in column (11).

Coevolution of Clump and Most Massive Core
The ability of a clump to form massive stars is directly linked to the amount of material within the natal clump (Beuther et al. 2013).Therefore, it is essential and straightforward to study the relation between clump and its most massive core (MMC), which is most likely to form massive stars inside the clump.The left panel of Figure 5 shows the core masses (M core ) versus the mass of the clump (M clump ) of the ASSEMBLE clumps, with the maximum value, that is, the mass of MMC (M max ) labeled.As demonstrated in the right panel, a positive sublinear correlation is observed between M ASSEMBLE,max and M ASSEMBLE,clump , with a power law index of 0.75(0.08).The Pearson and Spearman correlation coefficients are calculated to be 0.67 and 0.73, respectively.Significantly, both correlation coefficients exhibit p-values below 0.05, indicating a high level of statistical significance for the observed correlation.This positive correlation indicates a coevolution between the clump and MMC, i.e., a more massive clump contains a more massive core, which is consistent with what has been found in Anderson et al. (2021).
Furthermore, the coevolution of the massive clump and its most massive core can be connected to gas kinematics in a dynamic picture.In massive star-forming regions, filamentary gas accretion flows frequently connect clump and core scales in both observations (Peretto et al. 2013(Peretto et al. , 2014;;Liu et al. 2016b;Lu et al. 2018;Yuan et al. 2018;Dewangan et al. 2020;Sanhueza et al. 2021;Li et al. 2022;Xu et al. 2023;Yang et al. 2023) and sim- found four spiral-like gas streams conveying gas from the natal clump directly to the most massive core, with a continuous and steady gas accretion rate across three magnitude.Therefore, we suggest that such a "conveyor belt" (Longmore et al. 2014) should be the main reason for coevolution.If all the massive clumps are undergoing a quick mass assembly, the sublinearity of the mass scaling relation also suggests that the clump-tocores efficiency should vary among different clumps (Xu et al., in preparation).To more directly understand the dynamic picture of coevolution of clump and core, detailed gas kinematics analyses should be systematically performed in a sample with a wide range of evolutionary stages.
It is worth noting that most of the efforts in the search for massive starless cores have been focused on IRDCs.However, several numerical simulations suggest that thermal feedback from OB protostars and strong magnetic field proto-stellar clusters can play a crucial role in reducing the level of further fragmentation and producing more massive dense cores (Offner et al. 2009;Krumholz et al. 2007Krumholz et al. , 2011;;Myers et al. 2013), and hints for such a reduction of fragmentation for strong magnetic fields have actually been suggested observationally (Palau et al. 2021).Observations also suggest that a 5 M ⊙ zero-age main sequence (ZAMS) star can produce radiation feedback to support high-mass fragments (Longmore et al. 2011).In particular, massive starless core candidates such as G9.62+0.19MM9(Liu et al. 2017) and W43-MM1#6 (Nony et al. 2018) have been found in evolved protostellar clusters.Moreover, Contreras et al. (2018) reported a relatively massive but highly subvirial collapsing prestellar core with mass 17.6 M ⊙ , that is heavily accreting from its natal cloud at a rate of 1.96 × 10 −3 M ⊙ yr −1 .If the accretion rate persists during the lifetime of the massive starless clump (≲ 1 − 3 × 10 4 yr), then the mass of the prestellar core can be doubled at the beginning of the protostellar stage.Therefore, it would be even more promising to search for high-mass prestellar cores in protostellar clusters than in prestellar clusters.
Within the ASSEMBLE protoclusters, the most massive prestellar core I17220-ALMA9 has a mass of 18.3 M ⊙ within 0.065 pc, which is about two times larger than the ones found in the ASHES IRDCs.The second massive prestellar core I16060-ALMA17 has a mass of 16.5 M ⊙ within 0.045 pc.However, we should note that 1) the ASSEMBLE data only have the ALMA 12m array configuration, so the core flux can be underestimated with extended flux filtered out; 2) we adopt the clump-averaged temperature as the temperature of the prestellar core, which can be overestimated, resulting in an underestimated core mass.Complementary shortbaseline configuration and a better estimation of temperature should give a better estimate of the prestellar core mass.At any rate, the available evidence strongly suggests that: 1) prestellar cores are becoming more massive, which can be due to the continued mass accumulation along with the natal clump (see Section 6.1); 2) high-mass prestellar cores can survive in protostellar clusters.However, to demonstrate the causality between the survival of high-mass prestellar cores and the protocluster environment, both a systematic search for high-mass prestellar cores in massive protoclusters and determination of environmental effects are needed.

Core Separation
To study the spatial distribution of cores, we first build the minimum spanning tree (MST) for each AS-SEMBLE core cluster; and the details can be found in Appendix B.
Following the convention of Wang et al. (2016); Sanhueza et al. (2019), we take the "edge" of MST as the separation between the cores.A total of N − 1 separation lengths are defined in each clump where N is the core number.The upper panel of Figure 6 shows the distributions of core separation of the ASSEMBLE sample in blue, the ASHES total sample (ASHES Totals; Morii et al. 2023)  bility density as shown with line-connected scatter plot, the Kolmogorov-Smirnov test between the distributions of the ASHES Totals and Pilots give a p-value of 0.57 ≫ 0.1, indicating that the ASHES Pilots share the same distribution with that of the ASHES Totals.Therefore, the ASHES Pilots are good enough to represent the ASHES Totals in the case of studying core separation.Since the sample size of the ASHES Pilots is comparable to that of the ASSEMBLE, we only com-pare core separations from the ASSEMBLE with those from the ASHES Pilots in the following analyses.
The bias of the mass sensitivity and spatial distribution should be excluded.For example, if the ASSEM-BLE mass sensitivity is higher than the ASHES one, we are about to detect more low-mass cores, reducing the separation.Thanks to comparable sensitivities of the two samples, we have detected the core population with the same truncation limited by the mass sensitivity.In addition, the ASSEMBLE and ASHES surveys share similar spatial resolutions, as indicated by the orange shadows, and therefore we can directly compare their core separations.
The Mann-Whitney U test 2 between two groups of core separations gives a p-value ≪ 0.01, significantly excluding the null hypothesis that two distributions are the same.To further test the effects of the uncertainty of the clump distance, 1000 Monte Carlo runs are adopted to simulate the 1-σ distribution dispersion, as shown in the blue and gray extent in the lower panel of Figure 6.The distribution of the p-value derived from the Mann-Whitney U test is shown with the subpanel on the upper right corner in the lower panel.Even perturbed by 1-σ uncertainty from distance (∼10-20%), the majority of pvalues are significantly lower than 0.01, suggesting that two distributions are truly different.In other words, the core separations in the ASSEMBLE protoclusters are systematically smaller than those in the ASHES protoclusters, suggesting that the cluster becomes tighter with closer separations during the clump evolution indicated by L/M .
It should be noted that the ASSEMBLE core separation exhibits a significant peak at ∼ 0.035 pc.The value is twice the spatial resolution (mean value of ∼ 0.018 pc), suggesting it is not a result of resolution effects.Furthermore, both Tang et al. (2022) and Palau et al. (2018) have also observed two peaks in the separation histogram in W51 North and OMC-1S.One of these peaks falls within the range of 0.032 to 0.035 pc, which aligns with the results we have obtained in our study.Such a consistency between three independent observations (with different spatial resolutions) might suggest a typical level of hierarchical fragmentation at this scale.

The Q Parameter
2 The Mann-Whitney U Test is a null hypothesis test, used to detect differences between two independent data sets.The test is specifically for non-parametric distributions, which do not assume a specific distribution for a set of data (Mann & Whitney 1947).Because of this, the Mann-Whitney U Test can be applied to any distribution, whether it is Gaussian or not.
To quantify the spatial distribution of cores, we follow the approach of Cartwright & Whitworth (2004) and define the Q parameter as, where m is the normalized mean edge length of the MST given by, where N c is the number of cores, L i is the length of each edge, and A is the area of protocluster as A = πR 2 cluster , with R cluster calculated as the distance from the mean position of cores to the farthest core.s is the normalized correlation length, where L av is the the mean separation length between all cores and R cluster is the cluster radius.The Q value serves as a measure of the degree of subclustering and the large-scale radial density gradient in a given region.As indicated by Fig. 5 in Cartwright & Whitworth (2004), a value of Q ≳ 0.8 indicates a centrally condensed spatial distribution characterized by a radial density profile of the form n(r) ∝ r −α .On the other hand, when Q ≲ 0.8, the Q parameter decreases from approximately 0.80 to 0.45 with an increasing degree of subclustering, ranging from a fractal dimension of D = 3.0 (representing a uniform number-density distribution without subclustering) to D = 1.5 (indicating strong subclustering).
From the MST results, the derived Q parameters for the ASSEMBLE clumps range from 0.53 to 0.89, with a median value3 of 0.71(0.13).We note that there are four protoclusters I15520, I16060, I16351, and I17204 that have Q greater than 0.8, indicative of a centrally condensed spatial distribution.As shown in Figure 7, the Q parameter shows a weak correlation with luminosityto-mass ratio, with Pearson correlation coefficient R p = 0.56.The positive correlation suggests that a protocluster is becoming more centrally condensed as it evolves.In Section 6.5, a correlation among a sample of both AS-SEMBLE and ASHES could be more instructive, since a wider dynamic range of L/M is available.As defined in Allison et al. (2009); Parker & Goodwin (2015), mass segregation refers to a more concentrated distribution of more massive objects with respect to lower mass objects than that expected by random chance.For dynamically old bound systems (i.e.relaxed and virialized clusters), the process of two-body relaxation has redistributed energy between stars and they approach energy equipartition whereby all stars have the same mean kinetic energy.Therefore, more massive stars will have a lower velocity dispersion, and they will sink into the deeper gravitational potential, i.e., the center of the cluster (Spitzer 1969).

Mass Segregation
Despite the observed mass segregation in old stellar clusters, it does not have to be from canonical two-body relaxation dynamical process.If we observe mass segregation in a region that is so young that two-body encounters cannot have mass segregated the stars, then the mass segregation must be set by some aspect of the star formation process, and is often called "primordial mass segregation" (Parker & Goodwin 2015), which has been found in some simulations of star formation (e.g., Moeckel & Bonnell 2009;Myers et al. 2014).Observationally, Sanhueza et al. (2019) have only found weak mass segregation in 4 out of 12 IRDCs and no mass segregation in the others.The overall conclusion is that there is no significant evidence of primordial mass segregation in IRDCs (Sanhueza et al. 2019;Morii et al. 2023).In contrast, at a similar physical resolution of 2400 au and the same band (1.3 mm) by ALMA, Dib & Henning (2019) have found massive star-forming region W43 exhibits evident mass segregation with maximum mass segregation ratio Λ max MSR = 3.49 (see definition in Equation 8).

Λ Plots: Characterisation of Mass Segregation
To quantify the mass segregation in the protoclusters, we adopt the mass segregation ratio (MSR), Λ MSR , which is defined by Allison et al. (2009) and shown to perform best compared to three other methods by Parker & Goodwin (2015).The value of Λ MSR at N MST is given by where l random is the mean MST edge length of an ensemble of N MST cores randomly chosen from the protocluster and l massive is the mean MST length of the top-N MST most massive cores.In our analyses, we performed 1000 Monte Carlo runs of choosing N MST random cores to obtain a set of l random , calculating the mean value ⟨l random ⟩ and its standard deviation σ l,random = ⟨(l random − ⟨l random ⟩) 2 ⟩.For each N MST , Λ MSR is meant to measure how much the MST length of the top-N MST most massive cores deviate from the MST length of the entire protocluster.If the MST length of the top-N MST ensemble is shorter than the MST length of the entire protocluster, it is suggested that massive cores have a more concentrated distribution.By definition, Λ ≃ 1 means that the massive cores were distributed in the same way as the other cores (i.e., no mass segregation); Λ > 1 means that the massive cores were concentrated (i.e., mass segregation), and Λ < 1 means that the more massive cores were spread out relative to the other cores (i.e., inverse-mass segregation).
Figure 8 presents mass segregation ratio Λ MSR versus the fraction of the selected core number to the total core number f MST = N MST /N MST,max , which is called "Λ MSR plot" hereafter.We arrange the protoclusters in descending order of the maximum value of the mass segregation ratio Λ MSR,max .For example, the protocluster I16071 in the first panel has the highest Λ MSR,max of 8.72(±3.69),which implies strong mass segregation.In contrast, the protocluster I14382 has no mass segregation or even a weak inverse-mass segregation, as shown in the last panel.
There are three notable features that deserve additional explanations in the Λ plots: • Λ MSR peak at small f MST : protoclusters have a wide range of dense core mass while there are a small number of massive dense cores.When f MST or N MST are small, massive dense cores should account for a large propor-Xu et al. tion in the ensemble, so l massive should be significantly smaller than ⟨l random ⟩ if mass segregation exists.
• Λ MSR decrease with f MST : when N MST increase, l massive will involve more low-mass cores so that the mass segregation trend, if it exists, will be washed out; furthermore, when f MST is larger, the ensembles of cores used to compute both l massive and ⟨l random ⟩ are more similar to the entire core sample so that both quantities theoretically approach the same value, the MST length of the entire core sample.
• Diverse Λ MSR profiles or diverse fractions of cores involved in mass segregation: Λ MSR drops with f MST with different rates.The clump with the strongest mass segregation, I16071, has its Λ MSR dropping toward 1 around f MST ≃ 0.6, while the clump with the second largest mass segregation, I16060, has its Λ MSR rapidly dropping toward 1 around f MST ≃ 0.2.Such diversity is also true among the clumps with lower degrees of mass segregation (e.g., I15596 vs. I14498).Therefore, it is of great interest to understand why the different protoclusters can show such different profiles of Λ MSR plots in the future.

I MSR
Λ MSR plots are difficult to compare with each other, because Λ MSR by definition depends on N MST or f MST .In other words, to fully characterise the degree of mass segregation of a protocluster, two main factors need to be take into account: 1) Λ MSR,max , directly determines what the largest deviation from the random process is, according to the definition of Equation 8; 2) N MST,crit or f MST,crit , which determines at what point the mass segregation ratio of cluster disappears for parameter N MST or f MST .
Here we propose mass segregation integral (MSI) I MSR Λ to describe how a cluster is segregated, where Λ MST,i is the mass segregation ratio at given f MST or N MST .The MSI is meant to record every deviation from Λ MSR,i = 1 (when there is no mass segregation) at each N MST .In Section 6.6, we will examine the significant mass segregation observed in the ASSEMBLE protoclusters and explore its possible origins using the MSI.However, it is important to acknowledge limitations of using the MSI.First, the MSI collapses the Λ MSR profile into a scalar value, thus disregarding the potentially various spatial distributions that can produce the same MSI value.Second, the physical and mathematical interpretations of the MSI are not yet fully understood, as it only records the deviations from the random distribution of cores within a protocluster.To enhance our understanding, future studies could establish a correlation between the MSI and the evolutionary timescale of a protocluster.

EVOLUTION OF MASSIVE PROTOCLUSTERS
The ASSEMBLE clumps have evolved to a late stage in the formation of massive protoclusters, with an L/M ratio ranging from 10 to 100 L ⊙ /M ⊙ .On the contrary, the ASHES clumps are in an early stage, with a L/M ratio between 0.1 and 1 L ⊙ /M ⊙ .Therefore, the ASSEM-BLE and ASHES clumps can serve as mutual informative comparison groups, as the basis of our dynamic view of protocluster evolution.As introduced in Section 4.4, the statistics of the core parameters are summarized in Table 5, which can be directly compared to Table 5 in Sanhueza et al. (2019).To highlight the quiescent and active nature of ASHES and ASSEMBLE clumps, respectively, the samples also have the second names infrared dark clouds (IRDCs) and infrared bright clouds (IRBCs), respectively.

Core Growth and Mass Concentration
As shown in Figure 9, the IRBCs exhibit a median volume density of the core number n core,med of 177 per pc 3 , which is approximately three times greater than the 61 per pc 3 observed in the IRDCs.We consider the potential effects from the different mass sensitivities between the two projects.The slightly worse sensitivity (σ = 0.089 M ⊙ ) of the ASSEMBLE compared to the ASHES (σ = 0.078 M ⊙ ) shows that correcting for sensitivity would only increase the core density in IRBCs.
To exclude the effects of different source extraction algorithms, we also perform the source extraction using getsf in the 12m-alone data of ASHES as it was done for the ASSEMBLE in Appendix B, only obtaining a much lower core number of 66, mostly due to two factors: 1) getsf tends to extract spherical cores but miss those irregular ones; 2) the 12m-alone data filter out large-scale structures that are previously identified by astrodendro algorithm.Therefore, correcting the effects from the array configuration and source extraction algorithm will only result in even larger difference between the two sets of parameters mentioned above.In any circumstances, the core number densities in the AS-SEMBLE clumps are considerably higher than those of IRDCs.
As demonstrated in simulations by Camacho et al. (2020), massive clumps accrete mass and increase density as they evolve, resulting in a decrease in the freefall timescale.Consequently, the dense cores formed by Jeans fragmentation collapse to form protostars more quickly, leading to a higher fraction of protostellar cores, f proto .As shown in the second column of Figure 9, f proto increases significantly from 29% in the early-stage IRDCs to 42% in the late-stage IRBCs on average.The increasing trend of f proto with respect to evolutionary stage has also been previously reported by Sanhueza et al. (2019) and is consistent with the fragmentation reported by Palau et al. (2014Palau et al. ( , 2015Palau et al. ( , 2021) ) in more evolved IRBCs, because in these works most of the cores are protostellar (given their higher masses and compactness compared to the ASSEMBLE sample).
Furthermore, we provide several pieces of evidence for the growth of the core mass from IRDC to IRBC in columns (3-11) in Figure 9. Parameters including the maximum, mean, and median mass of protostellar cores M proto,max , M proto,mean , and M proto,med , respectively; those of prestellar cores M pre,max , M pre,mean , and M pre,med , respectively; and those of surface mass density of total core population Σ mass,mean , Σ mass,max , and Σ mass,med , respectively, all exhibit systematic increments from IRDCs to IRBCs.These mass or surface density increments have also been observed in another comparative work between IRDCs and IRBCs with hubfilament structures (Liu et al. 2023), where gas inflow is thought to be responsible for the hierarchical and multiscale mass accretion (Galván-Madrid et al. 2010;Liu et al. 2022aLiu et al. ,b, 2023;;Xu et al. 2023;Yang et al. 2023).Very recent statistical studies of dense cores in both the Dragon infrared dark cloud (Kong et al. 2021), the Orion Molecular Cloud (OMC; Takemura et al. 2023), the ASHES IRDC sample (Li et al. 2023), and the ALMA-IMF protoclusters (Nony et al. 2023 Comparison of parameters including median volume density of the core number n core,med , number fraction of protostellar cores fproto, the maximum/mean/median value of mass of protostellar cores Mproto,max/Mproto,mean/M proto,med , mass of prestellar cores Mpre,max/Mpre,mean/M pre,med , and core surface mass density Σmass,max/Σmass,mean/Σ mass,med .The IRDCs and IRBCs are in blue and red, respectively.The y value is normalized by the ratio to the value of IRDCs. et al. 2023) also suggest that the protostellar cores are considerably more massive than the starless cores, suggesting that cores grow with time.If the missing flux in the ASSEMBLE sample were recovered, the effect would be to increase the core mass and surface density, strengthening the arguments above.
6.2."Nurture" but not "Nature": A Dynamic View of the M max versus M clump Relation As discussed in Section 5.1, a positive correlation between M max and M clump is observed within the AS-SEMBLE protoclusters, suggesting a close relationship between the natal clump and the most massive core through multiscale gas accretion (Xu et al. 2023).On the contrary, the ASHES pilots, represented by the gray data points in the right panel of Figure 5, exhibit no significant M max versus M clump correlation.The Spearman rank correlation coefficient is calculated to be 0.04, with a corresponding p-value of 0.90.This lack of correlation aligns with the concept of thermal Jeans fragmentation (e.g., Palau et al. 2015Palau et al. , 2018;;Sanhueza et al. 2019), where the clump's thermal Jeans mass is primarily determined by its dust temperature within a narrow range of 10-20 K (Morii et al. 2023), rather than the turbulence whose energy is governed by clump's gravity assuming energy equipartition (Palau et al. 2015).This finding supports the notion that early-stage cores are characterized by dominant initial fragmentation rather than gravitational accretion.In that case, no correlation is naturally expected between the mass of the natal clump and the mass of the core resulting from fragmen-tation.Therefore, we propose a dynamic picture of the clump-core connection.
• At the beginning, initial Jeans fragmentation produces a set of dense cores whose mass is not associated with clump-scale gravitational potential (mass) and turbulence.
• As a massive clump evolves, multiscale continuous gas accretion help build up the connection between clump and core scales, for example the mass correlation that we've observed.

Implications of the M max versus M cluster Relation
The relation M ⋆max versus M ⋆cluster , which describes the relationship between the mass of the most massive star (M ⋆max ) and the total mass of the star cluster (M ⋆cluster ), has been previously established both analytically (Weidner & Kroupa 2004) and observationally (e.g., Testi et al. 1999;Weidner & Kroupa 2006).This relation highlights the systematic variation of the typical upper mass limit with the overall mass of the star cluster.It suggests that the formation of stars within cloud cores is primarily influenced by growth processes occurring in an environment with limited resources.This finding underscores the significance of resource availability in shaping the stellar population within star clusters.
Protoclusters provide a retrospective glimpse towards the early version of star clusters.Throughout this paper, we refer M cluster as the sum of all the core masses in a protocluster.Note that M cluster is different than the total mass of a stellar cluster (M ⋆cluster ).As shown in the left panel of Figure 5 2013) Bonnell(2003Bonnell( ,2004) ) Polynomial Model Linear Model Weidner(2013) Bonnell(2003Bonnell( ,2004) ) M max M 0.9 cluster M m a x The blue solid line shows the second-order polynomial fitting of all the data points.Mmax ∝ M 0.88 cluster in the blue box indicates the power-law index in the first-order approximation.The gray dashed line shows the linear regression of all the data points.The green solid line shows the polynomial fitting of the data from 139 star clusters (Weidner et al. 2013), with first-order approximation of M⋆max ∝ M 0.58 ⋆cluster in the upper green box.The dashed green line as well as the lower green box show the relation from smoothed particle hydrodynamics simulation (Bonnell et al. 2003(Bonnell et al. , 2004)).
masses of the most massive cores M max in Figure 10.Contrary to what has been found in the "M max versus M clump " plane (shown in the right panel of Figure 5), both the ASHES and the ASSEMBLE protoclusters have positive correlation between M max and M cluster .A second-order polynomial model "y = −0.27x 2 + 1.96x − 1.51" fits the M max versus M cluster relation best, with a correlation coefficient of 0.68, which is shown in the blue solid line.Besides, we also present the relation in stellar clusters given by observations (Weidner et al. 2013) in green solid line and smoothed particle hydrodynamics simulations (Bonnell et al. 2003(Bonnell et al. , 2004) in green dashed line.
In order to facilitate a direct comparison between the stellar cluster and protostellar cluster, we have also performed a first-order approximation of the polynomial models represented by the blue and green solid lines.By utilizing the mean value theorem, the first-order powerlaw index can be estimated by considering the average derivatives within the given value range.The estimation of the power-law index is indicated in the blue and green boxes, which are overlaid on the respective solid lines.As shown in the gray dashed line, we directly perform the linear regression to derive a power-law index of 0.9, validating our first-order approximation of 0.88.
Despite uncertainties, the slope of the log M max versus log M cluster relation is notably steeper compared to that of the log M ⋆max versus log M ⋆cluster relation.To reconcile this disparity within the context of protocluster evolution, we take into account the influence of multiple star systems on massive star formation.As depicted in Figure 1 by Offner et al. (2022), the probability of events involving multiplicity is nearly 100%.In other words, it is highly likely that massive cores give rise to the formation of more than one massive star.Hence, it is natural to expect that the slope of the log M max versus log M cluster relation can evolve into a shallower version, akin to what is observed in the log M ⋆max versus log M ⋆cluster relation.
Another noteworthy finding is the similarity in the total cluster mass distribution between the ASHES and ASSEMBLE protoclusters, particularly within the range of 10-100 M ⊙ , when excluding four outliers with M cluster > 100 M ⊙ .Since these outliers also exhibit higher clump masses (refer to Figure 5), a more fundamental question arises: Why do these protoclusters with different evolutionary stages consistently maintain a mass proportion of cluster to clump (M cluster /M clump ) between 1-10%?This proportion can be regarded as the dense gas fraction (DGF), which is often closely associated with star formation efficiency (Ge et al. 2023).Consequently, it is of great significance to investigate the evolution of DGF in relation to massive star-forming clumps (Xu et al., in preparation).

Gravitational Contraction: Protoclusters Evolve to Greater Compactness
As shown in Figure 6, the core separation distribution of the ASSEMBLE protoclusters have two prominent features.One is a peak at 0.035 pc as discussed in Section 5.3, systematically smaller than what has been found in the ASHES, meaning that the spatial distribution becomes tighter and more compact as the protocluster evolves.The other one is an extended tail at 0.06-0.3pc, numerically consistent with what has been found in the ASHES protoclusters of 0.06-0.24pc (refer to green or gray histograms of Figure 6), which are assumed to be the residuals of the initial fragmentation at the early stage.In this section, we discuss how gravity leads to the tightening process of protoclusters and complete the dynamic picture of fragmentation and gravitational contraction.
We can make a simple semi-quantitative calculation.Given that the thermal Jeans fragmentation is observed to dominate at the early stage of massive star formation (Sanhueza et al. 2019;Li et al. 2022;Morii et al. 2023;Palau et al. 2014Palau et al. , 2015Palau et al. , 2021)), we assume the initial condition that the cores could have initially fragmented on Jeans length scales of ∼ 0.14 pc (mean Jeans lengths in the ASHES; Sanhueza et al. 2019).If dense cores are moving toward the center of the clump by gravity, then the velocity of the cores should be freefall velocity as v ff = R cl t ff .Adopting the typically observed massive clump size and density of R cl = 1 pc and n H2 = 10 4 cm −3 , the core separation will be tighter by, where t life ∼ 0.2 − 1 × 10 5 yr (Motte et al. 2018) are the free-fall time scale and the statistical lifetime of massive starless clumps, respectively.Therefore, the core separation should tighten by ∆l gc ≃ 0.04 − −0.18 pc in the protoclusters by gravitational contraction, numerically consistent with the shift from extended tail (0.06-0.3 pc) to the observed separation (peaked at 0.035 pc).
The simple gravitational contraction model fits the observations well, indicating ongoing bulk motions from the global gravitational collapse of massive clumps (Beuther et al. 2018;Vázquez-Semadeni et al. 2019).But note that we are still unable to rule out another possibility that the closer separation is due to hierarchical fragmentation to produce a series of condensations inside a massive core.

Evolution of the Q Parameter
As clumps evolve over time, the primordial distribution of cores dissolves due to dynamical relaxation, leading to a more radially concentrated structure as predicted by simulations (Guszejnov et al. 2022).Consequently, more-evolved clumps are predicted to have higher Q values.In the 12 ASHES Pilots, Sanhueza et al. (2019) used the fraction of protostellar cores f proto to gauge the evolutionary stage.Due to the narrow parameter space of similar evolutionary properties such as dust temperature T dust (10-15 K) and luminosity-tomass ratio L/M (0.1-1 L ⊙ /M ⊙ ), only a weak correlation between Q and f proto was found (Sanhueza et al. 2019).However, Dib & Henning (2019) found that the most active star forming region W43 has a higher Q value compared to more quiescent regions (L1495 in the Taurus, Aquila, and Corona Australis).These studies  inspire a larger sample with wide evolutionary stages to shed more light on the interplay between star formation in clouds and the spatial distribution of dense cores.The combination of the ASSEMBLE and ASHES clumps provides a systematic sample with a wide dynamic range of evolutionary stage (L/M of 0.1-100 L ⊙ /M ⊙ and T dust of 10-35 K).To make the comparison between two samples more directly, we have simulated the mock 0.87 mm continuum data with only the 12-m array configuration (see details in Appendix C).Following the same procedure of core extraction, we have an updated ASHES core catalog used for the MST algorithm (see more details in Appendix B).The Q parameters for the ASHES sample range from 0.40 to 0.75, with a median value of 0.61(0.13).The Mann-Whitney U test between the Q parameters of the ASSEMBLE and ASHES samples has a p-value of 0.03 (<0.05), showing the two samples have significantly different Q parameters.
The linear regressions between the Q parameters and the evolutionary indicators (L/M and T dust ) are performed in the log versus log space and shown in Figure 11.The positive correlations between both L/M and T dust , indicates that the Q parameters evolve with time.The correlations are confirmed to be statistically significant by the high Pearson correlation coefficients ρ p of 0.61 and 0.60 with p-values of 0.0024 and 0.0034 for L/M and T dust , respectively.Moreover, the Spearman correlation coefficients ρ s are 0.57 and 0.53 with p-values of 0.0056 and 0.0088.Statistically, it's tentatively evident that the Q parameter of the protostellar clusters should increase in later evolutionary stages, indicating more sub-clustering distribution at an early stage but more centrally condensed structure when the cluster evolves, which agrees with the results and predictions in Sanhueza et al. (2019).Compared to the early-stage clusters previously reported by Sanhueza et al. (2019); Morii et al. (2023), we have identified three main differences in mass segregation according to the Λ-plots in Figure 8. Firstly, evident mass segregation (with Λ MSR ≳ 3) was found in 73% (8 out of 11) ASSEMBLE protoclusters, which is > 5 times more than it was identified in the ASHES sample (13%, 5 out of 39; Morii et al. 2023).Secondly, the mass segregation ratios we observed were significantly higher, with some clusters exhibiting values as large as ∼ 9. Finally, we observed Λ MSR > 1 even for larger core numbers (≳ 10) in certain protoclusters such as I16351, I15520, and I15596.
Using the mass segregation integral (MSI) introduced in Section 5.5.2, we present a direct comparison between the ASSEMBLE and the ASHES Totals, as illustrated in Figure 12.The Mann-Whitney U test reveals significant differences between the ASSEMBLE and the ASHES protoclusters, as indicated by the green histogram of p-values.To establish a reference sample for statistical analysis, we simulate 100 clusters with mean MSI of µ = 0 (no mass segregation), and with random perturbation of σ = 0.1 (assumed to be the same as typical uncertainties when calculating the MSI) in MSI.The Mann-Whitney U test rejects the null hypothesis that the MSI of the ASSEMBLE protoclusters follows the random perturbation (p-value ≪ 0.01), highlighting the presence of evident mass segregation.In contrast, the null hypothesis cannot be confidently rejected for the ASHES protoclusters, with mean and median p-values of 0.18 and 0.12, respectively.Thus, the ASSEMBLE protoclusters exhibit robust evidences of mass segregation, whereas the mass segregation in the ASHES protoclusters is weak to moderate.
In the context of protocluster evolution, the degree of mass segregation increases unambiguously from the ASHES clusters to the ASSEMBLE clusters (this work).Therefore, the natural question is the origin of mass segregation.Here, we test whether mass segregation can result from the canonical dynamical relaxation by twobody relaxation.
To analyze the dynamics of the cluster, we adopt the formulation of Reinoso et al. (2020), who extended the framework of Spitzer (1987) to include the effect of a gas potential.The crossing time of the cluster is then given as, with velocity under the virial equilibrium V vir , and q = M gas /M core .Here, M gas and M cluster are ambient low-density gas mass and total dense core cluster mass, respectively.R is radius of cluster.The relaxation time is then given as, where N is the number of core in a cluster and γ is a constant of proportionality in the term of virial velocity.The γ value is between 0.42 and 0.38 for the polytropes of index of the cluster system between 3 and 5, and the γ = 0.4 provides a reasonably good approximation for most systems (Spitzer 1969).So we use γ = 0.4 here.
We take the ASHES sample as the initial condition for our protocluster analyses, assuming a typical value of R = 0.5 pc, N = 25, and M cluster = 100 M ⊙ (i.e., the mass of the protoclusters).For the gas mass, we used two different methods.The first method is calculating the total gas mass based on an average volume density of 5 × 10 4 cm −3 and a radius of R = 0.5 pc (from Table 1 in Morii et al. 2023), resulting in a gas mass of approximately ∼ 200 M ⊙ .The second method considers that the ALMA recovered flux only comes from the dense cores that we identified and the missing flux should come from diffuse gas, both of which are covered by the ATLASGAL emission.As shown in Section 4.2, the flux ratio of ALMA to ATLASGAL has a mean value of ∼ 20%.Therefore, the total gas mass should be four times larger than the total mass of the core cluster, giving a value of 400 M ⊙ .Two independent methods yield M gas within the range of 200-400 M ⊙ .By considering both methods, we derived the q value of 2-4.Taking all factors into account, we found that the typical relaxation time of a protocluster was as long as 70−500 Myrs, which is much longer than the typical lifetime of massive star formation (several Myrs).Considering the short formation timescale of massive stars, the mass segregation is unlikely to be caused by dynamical relaxation (Zhang et al. 2022), as is the case for more evolved stellar clusters.In the context of a stellar cluster, such mass segregation should be considered "primordial", although it has already evolved from its initial stage.
If the observed mass segregation is not induced by traditional dynamical processes by cores/stars themselves, what could be its origin?We propose that this could be naturally due to the gravitational concentration of the entire clump or gas accretion toward the center.The ALMA observations of IRDCs have already revealed a large number of sub-Jeans-mass cores during the initial fragmentation (Sanhueza et al. 2019;Morii et al. 2023).In the late stage, the most massive cores are always located at the centers of the clumps or of their gravitational potentials.Our work supports the predictions of numerical simulations where members near the center of the gravitational potential will become the most massive cores during the evolution due to their privileged location in the forming cluster (Bonnell & Davies 1998;Bonnell & Bate 2006).

CONCLUSION
The ALMA Survey of Star formation and Evolution in Massive protoclusters with Blue Profiles (ASSEM-BLE) is aimed at a comprehensive examination of the mass assembly process of massive star formation in a dynamic view, including fragmentation and accretion, and their relevance to theories.To this end, the survey employed ALMA 12 m mosaicked observations to capture both continuum and spectral line emissions in 11 massive (M clump ≳ 10 3 M ⊙ ) and luminous (L bol ≳ 10 4 L ⊙ ) clumps protoclusters with blue profiles.This paper releases the continuum data, characterizes the core physical properties, and presents the analyses of the evolution of the protostellar clusters, while outlining the conclusions drawn from the analysis as follows: 1.With a high angular resolution of ∼0.8-1.2 ′′ , the 870 µm dust continuum emission reveals fragmentation with diverse morphologies.Applying the getsf algorithm to the continuum data, we identified a total of 248 cores across the 11 massive protoclusters, with the number of cores per clump ranging from 15 to 37.
2. We classified the cores on the basis of molecular outflows and line identification.Of the 248 cores, 142 were classified as prestellar core candidates, while 106 were identified as protostellar cores.To estimate the temperature, we used the rotational temperature derived from the multitransition lines of H 2 CS and CH 3 OCHO.If neither of the two lines are detected, we used the clumpaveraged temperature for the prestellar core candidates.The properties of H 2 CS lines in the AS-SEMBLE sample will be discussed in a forthcoming article.
3. Compared to early-stage ASHES protoclusters, the more evolved ASSEMBLE protoclusters show systematic increases in the average and maximum mass as well as in the surface density of both protostellar and prestellar cores.These increases indicate ongoing mass accretion onto these dense cores, which aligns with the gas accretion process observed in these massive clumps with blue profiles.
4. The mass of the most massive core (MMC) M max correlates with the mass of the clump M clump as M max ∝ M 0.75 clump , with a Spearman correlation coefficient of 0.73.The sublinear correlation indicates a coevolution between clump and MMC potentially by multiscale gas accretion.In con- trast, the correlation is not observed in the earlystage ASHES protoclusters, consistent with the idea that early-stage cores are characterized by dominant initial fragmentation rather than clumpscale gravitational accretion.
5. The correlation between the mass of MMC M max and the mass of protoclusters M cluster is almost linear with a power index of ∼0.9 in the firstorder approximation.Despite uncertainties, the slope of the log M max versus log M cluster relation is steeper compared to that of the log M ⋆max versus log M ⋆cluster relation found in star clusters, which can be reconciled by an increasing trend of stellar multiplicity with mass.
6.The most massive prestellar cores found in our study have an average mass of 18.6 M ⊙ , which is approximately two times larger than that found in the ASHES Pilots.Furthermore, the median and mean masses of the prestellar cores in the protoclusters are ∼ 2-3 times higher than in the IRDCs.This suggests that prestellar cores are becoming more massive as a result of the continued mass accumulation within the natal clump and that highmass prestellar cores can potential survive in protostellar clusters.Thus, we recommend a systematic search for high-mass prestellar cores in massive protoclusters.
7. Using the Minimum Spanning Tree (MST) algorithm, the cores within each cluster are connected by edges.The core separations in the ASSEM-BLE sample are systematically smaller than those in the ASHES sample, indicating that the cluster becomes tighter with closer separations during its evolution.The Q parameters are observed to be positively correlated with both luminosity to mass ratio L/M and dust temperature T dust , indicating a more sub-clustered distribution at an early stage, but a more centrally condensed structure as the cluster evolves.
8. According to the mass segregation ratio (Λ) plots and the mass segregation integral (MSI) that we defined in this paper, mass segregation is commonly found (8 out of 11) and clearly evident in the ASSEMBLE protoclusters.The MSI of the ASHES sample shows an insignificant difference from the random spatial distribution without mass segregation, indicating a weak or no mass segregation in the initial stage.It was further proposed that the mass segregation should arise from gas accretion processes and gravitational concentration, as opposed to arising from dynamical interactions between point masses when the gas has already gone from the systems.
Leveraging the results and discussions presented above, we are proposing a comprehensive dynamic perspective on protocluster evolution as shown in Figure 13.At the initial stage, the protocluster originates from thermal Jeans fragmentation in infrared dark (L/M < 1L ⊙ /M ⊙ ) clumps, with wide separation and no mass segregation.Subsequently, filamentary structures, especially hub-filament system (Morii et al. 2023), act as "conveying belts" and facilitate mass transfer toward the cores, by which the connection between the clump and the cores is gradually established (Xu et al. 2023).Concurrently, protostars form from dense cores, leading to the heating of gas and dust within the clump, transitioning it into an infrared weak state (1L ⊙ /M ⊙ < L/M < 10L ⊙ /M ⊙ ).Due to the effects of persistent global gravitational collapse and contraction, the protocluster becomes even tighter with narrower core separations and the mass segregation builds up in the late stage (L/M > 10L ⊙ /M ⊙ ).
The ASSEMBLE project not only provides valuable insights into the mass segregation and clustering properties of massive protoclusters but also can be used to investigate outflows (Baug et al. 2021), chemistry, and core-scale infall motion.When combined with Band-3/6 data from the ATOMS project (PI: Tie Liu, see the survey description in Liu et al. 2020), the ASSEMBLE project's data can facilitate more kinematic analyses, further illuminating how gas is transferred inward and how efficient accretion is at the clump scale and in a dynamic view.In this paper, our analyses and their statistical significance are mainly limited by sample size.As the ASSEMBLE project aims to expand its sample to include a wider range of parameters such as evolutionary stage (L/M ) and clump mass (M clump ), even more statistically significant results are expected.2013, 2018), CASA (McMullin et al. 2007), getsf (Men'shchikov 2021)

Figure 1 .
Figure 1.Showcase of the ASSEMBLE sample.Background shows the Spitzer infrared three-color map (blue: 3.6 µm; green: 4.5 µm; red: 8 µm).White contours are ATLASGAL 870 µm continuum emission, with levels starting from 5σ increasing in steps of f (n) = 3 × n p + 2 where n = 1, 2, 3, ...N .The beam of the ATLASGAL continuum map is 19.′′ 2 as shown on the bottom left.The ALMA mosaicked pointings are shown with yellow circles and the primary beam responses of 0.5 and 0.2 are outlined by orange solid and dashed lines respectively.The scale bar of 0.5 pc is labelled in the bottom right corner.
3. OBSERVATIONS AND DATA REDUCTION 3.1.ALMA Band-7 Observing Strategy The 17-pointing mosaicked observations were carried out with ALMA using the 12-m array in Band 7 during Cycle 5 (Project ID: 2017.1.00545.S; PI: Tie Liu) from 18 May to 23 May 2018.The observations have been divided into 6 executions; 48 antennas were used to obtain a total of 1128 baselines with lengths ranging from 15 to 313.7 meters across all the executions.The mosaicked observing fields of ALMA are designed to cover the densest part of the massive clumps traced by the ATLASGAL 870 µm continuum emission.The fields are outlined by the yellow dashed loops in Figure 1.The field center of each mosaicking field is shown in the column (2)-(3) of Table

Figure 2 .
Figure2.The ALMA 870 µm dust continuum emission without primary beam correction as well as extracted cores for two ASSEMBLE clumps (I14382-6017 and I14498-5856).The ALMA mosaicked primary beam responses of 0.5 and 0.2 are outlined by yellow solid and dashed lines respectively.Only the primary beam response of 0.2 is shown on the right panel.The beam size of each continuum image is shown in the bottom left corner.Left: the background color map shows the ALMA 870 µm emission with two colorbars, the first one (grayscale) showing -9 to +9 times the rms noise on a linear scale, then a second one (color-scheme) showing the range +9 times the rms noise to the peak value of the image in an arcsinh stretch.The rms noise and peak intensity are given on the top right.The black contours are from the ATLASGAL 870 µm continuum emission, with power-law levels that start at 5σ and end at I peak , increasing in steps following the power law f (n) = 3 × n p + 2 where n = 1, 2, 3, ...N and p is determined from D = 3 × N p + 2 (D = I peak /σ: the dynamic range; N = 8: the number of contour levels).The values of each contour level are labeled with a unit of Jy beam −1 .Right: the background gray-scale map shows the arcsinh-stretch part in the left panel, outlined by the 5σ contour.The ALMA continuum emission map is smoothed to a circular beam with a size equal to the major axis of the original beam.The cores extracted by getsf algorithm are presented by red / blue ellipses, as well as black IDs, with numbers in order from North to South.The red and blue ones represent protostellar and prestellar cores defined in Section 4.3.

Figure 3 .
Figure 3. (a) Number of cores detected against the 1σ mass sensitivity.The mass of clump is coded as the size of circle.(b) Number of cores detected against the physical resolution.The distance of clump is coded as the size of circle.

Figure 4 .
Figure4.Examples of "hot core" and "warm core" spectra.The gray lines are the real spectral data extracted from I16060-ALMA7 and I16060-ALMA15 dense cores.The best-fit line models of CH3OCHO and H2CS are shown in red and blue, respectively.The core temperatures are assumed to be 112 K and 89 K for the hot and warm core, respectively.

Figure 5 .
Figure 5. Left: Core masses Mcore versus the clump masses M clump .The masses of the most massive cores Mmax in each clump are labeled with the corresponding colors.Right: The scaling relation between Mmax and M clump , with the result of linear regression shown on the top left.The shaded area shows the 2σ uncertainty of the fitting result of the ASSEMBLE sample.The gray points from the ASHES Pilots show no correlation.ulations(Schneider et al. 2010;Naranjo-Romero et al. 2022), which can play a crucial role in regulating mass reservoirs at different scales.Notably,Xu et al. (2023) found four spiral-like gas streams conveying gas from the natal clump directly to the most massive core, with a continuous and steady gas accretion rate across three magnitude.Therefore, we suggest that such a "conveyor belt"(Longmore et al. 2014) should be the main reason for coevolution.If all the massive clumps are undergoing a quick mass assembly, the sublinearity of the mass scaling relation also suggests that the clump-tocores efficiency should vary among different clumps(Xu  et al., in preparation).To more directly understand the dynamic picture of coevolution of clump and core, detailed gas kinematics analyses should be systematically performed in a sample with a wide range of evolutionary stages.

Figure 6 .
Figure6.Upper: the number distribution (indicated by stacked histogram) and probability density distribution (indicated by line-connected points) of core separation are presented in a logarithmic scale, where the ASSEMBLE, the ASHES Total(Morii et al. 2023), and the ASHES Pilot(Sanhueza et al. 2019) samples are presented in blue, green, and gray colors, respectively.The mean spatial resolution of both the ASSEMBLE and the ASHES surveys are ∼ 0.02 pc, shown with orange shadow.Lower: the 1000 Monte Carlo runs of the probability density distribution of core separation for the ASSEMBLE (blue lines) and the ASHES (gray lines), respectively, considering the Gaussian-like uncertainty of clump distance.The Mann-Whitney U test is performed on each of the sets of core separation distributions, and the distribution of the p-value is shown in the top right.The pvalues are much lower than 0.01, showing that two samples share a significantly different distribution of core separation.

Figure 7 .
Figure 7.The Q values versus luminosity-to-mass ratio L/M in the ASSEMBLE protoclusters.The linear regression results including the fitting model, Pearson correlation coefficient ρp, Spearman rank correlation coefficient ρs, and scatter σ are shown on the upper left.

Figure 8 .
Figure8.The mass segregation ratio ΛMSR plots: ΛMSR is presented as a function of the fraction of core number fMST = NMST/NMST,max for the 11 protoclusters.The shaded regions in each panel represent the local uncertainties.The upper left corner of each panel shows the maximum value of ΛMSR,max and its corresponding NMST.The ΛMSR versus fMST panels are arranged in descending order of ΛMSR,max for the corresponding protoclusters.The total core number is listed at the lower right corner.
Figure9.Comparison of parameters including median volume density of the core number n core,med , number fraction of protostellar cores fproto, the maximum/mean/median value of mass of protostellar cores Mproto,max/Mproto,mean/M proto,med , mass of prestellar cores Mpre,max/Mpre,mean/M pre,med , and core surface mass density Σmass,max/Σmass,mean/Σ mass,med .The IRDCs and IRBCs are in blue and red, respectively.The y value is normalized by the ratio to the value of IRDCs.
, M cluster are plotted with the

Figure 10 .
Figure 10.The Mmax versus M cluster relation.The AS-SEMBLE and the ASHES Pilots are plotted with errorbars.The blue solid line shows the second-order polynomial fitting of all the data points.Mmax ∝ M 0.88 cluster in the blue box indicates the power-law index in the first-order approximation.The gray dashed line shows the linear regression of all the data points.The green solid line shows the polynomial fitting of the data from 139 star clusters(Weidner et al. 2013), with first-order approximation of M⋆max ∝ M 0.58 ⋆cluster in the upper green box.The dashed green line as well as the lower green box show the relation from smoothed particle hydrodynamics simulation(Bonnell et al. 2003(Bonnell et al. , 2004)).

Figure 11 .
Figure 11.The Q values versus luminosity-to-mass ratio L/M (left) and temperature T dust (right).The linear regression results including the fitting model, Pearson correlation coefficient ρp, Spearman rank correlation coefficient ρs, and scatter σ are shown on the upper left.ASHES clumps are shown in the Q versus T dust panel with gray points.

Figure 12 .
Figure 12.The mass segregation integrals I MSR Λ of the AS-SEMBLE and the ASHES are shown as blue and gray histograms with errorbars.The p-values of the Mann-Whitney U test between ASSEMBLE, ASHES, and Gaussian distributions (µ = 1,σ = 0.1) are shown in three colors, which are attached in the upper right corner.

Figure 13 .
Figure 13.The cartoon of protocluster evolution from Infrared dark, to infrared weak, and to infrared bright.The black filamentary structures connect the dense cores at the early stage (Morii et al. 2023) and transfer gas inwards (Xu et al. 2023), and then fade away as the protocluster evolves (Zhou et al. 2022).The black arrows indicate inflow gas streams.= prestellar cores; = protostellar cores; = OB stars.
grant.AP acknowledges financial support from the Sistema Nacional de Investigadores of CONAHCyT, and from the CONAHCyT project number 86372 of the 'Ciencia de Frontera 2019' program, entitled 'Citlalcóatl: A multiscale study at the new frontier of the formation and early evolution of stars and planetary systems', México.GCG acknowledges support from UNAM-PAPIIT IN10382 grant.A.S., G.G., and L.B. gratefully acknowledge support of ANID through the BASAL project FB210003.A.S. also gratefully acknowledges support from the Fondecyt Regular (project code 1220610).This work is sponsored (in part) by the Chinese Academy of Sciences (CAS), through a grant to the CAS South America Center for Astronomy (CAS-SACA) in Santiago, Chile.C.W.L. is supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2019R1A2C1010851), and by the Korea Astronomy and Space Science Institute grant funded by the Korea government (MSIT; Project No. 2023-1-84000).K.T. was supported by JSPS KAKENHI (Grant Number JP20H05645).This paper uses the following ALMA data: ADS/JAO.ALMA#2017.1.00545.S and 2019.1.00685.S, and ADS/JAO.ALMA#2015.1.01539.S, 2017.1.00716.S, and 2018.1.00192.S. ALMA is a partnership of ESO (representing its member states), NSF (USA) and NINS (Japan), together with NRC (Canada), MOST and ASIAA (Taiwan), and KASI (Republic of Korea), in cooperation with the Republic of Chile.The Joint ALMA Observatory is operated by ESO, AUI/NRAO, and NAOJ.Data analysis was in part carried out on the High-Performance Computing Platform of Peking University, and the open-use data analysis computer system at the Astronomy Data Center (ADC) of the National Astronomical Observatory of Japan.The MeerKAT telescope is operated by the South African Radio Astronomy Observatory, which is a facility of the National Research Foundation, an agency of the Department of Science and Innovation.

Table 1 .
Physical Properties of the ASSEMBLE Clumps

Table 2 .
ALMA Observational Parameters Xu et al.

Table 3 .
Fundamental Measurements of Core Parameters from getsf

Table 4 .
Calculated Properties for the Core Sample

Table 5 .
Note-ASSEMBLE clump and extracted core ID are listed in (1) and (2).Dust temperature and its estimation methods are listed in (3) and (4).The mass, radius, volume density, surface density, and peak column density are listed in (5)-(9).This table is available in its entirety in machine readable form.aTemperature estimation method: G = global clump-averaged temperature in column (9) of Table1;H = H2CS rotation temperature; C = CH3OCHO rotation temperature.Statistics of the ALMA Cores in Each Clump Note-ASSEMBLE clump is listed in (1).The 1σ mass sensitivity and the number of extracted cores are listed in (2) and (3).The minimum, maximum, and mean values of the mass are listed in (4)-(6).The mean values of the core radius, volume density, surface density, and peak column density are listed in (7)-(10).The numbers of prestellar and protostellar cores are listed in (11).