Tracing the Total Stellar Mass and Star Formation of High-redshift Protoclusters

As the progenitors of present-day galaxy clusters, protoclusters are excellent laboratories to study galaxy evolution. Since existing observations of protoclusters are limited to the detected constituent galaxies at UV and/or infrared wavelengths, the details of how typical galaxies grow in these young, pre-virialized structures remain uncertain. We measure the total stellar mass and star formation within protoclusters, including the contribution from faint undetected members by performing a stacking analysis of 211 z = 2–4 protoclusters selected as Planck cold sources. We stack Wide-field Infrared Survey Explorer and Herschel/SPIRE images to measure the angular size and the spectral energy distribution of the integrated light from the protoclusters. The fluxes of protoclusters selected as Planck cold sources can be contaminated by line-of-sight interlopers. Using the WebSky simulation, we estimate that a single protocluster contributes 33% ± 15% of the flux of a Planck cold source on average. After this correction, we obtain a total star formation rate of 7.3 ± 3.2 × 103 M ⊙yr−1 and a total stellar mass of 4.9 ± 2.2 × 1012 M ⊙. Our results indicate that protoclusters have, on average, 2× more star formation and 4× more stellar mass than the total contribution from individually detected galaxies in spectroscopically confirmed protoclusters. This suggests that much of the total flux within z = 2–4 protoclusters comes from galaxies with luminosities lower than the detection limit of SPIRE (L IR < 3 × 1012 L ⊙). Lastly, we find that protoclusters subtend a half-light radius of 2.′8 (4.2–5.8 cMpc), which is consistent with simulations.


INTRODUCTION
Protoclusters are overdensities of galaxies at high redshift that are expected to collapse into low redshift galaxy clusters (Overzier 2016).Unlike galaxy clusters, protoclusters are not yet compact and virialized.In the local universe, galaxies located within galaxy clusters have lower star formation rates and older stellar populations compared to galaxies in the field (Dressler 1980).However, observations and simulations of galaxies suggest that at higher redshifts cluster galaxies have higher star formation rates for a fixed stellar mass (Casey 2016;Chiang et al. 2017).At redshifts of z > 1.2, the star-Corresponding author: Roxana Popescu rpopescu@umass.eduformation rates of individual galaxies within clusters exceed those of field galaxies (Alberts et al. 2014).
It is important to study protoclusters to understand the history of the formation of galaxy clusters and the processes that govern the growth and quenching of galaxies that reside in dense environments.Studies of protoclusters can also indicate how structure evolves in the universe and indicate the impact of environment on galaxy evolution.Infrared observations, in particular, provide a powerful constraint on stellar content growth via star formation in protoclusters (see, e.g., Alberts & Noble 2022, and references therein).At z > 2, continuum emission from old stars is redshifted to the near-infrared.Dust obscured star-formation is observed in the mid-to far-infrared as emission from stars is absorbed by and re-radiated from the dust.State-ofthe-art hydrodynamic simulations are unable to repro-duce the combined star-formation rates of highly dustobscured starburst galaxies observed in protoclusters (Lim et al. 2021;Remus et al. 2022).Sensitive infrared observations can place firm observational constraints on how galaxies grow in protocluster environments and inform theoretical models.
Since protoclusters are extended structures that have not yet collapsed, the hot intracluster gas might not exist.As a result, conventional methods of detecting galaxy clusters such as the Sunyaev-Zeldovich effect and X-ray emission are not as effective.In recent years, highredshift protoclusters have been discovered as overdensities of Lyα emitters (LAEs: e.g., Steidel et al. 2000;Kurk et al. 2000;Hayashino et al. 2004;Lee et al. 2014;Dey et al. 2016;Higuchi et al. 2019;Shi et al. 2019a), Hα emitters (Kurk et al. 2004;Tanaka et al. 2011;Hayashi et al. 2012;Koyama et al. 2013), Lyman Break Galaxies (LBGs: e.g., Steidel et al. 1998;Toshikawa et al. 2012;Shi et al. 2019b), dusty star-forming galaxies (DSFGs: e.g., Chapman et al. 2009;Oteo et al. 2018;Casey et al. 2015) and through infrared color selection (e.g., Galametz et al. 2012).Recently, observatories designed to map the CMB such as Planck and the South Pole Telescope (SPT) have identified bright millimeter/submillimeter sources, which, when followed up with observations at higher resolution, resolve into overdensities of dusty galaxies that are candidate protoclusters (Planck Collaboration et al. 2015;Wang et al. 2021;Miller et al. 2018).While these selection techniques have been proven effective in identifying protoclusters, they target specific types of galaxies and thus do not provide a complete picture of protocluster formation (Kubo et al. 2013;Shi et al. 2019b).Additionally, the small number of deep, multi-wavelength follow-up observations of protoclusters do not present a uniform census of the low-luminosity members that may dominate in number and in the star-formation/stellar mass budget.
In principle, confirming a protocluster structure requires spectroscopic measurements of its member galaxies.However, even in the era of deep, wide-field imaging surveys by facilities such as Euclid, Vera C. Rubin Observatory, and the Nancy Grace Roman Space Telescope which will identify a large number of protocluster candidates, it will be expensive to spectroscopically follow up all of these candidates.A study of a statistical sample of protoclusters that accounts for all of the emission in protocluster members and not just the massive and bright galaxies is needed to quantify when and where star formation takes place within protoclusters and to provide a useful benchmark to interpret further observations.A statistical study would also be useful for comparing the average properties of protoclusters selected by different methods.Comparing statistical samples of protocluster candidates identified by overdensities in the far-IR to optically or near-IR selected overdensities would demonstrate differences in the total star formation within these populations.
In this study we use a WISE and Herschel1 SPIRE stacking analysis to measure the total light from Planckidentified protocluster candidates (Planck Collaboration et al. 2015) in order to quantify the average stellar mass and the dust-obscured star formation rate.We compare our results to a similar stacking analysis done by Kubo et al. (2019) on protoclusters detected by overdensities of Lyman break galaxies with the Hyper Suprime-Cam Subaru Strategic Program (Toshikawa et al. 2018).We also compare to clusters at z ∼ 1−2 (Alberts et al. 2021) stacked using the same total light stacking method.
This paper is organized as follows: Section 2 describes the protocluster sample and the datasets used in this work.In Section 3 we discuss our total light stacking technique.In Section 4 we calculate a correction factor for line of sight interlopers within the sample.In Section 5, we apply the stacking technique to the Planck protocluster candidates, analyzing the average radial profiles and total photometry.We build SEDs from measurements of the light from the stacked protoclusters.Section 6 discusses our results and Section 7 presents our conclusions.Throughout this work, we adopt the standard cosmology (Ω M , Ω Λ , h) = (0.3, 0.7, 0.70) and a Chabrier (2003) initial mass function (IMF).

Protocluster Sample
Our sample consists of protocluster candidates from Planck Collaboration et al. (2015) that were selected as "cold" sources in the cosmic infrared background from Planck High Frequency Instrument (HFI) maps or selected from the Planck Catalogue of Compact Sources (PCCS: Planck Collaboration et al. 2014).A "cold" source has a redder color, peaking between 353 and 857 GHz, which corresponds to a cold dust temperature or a high redshift.We hereafter refer to this sample as the PC15 protocluster candidates.
According to Planck Collaboration et al. (2016) the sources in the Planck High-z (PHz) catalog, which were selected similarly to the PC15 protocluster candidates, have far-infrared luminosities ranging from 1 − 10 × 10 14 L ⊙ and a surface density of 0.2 deg −2 .A cold point source identified with Planck, which has an extent smaller than the 4 ′ .5 Planck beam, is likely an overdensity of unresolved dusty sources or a strongly lensed galaxy (Greenslade et al. 2018;Negrello et al. 2005Negrello et al. , 2017;;Gouin et al. 2022).As described in Planck Collaboration et al. (2015), 228 Planck sources identified were followed up with Herschel SPIRE, which has a smaller beamsize than Planck and can resolve the Planck sources into overdensities or single lensed galaxies.Of the 228 sources, 12 are strongly lensed galaxies confirmed by spectroscopic follow-up, 4 are dominated by Galactic cirrus and the remaining 212 are robust protocluster candidates.Stacked observations of these protocluster candidates show significant and extended emission that extends larger than the Planck beam (Planck Collaboration et al. 2015).We assume a redshift of z = 2 for the PC15 protoclusters, since this is consistent with the colors of individual SPIRE sources, assuming a dust temperature of 35 K (Figure 13, Planck Collaboration et al. 2015).
51 protocluster candidates from both the PC15 protocluster candidates as well as other protocluster candidates from the PHz catalog were followed up with observations by SCUBA-2 on the JCMT (MacKenzie et al. 2017).They find that within the combined 1.20 deg 2 of these 51 fields there is an enhanced source density of S 850 > 8 mJy sources relative to the field.Cheng et al. (2020) individually analyze 46 of these 51 fields and classify 25 as significantly overdense, 11 as slightly overdense, and 10 as not overdense.Martinache et al. (2018) follow up 83 of the PC15 and PHz protocluster candidates with Spitzer IRAC and determine that around 46% of the fields are 3σ overdense compared to the field.Three individual Planck protocluster candidates were spectroscopically observed to identify member galaxies (Flores-Cacho et al. 2016;Kneissl et al. 2019;Polletta et al. 2021).One of these protocluster candidates is observed to be an overdensity of Hα-emitter galaxies (Koyama et al. 2021).Polletta et al. (2022) follows up 18 PHz fields with IRAM-30m/EMIR to detect CO emission from individual galaxies.Out of 8 IRAM/EMIR fields with successfully measured redshifts, half contain 2-3 galaxies each with similar redshifts.In summary, follow-up observations of Planck protocluster candidates so far indicate that at least half correspond to overdense fields.
We use the masked coadds in which outlier pixels are rejected when combining the images.We download tiles from the unWISE website that contain the center of each protocluster candidate.Each WISE tile is 2048 pixels on a side, covering a 1.56 • × 1.56 • area.
We determine which tile is best to download for each protocluster by identifying the tile with the smallest separation between the tile center and the protocluster center.We inspect the unWISE tiles containing the protocluster candidates.Some lie near the intersection of two or four adjacent WISE tiles.In principle, all of these frames contain useful data.However, using multiple tiles for a given protocluster can introduce unwanted systematic effects.As each WISE tile is projected with its own field center as a tangent point, simply coadding two adjacent tiles would artificially decrease the sky noise.In addition, the unWISE data processing includes its background subtraction (Lang et al. 2016) and these overlapping tiles often have different background levels, which can potentially dilute the low-level signal.
To minimize this effect, we reject the unWISE tiles if the center of a given cluster lies too close to the image boundaries.We require that a protocluster center lies within 0.7 • in both X and Y directions from the center of the tile, ensuring that it is at least 7.5 ′ away from the image boundary.20 protoclusters do not meet this criterion and thus are removed from subsequent analysis.Additionally, our visual inspection reveals that some frames contain a star or an image artifact that is too close to the protocluster.In a handful of frames, one bright star dominates a substantial fraction of the frame with its scattered light, raising the background level for a much wider surrounding area, which could mimic the protocluster signal.All in all, 59 protoclusters are removed from the WISE analysis leaving 152 protoclusters.We do not expect the removal of any frames to bias the WISE image stacks since the properties of the removed protoclusters are not intrinsically different from the rest of the sample.

PROTOCLUSTER STACKING METHOD
For our analysis we use the same total light stacking method developed in Alberts et al. (2021) which captures detected cluster members as well as undetected lower luminosity cluster members and/or intra-cluster dust (ICD).In this paper, we stack WISE and SPIRE images of the PC15 protocluster candidates.We create cutouts around each of the PC15 protocluster candidates and stack the cutouts to measure the average signal from all constituents within a sample of protocluster candidates.In this section, we describe the processing of the WISE and Herschel data and our stacking analysis.

WISE Image Processing
We first remove all individually detected sources from these images, as we expect most detected sources to be foreground galaxies and stars, since only very massive galaxies would be detected in WISE at z = 2.The 5σ limiting magnitude of the unWISE W 1, W 2, W 3, and W 4 data is 19.6, 19.3, 16.8, and 14.7 AB, respectively (Wright et al. 2010;Cutri et al. 2013;Meisner et al. 2017a).Assuming the Chabrier IMF and the formation redshift of z f = 6, a WISE source at z = 2 detected in W1 with S/N=5 is expected to have a stellar mass of ≈ 4 × 10 11 M ⊙ , 3 × 10 11 M ⊙ , and 2.2 × 10 11 M ⊙ for an exponentially decaying star formation history (SFH) with τ = 0.1, 1.0 Gyr, and a constant SFH, respectively.The use of the Salpeter IMF would increase the mass by 50-60% while changing the formation redshift to z f = 4 would decrease the mass by up to 30%.Given everything equal, a higher source redshift would also increase the mass of the detected galaxy.
Based on studies of DSFG overdense protoclusters located at z = 2−3 (Polletta et al. 2021;Casey et al. 2015;Kubo et al. 2015), we can expect that there are around 3 spectroscopically confirmed galaxies per protocluster field with WISE detections and stellar masses greater than 10 11 M ⊙ .By masking out the WISE detections in our analysis, we might be losing the flux from at most 3 galaxies per protocluster field, which does not affect the stellar mass derived significantly (see Section 5.2).
In order to isolate the much fainter signal from the protoclusters, we create pixel masks as follows.First, we use the SExtractor software (Bertin & Arnouts 1996, SE, hereafter) to detect sources in each WISE tile.We use a DETECT THRESH of 3.0 and a DETECT MINAREA of 3 pixels, requiring the isophotal signal-to-noise ratio to be 5 or higher.While changing the detection setting can alter the overall background level in the final stacked image, our results are insensitive to such changes.The detection check image is generated as a segmentation map that tags all pixels belonging to an individual source.
Second, we estimate a constant background of each 1.5 • × 1.5 • WISE tile and subtract it from the image.This step is different from the approach taken by Alberts et al. (2021), in which we used the SE-generated, spatially varying background map with BACK SIZE of 10 ′ .At z = 1.0 (1.5), the virial radius of the galaxy clusters studied by Alberts et al. (2021) is 1 Mpc corresponding to 2.0 ′ (1.9 ′ ), i.e, much smaller than the scale of 10 ′ at which the background is allowed to vary.In comparison, unvirialized protoclusters can have half-light radii up to 10 cMpc (6.5 ′ at z = 2: e.g., Chiang et al. 2013;Muldrew et al. 2015).Although the fraction of pixels in a 1.5 • × 1.5 • WISE tile containing a single protocluster is still expected to be small, allowing the background to vary within the tile could remove the low-level protocluster signal.To avoid this effect, we estimate the sigma-clipped mode value and subtract it from the entire image.
Finally, using the segmentation map we mask all pixels that belong to individually detected WISE sources and saturated pixels.The masked pixels are flagged as NaN.The mean (median) fraction of pixels flagged as NaN for the full sample is 31% (26%) with a standard deviation of 13% for W 1. As for W 4, the number is 13 % (7%).

WISE Image Stacking
For each protocluster, we create a 901×901 pixel image (41 ′ on a side) centered on its position using the IDL routine hastrom.pro.All pixels in the no-data region are replaced with NaN.We then convert counts to physical units and store the resampled image of each protocluster in a three-dimensional datacube.We create a weighted mean stack of the datacube where we use the inverse variance of the pixel-wise rms as the weight of each protocluster image.
The stacked image inherently contains residual background which primarily originates from faint sources that are not masked prior to stacking.Additionally, the averaging of > 100 images reveals the low-level residual sky that had not been properly accounted for at the initial background subtraction step.We estimate the constant background in the annular bins in the range of 9 ′ -12 ′ .At z = 2 (3), the range corresponds to 14-18 cMpc (17-23 cMpc), and thus the range ensures that the background is estimated safely away from the expected extent of protoclusters.
It is worth stressing that rigorous sky subtraction is key to obtaining a robust stacking result.It is particularly important for protocluster analyses not only because the expected signal is diffuse but also because protoclusters are very extended in the sky.The use of an SE-generated spatially-varying background map for background subtraction instead of a constant back-ground can considerably alter the signal.Setting the spatial filter size (set by the BACK SIZE parameter in SE) to be larger than the expected protocluster size mitigates the problem.Any filter smaller than ≈ 15 ′ has the potential to remove the low-level signal from the protocluster and to alter the light profile.
Still, the resultant stacked images suggest that the background may have been oversubtracted.Visual inspection of the stacked image -most prominent in W 1shows a slight increase in the background level at > 12 ′ from cluster center.This upturn does not depend on the manner in which we perform sky subtraction including the choice of BACK SIZE values.Thus, we speculate that it is a feature already present in the unWISE tiles.As described in Lang (2014), for each of the individual un-WISE frames, a constant sky background was estimated -as a mode of the pixel histogram -and taken out.The fraction of pixels containing protocluster signal varies with the position of the protocluster and its brightness, but it is conceivable that the background level may have been overestimated and thus oversubtracted.Since our analyses make use of the final unWISE coadds, it is not possible for us to correct individual WISE exposures for this effect nor can we track down the exact cause.
However, this oversubtraction is unlikely to affect our ability to measure the surface brightness profile or the total flux originating from protoclusters.Because a constant sky level is subtracted from each tile (during the unWISE processing as well as our own), the shape of the profile on individual protoclusters is preserved while the background level is likely set below the true value.Thus, if we set our 'sky' baseline at this level, it is possible to recover the true total flux.In practice, this is readily achieved in all bands by requiring that, at radii larger than the expected protocluster size, the surface brightness profile asymptotes to zero, or alternatively, the cumulative flux plateaus to a constant value.
Following Alberts et al. (2021), we create stacks from random positions to test the robustness of the total light cluster stacking method.For each of the protoclusters, we pick a random position within the same unWISE tile (1.5 • on a side) up to 0.7 • away from its center in both X and Y directions and perform image stacking using the identical procedure as before.The result is shown in Figure 1.While these "off-cluster" stacks indicate no detection at their centers, the noise properties are reassuringly similar to those of the protocluster stacks.For each band, we create 500 off-cluster stacks and measure the mode of the pixel distribution from each image as an estimate of the background value and its standard deviation, σ sys , is recorded.

Herschel/SPIRE Image Processing
Prior to stacking, we do not perform any additional processing of the SPIRE images.The pipeline-processed images are already background subtracted and have a mean of zero.We do, however, account for background over-subtraction after stacking as described in Appendix A. This over-subtraction is due to the large relative size of the protoclusters to the images.
Given the SPIRE confusion limit, the sources detected individually in the SPIRE images have flux densities ≳ 20 mJy.The number of low-redshift (z < 2) sources at this FIR flux level is low (Béthermin et al. 2012); in turn, a significant fraction of the detected SPIRE sources are expected to lie at z ≳ 2 and possibly belong to PC15 protoclusters.For this reason, we do not mask any SPIRE detection prior to our stacking analysis.
Additionally, at these wavelength bands, the images are less dominated by foreground galaxies.Alberts et al. (2021) reported that field galaxies do not contribute to the stacked signal in the SPIRE bands by stacking cutouts that were offset from the clusters by a random amount.We are unable to perform a similar test as the SPIRE data we do not provide a sufficiently large areal coverage to produce 'off-protocluster' stacks.

Herschel/SPIRE Stacking
We stack the 211 protocluster candidates in the three SPIRE bands, by constructing a 3D datacube that is the size of the smallest available image dimensions for each band.The cubes are 26.5 ′ ×24.0 ′ for the 250 µm band, 26.6 ′ ×24.2 ′ for the 350 µm band, and 26.8 ′ ×24.0 ′ for the 500 µm band.We randomly rotate the individual cutouts in 90 • steps before placing them in the data cube in order to avoid systematic effects from the mapping since the Herschel scan pattern can result in offsets in the stack centers (Alberts et al. 2021).We then take the inverse variance weighted mean across each pixel of the cutouts.
We compute the root-mean-square (RMS) fluctuation of the pixel distribution for each image.The average RMS over each protocluster cutout ranges from 9.3-10.9mJy/beam at 250 µm.We do not perform sigma clipping to remove outliers because the individual noise maps are smooth and do not vary significantly.
We calculate the stacked emission in each pixel according to: where S i,j is a pixel in the stacked image, I i,j,k is a pixel in each cutout image, σ i,j,k is the error for each pixel of each cutout image.We sum over the running index k, which is the number of image cutouts.
In Figure 1 we show the protocluster stacks for the SPIRE 250, 350, and 500 µm bands.The protocluster signal is clearly detected in all bands and extends to a radius of at least 6 ′ (9.0-12.5 cMpc at z = 2 − 4).

Photometry of WISE and Herschel Stacked Images
We perform aperture photometry on the cluster stacks as follows.In all bands, we make the cumulative flux and the surface brightness measurements in a series of circular apertures and annuli with a step size that is equal to the beam FWHM for each band.We assume the center of the image stack to be the center of the targeted SPIRE observations.For the WISE stacks, we subtract the image by a constant background estimated at 10 ′ -15 ′ away from the image center.For the SPIRE stacks we add to the stacked image the flux that was oversubtracted as described in Appendix A.
The cumulative flux is a sum of all pixel values within the given aperture while the radial surface brightness is measured as the mean of all pixels within a given annulus.For the SPIRE bands, the photometric uncertainties are estimated from 1,000 stacks created by bootstrapping in which a random subset of protoclusters are selected and stacked.For the WISE bands, we add the background uncertainties calculated in Section 3.2 to the total uncertainty.
We compare the total protocluster flux measured in the SPIRE 350 µm and SPIRE 500 µm bands to the unresolved fluxes measured by Planck for a subset of 78 PC15 sources within the PHz catalog.The average Planck fluxes of a subset of the protoclusters have larger uncertainties but are consistent with stacked SPIRE signal.
We also compare the total flux of the stacked PC15 sample to the average flux from individually detected sources in the SPIRE images.We find that on average ∼ 45% of the total 250 µm stacked flux is from individually detected sources, although some fraction of the flux is expected to be from line of sight interlopers (see Section 4).

Redshift Subsamples
The PC15 protoclusters are expected to span a wide redshift range of z=2-4.We subdivide the sample by redshift based on the flux ratios of the total SPIRE flux in different bands within a 7 ′ aperture.We calculate the ratios of the total flux F 500 /F 350 and F 250 /F 350 as done in Planck Collaboration et al. (2015).Seven protocluster candidates are removed due to being individually undetected in one of the SPIRE bands.
At higher redshift, the far-IR SED peak shifts to longer wavelengths, and as a result, galaxies are expected to have a higher F 500 /F 350 ratio and a lower F 250 /F 350 ratio than the lower-redshift counterparts.We define our higher-z galaxy subsample as those with colors F 500 /F 350 > F 250 /F 350 .The remainder (with colors F 500 /F 350 < F 250 /F 350 ) are selected as the lower-z subsample.This criterion results in 57 and 147 protocluster candidates in the higher-and lower-z bins, respectively.We calculate the fluxes in each of the bands for the subsample stacks and include them in Table 1.

CORRECTION FOR LINE OF SIGHT INTERLOPERS
Sources selected by instruments with low resolution, such as single-dish far-infrared instruments, suffer from contamination and subsequent flux boosting by line of sight interlopers (Hodge et al. 2013).Indeed, Negrello et al. (2017) showed that the number density of PC15 protocluster candidates is much higher than the theoretical expectation; the implication is that a large fraction of PC15 protocluster candidates may be superpositions of multiple, unassociated protoclusters along the sightline.Gouin et al. (2022) further measures the contamination along the line of sight within PC15 protoclusters in the IllustrisTNG simulation.While the manner in which star formation occurs in dense protoclusters is not well understood (Lim et al. 2021) and may vary greatly from the expectation of dusty star formation in an average-density environment, these considerations stress the importance of properly accounting for this effect and of correcting our measurements.
In order to realistically simulate the largest cosmic structures at high redshift, we use the halo catalog from the WebSky Simulation (Stein et al. 2020).Websky is created to provide accurate extragalactic mocks to be compared against the current and next-generation cosmic microwave background observations covering large areas of the sky.By providing ≈ 9 × 10 8 halos in the entire sky with the minimum halo mass 1.2 × 10 12 M ⊙ out to z = 4, the Websky catalog can account for the largescale density fluctuations, and thus is ideal to model the all-sky Planck data and the selection of massive protoclusters therein.
To assign an infrared luminosity to each halo, we use the L IR − M h − z calibration empirically obtained by Wu et al. (2018, Equation 32) assuming a 0.25 dex scatter.We model the redshift selection function by taking the N (z) predictions presented in Figure 13 of Planck Collaboration et al. (2015) and fitting them into a smooth curve with the Equation 7 of Schmidt et al. (2015).These models reflect the Planck color selection corresponding to five dust temperature scenarios ranging from 25-45 K and dust emissivity β = 1.5.While we take the 35 K case as our fiducial model, we compute our results in all cases.
All halo positions are projected onto the entire sky and their fluxes are pixelated by matching to the Planck beamsize of 4.5 ′ .The 8,251 brightest IR sources are then selected above a flux density threshold, determined to satisfy the surface density of PC15 protocluster candidates of 0.2 deg −2 (Planck Collaboration et al. 2016).For each of these sources, we construct the IR luminosity contributed by different halos as a function of redshift.In Figure 2, we illustrate example sightlines: the top two panels show the sightlines in which a single structure makes the highest contribution to the total flux.The bottom panels show two randomly chosen structures selected as mock 'PC15 protocluster candidates'.
We do not directly compare the IR luminosities of these simulated PC15 protoclusters with those of the data.WebSky lacks the mass resolution to properly model star formation in halos or to identify all halos.Additionally, observed star formation in protoclusters cannot be reproduced in higher-resolution hydrodynamical simulations (e.g.Lim et al. 2021).Regardless, the abundance matching technique we employ in this work should offer a robust way to identify structures comparable to the PC15 protoclusters provided that halo mass and IR luminosity are broadly correlated as observed (Wu et al. 2018).
The fractional contribution from the dominant structure in each sightline is computed as where B is the background signal, F tot is the total flux density summed over the redshift selection function, and F 1 is the flux density from the dominant structure.The background signal is measured by averaging over all Planck-beam-sized pixels.This step of taking out the background signal is appropriate as it mirrors the sky subtraction procedure adopted in our observational measurement (Section 3.2).This analysis finds, on average, that the contribution from a single protocluster in a Planck-sized beam is 33±15%, i.e., about one third of the total flux.This is in broad agreement with Negrello et al. (2017) in that the interloper contribution is non-negligible and Lammers et al. (2022), who find that almost all simulated protocluster candidates have at least two physically unassociated structures along the line of sight.Our analysis is relatively insensitive to the assumed T dust and scatter in the L IR − M h relation.Assuming T dust =25, 35, and 45 K while keeping the scatter to 0.25 dex, f changes from 40 ± 19, 33 ± 15, and 34 ± 16%, respectively.Similarly, changing the scatter to 0.15, 0.25, and 0.35 dex results in 30±12, 33±15, and 41±22%, respectively.Assuming warmer dust leads to lower interloper fractions while a higher scatter increases the interloper fraction.In all cases, the change is minor compared to the intrinsic spread in the fraction of emission originating from Poisson fluctuations.

Total Cluster Light: Radial Surface Brightness
Profiles and Fluxes of Planck Protoclusters In Figure 3, we show the radial surface brightness profiles of the protoclusters.In all bands, the shapes of the profiles are similar and extend out to 6 ′ -8 ′ from the center.The Planck beamsize of the shortest wavelength band detection band (4.5 ′ at 875 GHz) is indicated as a dotted line in each panel.We compare the radial surface brightness profiles to the Planck beam to verify that the emission is more extended than the Planck beam and therefore resolved.
Some of the WISE radial profiles show dips within the innermost region (r < 1 ′ ).This is because our annular bins are centered on the image center (i.e., the center of the Planck detections) for consistency.Given the large beamsize of Planck, we expect that the center of each protocluster may be offset within the limit, and as a result, the stacked signal may be slightly offset as well (see Appendix B).In addition, we note that the measurements from the innermost annular bins contain a much smaller number of pixels than measurements from the outer bins, and thus are noisier as reflected by the associated error bars.
To measure total cluster light at each band, we perform aperture photometry of the image stack using a circular aperture with a radius of 7 ′ .This corresponds to 9.0-12.5 cMpc at z = 2 − 4, much larger than the expected size of protoclusters (Chiang et al. 2013).As mentioned in Section 3, the surface brightness profiles level off at zero for most bands.
The stacked total protocluster fluxes and associated uncertainties are listed in  ing.We repeat our Herschel stacking and flux measurements on the 152 systems and find that doing so does not change the stacked average flux within the errors for all three SPIRE bands.This is not surprising considering that the removal of a given protocluster is determined by the WISE tiling scheme and the positions of bright stars in and around the WISE images and not by the properties of protoclusters.We assume the Herschel observation centers on the most massive halo within (an asymmetrical) protocluster.Since we are stacking, the random asymmetries of individual protoclusters are then washed out.However, because of the positional uncertainties of protoclusters due to the large Planck beam, the average radial profiles of the protoclusters in the stack smear out.In Appendix B, we quantify this effect by approximating the protoclusters as 2D gaussian distributions and shifting the centers.Our result shows that the effect of positional uncertainty is small.

Spectral Energy Distribution
Using the flux measurements made in WISE and SPIRE bands, we fit the spectral energy distribution (SED) fitting to derive the average physical properties  1.Total stacked fluxes of the PC15 protoclusters in units of mJy within a circular aperture with a radius 7 ′ (3.5 cMpc at z = 2).For the higher-z sample, we do not detect any flux in W 3 and list the 3σ upper limit.Uncertainties are calculated by bootstrapping.The fluxes presented in this table are not corrected for line of sight interlopers (see Section 4).
of the PC15 protoclusters.Since the interloper contamination rate estimated in Section 4 is not wavelengthdependent, we first run the SED fitting using the photometry measured on the image stack and correct the derived quantities after.The W 3 and W 4 measurements are excluded from the SED fitting since they sample the region of Polycyclic Aromatic Hydrocarbon (PAH) emission for galaxies at z ∼ 2; the spread of the protoclusters in redshift space will cause significant scatter in W 3 and W 4 due to PAH emission lines moving in and out of the band (e.g Alberts et al. 2021).Though we do not use the W 3 and W 4 bands for the SED fitting, they strongly indicate the presence of PAH features with a strength consistent with galaxy templates as shown in Figure 4. We fit the observed SED to model SEDs using CIGALE, a Bayesian SED modeling code that employs multi-wavelength energy balance (Burgarella et al. 2005;Boquien et al. 2019;Noll et al. 2009).We parameterize the dust emission following Casey (2012), using the casey2012 module within CIGALE, as a singletemperature modified blackbody which traces cold dust emission from the reprocessed light from young stars, and a mid-IR power law which traces warmer dust emission from starbursts and/or AGN.We allow the dust temperature to vary from 20 K to 60 K while the dust emissivity index (β = 1.5) and the mid-IR power-law index (α = 2.0) are fixed.
We model the star formation history (SFH) from z = 15 up to the assumed protocluster redshift of z = 2.0, 2.5, and 3.0.The expectation is that a protocluster SFH should increase with time at a rate commensurate with the cosmic star formation rate density (SFRD), which Madau & Dickinson (2014) parametrized as: in their Equation 15.Indeed, semi-analytic models predict that the cosmic SFRD contributed by protoclusters has a similar shape as the field counterpart (see Figures 4 and 5 in Chiang et al. 2017).For this reason, we adopt Equation 2 as our custom SFH in CIGALE using the sfhfromfile module.However, we caution that the true SFH of the protoclusters can be different from Equation 2and we use this assumption since we do not have enough photometric data points to constrain the SFH.
We use the bc03 module based on the Bruzual & Charlot ( 2003) stellar population synthesis model, assuming solar metallicity and the Chabrier (2003) IMF.We use the dustatt powerlaw module with an implementation of the Charlot & Fall (2000) attenuation law with both young and old stellar populations.We vary the V band attenuation for the young stellar population from 0 to 20 while keeping the ratio of the attenuation for the oldand young populations constant at 0.44.We set the amplitude of the UV bump to zero and assume a power-law slope of −0.7.The resulting best fit CIGALE SED for the full sample assuming z = 2 is shown in Figure 4.
We calculate SFR IR from the the best-fit total infrared luminosity, L IR , following the calibration of Murphy et al. (2011): SFR IR = 1.5 × 10 −10 L IR .Since the SFR calculation within CIGALE is sensitive to the assumed star formation history, we use an empirical calibration.Assuming z = 2, the best-fit model for the full sample yields a stellar mass of 4.9 ± 2.2 × 10 12 M ⊙ and an SFR of 7.3 ± 3.2 × 10 3 M ⊙ yr −1 after applying the interloper correction factor of 33 ± 15 %.The SED parameters for the full sample derived with and without correction are listed in Table 2.As we describe in Section 3.1, masking out detected WISE sources would only underestimate the stellar mass by at most 6%.
Though it is unphysical to assume that protocluster SFH remains constant or declines exponentially with time, we repeat the SED fitting with these standard assumptions.Compared to our fiducial model, a constant SFH model leads to a decrease in stellar mass by a factor of ≈ 2 while the best-fit exponentially decaying model results in an increase of stellar mass by up to 50%.SFRs remain unchanged within a few percent regardless of the assumed SFHs.

Redshift Evolution within the Sample
We look for redshift evolution within the protocluster sample, by subdividing the full sample based on SPIRE colors as discussed in Section 3.6.We fit the SEDs of the subsamples with CIGALE, using the identical setting described in Section 5.1 other than the fact that we assume z = 2 and z = 3 for the two subsamples.We estimate the redshifts of the subsamples based on the SPIRE color-color plot in Figure 8 from Planck Collaboration et al. (2015).The results of the SED fitting are presented in Table 2. Our results suggest that galaxies in higher-redshift protoclusters have higher specific star formation rates (2.9±0.3×10−9 yr −1 ) than those in their lower-redshift counterparts (1.5±0.1×10−9 yr −1 ), consistent with the trend seen in field galaxies (Speagle et al. 2014).

The Spatial Extent of Stellar Light and Dusty Star Formation in Protoclusters
We compare the normalized radial surface brightness profiles of the stacks in the WISE W 1 and W 2 bands and in the SPIRE bands in Figure 3 (bottom right panel).The WISE W 1 and W 2 bands trace the stellar light, while the SPIRE bands trace the dust emission.We determine the half-light radius of each of the surface brightness profiles by fitting a 1D Gaussian model to each profile and we list them in Table 3.We measure the half-light radii to be around 2.8 ′ (corresponding to 4.2-5.8cMpc at z = 2 − 4).The angular extent of the emission in all of these bands is similar, although the radial surface brightness profiles in the WISE bands appear to have a slightly larger extent.We cannot say if this is because the stellar light traces a larger area than the dust emission or if this is a systematic effect due to the lower signal-to-noise ratio of the WISE stacks.We also determine the half-light radii for the lower-z and higher-z subsamples and, when detected, find them to be consistent with the full sample.
The half-light radii we measure may be overestimated due to positional uncertainty given the Planck beamsize.In Appendix B, we simulate this effect and find that a simulated protocluster light profile has a FWHM that is on average 14% smaller in all bands than the measured FWHM of the protocluster profiles.We list the corrected half-light radii in Table 3. Chiang et al. (2013) use numerical simulations to predict the extent of protoclusters at different redshifts for structures that end up with halo masses ranging from 10 14 M ⊙ to 10 15 M ⊙ at z = 0.As we discuss in Section 6.4, we estimate that the PC15 protoclusters represent Coma-cluster progenitors, with halo masses greater than ≈ 10 15 M ⊙ .The effective radii of these simulated protoclusters correspond to the angular radii of 3 ′ -7 ′ , broadly consistent with our measured half-light radii.Lammers et al. (2022) also perform a stacking analysis of the PC15 protocluster candidates, although they center each image on the brightest SPIRE source prior to stacking.This will cause the radial profile to be a combination of a bright central point source and extended emission.Ignoring the bright central point source, the Lammers et al. (2022) stacks show extended emission out to a radius of around 5 ′ where the radial profile flux decreases to zero, similar to our results (∼7 ′ as shown in Figure 3).We compare our results to the star formation rate estimated for another sample of protoclusters from Kubo et al. (2019), which are selected as overdensities of Lyman Break Galaxies (LBGs) identified from the Hyper Suprime-Cam Subaru Strategic Program (Aihara et al. 2018).The protocluster sample from Kubo et al. ( 2019) is located at z ∼ 3.8, have a number density of n = 4.6 × 10 −7 h 3 Mpc −3 and an average halo mass of M ∼ (0.8 − 1.1) × 10 13 h −1 M ⊙ (Toshikawa et al. 2018).Kubo et al. (2019) employed a similar stacking technique to our analysis, which should capture the total light from the protoclusters.We compare the restframe average SED of the PC15 protocluster candidates to that of the protocluster candidates from Kubo et al. (2019) in Figure 4 and find that the SEDs peak at different wavelengths.The SED of the protocluster candidates from Kubo et al. (2019) implies much hotter dust, which is inconsistent with a purely star-formation dominated SED.Kubo et al. (2019) fit a composite starforming and AGN template and obtain a total SFR of 2.1 +6.3  −1.7 × 10 3 M ⊙ yr −1 , which is consistent with the average star-formation rate we calculate for the PC15 protocluster candidates, indicating that both samples have large amounts of star formation, but the Kubo et al. (2019) sample potentially has a large AGN contribution to its dust emission.
there is little evidence for a dominant AGN contribution to the average SED within the PC15 protocluster sample.The lack of warm dust inferred from our SED fitting may be a selection effect from the Planck color selection.

Fraction of Flux from Bright DSFGs
We estimate the fraction of the total SFR and stellar mass of the PC15 protoclusters from galaxies which are undetected on the SPIRE images by comparing the total SFR and stellar mass from our total light stacking to the sum of the SFR and stellar mass from spectroscopicallyconfirmed bright DSFGs in protoclusters: four protoclusters at z ∼ 2 (Casey 2016), one Planck selected pro- M a i n S e q u e n c e a t z = 2 This Work (Total Light) Summed DSFGS in Individual Protoclusters (z ∼ 2) Summed DSFGS in Individual Protoclusters (z ∼ 4) Clusters (z = 1.4;Total Light) Figure 5.Total star formation rates as a function of total stellar masses for different studies of protoclusters and clusters.Red indicates our total light measurement for PC15 protoclusters corrected for LOS interlopers.We also plot individual protoclusters (Casey 2016) that contain DSFGs (square points), and a confirmed Planck-selected protocluster from the PHz catalog (Polletta et al. 2021) indicated by the circular point.We also show z ∼ 4 protoclusters (Long et al. 2020;Hill et al. 2020Hill et al. , 2022) ) and the z = 1.4 cluster sample from Alberts et al. (2021).The main sequence at z = 2 from Speagle et al. (2014) Table 3. Half-light radii from the radial surface brightness profiles in each band of the PC15 protoclusters, corrected for positional uncertainty in the Planck beam as described in Appendix B. The measured half-light radii are indicated in brackets.The uncertainties range from 0.05 ′ to 0.3 ′ .For the higher-z sub-sample, we do not detect any flux in W 3 and do not measure a robust radial profile in W 4.
tocluster at z = 2.16 (Polletta et al. 2021) and two protoclusters at z ∼ 4 (Long et al. 2020;Hill et al. 2020Hill et al. , 2022)).The individual protoclusters have halo masses greater than 10 13 M ⊙ at their respective redshifts, which places them on similar evolutionary tracks as the PC15 protoclusters (Chiang et al. 2013).We define low luminosity galaxies as those not detectable individually on the SPIRE images, corresponding to the flux F 250 < 20 mJy at 250 µm (L IR ∼ 3 × 10 12 L ⊙ assuming a star-forming galaxy template at z = 2 from Kirkpatrick et al. 2012).
In Figure 5, we plot the total stellar masses and star formation rates calculated for the PC15 protocluster sample in comparison with summed stellar mass and star formation rate values from the literature of individual spectroscopically confirmed DSFGs in protoclusters.Our stacking analysis of the PC15 protoclusters finds two times more SFR and four times more stellar mass than the sum of the SFRs and stellar masses of sources Figure 6.The star formation rate density (SFRD) of the PC15 protocluster candidates, including lower-z and higherz subsamples, and the LBG selected Hyper Suprime-Cam protocluster candidates (Kubo et al. 2019) in comparison to the projected SFRD for protoclusters with a z = 0 halo mass of log M200/M⊙ ∼ 14 − 15 scaled from Chiang et al. (2017) and the observational measurements of field galaxies (Madau & Dickinson 2014) This figure is adapted from Alberts & Noble (2022).
. detected individually in DSFG overdense protoclusters.This suggests that much of the light from protoclusters comes from less luminous member galaxies, which is also consistent with studies of lower-z galaxy clusters (Alberts et al. 2021;McKinney et al. 2022).
We find that around 50% of far-IR light within our protocluster sample could come from detected DSFGs.In blank fields, Béthermin et al. (2012) reported that DSFGs with fluxes greater than 20 mJy account for only 15% of the integrated light in the 250 µm band.The threshold of 20 mJy at 250 µm in Béthermin et al. (2012) is chosen to correspond to sources detected with ∼ 3 − 4σ, where σ includes confusion, in SPIRE 250 µm maps.This is roughly the same limit as the submm followup of the DSFGs with F 250 µm ≥ 20 mJy in protoclusters.Our results indicate that the PC15 protocluster candidates have an abundance of luminous DSFGs relative to the field.

Star Formation Rate Density of Protoclusters
Given the total star formation that we measure for the PC15 protocluster sample, we calculate the star for- mation rate density (SFRD) of the PC15 protoclusters assuming that a protocluster volume is approximated by a sphere of a radius 10.5 cMpc at z = 2 (corresponding to the 7 ′ aperture in which SFR and stellar mass are measured).We include the lower-z and higher-z subsamples of the PC15 protoclusters discussed in Section 5.3, which are consistent with the same SFRD given the uncertainties.
We also compare to the LBG-selected protocluster candidates from Kubo et al. (2019), since this is the only other protocluster sample that has been analyzed through total light stacking.Though the LBG-selected protoclusters are not expected to be the progenitors of the PC15 protoclusters, we include them as a reference point to show the evolution of different protocluster populations.We calculate the SFRD of the LBG-selected protocluster candidates where we assume a spherical volume with a radius of 10.2 cMpc, as determined by a radius of 5 ′ at z = 3.8.
The result is illustrated in Figure 6 together with the cosmic SFRD of the field (Madau & Dickinson 2014).We also show the Chiang et al. (2017) predictions based on semianalytic models, which show the evolution of galaxy clusters at z = 0 with halo masses greater than 10 14 M ⊙ (the median mass is M 200 = 10 14.3 M ⊙ ).
Our SFRD measurement lies an order of magnitude above the field expectation.The SFRD values from both the PC15 protoclusters and the LBG-selected protoclusters are consistent with the theoretical predictions of Chiang et al. (2017).

Evolution from Protoclusters to Clusters
Following the method used by Long et al. (2020), we estimate the average mass of the dark matter halos hosting the PC15 protoclusters by converting the total stellar mass to a total mass by assuming a constant stellarto-halo-mass ratio, M * /M halo .
Motivated by Behroozi et al. (2013, see their Figure 7b), we adopt M * /M halo = 0.01, representative of ≈ 10 11 M ⊙ halos hosting galaxies with stellar masses ≈ 10 9 M ⊙ at z ∼ 2. Recent surveys showed that the galaxy stellar mass function increases steeply towards the low-mass end (with α ≈ −1.6: e.g., Santini et al. 2022); as a result, the stellar mass budget is dominated by low-mass galaxies.We then calculate the total halo mass of PC15 protoclusters to be 4.9 ± 2.2 × 10 14 M ⊙ .
This estimate is uncertain as it sensitively depends on the adopted M * /M halo value and the line of sight interloper correction.The former is known to vary with a galaxy's stellar mass (Behroozi et al. 2013) while the latter is based on the empirical 'field' expectation of how halo mass correlates with infrared luminosity (see Section 4).It is also conceivable that both relations may scale differently in protocluster environments.Still, the total halo mass of PC15 protoclusters is consistent with that of Coma progenitors at z = 2 − 4 (Chiang et al. 2013), reinforcing the notion that they represent the most massive cosmic structures in the universe.
We compare the spatial extents of the PC15 protoclusters to those obtained by the stacked observations of galaxy clusters (log M 200 /M ⊙ ∼ 13.8) in the Boötes field from Alberts et al. (2021).The halo mass of the PC15 protoclusters is estimated to be much greater than the halo mass determined for the Boötes clusters.The most massive of the stacked Boötes clusters have average stellar masses around 1.4 × 10 12 M ⊙ , while the PC15 protoclusters have an average stellar mass around 4.7 × 10 12 M ⊙ .As such, the PC15 protoclusters are much more massive than the Boötes clusters and are unlikely to be the progenitors of the Boötes clusters.However, given that this is the only cluster sample with a measurement of the total cluster light we compare the profile of the PC15 protoclusters to the Boötes clusters.
The Boötes clusters were split into four redshift bins: z = 0.5−0.7,z = 0.7−1.0,z = 1.0−1.3,and z = 1.3−1.6 and were stacked in both WISE and SPIRE using the same method that we implement.In Figure 7 we overplot the normalized radial surface brightness profiles of the Boötes clusters on top of the normalized radial profile of the protoclusters in W 1. The Boötes clusters in all redshift bins have half-light radii around 0.5 ′ (0.5 cMpc at z = 1.15), while the protoclusters have a halflight radius of around 3 ′ (4.5 cMpc at z = 2).The extent of the Boötes clusters is consistent with their virial radii.Though the PC15 protoclusters are not expected to be progenitors of the Boötes clusters, the difference in radial extent is consistent with theoretical predictions from Chiang et al. (2013), which suggest that protoclusters decrease in angular size as traced by both stellar mass and star formation as they assemble and gravitationally coalesce.

CONCLUSIONS
We study the physical properties of z = 2 − 4 protocluster candidates identified by (Planck Collaboration et al. 2015, PC15 protoclusters).We perform a stacking analysis of WISE and Herschel images of the PC15 protoclusters to study the average properties of all galaxies residing in these cosmic structures.Total cluster light stacking analyses performed in this work complement the existing spectroscopic observations of individual protoclusters which exist only for a handful of systems.
Our main conclusions are as follows: 1.The PC15 protoclusters have a similar extent in the WISE W 1 and W 2 bands and the three SPIRE bands indicating that the stellar light and the light from dust trace similar regions.
2. We fit the SED of the PC15 protocluster stacks and determine the total star formation rate and the total stellar mass.When compared to the total star formation from individual DSFGs in protoclusters we find a total SFR that is two times larger and a stellar mass that is four times larger, indicating that much of the star formation and stellar mass in protoclusters comes from galaxies with lower luminosities than DSFGs.
3. From the SED fits of the PC15 protoclusters, we do not see any evidence of additional warm dust that would indicate a significant AGN contribution, contrary to the results of (Kubo et al. 2019).This could be due to the different methods for selecting these respective protocluster samples.
4. We measure the half light radii of the protoclusters to be ∼ 2.8 ′ (corresponding to 4.2-5.8cMpc at z = 2 − 4).This angular size is consistent with the evolution of Coma-cluster progenitors from Chiang et al. (2013).The PC15 protoclusters have a more extended radial distribution than the stacked measurement of low-redshift clusters (Alberts et al. 2021), which is in qualitative agreement with the expectation that protoclusters are not yet collapsed.
The total light stacking analysis that we present can be applied to other samples of protoclusters and clusters selected by different methods.Follow-up with upcoming mm-wavelength surveys by TolTEC on the Large Millimeter Telescope will reveal the amount of dustobscured star formation within protocluster candidates.Upcoming large wide-field surveys such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time, Euclid, and the Nancy Grace Roman Space Telescope will generate large samples of candidate protoclusters.A stacking analysis can compare the different populations of protocluster candidates selected by different methods, determine how much star formation there is within populations of protoclusters, and determine where the cosmic star-formation rate density of protoclusters peaks.

B. EFFECT OF PLANCK BEAMSIZE ON MEASURED PROTOCLUSTER SIZE
In determining the angular size of the Planck protoclusters, the main limiting factor is the positional uncertainty within the beamsize of the Planck High Frequency Instrument (Planck Collaboration et al. 2015), which is 4.5 ′ in the detection band.This means that the sources that contribute to the Planck flux, protoclusters and interlopers alike, are randomly offset in position relative to the center of the Planck beam and with one another.Thus, the measured angular profile of the protocluster stack is expected to be broadened by this uncertainty.
To quantify the extent of this effect and correct for it, we create simulated protocluster stacks as follows.We place 211 mock protoclusters on a two-dimensional image where each source is modeled as a 2D Gaussian with a fixed FWHM while the source position relative to the image center is determined at random following a normal distribution with the width of the Planck beam.Ten input FWHM values are used ranging from 3.5 ′ to 8.0 ′ in increments of 0.5 ′ .We then resample the image at the pixel scale of our data and convolve it with the PSF of each band to create a mock protocluster stack.The FWHM of the profile is measured using scipy.curvefit assuming a Gaussian model.
In Figure 9, we illustrate the result of our simulation for the unWISE W 1 and Herschel SPIRE 350 µm bands in orange.The one-to-one relation is shown in blue and the Planck beam FWHM is indicated by the red vertical line.The horizontal grey line indicates the protocluster FWHM as measured while the vertical dashed line and the swath, both in grey, mark the inferred FWHM of our protocluster after the correction.In all cases, the protocluster emission is clearly resolved (i.e., more extended than the Planck beam).We repeat the similar analyses in all unWISE and Herschel SPIRE bands and find that the level of broadening is 10%, 9%, 18%, and 21% for the W 1, W 2, W 3, W 4 and 18%, 16%, and 18% for the SPIRE 250, 350 and 500 µm bands, respectively.
In Section 4, we find that only ≈one third of the measured flux belongs to the 'main' protocluster while the remaining ≈two thirds come from interlopers which themselves may be (typically unassociated) smaller structures.In light of this, we run another set of simulations intended to set the most conservative upper limit in a configuration that mimics the reality.The procedure differs from the previous simulation in two aspects: first, we place the main protocluster, represented by a Gaussian model, carrying 1/3 of the flux at image center while other sources carry the rest of the flux.Second, the sources are this time distributed uniformly within the Planck beamsize.This setup results in more severe broadening of up to 35% compared to our nominal simulation (up to ≈20%), as expected.By changing the fractional contribution of interlopers and protoclusters, we find that the increased level of broadening is primarily attributed to the uniform distribution of sources within the Planck beam, which is unlikely to reflect the reality.

Figure 1 .
Figure 1.Weighted mean stacks of the PC15 protoclusters showing extended emission from 3.4 µm -500 µm.The top panel shows the stacks and the offset stacks in the WISE bands.The offset stacks are created to verify that the extended emission does not appear when stacking random areas of the sky background.The bottom panel shows the stacks in the SPIRE bands.The size of the image point spread function FWHM is indicated by a red circle at the bottom left corner of each image.The measured half-light radii of the protoclusters are indicated by the dashed red circles.

Figure 2 .
Figure 2. IR flux distribution of cosmic structures constructed from the WebSky simulation are shown along four Planck-beam-sized sightlines.Flux densities of individual structures are indicated in blue while the dominant structure with the highest flux density is marked in red with its fractional contribution indicated at the top left corner of each panel.The top panels show the two brightest sources in the simulation while the bottom panels show two randomly drawn from the sample of simulated PC15 protoclusters (see Section 4 for more detail).

Figure 3 .
Figure3.Radial surface brightness profiles of the WISE and SPIRE stacks of the PC15 protoclusters, with the best-fit Gaussian model plotted in red.The dotted gray line indicates the profile of the Planck beam, normalized to the peak of the best-fit Gaussian model.The FWHM of the beam in each band is indicated in brackets.The bottom right panel shows the surface brightness profiles normalized to the peak value of the bands which trace stellar mass (W 1 and W 2), and dust-obscured star formation(SPIRE 250, SPIRE 350, and SPIRE 500).The comoving (top x-axis) scale is calculated for z = 2.

Figure 4 .
Figure 4.The rest-frame SED of the full sample of stacked PC15 protocluster candidates and the best-fit model of the spectrum to the SED from CIGALE (assuming z = 2).The data points used to fit the SED are shown by filled red circles, while the open red circles show the W3 and W4 points which are not used in the fit.AGN and star-forming galaxy templates, AGN4 and SFG3, from the comprehensive library ofKirkpatrick et al. (2015) are plotted and normalized to the best-fit model at 83 µm (250 µm observed frame).The SED of the stack of the brighter half of the protocluster candidates from(Kubo et al. 2019) are plotted in gray.

Figure 9 .
Figure 9. Input vs measured FWHMs of simulated protocluster stacks are shown in orange.A one-one relation is shown in blue while the red vertical line indicates the FWHM of the Planck beam.The solid gray horizontal line marks the measured FWHM from our data.The vertical dashed line and the shaded region, both in gray, show the range of the inferred protocluster size after correcting for the broadening effect.

Table 1 .
It is worth noting that the SPIRE fluxes in the table are made on the full sample of 211 PC15 protocluster candidates while the WISE fluxes are made on 152 protoclusters (see Section 3.2) after removing 59 frames to allow clean stack- Table protocluster candidates.We conclude