The Space Density of Intermediate-redshift, Extremely Compact, Massive Starburst Galaxies

We present a measurement of the intrinsic space density of intermediate-redshift (z ∼ 0.5), massive (M * ∼ 1011 M ⊙), compact (R e ∼ 100 pc) starburst (ΣSFR ∼ 1000 M ⊙ yr−1 kpc−1) galaxies with tidal features indicative of them having undergone recent major mergers. A subset of them host kiloparsec-scale, > 1000 km s−1 outflows and have little indication of AGN activity, suggesting that extreme star formation can be a primary driver of large-scale feedback. The aim for this paper is to calculate their space density so we can place them in a better cosmological context. We do this by empirically modeling the stellar populations of massive, compact starburst galaxies. We determine the average timescale on which galaxies that have recently undergone an extreme nuclear starburst would be targeted and included in our spectroscopically selected sample. We find that massive, compact starburst galaxies targeted by our criteria would be selectable for ∼148−24+27 Myr and have an intrinsic space density nCS∼(1.1−0.3+0.5)×10−6Mpc−3 . This space density is broadly consistent with our z ∼ 0.5 compact starbursts being the most extremely compact and star-forming low-redshift analogs of the compact star-forming galaxies in the early universe, as well as them being the progenitors to a fraction of intermediate-redshift, post-starburst, and compact quiescent galaxies.


INTRODUCTION
Galaxy formation models within a Λ-Cold Dark Matter (ΛCDM) framework that do not include feedback typically over-predict the present day baryon fraction as well as the number of number density of galaxies on the high and low mass ends of the local stellar mass function (SMF) (e.g., Croton 2006;Kereš et al. 2009;Moster et al. 2010;Moustakas et al. 2013).This implies that star formation over cosmic timescales is inefficient, which requires that galaxy formation models inject energy into cooling clouds of gas.This is typically done by invoking feedback from massive stars and active galactic nuclei (AGNs) to heat and eject gas, thus reducing star formation efficiency (e.g., Springel et al. 2005b;Di Matteo et al. 2005;Somerville & Davé 2015).Feedback as a driver of the cosmic star formation inefficiency is supported by evidence of large-scale gas outflows and/or relativistic jets in star forming and active galaxies (e.g.Veilleux et al. 2005;McNamara & Nulsen 2007;Fabian 2012;Somerville & Davé 2015).
In massive galaxies, feedback-driven outflows are often attributed to AGN activity since dark matter halo mass, galaxy stellar mass, bulge mass, and black hole mass all scale with one another (e.g., Ferrarese & Merritt 2000;Guo et al. 2010;Kormendy & Ho 2013).However, cosmological galaxy formation simulations show that the exclusion of stellar feedback in models leads to the formation of galaxies that are ∼ 10 times more massive than observed at a given redshift, showing that stellar-driven feedback plays an integral role in regulating star formation in massive galaxies (e.g., Springel et al. 2005b;Hopkins et al. 2012).On small (giant molecular cloud) scales, feedback can slow the local star forma-tion rate by decreasing the gas surface density in a region, but this alone is not sufficient to produce simulated galaxies whose masses match those observed.Large-scale galactic wind-driven outflows where Ṁ * ,outflow ∼ SFR are necessary to be able to model galaxies with masses that are consistent with observations (e.g., Veilleux et al. 2005).
Constraining the importance of feedback-driven quenching is crucial to understanding how massive galaxies form, especially at high redshift.Massive, quiescent galaxies at z > 1.5 are typically more compact than their local counterparts by roughly a factor of 5 (e.g.Zirm et al. 2007;van Dokkum et al. 2008;van der Wel et al. 2014).The likely progenitors of these massive, compact quiescent galaxies are similarly compact star forming galaxies that were formed in gas-rich mergers of disk galaxies and were then rapidly quenched via some dissipative feedback (e.g., Barro et al. 2013;Stefanon et al. 2013;van Dokkum et al. 2015).However, heavy dust obscuration coupled with high redshift makes constraining the role of AGN vs. stellar-driven feedback difficult with the typical UV signatures of outflows (e.g., van Dokkum et al. 2015).
We have been studying a population of z ∼ 0.5 massive, compact galaxies which show signs of recent, extreme bursts of star formation and gas depletion, similar to what we would expect as the progenitors to high-z massive, quiescent galaxies (Tremonti et al. 2007;Diamond-Stanic et al. 2012, 2021;Geach et al. 2013;Sell et al. 2014;Geach et al. 2014;Rupke et al. 2019;Petter et al. 2020).Our sample of galaxies consists of sources initially targeted as SDSS quasars, but subsequently classified as young post-starburst galaxies due to their blue stellar continua, weak nebular emission lines, and bright infrared photometry (Tremonti et al. 2007).Hubble Space Telescope (HST) imaging showed that these galaxies have extremely compact morphologies (R e ∼ 100 pc) with tidal features indicative of having recently undergone a major merger event (see Figure 1) (Diamond-Stanic et al. 2012;Sell et al. 2014).We also note that rings and diffraction spikes from the HST PSF are visible in the images of our sources, showing that their angular sizes are on the order of that of the PSF which further highlights their compactness (Sell et al. 2014;Diamond-Stanic et al. 2021;Davis et al. in prep).The sources in our sample can have SFR surface densities up to ∼ 1000 M yr −1 kpc −1 (Diamond-Stanic et al. 2012;Sell et al. 2014), and lie below the 0.5 < z < 1 size-mass relations for star forming and quiescent galaxies (see Figure 2; Mowla et al. 2019;Diamond-Stanic et al. 2021).Spectroscopic observations show that these galaxies host outflows with velocities > 1000 km s −1 that can extend to tens of kpc (Tremonti et al. 2007;Rupke et al. 2019;Davis et al. in prep).There is also little evidence that these massive outflows are primarily driven by AGN activity based on X-ray, IR, radio, and spectral line diagnostics, meaning that extreme star for-mation can be responsible for gas depletion in these galaxies (Diamond-Stanic et al. 2012;Sell et al. 2014;Petter et al. 2020).
These galaxies are important because they allow us to directly observe the effects of extreme star formation on gas kinematics in starburst and post-merger galaxies.In mergerdriven galaxy evolution scenarios, a major merger event can trigger a strong burst of obscured star formation.Dissipative feedback via AGN or starburst activity can then expel large amounts of gas and dust from the galaxy, allowing it to passively evolve into a gas-poor massive elliptical galaxy (e.g.Sanders et al. 1988;Lonsdale et al. 2006).The objects we are studying can possibly be representative of galaxies that are actively undergoing quenching, and might be an important phase for the building up of a massive, quiescent elliptical population.However, this is difficult to determine without knowing the space density of extreme compact starburst galaxies like the ones we have been studying.We are broadly defining our compact starbursts as massive, centrally concentrated galaxies that have recently experienced a burst of star formation.The space density of extreme massive, compact starbursts is strongly dependent on the timescales upon which starburst events can be observed using our selection criteria.
The aim of this paper is to estimate the average amount of time sources in a simulated galaxy population would be selected as extreme compact starburst galaxies under our selection criteria, in addition to their space density.We also place our galaxies into context with their high redshift compact star forming analogs, compact quiescent galaxies, post starburst galaxies, ultraluminous infrared galaxies (ULIRGs), the merger rate density, and massive, quiescent galaxies within the same redshift interval (e.g.Sanders et al. 1988;Lonsdale et al. 2006;Lotz et al. 2011;Barro et al. 2013;van der Wel et al. 2014;Wild et al. 2016).
The outline of the paper is as follows: in Section 2 we discuss the selection of the parent sample of galaxies.In Section 3 we discuss empirical model construction and constraining model free parameters via an MCMC routine.In Section 4 we discuss our implementation of the SDSS quasar selection function.In Section 5 we calculate the average observability timescale and space density for our population of compact starbursts.In Section 6 we place our galaxies into cosmological context with other phases of merger-driven galaxy evolution.We adopt a cosmology of H 0 = 70.2kms −1 Mpc −1 , Ω M = Ω CDM + Ω b = 0.229 + 0.046 = 0.275, and Ω Λ = 0.725 (Komatsu et al. 2011)

THE OBSERVED SAMPLE
The selection criteria used for our sample will be detailed in Tremonti et al. in prep, but we will give a brief summary in this section.
Our sample was originally selected with the objective to understand the role galaxy-scale winds play in star formation quenching for massive, intermediate redshift galaxies.The parent sample of galaxies we use in this work is drawn from the Eighth Data Release of SDSS (York et al. 2000;Aihara et al. 2011).We set out to select sources that were targeted as quasars (flagged either as QSO HIZ, QSO CAP, QSO SKIRT, QSO MAG OUTLIER), since the SDSS QSO sample extends to fainter magnitudes than the main galaxy sample (Strauss et al. 2002).Selecting sources that have been targeted as quasars allows our sample to consist of objects that are massive and compact.The magnitude limits ensure that our sources are massive, highly star forming, and not strongly dust attenuated and the SDSS quasar selection algorithm requires that our sources are either unresolved or that they are resolved but satisfy more stringent color-magnitude cuts.This is described in more detail in Section 4.1.
We required that our sources were spectrally classified as galaxies with apparent 16 < i < 20.We selected sources within 0.4 < z < 0.9 to ensure that the MgII λλ2796, 2804 line would be shifted into the optical so we could use that as a probe of galactic winds.We also exclude sources that were classified as distant red galaxies (LEGACY TARGET1 != DRG).Sources with redshift warnings and bad quality plates were also thrown away.This initial cut left us with a sample of 1198 galaxies.
We fit the SDSS spectra with a combination or simple stellar population models, similar to Tremonti et al. (2004), and a type I quasar template.From the spectral fitting, we calculated the fraction of light attributed to the quasar model (f qso ).We also measured nebular emission and stellar absorption line indices (following Kauffmann et al. 2003) for the sources in our parent sample as well as the strength of the 4000 Å break (D n (4000)) (Balogh et al. 1999).Our initial aim was target post starburst galaxies (PSBs) by selecting galaxies with evidence of having gone through a starburst event within the last 1 Gyr ((Hδ A + Hγ A )/2 OR D n (4000) < 1.2.), but with little ongoing star formation within the last 10 Myr ([OII] 3727 Å equivalent width (EW) > −20 Å).These cuts reduce our sample to 645 sources.
Lastly, our sample was limited to consisting of brighter galaxies with tighter cuts on [OII] EW and including a cut on the measured quasar fraction to further ensure that strong AGN were not included.The new cuts imposed were [OII] 3727 ÅEW > −15 Å, and f qso < 0.25.We also require that apparent g and i magnitudes were brighter than g < 20 or i < 19.1.Although we select for weak nebular emission to eliminate starbursts, many of our sources were detected in WISE (Wright et al. 2010), and SED fitting through the mid-infrared shows they can have SFRs= 20 − 500 M yr −1 (Diamond-Stanic et al. 2012;Perrotta et al. 2021;Davis et al. in prep).These cuts leave us with a sample of 121 galaxies.
We take advantage of the WISE detections for our sources and make an IR color cut of W 1 − W 2 < 0.8 to further limit AGN contamination (Stern et al. 2012;Hickox et al. 2017).The WISE AGN cut leaves us with a population of 115 galaxies in what we are considering to be our parent sample.We include this selection criteria in our modeling of compact starburst galaxies to estimate the amount of time our galaxies would be targeted and selected by this set of criteria.A full list of targets is given in Table 1 along with their redshifts, stellar masses, and SDSS photometry.
In addition to the SDSS and WISE data for our parent sample, we also have high-S/N (∼ 15 − 30 per pixel) spectra from the Blue Channel Spectograph on the 6.5-m MMT (Angel et al. 1979), the Magellan Echellette (MagE; Marshall et al. 2008) spectrograph on the Magellan Clay telescope, and the Low Resolution Imaging Spectrometer (LRIS; Oke et al. 1995) on the Keck I telescope for 37 of the sources in our parent sample.These observations and their data reduction are detailed in Davis et al. (in prep), but broadly these observations were done using 1" slits resulting in spectra with resolution R ∼ 600 − 4100.We refer to these 37 galaxies as the MgII sample.

MODEL CONSTRUCTION
The aim of this work is to constrain the importance of massive, compact starburst events in galaxy quenching at z ∼ 0.5 by estimating the space density of these objects.Here, we do this by constructing an empirical model based on the galaxies we have in our sample and then evolving a large simulated population of compact starbursts to estimate the timescales upon which they would be targeted by our selection criteria.This process can be broken down into two steps: 1. Construct a set of template distributions of stellar population parameters and SFHs by fitting SDSS ugriz model mags and W1, W2 photometry for the 115 galaxies in our sample with a Markov Chain Monte Carlo (MCMC; Metropolis et al. 1953;Foreman-Mackey et al. 2013) fitter.
2. Use the posterior distribution of SFH parameters from step 1 to predict luminous properties of a set of mock galaxies whose SFHs are consistent with our observed sample.The luminous properties are computed using the FLEXIBLE STELLAR POPULATION SYNTHE-SIS models (FSPS; Conroy et al. 2009).
Since our small sample of galaxies consists of sources that are unresolved in SDSS imaging, we have to make a number of assumptions about their underlying stellar populations.First, we assume that the light from our compact starburst galaxies can largely be broken down into two components: a young, simple stellar population (SSP) that formed in a single, nuclear burst, and an older component that has a star formation history representative of a massive, star forming galaxy at z ∼ 0.5.We note that there is likely clumpy star formation occurring outside of the nuclear regions of our galaxies, but due to their extremely compact HST morphologies it is fair to assume that the contribution of these star forming regions to the total emitted light is minimal compared to the large nuclear burst.We also assume that our galaxies will only experience one burst of nuclear star formation and will then passively evolve.Although HST observations (Sell et al. 2014) showed that many of our sources have more than one core that could trigger a starburst event, we note that these sources are still unresolved in SDSS so the burst would not be localized to a particular core.This assumption is also consistent with the single burst of star formation triggered by a merger event seen in simulations (e.g.Springel et al. 2005a).Next, we naively assume that since the nuclear burst component dominates the spectral energy distribution (SED) of the total system, that the differences observed between the galaxies in our sample can solely be attributed to differences in the properties of the nuclear starburst.This assumption is consistent with the galaxies in the MgII sample having very blue spectra and young ages as derived from spectral modeling (e.g., Davis et al. in prep).
These assumptions allow us to construct a model that utilizes FSPS to simulate the stellar populations for the nuclear starburst component as well as the older, non-burst underlying stellar population.In our modeling framework, we introduce four free parameters that are fit via an MCMC routine for each of the galaxies in our sample: the age of the burst (t age ), the fraction of total galaxy stellar mass formed in the nuclear burst (f burst ), the optical depth for the dust around young stars formed in the nuclear burst (τ dust,1 ), and the total stellar mass of the system (M * ).We separately calculate the ugriz, W1, W2, [OII] (3727 Å) fluxes for the nuclear burst and non-burst components and their f burst weighted sum to determine the SED and [OII] EW for the total simulated galaxy.
In this section, we describe the assumptions made in the FSPS modeling of both the extended non-burst and nuclear starburst components as well as the MCMC fitting we use to constrain values for the free parameters in our model.

Modeling the extended, non-burst component
The photometric and morphological properties of the extended stellar population are most important in the later stages of the compact starburst's evolution since the contribution of the nuclear burst wanes over time.Here, we describe the assumptions we make in the FSPS modeling of the extended, non-burst component.We initialize FSPS such that TAGE is the Hubble time (in Gyr) at the redshift of a given galaxy, DUST1 = 1, and DUST2 = 0.5.We chose these dust optical depths to ensure that the ug photometry for the modeled extended stellar component would be fainter than that of the reddest observed sources in our sample, while being consistent with the recommended values given in Charlot & Fall (2000).We explored the effects of changing TAGE and the dust parameters for the extended components in the galaxies shown in Figure 1 to ensure that our modeling is largely robust to extended component assumptions and found that the results of our MCMC fitting do not change with changing non-burst initial conditions.
A crucial piece to modeling the stellar population of the extended, non-burst component is assuming a particular star formation history (SFH).HST images show hints of a smooth, extended underlying stellar population (Diamond-Stanic et al. 2021).The presence of tidal features in our HST observations suggests that the galaxies in our sample have recently undergone merger events, and their high star formation surface densities indicate that that these mergers were likely gas rich (e.g., Diamond-Stanic et al. 2012;Sell et al. 2014).Based on this, we assume that the extended, non-burst stellar populations have a star formation history typical of actively star forming disk galaxies.
However, the SFHs of star forming disk galaxies are uncertain.There are many possible SFHs that would be able to build up the tightly-correlated star formation main sequence at late cosmic times (e.g.Oemler et al. 2017).For simplicity, since young stars dominate the light output from a stellar population we approximate the SFH as being flat over cosmic time to ensure that the progenitor galaxies in the system were experiencing some degree of star formation prior to merging.We do this by setting the FSPS SFH parameter as a delayedburst SFH (sfh = 4 in FSPS) but with the constant star formation fraction set to 1.
We also note that we explored other SFHs that peaked at earlier cosmic times, such as the dark matter halo mass dependent models constructed in Behroozi et al. (2019), but our MCMC chains for these models were not able to reach con-vergence.The inability for our chains to converge is consistent with the fact that we do not believe that Behroozi et al. (2019)-like SFHs would be physically representative of galaxies like those in our sample.For massive (M * ∼ 10 11 M ) galaxies like the ones in our sample, this would suggest that our sources would have peaked in star formation at z ∼ 2 and then passively evolved until z ∼ 0.5.This would imply that the progenitors of our compact starbursts would be almost entirely be quiescent, which is unlikely do to their high gas fractions.Therefore, we do not include models like this in our analysis.

Modeling the nuclear burst
Recent observational evidence has shown that intermediate redshift, extreme compact starburst galaxies are likely to exhibit flat age gradients, meaning that their optical light is dominated by star formation that began and ended in one uniform event (e.g., Setton et al. 2020).Since we expect all of the stars formed in the nuclear burst to have formed at approximately the same time, we model the starburst as a simple stellar population (SSP) in FSPS (sfh = 0).This choice is consistent with very short burst durations we derive from non-parametric SFH modeling of a subset of our sample with high S/N spectra (Geach et al. 2018;Tremonti et al. in prep;Davis et al. in prep).This work (detailed in Davis et al. (in prep)) is done by fitting the rest frame UV-mid IR broadband photometry and high-resolution spectra simultaneously using Prospector (Leja et al. 2019;Johnson et al. 2021).We also assume that the dust in the vicinity of the nuclear starburst extincts some of the light from the newly formed stars.We leave the age of the central burst (log t age ) and the optical depth (τ burst ) as free parameters that will later be constrained with MCMC fits to the photometric data of the sources in our observed sample.We set DUST2 = τ burst /2 (e.g., Wild et al. 2011).We similarly calculate SDSS ugriz and WISE W1 & W2 magnitudes for the nuclear bursts as we did for the extended, non-burst stellar population.

Calculating PSF magnitudes
Once we have the model photometry for the extended, non-burst stellar populations and their nuclear bursts, we can combine them to get the photometry for the entire system.We start by converting the modeled apparent AB magnitudes for the extended, non-burst stellar population and the burst component to flux densities.The output magnitudes of FSPS are normalized to 1 M at every epoch, so we calculate the fluxes for our galaxies and nuclei by multiplying their 1 M flux densities by their respective masses.We define the mass of the nuclear burst as We also leave f burst and M * as free parameters in our MCMC fitting in addition to τ dust and log t age as described earlier.
For sources observed in SDSS, the QSO targeting pipeline takes a source's ugriz PSF magnitudes as input rather than its de Vaucouleurs or exponential disk model magnitudes (Richards et al. 2002).The output magnitudes from FSPS are representative of model magnitudes, so we must first convert these to PSF magnitudes before we run the SDSS QSO targeting algorithm on our modeled sample.We do this by first assigning surface brightness profiles to both components of the galaxy.For the extended, non-burst component, we assume a n = 1 Sérsic profile where the effective radius (R eff ) is taken from the redshift-dependent star forming galaxy size-mass relation presented in Mowla et al. (2019).Due to the nuclear starburst's compact nature, we assume a n = 4 Sérsic profile where R eff is ∼ 300 pc, as motivated by observations (e.g., Geach et al. 2013;Sell et al. 2014).Diamond-Stanic et al. (2021) showed that R ef f < 1 kpc for the HST-observed galaxies.We do not vary R ef f for the nuclear components for our modeled galaxies since ∼ 100 pc scale starbursts would always be unresolved in SDSS and are effectively observed as point sources.
We convert R eff for each component from kpc to arcsec using their cosmological angular size distances and normalize the surface brightness profiles (I(r)) for each component such that We then convolve these component surface brightness profiles with the SDSS PSF in each photometric band.The full width half maxes (FWHMs) for the ugriz bands are 1.53, 1.44, 1.32, 1.26, 1.29 arcsec, respectively.The convolved burst and disk components are then added together to create a modeled total galaxy surface brightness profile.We then fit this profile with a 2D-Gaussian model of the SDSS PSF and integrate the Gaussian model fit to obtain PSF fluxes in each respective band.The PSF fluxes are then converted to apparent AB magnitudes so they could later ( §4.1) be passed through the SDSS QSO selection pipeline.

Constraining model free parameters with MCMC
We have constructed a 4-parameter model for the photometry and [OII] (3727 Å) EW of intermediate-z compact starbursts by utilizing FSPS.FSPS directly outputs model mags and spectra of stellar populations.We calculate [OII] (3727 Å) EW from the FSPS output spectrum using SPECUTILS (Earl et al. 2022).As stated above, our compact starburst model is the sum of separately modeling the host galaxy and nuclear burst contributions to the overall photometric and spectral properties.In this model, we leave the age of the nuclear starburst (log t age /Myr), the burst fraction (f burst ), optical depth of dust extincting young stellar light (τ dust ), and the galaxy stellar mass (log M * /M ) as free parameters.
Here we detail how we constrain possible parameter values using MCMC fitting to the ugriz and W1/W2 photometry for our observed galaxies.

Parameter fitting
As discussed in Section 2, our collaboration has been studying a sample of 115 intermediate-z compact starburst galaxies.Archival SDSS ugriz and WISE W1 and W2 photometry are available for the full parent sample.For each of these, we constrain the probability densities for log t age , f burst , τ dust , and log M * using the ensemble adaptation of the Metropolis-Hastings MCMC algorithm from the package, EMCEE (Metropolis et al. 1953;Foreman-Mackey et al. 2013).Each step of our MCMC calculates the model SDSS ugriz, WISE W1, and W2 photometry, and compares them to those for each observed galaxy.For each galaxy, we run the MCMC such that the autocorrelation time for each walker is ∼ 50 times less than the run time.For most of our galaxies this is ∼ 60, 000 steps.We use the EMCEE ensemble stretch move with scale parameter a = 2.We randomly initialize each walker in the intervals and allow them to explore the parameter space 0.5 < log t age /Myr < 3 0.05 < f burst < 0.65 0 < τ dust < 5 10 < log M * /M < 12 such that it finds the parameter values that are most likely to minimize the difference between the model and observed photometry.
For each galaxy in our sample, we output the mean parameter values and their covariance from MCMC-calculated posterior distributions.We use these mean values and their covariances to model these posteriors as 4-dimensional Gaussian distributions whose means and standard deviations are identical that of the MCMC output.We do this to reduce noise later in our analysis since we use these distributions to randomly draw sets of parameter values to model mock galaxies based on the ones in our observed sample.The best fit SED and parameter probability distributions for a constantly star forming host based on the galaxy J0826+4305 can be seen in panels (a) and (b) Figure 3, respectively.We also include these for J1713+2817, J2118+0017, J1506+6131, J1558+3957, and J1613+2834 in Figures 9,10,11,12,and 13, respectively.For consistency with other studies of our objects, we note general agreement between our best fit stellar masses and those presented in Sell et al. (2014) for the galaxies that were included in both of our samples.This is shown in Table 2.
For each of the 115 galaxies in our sample we randomly draw log t age , f burst , τ dust , and log M * values from their respective Gaussian-modeled posterior distributions taking into account the covariances between each of the parameters, to model a population of galaxies with properties similar to the observed source.We can then evolve these modeled galaxies to estimate a distribution of selectable lifetimes for each of the galaxies in our sample.

MODELING THE TARGETING ALGORITHM & SELECTION FUNCTION
The ultimate goal for our model is to be able to estimate the space density of z ∼ 0.5, massive, compact starburst galaxies.To do this, we need to understand the timescales upon which these galaxies would be selected under a set of targeting criteria.Here, we detail how we model the various components of the selection function we use to identify sources in our sample.

The SDSS QSO targeting algorithm
All of the sources in our observed sample were initially targeted for SDSS spectroscopy as QSOs based their bright magnitudes and blue colors.In order to ensure that our modeled galaxies would satisfy these criteria, we need to incorporate this selection into our modeled targeting function.
The SDSS QSO targeting algorithm identifies sources based on their location in three-dimensional color space.This is the (u − g)-(g − r)-(r − i) (ugri) color cube for z < 3 sources and (g − r)-(r − i)-(i − z) (griz) cube for galaxies at higher redshifts.The QSO catalog constructed from SDSS DR8 sources was selected using the Richards et al. (2002) targeting algorithm1 .The SDSS quasar selection function aims to identify sources that lie far from the region of color space where stars are most likely to be found as well as for sources to satisfy general color/magnitude cuts.All magnitudes referenced in the targeting algorithm are PSF magnitudes.Since we are working with modeled data that is free from observational uncertainty, we do not include the steps in the algorithm that flag sources for having data with fatal errors.
Since quasars and local stars both exhibit bright apparent magnitudes and are unresolved point sources, the algorithm needs to be able to differentiate between them in color-colorcolor space.The algorithm makes use of the method described in Newberg & Yanny (1997) that defines a "stellar  2014) star forming and quiescent galaxies, respectively.The red, blue, and grey lines are the best fit size-mass relations for the quiescent, star forming, and total CANDELS/3DHST galaxies in Mowla et al. (2019).Our data point represents the average R ef f and M * for a subset of the MgII galaxies presented in Davis et al. (in prep).Our sources are significanly more compact than other galaxies at similar z and M * .
locus" in color-color-color space where stars are most likely to exist.The stellar locus is constructed by analyzing the distribution of SDSS identified stars in color space.To maintain generality, we will refer to the main coordinate system describing the color-color-color cube as x, ŷ, ẑ , where x is in the direction of the bluest color axis and ẑ in the direction of the reddest.The locus construction algorithm begins by setting the endpoints of the stellar distribution in color space and then iteratively calculating midpoints.This process allows a local coordinate system ( îi , ĵi , ki ) to be defined at each locus point.At each locus point (p i ), ki is defined as a unit vector in the direction − −−−−− → p i+2 − p i .As detailed in Newberg & Yanny (1997), unit vectors îi , ĵi , and ki are given as The cross section of the stellar locus is measured by fitting an ellipse perpendicular to ki at each point.The semi-major and semi-minor axes of the ellipses are in the direction of unit vectors li and mi , respectively, and are defined as li ≡ îi cos θ i + ĵi sin θ i , mi ≡ − îi sin θ i + ĵi cos θ i where θ i is the angle between the major axis of the ellipse and unit vector î.We adopted the locus point positions, Richards et al. (2002), and proceeded to construct right cylinders that define the 4σ stellar locus probability region in color-color-color space.We also incorporate the mid-z inclusion region as the white dwarf/A star exclusion regions detailed in Richards et al. (2002).
Sources targeted as quasars must also satisfy color and magnitude cuts in addition to not belonging to the stellar locus.For low-z sources in the ugri color cube, all objects must have apparent i-band magnitude 15 < i < 19.1 (Richards et al. 2002).Both extended and point source objects are allowed to be selected as quasars, but they need to satisfy different sets of criteria.Point source objects only need to fulfill the magnitude and stellar locus cuts to be targeted.Extended sources are kept if they are likely to contain an active nucleus.This is most likely when (u − g) < 0.9, as redder AGN would be at high-z and would not be extended (Richards et al. 2002;Adelman-McCarthy et al. 2006).This (u − g) cut does not remove blue, extended star forming galaxies, so a second cut of l i > 0 and m i > 0 is applied where l i and m i are positions within the k, l, m coordinate space defined earlier.In the high-z griz color cube, all outliers from the stellar locus with 15 < i < 20.4 are targeted as quasars.However, to avoid contamination from low-z quasars, sources are removed from the high-z sample when all of the following criteria are met; We allow the sources in our sample to be targeted as either low-z or high-z quasars since our observed sample contains a mixture of both target types.

Spectroscopic/photometric selection
In addition to being blue, unresolved sources, the galaxies in our sample also exhibit weak nebular emission characteristic of post starburst galaxies.As mentioned earlier, we implement an emission line equivalent width (EW) cut on [OII] (3727 Å) such that [OII] EW> −15 Å, consistent with that used for our parent sample (Sell et al. 2014;Davis et al. in prep;Tremonti et al. in prep).We also model the g < 20 flux limit and W 1 − W 2 < 0.8 WISE color cut that we impose on our sample.

ESTIMATING THE SPACE DENSITY
In this section, we discuss the various parameters that contribute to the calculated compact starburst space density (n CS ) as well as the possible sources of uncertainty.We estimate the space density in the redshift range 0.4 < z < 0.9 as Here, N targeted is defined as the number of galaxies in our observed sample of massive, compact starburst galaxies, f complete is the completeness of the SDSS QSO catalog (f complete ∼ 0.9; Vanden Berk et al. 2005), V 0.4<z<0.9 is the volume in Mpc −3 contained within the redshift range 0.4 < z < 0.9, A SDSS /A sky is the fractional area of the SDSS footprint relative to the area of the entire sky, t cosmic is the amount of cosmic time in Myr contained in the redshift range 0.4 < z < 0.9, and 1/t obs is the average of the inverse selectability timescale in Myr.The only modeldependent factor in this calculation is the amount of time our sources would be selected under a particular set of targeting criteria, so we will spend the first part of this section focusing on calculating this value.
It is also worth highlighting that the timescale we are calculating for our sources is the amount of time these objects would be targeted under our set of selection criteria.This is a separate quantity from the amount of physical time galaxies might be undergoing an extremely compact starburst phase.The physical timescale is also dependent on how we define these sources.A unifying feature of the observed sources in our sample is that they are late-stage major mergers that host extremely young stellar populations.It is possible that some of them have quenched/are very recent PSBs and that others are still forming stars.Broadly, we define our sources as galaxies that have recently experienced an extreme nuclear burst of star formation.Calculating the physical timescale for these sources would require much more detailed modeling which is beyond the scope of this work.Our goal here is to estimate the space density of objects that would be targeted by our selection criteria at some point in their evolution.

Calculating observed lifetimes
For each of the 115 galaxies in our sample, we used SDSS ugriz model mags and WISE W1/W2 measured photometry to construct SEDs which were then fit by our MCMC routine to obtain the posterior distributions for log t age /Myr, f burst , τ dust , and log M * ,tot /M .These posterior distributions were then modeled as 4-dimensional Gaussian distributions and we output their covariance matrices.For each of the 115 observed galaxies in our sample, we draw 200 sets of parameters from the respective posterior distributions while taking into account covariances between parameters.This gives us 115×200 mock galaxies which we then evolve.We evolve our modeled galaxies within the time interval −1 < log t age /Myr < 2.5 in 1000 uniformly spaced steps.We calculate [OII] EWs from the output FSPS spectrum using SPECUTILS (Earl et al. 2022), as well as the photometry at each step to determine if the sources would be targeted by our selection criteria at each time step.This allows us to construct selected lifetime distributions for each of the 115 observed galaxies in our sample.The evolutionary tracks for a subset of randomly selected galaxies' i and g-band magnitudes, [OII] EWs, and W 1 − W 2 colors, as well as the selection limits on each respective parameter can be seen in Figure 4. We note that Figure 4 does not include the SDSS QSO targeting selection since that is a much more complicated set of criteria and would be impossible to visually display.However, we do apply it in our target selection.
In the following section, we detail how we determine the space density of our sources by randomly sampling with for a sub-sample of modeled galaxies.The x-axis is age relative to the burst peak.The grey-shaded rectangles represent the regions of parameter space that would not be selected by the criteria placed on that given parameter.This is a schematic representation-the full details of our source selection can be found in Section 2. We find that extreme nuclear starbursts like the ones observed in our galaxies would be selected for ∼ 148 +27 −24 Myr, consistent with the burst ages calculated in Davis et al. (in prep).
replacement the selected lifetime distribution calculated by evolving mock galaxies.In short, we bootstrap by generating 100,000 randomly sampled (with replacement) populations of 115 mock galaxies.For each iteration, we randomly draw an array of 115 indices which correlates to the various observed galaxies in our sample.We use the randomly drawn indices to pull selected lifetimes from the corresponding selected lifetime distributions.We then average these lifetimes to determine a selectability timescale for that given mock population of galaxies.The average selected lifetime distribution for the 100,000 samples of 115 mock galaxies is shown in Figure 5.We find that on average, compact starburst galaxies like the ones we observe would be selected under our set of targeting criteria for 148 +27 −24 Myr.This timescale is broadly consistent with the average poststarburst peak age of 70 ± 106 Myr calculated in Davis et al. (in prep).
In our modeling, we find that our mock galaxies would be targeted soon after the nuclear burst occurs, meaning that we can directly compare our selectability timescale and the post-starburst peak SF ages in Davis et al. (in prep).The light-weighted stellar ages of the MgII sample ranging from ∼ 13-300 Myr) galaxies are consistent with the calculated selectability timescale in this work.This is a good consistency check to ensure that our modeling shows that galaxies in our observed sample would be selectable at their best-fit stellar ages.
We next use the selectability timescales of our modeled compact starburst galaxies to estimate their space density.

Calculating space density
As stated above, we estimate the space density in the redshift range 0.4 < z < 0.9 (Equation 1) by randomly sampling from our selected lifetime distributions.To ensure that we sample a sufficiently large population of mock galaxies, we iterate this part of the calculation 100,000 times.
For each of the 100,000 iterations, we randomly sample with replacement 115 galaxies from our mock sample.For each of the galaxies in that sample, we randomly draw a log t obs /Myr value from the observable lifetime distribution that corresponds to that particular galaxy.In each iteration, we use these log t obs /Myr values to compute, where N sim = 115.We then use this to calculate the space density for the random population generated each iteration using the expression above.The resulting space density dis- .Space density distribution calculated from our mock population of galaxies.We estimate that the space density for our population of 0.4 < z < 0.9 compact starburst galaxies is tribution (calculated using Equation 1) can be seen in Figure 6.We estimate the space density of these massive, compact starbursts to be (1.1 +0.5 −0.3 )×10 −6 Mpc −3 in the redshift range 0.4 < z < 0.9.

COSMOLOGICAL CONTEXT
One of the most interesting questions surrounding our sample of galaxies is whether or not this type of compact starburst phase is characteristic in the evolution of many, if not most, massive galaxies.A widely supported view of galaxy formation and evolution is that mergers are responsible for building up increasingly massive galaxies and for triggering starbursts and AGN activity (e.g., Toomre 1977 Figure 7.Comparison of the average timescales (in Gyr) upon which various phases of massive galaxy evolution would be observable.The black star represents the average selectability timescale for the modeled compact starburst galaxies in our sample, and its error bar along the redshift axis represents the size of the redshift range of our sources and the error bar along the t obs axis is the statistical uncertainty calculated via bootstrapping as described in Section 5.2 The grey, purple, and blue shaded regions represent the range of observable timescales for galaxy mergers (Lotz et al. 2011), ULIRGs (Farrah et al. 2003), and post starburst galaxies (PSBs; Wild et al. 2016), respectively.We note that the timescales presented for galaxy mergers and PSBs correspond to the amount of time a source would be targeted under a set of selection criteria (similar to the value calculated for our sources), while the timescale for ULIRGs reflects the amount of physical time a source would experience star formation characteristic of the ULIRG phase.We elaborate on how we obtain the timescale estimates for the shaded regions in the text.It is clear that compact starburst galaxies like the ones in our sample occur on relatively short lived timescales that are comparable to that of ULIRG star formation.et al. 1988;Kauffmann et al. 1993;Mihos & Hernquist 1996;Hopkins et al. 2006Hopkins et al. , 2008;;Lotz et al. 2011;Somerville & Davé 2015).Sanders et al. (1988) presented a basic framework in which the collision of two gas-rich disk galaxies would funnel gas towards the center of the system via tidal streams or shocks, thus creating a dusty, gas-rich environment to foster rapid star formation (e.g.Lonsdale et al. 2006).This dusty starburst stage would be selected as a ULIRG.As gas is fueling rapid star formation, it is continuously being funneled into the nucleus and also being accreted onto the black hole, thus also triggering AGN activity (e.g.Hopkins et al. 2006Hopkins et al. , 2008)).Within this framework, gas from the galaxy can be expelled by a blowout phase driven by violent, dissipative feedback.
The galaxies in our observed sample have many features that could tie them into this evolutionary framework.We know that the galaxies for which we have HST observations have disturbed morphological features such as tidal tails or Figure 8.Comparison of the space densities of various phases of massive galaxy evolution.The black star represents the modeled space density for compact starburst galaxies like those in our observed sample.Its error bar along the redshift axis represents the size of the redshift range of our sources and the error bar along the space-density axis is the statistical uncertainty calculated via bootstrapping as described in Section 5.2.We note that there are additional systematic errors, including uncertainty with model assumptions, which make this statistical error a lower limit.The blue squares represent the space density evolution of massive, compact star forming galaxies from the CANDELS survey (Barro et al. 2013), the red points represent massive (log M * /M ∼ 11), compact quiescent galaxies (van der Wel et al. 2014), the green triangle represents low-z PSBs (Pattarakijwanich et al. 2016), and the purple hexagon represents low-z ULIRGs (Kim & Sanders 1998).The grey, red, purple, and green shaded regions depict the Lotz et al. (2011) observed merger rate density, the Stott et al. (2013) observed merger rate density (calculated using merger observability timescales), ULIRG space density (Magnelli et al. 2011), and intermediate-z PSB space density (Wild et al. 2016) ranges, respectively.The Barro et al. (2013) points, Lotz et al. (2011) region, andStott et al. (2013) region have been adjusted to account that our sources have masses log M * /M > 10.5, while most of the other populations shown include galaxies log M * /M > 10.While only a relatively small fraction of intermediate-z major mergers will result in an extreme compact starburst similar to those in our sample, it is likely that sources like ours are the more extreme, lower-z analogs to compact star forming galaxies more common in the early Universe and are closely related to intermediate-z PSBs.
two nuclei, which is indicative of them having undergone a recent merger (e.g Sell et al. 2014).In addition to having disturbed morphologies, our galaxies host high velocity ionized and molecular gas outflows which can extend out to kpc scales (e.g.Tremonti et al. 2007;Diamond-Stanic et al. 2012;Geach et al. 2013Geach et al. , 2014;;Sell et al. 2014;Geach et al. 2018) or even over 100 kpc scales (Rupke et al. 2019).
In order to understand the evolutionary significance of extreme, compact star formation events like those observed in our galaxies, we need to contextualize their space density relative to that of various phases within massive galaxy, mergerdriven evolution.Our results are summarized in Figures 7  and 8, and we discuss in greater detail within this section.

Evolution of massive compact galaxies
The sample of galaxies we have been studying is comparable to a high-z population of similarly compact, massive forming galaxies.Massive, quiescent galaxies in the Uni-verse at z > 1.5 are typically more compact than their local counterparts by roughly a factor of 5 (e.g.Zirm et al. 2007;van Dokkum et al. 2008;van der Wel et al. 2014).The progenitors of these galaxies were likely compact star forming galaxies that were formed in gas-rich mergers of disk galaxies and were then rapidly quenched via some dissipative feedback, a formation scenario that is reminiscent of what we expect for ULIRGs and quiescent galaxies in the lower-z Universe (e.g., Barro et al. 2013;Stefanon et al. 2013;van Dokkum et al. 2015).Barro et al. (2013) observed populations of compact quiescent and star forming galaxies in the redshift range ∼ 1 < z < 3 to understand the evolutionary pathways that lead to the assembly of massive, compact quiescent galaxies we see predominantly in the early Universe.We include their compact star forming galaxy space density evolution as the blue squares in Figure 8 for comparison with the intermediate-z massive, compact starburst galaxies we are studying (black star).We adjust the points from Barro et al. (2013) using redshift appropriate stellar mass functions (Moustakas et al. 2013;Adams et al. 2021) to account for the fact that their sample consists of sources with a wider stellar mass distribution than our sample.The adjusted space density is given as where n literature is the literature space density calculated for a larger mass range than our sample, and φ SMF is the stellar mass function.We use the Moustakas et al. (2013) and Adams et al. (2021) SMFs for z ≤ 1.5 and z > 1.5, respectively.The Barro et al. (2013) compact star forming galaxies have constant space densities at high redshift, but begin to decline at z <∼ 1.5.This decline is consistent with the decline in galaxy merger, star formation, and cold gas densities with decreasing redshift (e.g., Tacconi et al. 2010;Daddi et al. 2010;Tacconi et al. 2013;Madau & Dickinson 2014;Riechers et al. 2019).We show in Figure 8 that the space density of our sources lies only slightly below the space density evolution trend shown with the Barro et al. (2013) compact star forming galaxies.We note that our galaxies are more extreme than the Barro et al. (2013) sources as they are both more compact and more rapidly star forming.This likely biases our compact starburst space density to be slightly lower than that for the Barro et al. (2013) galaxies.It is possible that our sources represent the low redshift analogs for an extreme subset of compact starburst galaxies that are more prevalent in the early Universe.
Understanding how stellar feedback rapidly quenches star formation at intermediate redshift is necessary to be able to build models for galaxy formation and evolution in the early Universe when compact star formation events were significantly more common.For compact star-forming galaxies in the early Universe, it is difficult to observe the effects of feedback due to their high redshift and the fact they are commonly obscured by dust, making it nearly impossible to observe UV spectral signatures of outflows (e.g., van Dokkum et al. 2015).The broad consistency between the space density of our extreme, compact starburst galaxies and the Barro et al. (2013) sample allows us to better understand how compact star formation might be a phase that massive galaxies go through across a wide range of cosmic time.Barro et al. (2013) also presented a schematic representation of how galaxies evolve onto the local size-mass relation.Within this framework, compact star forming galaxies will experience rapid quenching via AGN or star formation feedback, resulting in a massive, compact quiescent galaxy population.Over cosmic time, these sources will undergo minor and major mergers resulting in a buildup of mass and size (e.g.Naab et al. 2009).If our sources are the low-redshift analogs of early Universe compact star forming galaxies beginning their quenching phase, we would expect that they would also end up as compact, quiescent galaxies.We show the space density evolution from van der Wel et al. ( 2014) for high-z, massive (M * ∼ 10 11 M ), compact (R/(M * /M 11 ) 0.75 < 2.5 kpc) galaxies as red points in Figure 8.The space density of compact quiescent galaxies peaks just as that of compact star forming galaxies begins to decline.It then wanes with decreasing redshift due to size buildup via galaxy mergers.Within the lowest redshift bin, the van der Wel et al. ( 2014) sources have a space density of ∼ 10 larger than that of our compact, starburst galaxies.It is also worth noting that the compact quiescent galaxies would be considered to be "compact" for ∼ 2 Gyr before minor mergers significantly contribute to size buildup (e.g., Naab et al. 2009;Newman et al. 2012)-a timescale that is significantly longer than the ∼ 100 Myr timescale for which our sample would be targeted as extremely compact starbursts (e.g.Barro et al. 2013).In addition to this, the effective radii for the van der Wel et al. ( 2014) sources is significantly larger than that of our nuclear starbursts.This could be due to the compact quiescent radii being more linked to the stellar mass profiles, while ours might be biased to small values because of mass-to-light ratio (M/L) effects.However, Diamond-Stanic et al. (2021) showed that even accounting for M/L effects that the stellar mass effective radius for our systems is on the order of 0.1-0.5 kpc, which indicates that our population could be even smaller and potentially more extreme than the compact quiescent galaxies in the van der Wel et al. (2014) sample.All of this together suggests that a significant fraction of massive, compact quiescent sources at intermediate redshift could have recently gone through a starburst similar to what we observe for the galaxies in our sample.

Comparison to post starburst galaxies
In order to get a full picture of the role intermediate-z, extremely compact starbursts galaxies play in the buildup of a massive, quiescent population, we also need to understand the evolutionary stages that follow their bursts.By design of our selection criteria, the compact starburst galaxies in our sample are similar to PSBs in that they have B and Astar dominated spectral features and weak nebular emission.Understanding the population of PSBs in a similar redshift interval as our sources would provide context for quenching timescales as well as what the progenitors of PSBs might look like.Wild et al. (2016) studied a population of massive, PSBs within 0.5 < z < 2, and determined that PSBs are a relatively short-lived, transitory phase in galaxy evolution, likely lasting ∼ 0.1 − 1 Gyr (see also Wild et al. 2009).This timescale range was determined by modeling PSBs in both toy-model and hydrodynamic simulations, and evolving them to determine the amount of time they would be targeted as PSBs-a similar method to what we do here for our compact starburst galaxies.The PSBs selectability timescale is given as the blue region in Figure 7.Our compact starburst galaxies with selectability timescales of ∼ 100 Myr would be selected for 10 − 100% of the time PSBs would be selected by their respective selection criteria.
It would be expected that extremely compact starburst galaxies and PSBs would have similar space densities within a given redshift range if they were two evolutionary stages that were directly related to each other.In other words, if compact starburst galaxies are the immediate progenitors to PSBs, they should be found in similar abundances.This is what is seen in Figure 8.The Wild et al. (2016) PSBs within the mass range 10.5 < log M * /M < 12.5 show a decrease in space density with decreasing redshift.The lowest redshift bin for the Wild et al. (2016) PSBs overlaps with the upper limits of the redshift range probed for our compact starburst galaxies.The mass bin for Wild et al. (2016) is consistent with that of our sources so we did not have to correct for integrating the SMF within different mass intervals.Our sources overlap within the margin of error with the estimated PSB space density at the lowest redshift included in the Wild et al. (2016) sample.
The redshift evolution of the Wild et al. ( 2016) PSB space density is also consistent with declining star formation and cold-gas densities over cosmic time-properties that would also impact the frequency of extremely compact bursts of star formation (e.g Madau & Dickinson 2014;Riechers et al. 2019).Since the cosmic SFR density sharply declines at lowz, we also want to compare our compact starburst space density to that of low-z PSBs to determine if our calculated space density is consistent with the decline in PSB space density on the interval 0 < z < 1.We calculate the z ∼ 0.05 PSB space density by integrating the lowest-z luminosity function presented in Pattarakijwanich et al. (2016).This luminosity function is given per [5000 Å] magnitude, a fiducial top hat filter used to calculate average f λ across 4950 < λ/ Å < 5100 for the rest frame spectra of the PSBs in their sample.In order to calculate a comparable space density from this, we needed to construct a [5000 Å] mass-luminosity relation to determine our bounds of integration.We did this by calculating [5000 Å] magnitudes from SDSS spectra for the low-z PSBs studied in French et al. (2018) using the methodology described in Pattarakijwanich et al. (2016) and using MPA-JHU stellar masses (Brinchmann et al. 2004;Tremonti et al. 2004).We then integrated the Pattarakijwanich et al. (2016) luminosity function within 10.5 < log M * /M < 11.5, which corresponds to −23.3 < [5000 Å] < −21.3, to obtain a low-z PSB space density of ∼ (2.9 +1.2 −1.3 ) × 10 −6 Mpc −3 .This is given as the green triangle in Figure 8.This is of the same order of magnitude of that for our z ∼ 0.5 compact starburst galaxies, which supports that a fraction of the most extreme PSBs might have undergone an extremely compact starburst phase like that observed in our galaxies.

Comparison to ULIRGs
Within the framework of merger-driven galaxy evolution, it is likely that extremely compact starburst events are most relevant in the remnants of major, gas-rich mergers.We also know that major, gas-rich mergers can trigger strong bursts of dusty star formation which would be observed as a ULIRG with L F IR > 10 12 L .It is possible that sources like the massive, extremely compact starburst galaxies in our sample could represent the transition between the dust-obscured ULIRG and the beginning of a galaxy-scale blowout.Here, we compare the selectability timescale and space density of our compact starbursts to that of ULIRGs in order to contextualize their importance in merger-driven galaxy evolution.
The timescales upon which a galaxy will experience ULIRG-like star formation are poorly constrained.On the low end, SN-driven winds could cut the lifetime of a single starburst in a ULIRG to 1-10 Myr (e.g., Thornley et al. 2000).However, studies of ULIRGs with a wide variety of morphologies have allowed the ULIRG lifetime to be estimated to be in the 0.1-1 Gyr range (e.g.Farrah et al. 2001;Murphy et al. 2001;Farrah et al. 2003).It is possible that this wide range of estimated ULIRG lifetimes is due to the fact that it is likely that a ULIRG undergoes multiple large bursts of star formation, allowing it to be selected as such on discontinuous time intervals (e.g., Bekki 2001;Farrah et al. 2001).Farrah et al. (2003) analyzed a population of 41 local ULIRGs and found that most of their sources would have lifetimes 10 Myr 40.From all of the values quoted above, we assume that the lifetime of a ULIRG is ∼ 1 − 100 Myr, and show this range as the purple shaded region in Figure 7.However, it is important to make the distinction that these timescales are more strongly related to the physical timescales of dusty star formation than to observable lifetimes caused by respective selection criteria as discussed in other sections.The post-peak SF ages for the MgII galaxies in our sample calculated in Davis et al. (in prep) are better comparisons to the ULIRG lifetimes due to the fact that they are tied more to the physical properties of the galaxies.As stated earlier, Davis et al. (in prep) calculated the average post-peak SF age of ∼ 70 Myr, which is largely consistent our estimate that they would be able to be targeted for ∼ 148 +27 −24 Myr.These timescales are of a similar order of magnitude to that of ULIRGs, which is largely unsurprising because both types of systems are characterized by their energetic starbursts, albeit ours are a bit more extreme.
We next compare our estimated compact starburst space density to that of ULIRGs in a similar redshift interval.Ko-prowski et al. (2017) computed the evolution of the far-IR luminosity function for galaxies out to z ∼ 5. We estimate the observed space density of ULIRGs by adopting the 0.5 < z < 1.5 far-IR luminosity function presented here.Integrating the luminosity function for L IR > 10 12 L gives n ULIRG ∼ 6 × 10 −5 Mpc −3 .This is shown as the purple shaded region in Figure 8, where the range of values is due to the uncertainty in the Schechter function fit as described in Koprowski et al. (2017).We note that we do not correct for differences in the mass distributions between the ULIRG sample and our sources because ULIRG sample was luminosity selected.Similarly, Magnelli et al. (2009) calculated the evolving far-IR luminosity function and space density for ULIRGs for several redshift bins within the interval 0.4 < z < 1.3.For the 0.4 < z < 0.7 and 0.7 < z < 1 bins, n ULIRG ∼ 3 × 10 −5 Mpc −3 and n ULIRG ∼ 2 × 10 −5 Mpc −3 , respectively.
Comparing these values to our estimated compact starburst space density ((1.1 +0.5 −0.3 ) × 10 −6 Mpc −3 ) suggests that it is possible that ∼ 3 − 8% of intermediate-z ULIRGs can experience a phase similar to that observed in our sample of extremely compact starburst galaxies.The physical timescales of ULIRGs and our compact starbursts are driven by the same processes, and they are on the same order of magnitude, while there is a factor ∼ 12 − 40 difference in their space densities.It is possible the sources in our sample represent a small fraction of the most extreme population of ULIRGs that have the highest SFRs and/or are the most compact.
We also compare the space density of our intermediate-z massive, compact starburst galaxies to that of low-z ULIRGs, similar what we hae done in the previous subsection for PSBs since we expect a sharp decline in the ULIRG space density alongside that of the cosmic SFR density (e.g., Madau & Dickinson 2014).Kim & Sanders (1998) presented a luminosity function for 0.05 < z < 0.2 ULIRGs, and integrating the luminosity for log L IR /L > 12 gives a space density of ∼ (4 ± 1) × 10 −7 Mpc −3 .This is given as the purple hexagon in Figure 8.Given that the space density of our intermediate-z, compact starburst galaxies is calculated in a redshift range between that of the low and intermediatez ULIRGs, this very steep decline in ULIRG space density also suggests that a small fraction of ULIRGs could undergo a phase like that observed in our galaxies as they evolve.
6.4.Comparison to z ∼ 0.5 merger rate per co-moving unit volume Since extremely compact starburst galaxies are likely formed by the merging of gas-rich disk galaxies, it is important to characterize how many major mergers could produce events like those observed in our sample of galaxies.This requires having knowledge of the major merger rate over a given redshift range.In the past few decades, much work has been done to constrain the galaxy-galaxy merger rate throughout cosmic time.However, there are large systematic uncertainties in this measurement that have prevented the reaching of a consensus between theory and observations and even between different observational techniques.Here, we summarize the most recent results in calculating the z ∼ 0.5 galaxy merger rate per co-moving unit volume and use them to contextualize our compact starburst space density.To be more concise, we will refer to the merger rate per co-moving unit volume as the merger rate density for the rest of this paper.
A crucial piece of calculating the galaxy merger rate density is understanding the timescales upon which a system would be identified as a major merger.This is also the aspect of the calculation that contributes the most uncertainty to the major merger rate density.The two main methods to identify merging galaxies are to select systems with disturbed morphologies (e.g., Abraham et al. 1994Abraham et al. , 2003;;Conselice 2009;Lotz et al. 2008) or to search for systems comprised of close pairs (e.g, Le Fèvre et al. 2000;Bluck et al. 2009).Each of these methods probe different stages of the merger and are susceptible to different biases.Close pair selection identifies sources before the merger begins but morphological selection can detect systems before, during, and after the merger occurs, allowing morphologically selected galaxy mergers to be identifiable on different timescales than their close pair counterparts.
In Figure 7, we compare the selectability timescale calculated for our modeled compact starburst galaxies (black star) to that of all galaxy mergers presented in Lotz et al. (2008) (grey shaded region).The Lotz et al. (2011) region reflects the range of timescales calculated for simulated systems with mass ratios 1 : 10 < µ < 1 : 1 that were selected morphologically (for a detailed review; Abraham et al. 1994Abraham et al. , 2003;;Lotz et al. 2011).We find that extreme compact starburst events are selectable for a fraction of the amount of time that a morphologically selected galaxy merger would be under its own respective criteria.
Having constraints on galaxy merger timescales allows for the merger rate density to be calculated.We show our calculated compact starburst space density (black star) in conjunction with merger rate densities (grey and red shaded regions) as well as the space densities of other phases of mergerdriven evolution in Figure 8.The grey shaded region represents the range of the predicted observable merger rate densities calculated in Lotz et al. (2011), and the red shaded region represents the observed range of merger rate densities presented in Stott et al. (2013) which used Lotz et al. (2011) predicted observable timescales.Both the Lotz et al. (2011) and Stott et al. (2013) merger rate densities were calculated for samples containing galaxies with log M * /M > 10, while the compact starburst galaxies in our sample are typically log M * /M > 10.5.We therefore adjusted the Lotz et al. (2011) and Stott et al. (2013) merger rate densities to ensure that we are working within the same mass interval of the galaxy stellar mass function (SMF) within the appropriate redshift range, as described above.We also converted these merger rate densities to merger space densities by assuming a typical merger timescale of 0.5 Gyr (Lotz et al. 2011).
We find that our estimated massive compact starburst space density is ∼ 200 times smaller than the merger rate density within a similar redshift interval, suggesting that only a small fraction of galaxy mergers would trigger an extreme burst of compact star formation similar to our observed sample.However, we reiterate that the Lotz et al. (2011) and Stott et al. (2013) merger rates consider both major and minor mergers.It is likely that these compact starburst events are triggered only by gas-rich major (mass ratio 1:1 -4:1) mergers which only make up a fraction of the total number of mergers occurring across a given redshift range (e.g., Lin et al. 2010).This suggests that although only a small fraction of all galaxy mergers might result in extremely compact starbursts, that these could be a likely result of a larger fraction of gas-rich major mergers.6.5.Comparison to z ∼ 0.5 massive, quiescent galaxies Another way of understanding the role of compact starburst galaxies in the buildup of quiescent galaxy populations is to compare their space density to that of massive, quiescent galaxies within the same redshift range.Moustakas et al. (2013) presented a detailed study of galaxies targeted in PRism Multi-object Survey (PRIMUS) and provided contsraints on the evolution of the stellar mass function from 0 < z < 1.The galaxies in PRIMUS were sorted into star forming and quiescent populations, and the evolution of their space density was calculated across different stellar mass and redshift bins.For quiescent PRIMUS galaxies in the mass range 10.5 < log M/M < 11, their space density increases by ∼ 2×10 −4 Mpc −3 from z ∼ 0.8 to z ∼ 0.35.The net decline in space density for star forming galaxies in this redshift interval is ∼ 9 × 10 −5 Mpc −3 .These changes in space density are comparable to the merger rate in this redshift range and are a factor of ∼ 1000 larger than our measured space density of n ∼ (1.1 +0.5 −0.3 ) × 10 −6 Mpc −3 for our sample of massive, compact starburst galaxies.This is broadly consistent with short-lived compact starbursts existing for ∼ 100 Myr, evolving into massive, quiescent galaxies which would exist on ∼Gyr timescales.It is likely that this is a relatively rare phase of galaxy evolution within the general population of massive, quiescent galaxies.However, it is possible that the fraction of those that have also previously undergone extreme ULIRG or PSB phases also could have experienced extremely, compact starbursts like those in our sample.

SUMMARY & CONCLUSIONS
In order to build up a population of quiescent galaxies, otherwise gas-rich and star forming galaxies need to undergo some type of quenching process to either disrupt or expel the gas in the system.Violent, dissipative feedback in which either AGN activity or rapid star formation injects energy into the ISM is an important process that impedes the formation of stars in a galaxy.Observationally, feedback manifests as large-scale gas outflows being driven from a galaxy.
Within the context of merger-driven galaxy evolution, we expect gas-rich mergers of massive star forming galaxies to trigger dusty starburst events that would then be followed by a blowout event in which nuclear gas and dust is expelled from the system, therefore exposing the nuclear regions of the galaxy.In this work, we have studied a population of 115 z ∼ 0.5 massive galaxies that are experiencing extreme, compact starburst events and outflows.Resolved HST WFC3 observations of a subset of these show that they are merger remnants, suggesting that these types of events could be an phase within a simple merger-driven evolutionary pathway.
Our goal for this work was to determine how long galaxies like the ones we observe would be selected under a certain set of selection criteria, to estimate their space density, and to place them into cosmological context with other evolutionary phases massive galaxies could experience.We do this by empirically modeling the stellar populations of z ∼ 0.5 massive, compact starburst galaxies.Our model is dependent on four parameters: nuclear burst age, burst mass fraction, optical depth of dust enshrouding newly formed stars, and total galaxy stellar mass.These posterior distributions for these parameter values are constrained for each of the 115 galaxies in our sample by fitting the SDSS ugriz and WISE W1/W2 photometry for the 151 galaxies in our sample using an MCMC technique.We randomly draw sets of parameters from the Gaussian models for the MCMC-calculated posterior distributions to assemble a mock population of compact starburst galaxies.We evolve the modeled sources to determine the timescales under which the galaxies we model would be selected by our targeting criteria.We find that this timescale is 148 +27 −24 Myr and that the corresponding intrinsic space density is n CS ∼ (1.1 +0.5 −0.3 ) × 10 −6 Mpc −3 .Our results, as summarized in Figure 8, suggest that our observed population of extreme compact starburst galaxies could fit into an evolutionary scheme described in Barro et al. (2013).At higher redshifts massive, compact star forming galaxies are more common, and they are believed to be the progenitors of massive, compact quiescent galaxies.Based on comparisons with the Barro et al. (2013) sample of massive, compact galaxies it is likely that our sources follow a similar life cycle in which a gas-rich major merger triggers a burst of star formation.This starburst then drives massive, high velocity gas outflows, thus rapidly quenching the galaxy.This galaxy would be observable for ∼ 100 Myr timescales as a PSB (e.g., Wild et al. 2016), and would then evolve into a massive, compact, quiescent galaxy.Throughout cosmic time, the massive, quiescent galaxy will undergo minor mergers, allowing it to grow in both mass and size to become a typical quiescent galaxy consistent with the masssize relation of the massive quiescent galaxy population at z=0, which is notably devoid of compact quiescent galaxies (e.g., Taylor et al. 2010).Although it is more common for galaxies to experience this timeline earlier in the Universe, our galaxies appear to be consistent with these trends within their respective redshift interval.The space density of our massive, compact starbursts suggests that they can contribute to the buildup of a fraction of PSBs and massive, extreme compact quiescent galaxies within their epoch, which in turn could contribute to the overall population of massive, quiescent galaxies in the future.

Figure 1 .Figure 2 .
Figure1.HST WFC3 cutouts of 6 representative galaxies in our sample that overlap with those presented inSell et al. (2014).We note that we omit J0944+0930 and J1104+5946 fromSell et al. (2014) as they do not satisfy all of our selection criteria.All of these galaxies show clear signs of tidal disruptions, consistent with their extreme nuclear starbursts being triggered by major merger events.

Figure 3 .
Figure 3. Panel (a): Best fit SED for galaxy J0826+4305.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J0826+4305.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.

Figure 4 .
Figure 4. Shown here are the modeled evolutionary tracks of the apparent i-band and g-band SDSS magnitudes (panels (a) & (b)), [OII] equivalent width (panel (c)), and WISE W 1 − W 2 color (panel (d))for a sub-sample of modeled galaxies.The x-axis is age relative to the burst peak.The grey-shaded rectangles represent the regions of parameter space that would not be selected by the criteria placed on that given parameter.This is a schematic representation-the full details of our source selection can be found in Section 2.

Figure 5 .
Figure5.Distribution of average selected lifetimes from the mock sample.We find that extreme nuclear starbursts like the ones observed in our galaxies would be selected for ∼ 148 +27 −24 Myr, consistent with the burst ages calculated inDavis et al. (in prep).

Figure 9 .
Figure 9. Panel (a): Best fit SED for galaxy J01713+2817.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J01713+2817.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.

Figure 10 .
Figure 10.Panel (a): Best fit SED for galaxy J2118+0017.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J2118+0017.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.

Figure 11 .
Figure 11.Panel (a): Best fit SED for galaxy J1506+6131.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J1506+6131.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.

Figure 12 .
Figure 12.Panel (a): Best fit SED for galaxy J1558+3957.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J1558+3957.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.

Figure 13 .
Figure 13.Panel (a): Best fit SED for galaxy J1613+2834.The red points and error bars are the observed photometry and ±0.25 magnitude uncertainty region, respectively.The open black squares are the modeled photometry.The blue, violet, and green curves are the modeled SED for the total galaxy system, nuclear burst, and host galaxy, respectively.Panel (b): Triangle plot of parameter posterior distributions for galaxy J1613+2834.We calculate the mean and covariances of these posterior distributions to model them as 4D-Gaussian distributions.We then randomly draw sets of parameter values from the Gaussian-modeled posterior to construct a mock population of compact starbursts.Panel (c): Galaxy cutout as seen in Figure 1.