LISA Gravitational Wave Sources in A Time-Varying Galactic Stochastic Background

A unique challenge for data analysis with the Laser Interferometer Space Antenna (LISA) is that the noise backgrounds from instrumental noise and astrophysical sources will change significantly over both the year and the entire mission. Variations in the noise levels will be on time scales comparable to, or shorter than, the time most signals spend in the detector's sensitive band. The variation in the amplitude of the galactic stochastic GW background from galactic binaries as the antenna pattern rotates relative to the galactic center is a particularly significant component of the noise variation. LISA's sensitivity to different source classes will therefore vary as a function of sky location and time. The variation will impact both overall signal-to-noise and the efficiency of alerts to EM observers to search for multi-messenger counterparts.


INTRODUCTION
The Laser Interferometer Space Antenna (LISA) is a space-based mHz gravitational wave (GW) observatory set to launch in the mid-2030s Amaro-Seoane et al. (2017). Perhaps hundreds of millions of galactic ultra-compact binaries (UCBs), primarily white dwarf binaries, will continuously contribute to the gravitational-wave signal seen by LISA, with the brightest sources likely being first detected within days or even hours of LISA entering observing mode (Cornish & Robson 2017;Timpano et al. 2006). Over the mission's lifetime, LISA will likely detect tens of thousands of these sources (Cornish & Robson 2017). Dozens of LISA-detectable white dwarf binaries have already been detected electromagnetically (Stroeer & Vecchio 2006;Kupfer et al. 2018), and are currently the only guaranteed-in-advance sources for multi-messenger gravitational-wave astronomy. Such sources, which are bright in both the electromagnetic (EM) spectrum and gravitational waves, present excellent opportunities for white dwarf science (Marsh 2011), tests of general relativity (Littenberg & Yunes 2019), and checking the instrumental calibration of LISA (Littenberg 2018).
While the enormous quantity of galactic GW sources is a scientific opportunity for LISA, it also presents a challenge for LISA's other observations. The galaxy contains far more GW sources than LISA will be able to resolve. These unresolved sources will combine to form a stochastic gravitationalwave background which will compete with other expected LISA science sources, such as Supermassive arXiv:2206.14813v1 [astro-ph.IM] 29 Jun 2022 Black Hole Binaries (SMBHBs), Stellar Origin Black Hole Binaries (SOBHBs), and Extreme Mass Ratio Inspirals (EMRIS). The stochastic background from galactic sources could also mask extragalactic and cosmological stochastic backgrounds (Bonetti & Sesana 2020;Boileau et al. 2021;Adams & Cornish 2014), which could be a powerful test of models of early universe cosmology (Auclair et al. 2020;Ricciardone 2017).
To maximize the impact of LISA's transient source alerts to EM observers, the stochastic background must be characterized and updated continuously to give the best possible inputs to parameter constraint pipelines and mitigate the impacts on other source classes (Bellovary et al. 2020). As new galactic sources are detected and characterized, they can be subtracted from the LISA data stream, reducing the amplitude of the stochastic background. Because the signals from galactic sources are continuous, subtracting them from the data will retrospectively improve constraints on past transient sources throughout the mission's lifetime and enhance the sensitivity of future alerts.
Existing analyses have generally modeled the galactic stochastic background as constant throughout the year (Timpano et al. 2006;Karnesis et al. 2021). In reality, LISA's quadrupolar antenna pattern continuously rotates across the sky, causing the sensitivity as a function of sky location to vary over time. The rotating sensitivity would still produce an approximately constant noise level for an isotropic stochastic signal (as expected from cosmology). However, like most mass in the galaxy, GW emission from galactic binaries is expected to be highly concentrated in the direction of the galactic center (Cornish 2002). As a result, the galactic stochastic gravitational-wave background amplitude will vary significantly throughout of the year based on the position of LISA's antenna pattern relative to the galactic center. In this analysis, we adopt an improved technique to better model the effect of this variation on LISA.
This paper describes a time-frequency decomposition of the stochastic background with LISA. This decomposition allows us to effectively model the galactic stochastic background as a source of cyclostationary noise, consisting of a time-independent spectral contribution to the noise signal modulated by a frequency-independent amplitude.
The paper is organized as follows. In Section. 2, we adapt existing iterative fitting procedures to a time-varying galactic spectrum, and show that the decomposition results in a much-improved noise model compared to the previously assumed constant-noise model. In Section 3, we provide coefficients for an empirical fitting formula to the cyclostationary spectrum. In Section 4, we explore the results of using the improved cyclostationary model on the detectability of galactic binaries and SMBHBs. In Section 5, we draw conclusions and look toward our result's impact on future analysis.

METHODS
To develop the time-varying galactic stochastic background, we use the catalog of galactic binary sources from the "Sangria" dataset produced from a population synthesis model for the LISA data challenge (Baghi 2022). Other galactic populations could produce different results (Nissanke et al. 2012;Lamberts et al. 2019;Breivik et al. 2020). To generate a model for the noise background as a function of time, we use a version of the iterative fitting procedure described in (Timpano et al. 2006) and ) adapted to a time-varying spectrum. See (Adams & Cornish 2014) for a different approach to handling the temporal variation in the galactic spectrum.
Calculating the signals in the time-frequency domain allows the methodology to be efficiently adapted to handle a time-varying spectrum. We use the Wilson-Daubechies-Meyer (WDM) wavelet basis (Necula et al. 2012) with a modified normalization chosen such that a unit power spectrum in the wavelet domain S nm = 1 generates unit variance white noise in the time domain. The basis has a uniform tiling in frequency and time, which permits generating a uniform grid of frequency spectra as a function of time.
We generate LISA waveforms as described in Cornish (2020) using the Rigid Adiabatic approximation (Cornish & Shuman 2020;Rubbo et al. 2004) We first generate a realization of the signal seen by LISA as where w nm,inst is a random realization of the instrumental noise and w nm,k are the wavelet decompositions of each of the 29,000,036 binaries in the Sangria dataset. The vast majority of these binaries produce too faint a signal to be individually detectable. To mitigate unnecessary evaluations, we do an initial pass where we calculate the instrumental-noise-only SNR and add every source with SN R inst < SN R thresh into an 'irreducible' background w nm,irr , so that we can rewrite: where now the sum over k runs only over the O(100, 000) binaries bright enough to pass the initial minimum SNR cut. This component of the procedure will not be possible for the real LISA data because we will be starting from the observed w AET nm and not an artificially simulated catalog. To further accelerate convergence, we also remove the bright binaries with SN R inst > 30 at this step. At a later iteration, we verify all removed bright binaries retain SN R calc > SN R thresh and replace them if necessary.

The Cyclostationary Model
From Eq. 2, we must estimate the noise level S AE nm for the two channels which have an appreciable component from the galactic stochastic background. As a first approximation, we can approximate the mean noise spectrum as: We assume the instrumental component is constant in time and the same for both channels, although relaxing this assumption would be straightforward in the time-frequency domain. To write the signal as cyclostationary, we decompose the signal into two parts, one depending only on time and one only on the frequency: where S m,gal is the mean galactic spectrum averaged over time and the A and E channels, and r AE n is a multiplier which varies in time and will be different for the A and E channels. To determine r AE n , we can average over frequency where the sum from m 1 to m 2 runs over only pixels where S m,gal / S m,instr > 5 to reduce the influence of numerical noise on r AE n . For the particular test case presented in this paper, we can separate the instrumental noise from the galactic signal exactly, so it is possible to instead obtain a more precise fit to r AE n by subtracting the instrumental signal exactly from w AE nm to obtain instead where m 1 and m 2 can now obtain a numerically stable average over a larger range in frequency, for example S m,gal / S m,instr > 0.1. Although this version of the procedure would not be possible for real LISA data, this more precise estimator of r AE n allows us to more thoroughly investigate the limitations of the cyclostationary model compared to a hypothetical perfectly optimal model of the galactic stochastic background.
Eq. 5 allows for an arbitrary cyclostationary noise model. Physically, the time variation of the signal is generated entirely by the shifting antenna pattern due to the rotation of the constellation relative to the galaxy throughout the year (Cornish 2001). Consequently, we can generate a smoother noise model by extracting only the frequency components in harmonics of 1 year: where r AE n = r AE (t n ). The detailed distribution of the population of galactic binaries will ultimately determine the number of harmonics necessary to fit the time evolution of the amplitude, but there are practical limits to LISA's ability to place constraints on higher harmonics (Edlund et al. 2005;Renzini et al. 2022;Nissanke et al. 2012;Cornish 2002). We find that the first five harmonics of 1 year, k = 1...5, are sufficient to capture essentially all of the physical time variation of the signal while keeping the spurious variation in r AE n due to noise artifacts minimal. The coefficients A AE,k and φ AE,k can be extracted from either the Fast Fourier Transform of r AE n or a least-squares fit. A fit to r AE n for our two year simulated dataset is shown in Fig. 1, and parameters for several different observing time intervals are shown in Table. 1.

The Iterative Fit
The objective of the iterative fitting procedure is to remove all binaries bright enough that their parameters would be sufficiently well characterized by a full MCMC pipeline that their signal could be subtracted from the LISA data stream. Because the binaries interfere with each other's signals, subtracting a bright binary will often cause further fainter binaries to become detectable in the data stream. We use the well-established iterative procedure described in, e.g., Karnesis et al. (2021).
By iterating this procedure until there are no bright binaries left, we arrive at an estimate of the noise level as a function of time S AE nm . For this analysis, we choose a cutoff of SNR thresh = 7 as the threshold for a galactic binary to be considered detectable. Our method allows directly comparing results between a more realistic cyclostationary noise model, described in Sec. from Eq. 6 as a function of time for two years of simulated LISA data. The amplitude in the A channel varies by a factor of 7, while the amplitude in the E channel varies by a factor of 4, and the mean of the envelopes varies by a factor of 5, from 0.32 to 1.67. The model includes the first 5 harmonics of 1. yr as described in Eq. 7, although in this case the amplitudes of the first two harmonics dominate the other three by more than an order of magnitude. incoherent superposition of signals becomes intrinsically reasonably smooth after a few iterations, we adopt a Gaussian smoothing with a smoothing length that decreases significantly over the first few iterations, rather than the simple moving average or windowed median smoothing used in Karnesis et al. (2021). The smoothing procedure is described in more detail in Sec. 2.3. For approximately the first two iterations of the iterative subtraction procedure, the signal is not well approximated as cyclostationary due to distortion from the individual influence of bright, wellisolated galactic binaries. For those iterations, we set r AE n = 1. To prevent the initial iterations with a different noise model and smoothing length from causing sources to be incorrectly subtracted, we use a more conservative SNR thresh for those first few iterations. In testing, an exponential phase-in of the SNR cutoff from the initial SNR thresh = 7(1 + e −2k+4 ) for iterations k = 1..3 and SNR thresh = 7 thereafter works well to prevent any sources from ever being incorrectly subtracted due to the changing smoothing length or time dependence, while ensuring each iteration subtracts almost as many sources as possible. As long as the cutoff is sufficiently conservative to prevent any erroneous subtractions of sub-threshold sources, the exact details of the phase-in of the cutoff should not affect the final result at all. However, the rate of phase-in does affect the number of iterations and total compute time it takes to achieve the final converged result.
An example illustrating the convergence to a galactic spectrum for two years of data using our procedure is shown in Fig. 2. In this example, the procedure has essentially converged after < 10 iterations, and reached full convergence on iteration 17.

Smoothing
During the iterative fitting procedure, it is necessary to smooth the modeled galactic background in frequency so that individual bright binaries do not artificially compete with their own power. The final output power spectrum does not include any smoothing. To accomplish the smoothing, we apply a small Gaussian smoothing in log-frequency to the whitened log-power spectrum, where the width of the Gaussian smoothing in log-frequency is chosen such that, for the first iteration, the width around the peak at 2 mHz is σ(log(2 mHz)) = 6 frequency pixels. Because the smoothing is only really necessary for the first few iterations, we reduce the smoothing length by one e-folding per iteration to a floor of σ(log(2 mHz)) = 0.25 pixels starting in the fifth iteration.
The addition and subsequent subtraction of the factor of = 10 −50 is a small numerical stabilizer to prevent the argument of the logarithm from becoming zero or negative, and is chosen to be much smaller than the lowest value of S m,gal of interest. Using S m,instr as the numerical stabilizer would obtain a more realistic cutoff. However, for the purposes of this paper, the smaller numerical stabilizer complements the more precise estimator of r AE n in Eq. 6 to allow us to investigate the ultimate limitations of the cyclostationary model to the best possible precision.
We can then replace S m,gal with S AE m,gal,smooth in Eq. 4 throughout the iterative fitting procedure.

EMPIRICAL FITTING FORMULA
The smoothed average frequency spectrum in Eq. 8 allows the spectrum fitting to have essentially monotonic convergence to a final self-consistent estimate of the spectrum of the galactic background and is agnostic regarding the predicted frequency distribution of galactic sources. However, for some applications, it is desirable to reduce the number of parameters in the fit and obtain a smoother spectrum by fitting to a spectrum of known shape. A variety of ways of parameterizing the shape of the galactic background exist (Flauger et al. 2021;Caprini et al. 2019;Karnesis et al. 2021). Similar to Karnesis et al. (2021), we use a 5-parameter model of the galactic background: where log 10 (f knee ) = a k log 10 (T obs ) + b k and log 10 (f 1 ) = a 1 log 10 (T obs ) + b 1 can be used to predict the coefficients as a function of observation duration, where T obs is in years. The improvement in the overall amplitude of the galactic stochastic background as a function of T obs is absorbed by the variation in f 1 , so a shift in A as a function of time is not necessary. For fitting purposes A and f 2 have a large dynamic range, so we search over the logarithms log 10 A and log 10 f 2 instead, and fit over the spectrum in the wavelet domain, S m,gal S gal (f )/dt. To fit the parameters of Eq. 9, we calculate the galactic spectra S AE m,gal (T obs ) in 1 year increments T obs = 1..8 yr using the iterative fitting procedure described in Sec. 2.2. We then use the dual annealing algorithm described in Xiang et al. (2013) as implemented in SciPy (Virtanen et al. 2020) to obtain a least-squares fit of log 10 S AE m,gal (T obs ) + S m,inst to the 7-parameter model in Eq. 9. We fit both channels and all eight years of data simultaneously.
The empirical best fit parameters are given in Table 2, with spectra plotted for several selected observing times in Fig. 3. The best fit spectral parameters shown in Table 2 are similar whether or not we use the cyclostationary model, and are in good general agreement with the results in Karnesis et al. (2021) for all the shape parameters, given expected differences due to our different smoothing and fitting procedures. Our amplitude normalization in the wavelet domain is chosen such that  Table 2. The best fit parameters to the fitting formula in Eq. 9 in both our cyclostationary model and a constant noise model. The shape parameters change only slightly between the models, while there is a small shift in the overall amplitude normalization. Note that the normalization of our amplitudes A is different from Karnesis et al. (2021). Although we fit the amplitudes to the mean spectrum in the wavelet domain, to enable easier comparison to other frequency domain results, we report instead A = dtA wavelet , where dt 30.075 s.
Our normalization in the wavelet domain is chosen such that for N f = 2048, N t = 512 pixels/year, a unit power spectrum in the wavelet domain corresponds to unit variance white noise in the time domain.
S nm = 1 corresponds to unit variance white noise in the time domain for a grid size N f = 2048, N t = 512 pixels/year. To enable easier comparison to other frequency domain results, we report instead A = dtA wavelet , where dt 30.075 s. In combination with the parameters in Table 1 used in the formula for r AE n in Eq. 7, the parameters in Table 2 can be used to obtain a reasonable 25 parameter fit to our full cyclostationary model of the galactic spectrum fit to Sangria data for any given T obs , A full joint least-squares fit of the time and frequency parameters would likely find a slightly better best-fit model, but would be significantly more computationally expensive.

Improving The Fit
In this section, we evaluate quantitatively how the cyclostationary model improves the fit of the galactic stochastic background compared to a constant background model. To assess the improvement, we compare the whitened residuals from the simulated dataset; when the model is a good fit, the residuals will approach unit variance Gaussian white noise. This comparison is shown in Fig. 4.
We can then use an Anderson-Darling test (Anderson & Darling 1954;Jones et al. 1987) to detect deviations from normality. The constant model residuals give a test statistic A * 2 const ≈ 37.24, an extremely significant detection of deviation from normality (p < 10 −5 corresponds to a test statistic A * 2 const 2.28). The cyclostationary model residuals give A * 2 const ≈ 0.30, which is unable to reject the null hypothesis of normality at p > 0.1. Both sets of residuals are consistent with zero mean and unit variance.
Because we can separate the instrumental and galactic noise perfectly with our simulated "Sangria" dataset, we can investigate the true level of deviation from Gaussian white noise in the cyclostationary model, shown in the right panel of Fig. 4. Although this plot makes a slight deviation from cyclostationarity apparent by eye, note that, even in the complete absence of instrument noise, the overall distribution of the residuals still gives p > 0.1 consistency with normality by the Anderson-Darling test for 0.1 mHz < f < 4 mHz.
The deviation is likely due to a combination of the slight frequency dependence of the LISA antenna pattern in this frequency range (see, e.g., Cornish (2002)) and the fact that there are fewer ] 1 year 2 years 4 years 8 years Figure 3. Improvement of galactic confusion noise spectrum with observation time in the SNR thresh = 7 cyclostationary model, with the best fit spectrum from Eq. 9. Our fits are all performed intrinsically in the wavelet domain, but over this frequency range we approximate S AE (f ) S AE m dt, dt 30.075 s, to enable easier comparison to frequency domain results. Overall the model produces an excellent fit to the overall shape of the spectrum.
nearly-detectable binaries to average over at frequencies near the tails of the galactic spectrum. The deviation from the cyclostationary model is of little practical significance where the instrumental and galactic contributions to the noise cannot be perfectly separated, as in real data.

Sensitivity to Galactic Binaries
In our simulated dataset, the constant model predicts slightly more galactic binaries are detectable than in our cyclostationary model. However, the individual detected binaries have a different distribution on the sky, as shown in Fig. 5. The difference in detectability is due to the interaction between LISA's antenna pattern and the concentration of binaries within the galaxy. As shown in Fig. 6, the cyclostationary model significantly improves sensitivity away from the galactic center and, to a lesser extent, degrades towards the galactic center and anticenter. Overall, the improved cyclostationary model generally predicts a larger total sensitive volume for LISA than the constant model by a factor that varies depending on the frequency and observation time, as shown in Table. 3.

Sensitivity to Supermassive Black Hole Binaries
Because LISA's antenna pattern rotates relative to the galaxy throughout the year, the degradation in LISA's sensitivity due to the galactic stochastic background also evolves with time. This effect is analogous to the obscuration of electromagnetic signals by the matter in the galactic plane. An excellent source class to illustrate the impact of the variation is supermassive black hole binaries, The cyclostationary model is a significant improvement, with residuals statistically well approximated as Gaussian. In Right: we show the whitened residuals of the cyclostationary component after turning off instrumental noise completely, i.e. (w AE nm,gal ) 2 /S AE nm,gal . At frequencies around 2 mHz, the signal is still very well approximated as cyclostationary, while at the tails of the spectrum, a slight deviation from perfect cyclostationarity is apparent. This deviation is of limited practical significance because the spectrum falls off rapidly compared to the instrumental noise at those frequencies. Additionally, it will not be possible to separate the instrumental and galactic contributions to the noise spectrum in real data t (yrs) N det,const N det,cyclo N disagree r vol,1.0 mHz r vol,1.5 mHz r vol,2.0 mHz r vol,2.5 mHz  Table 3. Detection efficiency for the binaries in the Sangria dataset as a function of total observation time, and the ratio of the sensitive observing volumes for an injected test binary at several different GW frequencies. At shorter observation times, the cyclostationary model detects less binaries near the galactic center but overall improves sensitivity. The cutoff frequency of the galactic background decreases over time, and the models begin to agree above the cutoff frequency, as shown in Fig. 3.
which evolve over timescales of days to weeks and therefore sample a different galactic background amplitude depending on their time of merger. To illustrate the difference in sensitivity, we inject simulated SMBHB sources at a grid of chirp times t c and sky positions. The injection procedure as function of t c is shown in Fig. 7. In Fig. 9 The models disagree on the threshold SNR > 7 for 241 binaries, out of 7,273 total predicted detectable by the cyclostationary model, vs. 7,470 in the constant model. Because the binaries are physically highly clustered at the galactic center, the relative increase in detection efficiency far from the galactic center is significantly greater than the loss of efficiency near the galactic center. Therefore, the overall observing volume is larger in the cyclostationary model, despite the lower number of galactic binaries detected. The source of the observed structure is explored in Fig. 6. four selected chirp times. In Fig. 8, we show the sky-averaged effective range of LISA as a function of sky location in the different models. As anticipated by the analogy with the electromagnetic case, LISA is less sensitive to mergers in the direction of the galactic center and anticenter than mergers that occur out of the galactic plane. The uneven distribution of noise on the sky over time will likely also impact parameter estimation for SMBHB sources even at fixed signal-to-noise ratio. For example, the shifting relative amplitude of the noise in the A and E channels may result in altered constraints on sky location, inclination, and polarization in some parts of the sky. There will also be parameter-space volume effects due to the larger sensitive volume away from the galactic plane, which will likely shift the mass range LISA is sensitive as a function of sky location. We will examine the effects of the time-varying galactic stochastic background on parameter estimation in more detail in a future publication.

CONCLUSION
In this paper, we have extended existing iterative techniques for deriving the galactic stochastic background with LISA to handle the temporal variation in the galactic signal more correctly. Our technique significantly improves the quality of the fit of the noise model to the data.   Figure 7. Illustration of the injection procedure used to generate in Fig. 9 and Fig. 8. For this figure, we superpose the simulated signals for an injected SMBHB at each of the four chirp times presented in Fig. 8 with a whitened galactic+instrument noise spectrum from the constant noise model. The source has a total mass chosen such that f RD = 2 mHz, and a mean SNR 2 = 10, 000 across all realizations of the chirp time and sky location. We have examined how the amplitude of the galactic stochastic background varies throughout the year as a function of sky location for various sources. We have also shown how the amplitude variation's level and shape change over the mission's lifetime. To facilitate future analyses of the effects of time-varying galactic stochastic background without replicating the entire iterative procedure, we have a simple empirical fit to the time variation in the galactic stochastic background in the "Sangria" dataset in Eq. 7. We provide empirical coefficients for the annual cyclostationary time variation of the spectrum in Table 1 and the spectral shape parameters include the secular dependence on observation time in Table 2.
Overall, the improved modeling shows that results derived from the previous spectral models underestimate the total integrated observation volume of LISA by a factor of 10 − 20% for sources that accumulate most of their SNR in the 1 − 3 mHz frequency band. This improvement occurs because most of the sky is less severely contaminated than the galactic center. The level of improvement would depend on the specific galactic population model used, and it might be possible that some alternate galactic population models would not exhibit a significant improvement in the sensitive volume. We emphasize that our method is simply a more correct model of the noise spectrum, and it would still produce more correct results even for a choice of galaxy model where it predicted a lower total integrated observing volume.
Future work will combine our code with an MCMC pipeline to investigate the impact of the improved galactic spectrum modeling on parameter estimation for various sources.The temporal variation in the galactic spectrum may have more significant implications for parameter estimation than expected based on its effect on the computation of SNR alone. Therefore, correctly modeling the t c =0.73 yr t c =1.00 yr t c =1.26 yr t c =1.48 yr 50 0 50 Figure 9. Difference in predicted SNR between the cyclostationary and constant models for an injected f RD = 2 mHz supermassive black hole binary at four selected chirp times corresponding to the extrema in Fig. 3. The minima in LISA's sensitivity to SMBHBs correspond to the times when LISA's peak sensitivity passing closest to the galactic center (t c 1.26 yr) and closest to the galactic anticenter (t c 0.73 yr).
temporal variation in the noise spectrum will be an essential component of pipelines to obtain a global fit to LISA data. Our technique can easily be extended to any cyclostationary noise source in LISA, or even arbitrary non-cyclostationary time-frequency noise models, provided the noise is adiabatic.
Future analyses attempting to extract conclusions about LISA sources should use the improved noise modeling to more accurately model the impact of the galactic stochastic background on LISA data. This work was supported by NASA LISA foundation Science Grant 80NSSC19K0320.