Bayesian Analysis for Remote Biosignature Identification on exoEarths (BARBIE). II. Using Grid-based Nested Sampling in Coronagraphy Observation Simulations for O2 and O3

We present the results for the detectability of the O2 and O3 molecular species in the atmosphere of an Earth-like planet using reflected light at visible wavelengths. By quantifying the detectability as a function of the signal-to-noise ratio (S/N), we can constrain the best methods to detect these biosignatures with next-generation telescopes designed for high-contrast coronagraphy. Using 25 bandpasses between 0.515 and 1 μm and a preconstructed grid of geometric albedo spectra, we examined the spectral sensitivity needed to detect these species for a range of molecular abundances. We first replicate a modern-Earth twin atmosphere to study the detectability of current O2 and O3 levels, and then expand to a wider range of literature-driven abundances for each molecule. We constrain the optimal 20%, 30%, and 40% bandpasses based on the effective S/N of the data, and define the requirements for the possibility of simultaneous molecular detection. We present our findings of O2 and O3 detectability as functions of the S/N, wavelength, and abundance, and discuss how to use these results for optimizing future instrument designs. We find that O2 is detectable between 0.64 and 0.83 μm with moderate-S/N data for abundances near that of modern Earth and greater, but undetectable for lower abundances consistent with a Proterozoic Earth. O3 is detectable only at very high S/N data in the case of modern-Earth abundances; however, it is detectable at low-S/N data for higher O3 abundances that can occur from efficient abiotic O3 production mechanisms.


INTRODUCTION
Over 5,000 exoplanets have been discovered and confirmed in the last three decades.With the field of ex-Corresponding author: Natasha Latouf nlatouf@gmu.edu,natasha.m.latouf@nasa.govoplanet exploration booming since the first exoplanet atmosphere discovery (HD 209458;Charbonneau et al. 2002;Seager & Deming 2010;Kaltenegger 2017;Madhusudhan 2019), the next hurdle is the detection and characterization of terrestrial exoplanet atmospheres.Current space-based observatories (e.g.Spitzer, HST, JWST) are beginning to probe the characteristics of potentially rocky planets with both transmission (de Wit et al. 2016;Lim et al. 2023) and emission (Kreidberg et al. 2019;Zieba et al. 2023) measurements, but potentially habitable planets will be remain largely inaccessible except for 1-2 unique systems around the smallest stars (e.g.TRAPPIST-1) and impossible to characterize for Sun-like stars.However, with the advancements of high-contrast instrument technology and future mission concepts (e.g., Roberge & Moustakas 2018; The LU-VOIR Team 2019; Gaudi et al. 2020), the ability to find a habitable Earth-twin around a Sun-like star is now a realistic goal for the next flagship visible-light space telescope, the Habitable Worlds Observatory (HWO).
In the search for signs of planetary habitability, the detection of potential biosignatures can hint at biological activity on the planetary surface.Since Earth is currently our only example of a conclusively habitable (and inhabited) planet, using model scenarios with conditions similar to those on modern Earth or its distant past serves as a starting point for examining the detectability of planetary characteristics (Arney et al. 2016;Rugheimer & Kaltenegger 2018).In particular, at different times in its history Earth hosted varying concentrations of the atmospheric biomarkers O 2 and O 3 , which are primarily produced on Earth through photosynthesis and subsequent photochemical reactions; therefore measurements of geochemical proxies for the oxygenation of Earth's surface and atmosphere over time provide examples of possible global biogeochemical scenarios that we could encounter when observing Earthlike exoplanets.It is also useful to consider planets that -like Earth -have global liquid water oceans, but no biospheres.In some conditions, such The Phanerozoic (modern) epoch of Earth (541 million years ago -present), which forms the basis for most habitability and biosignature studies, has relatively high levels of both O 2 (21%) and O 3 (0.7 ppm) generated by abundant plant life and subsequent photochemistry.However, in the Proterozoic epoch (2.5 billion -541 million years ago), photosynthetic life was less abundant and measurements suggest oxygen accumulation in the atmosphere at approximately 0.1% to 1% of modern Earth O 2 levels but with potentially detectable levels of O 3 (Planavsky et al. 2014).In a planetary atmosphere similar to the Archean epoch of Earth (4 -2.5 billion years ago), we would expect an oxygen-poor and ozone-poor atmosphere, with a large abundance of greenhouse gases including CO 2 and CH 4 , which may have in turn formed a photochemical organic atmospheric haze (Zerkle et al. 2012;Arney et al. 2016;Krissansen-Totton et al. 2018).In addition to past epochs of Earth's history, models of alternative atmospheric scenarios are capable of producing of O 2 and O 3 dramatically higher than today's values -even without the presence of biology -due to a variety of mechanisms (Hu et al. 2012;Domagal-Goldman et al. 2014;Gao et al. 2015;Tian 2015;Harman et al. 2018).
As shown in Figure 1, we can see how varying the abundances of O 2 and O 3 can alter the resultant visible spectra and lead to differences in the detectability of these gases.In this work we quantify the detectability of O 2 and O 3 , using reflected-light measurements at visible wavelengths (0.515 -1 µm), for a range of abundance values and spectral bandpasses.This project is a direct continuation from Latouf et al. (2023, hereafter BARBIE1) -in BARBIE1, we studied the detection of H 2 O as a function of SNR and abundance throughout the visible spectral range, and in this paper we extend the same methodology to the study of O 2 and O 3 and also examine the impact of the width of spectral bands on the SNR required for detection.In [ §] 2 we present the methodology of our simulations, also providing a brief summary of BARBIE1.In [ §] 3 we present the results of our simulations for both the modern Earth-like SNR study and the molecular abundance study.In [ §] 4 we discuss the presented results and analyze the impact for future observations of varying Earth-twin epochs.In [ §] 5 we present our conclusions and ideas for future work.

METHODOLOGY
We follow a similar methodological approach to that of BARBIE1.Herein we present a summary of the main steps in our analysis.

Pre-Computed Spectral Grid
We use a geometric albedo spectral grid that was precomputed by Susemiehl et al. (2023, hereafter S23).This grid is housed in the Planetary Spectrum Generator (PSG, Villanueva et al. 2018Villanueva et al. , 2022)).The grid contains 1.4 million geometric albedo spectra that have been pre-computed with the parameters, minimum, and maximum values laid out in Table 1 of S23.The native resolution of the grid is set to R=500 and binned to R=140 for our analysis.For more information on the creation and verification of the grid, see S23 and BAR-BIE1.There are three molecular species in the grid: H 2 O, O 3 , and O 2 , with N 2 as the assumed background gas.The minimum and maximum grid values for these parameters are as follows: The model is comprised of 50% clear and 50% cloudy spectra, i.e.C f = 50%.There are three planetary parameters in the grid: surface pressure (P 0 ), surface albedo (A s ), and gravity (g).The minimum and maximum values for these parameters are as follows: P 0 Figure 1.Reflection spectra of a modern-Earth atmospheric composition but with varied O3 and O2 abundances.We feature a high O3 abundance (3×10 −6 ), a 50% PAL O2 abundance (2.1×10 −2 ), and a modern Earth (7×10 −7 O3, and 2.1×10 −1 O2), respectively.All spectra are binned at R=140. in [10 −3 Bar, 10 Bar]; A s in [10 −2 , 1]; gravity in [1 m/s 2 , 100 m/s 2 ]; and R p is fixed to 1 R ⊕ .The grid covers a wavelength range from 0.515 -1.0 µm as this range has been defined as the VIS channel in the exoplanet imaging instrument concepts studied in the Astro2020 Decadal Survey that form the starting point for HWO.

Mock Data and Retrieval Methodology
Our fiducial "data" spectrum is primarily set as a modern-Earth twin following Feng et al. (2018), with constant volume mixing ratios (VMRs) constant temperature profile at 250 K, A s of 0.3, P 0 of 1 Bar, and a planetary radius fixed at R p = 1 R ⊕ .We consider a resolving power of 140, binned from the native grid resolving power of 500.This fiducial spectrum is given as data to the nested sampler in conjunction with the grid.We use the modern-Earth twin fiducial spectrum for our initial SNR study, wherein we focused on SNRs 3-16 moving in steps of 1.We then change the fiducial spectrum for our abundance study, changing only the molecule of interest one at a time, to the values listed in Table 1 and 2. All other parameters were left to modern-Earth values, in order to specifically constrain the molecule of interest.In BARBIE1, we focused on H 2 O, centered on modern-Earth values and moving in increasing and decreasing log-steps.This was due to the lack of constraint on H 2 O abundance throughout time.In this study, we  consider O 2 and O 3 , and set our maximum O 2 value as the modern-Earth value of 0.21 VMR and move in decreasing log-steps of 0.25 and 0.5 down to values of 1% to 0.1% to represent a mid-Proterozoic Earth epoch (Planavsky et al. 2014).For O 3 we wished to test the possibility that values higher than on modern-day Earth might present stronger detectability.To study this potential, we set the maximum value as 3 ×10 −6 VMR, a value that is ∼ 5x the modern-Earth value of 7 ×10 −7 VMR, and well within the values created in models with high rates of abiotic O 2 and O 3 production on planets around F-, K, or M-type stars (Hu et al. 2012;Domagal-Goldman et al. 2014;Gao et al. 2015;Tian 2015;Harman et al. 2018).and decrease to the minimum value as the modern-Earth value.In log space, we decreased in steps of 0.1, due to the small gap between the maximum and minimum values.
As in BARBIE1, to examine the detectability of the molecular species as a function of the central wavelength and width of a bandpass, we chose 25 evenly spaced values for the bandpass central wavelength.However, in this study we also vary the total width of the spectral bandpass, examining values of 20%, 30%, and 40%.These ranges are consistent with the simultaneous bandpasses that may be achieved with future highperformance coronagraphs (Ruane et al. 2015;Por 2020;Juanola-Parramon et al. 2022).Using these bandpass values and the inputs laid out in 2.1.1 and 2.1.2,we run a series of Bayesian nested sampling retrievals for each abundance and SNR combination using the PSGnest application for the Planetary Spectrum Generator (PSG; Villanueva et al. 2018Villanueva et al. , 2022)).
PSG is a radiative transfer model and tool for synthesizing and retrieving upon planetary spectra.This includes planetary atmospheres and surfaces covering wavelengths from 50 nm to 100 mm (i.e. from UV to Radio) and a large range of planetary properties.PSG includes aerosol, atomic, continuum, and molecular scattering/radiative processes, implemented layerby-layer.PSG also includes the nested sampling routine PSGnest 1 , which is an adaptation of the algorithm used in Fortran MultiNest (Feroz et al. 2009).PSGnest is a Bayesian retrieval tool based on the well-known Multi-Nest framework and designed for exoplanetary observations; for more information on PSGnest and our retrieval methodology, see S23 and BARBIE1.

Outputs
1 https://psg.gsfc.nasa.gov/apps/psgnest.php The output results file from PSGnest contains highestlikelihood values of output parameters, the average value resulting from the posterior distribution, uncertainties which are estimated from the standard deviation of the posterior distribution, and the log evidence (logZ) (Villanueva et al. 2022).It also includes the input data (wavelength, uncertainty, and input spectrum, in respective columns) with the best-fit spectrum.We calculate the median values, as well as the upper and lower limits of the 68% credible region (Harrington et al. 2022) using the output results.We also extract the posteriors and global log-evidence for use in our detectability calculations, which is the numerical representation that quantifies the relative support per each model given the input data (Feroz et al. 2009).Using these outputs, we compute the Bayes factor.The Bayes factor is calculated by subtracting the Bayesian log-evidence per retrieval using the fiducial gas abundances.The resulting differences are referred to as the log-Bayes factor (lnB; Benneke & Seager 2013).The log-Bayes factor determines which scenario is most likely by examining the hypothesis where all parameters are present, and systematically subtracting the evidences from scenarios where each parameter in turn is nullified.In our studies, if lnB is less than 2.5, it represents an unconstrained detection; if lnB is between 2.5 and 5, it is a weak detection; if lnB is greater than 5, it is a strong detection (reference Table 2 of Benneke & Seager 2013).This differs slightly from Table 2 of Benneke & Seager (2013), in which lnB represents a weak, moderate, and strong detection respectively.We do not calculate the log-Bayes factor for non-gaseous components, as those factors cannot be absent and thus the calculation cannot represent a detection of those components.

Modern Earth Case Results
We begin by presenting the detectability of O 2 and O 3 as a function of SNR for the fiducial modern-Earth case, as first examined by S23 and BARBIE1; all of the O 2 and O 3 data and calculated log-Bayes factors across abundance, SNR, and wavelength are available to the community on Zenodo2 .We only present the narrow SNR range within which detectability strength changes materially for O 2 and O 3 in Figures 2 and 3. We can see that based on Figure 2, for SNR ≤ 7 there is only a weak or unconstrained detection possible for O 2 .However, beginning at SNR = 7.5, we can see strong detections of O 2 corresponding to bandpasses containing deep O 2 spectral features, such as at 0.74 µm.A strong detection of O 2 becomes possible from 0.68 -0.84 µm by SNR = 8.At SNRs higher than 8, all bandpass locations yield a strong detection of O 2 with increasingly better constraints of the 68% credible region.
Looking to Figure 3, it was required to significantly increase the SNR to achieve a strong detection for O 3 .We look at an SNR range of 18 -20, which is quite high, however it is only at this point that we begin to see a strong detection of O 3 within the wavelength range of our simulations.At an SNR of 18, we can see a small area of weak detection between 0.6 and 0.67 µm covering six bandpasses.At an SNR of 19, strong detection becomes possible for two bandpasses, and three bandpasses at SNR of 20.We can see it takes a very high SNR to achieve a strong detection of O 3 , which would lead to an exceptionally high integration time.
We can also see that although three bandpasses yield a strong detection at SNR=20, none of those bandpasses correctly constrain the abundance of O 3 within the 68% credible region.We present further investigation on this in Figure 4a.In this corner plot, which is focused on 0.64 µm at a 20% bandpass, O 3 is not retrieved within the 68% credible region, and there is a large spread of the high probability region which does not center on the true value of O 3 .This is largely due to the lack of continuum caused by the depth and width of the ozone features; a similar problem for detecting H 2 O at long wavelengths is discussed at length in BARBIE1.We present Figure 4b to show that by increasing the bandpass width from 20% to 40% centered on 0.64 µm, and thus increasing the Results of the fiducial case study for O3 but focused on a specific narrow SNR range.We show the SNR range at which detectability materially changes for O3.All facets of the plot remain the same as in Figure 2. We notice that for SNR=20, 3 bandpasses achieve a strong detection but none of these bandpasses correctly constrain the abundance of O3 within the 68% credible region.We present further investigation on this in Figure 4.
amount of spectral region covered in each bandpass, an adequate continuum is constrained and the retrieval of O 3 is firmly within the 68% credible region with little spread in the high probability regions.
However, we can see in the corner plots shown in Figure 4a and 4b that A s is also consistently retrieved at a value at least a factor of two away from the true value.To investigate the source of this incorrect retrieved parameter, we ran several retrievals with very high SNR ( 200) and with a data spectrum using values centered on a grid point, and compared both sets of results to our lower-SNR modern-Earth results.This allowed us to test whether the result was due to degeneracies for lower-SNR data or due to the impact of interpolation error in our grid-based retrieval scheme (as examined by S23).In Figure 4c, where we use a very high SNR and all of the parameters are set to exact values found in the S23 grid, we can see that all parameters are retrieved within a 68% credible region except for C f which is known to be degenerate.When C f is locked to its true value (0.5) as in Figure 4d, we can see this issue disappears and the range in the highest likelihood region decreases.There is higher interpolation error in A s likely due to the scarce sampling in the grid points and the limited bandwidth of the sampled spectra considered here.As there is only three grid points in this parameter space, there is a higher likelihood for interpolation error as there is more distance between the given true value and the nearest grid point.For a more in-depth description of this interpolation error and its impact, see S23.
In Figure 5 we present the shortest wavelength at which a weak or strong detection is achieved for O 2 and O 3 ; as described in BARBIE1, this is an important metric since the number of planets amenable to high-contrast coronagraphic imaging is higher when observing at shorter wavelengths due to the smaller inner working angle.Figure 5a provides a summarial result  of Figures 2 and 3, covering the full range of SNR from 3-20 at a 20% bandpass.In Figure 5b, we present the shortest wavelength for strong detection if we assume a 30% bandpass, while in Figure 5c we present the shortest wavelength for strong detection if we assume a 40% bandpass.We also present the range for a strong detection of H 2 O as in BARBIE1, with additional SNRs to 20, to provide context to the results and present the possibilities for dual or triple molecule detection.To show the full range, we shade out to the longest wavelength where detection is possible.
In Figure 5a we can see that O 3 is only detectable, whether weakly or strongly, with high SNR (≥ 14) data over a very narrow range.We can see that O 3 can be detected (albeit weakly) simultaneously with a strong detection of H 2 O at high SNR (≥ 17) with careful bandpass selection.However, the high SNR required for detection indicates that this would be costly in terms of observing time.Conversely, O 2 is strongly detectable at an SNR of 8, with the range between shortest and longest wavelength encompassing all O 2 features in the visible wavelength range.This detectability range overlaps significantly with the strong detection range of H 2 O.We can see that O 2 and H 2 O can be simultaneously observed with SNR = 10 with careful selection of wavelength and corresponding bandpass.This also allows for a range of possible SNR depending on the desired wavelength of detection -if a longer wavelength such as 0.83 µm is accessible, then SNR = 10 would allow for a dual detection, but if a short wavelength such as 0.68 µm is required, then a much higher SNR is required.Next looking to Figure 5b, we see the detectability change, with triple molecule detection possible from a shortest wavelength of 0.66 µm out to 0.7 µm at an SNR ≥ 13.O 2 remains strongly detectable beginning at an SNR of 8 as in Figure 5a; however, while the shortest wavelength of detectability for O 2 starts at approximately 0.69 µm with a bandpass of 20%, the shortest wavelength of detectability for O 2 with a bandpass of 30% starts at approximately 0.66 µm.When we look to Figure 5c, we see that detectability changes drastically, with triple molecule strong detection possible from a shortest wavelength of 0.625 µm out to 0.725 µm at an SNR ≥ 11.We can also see that once again, O 2 remains strongly detectable beginning at an SNR of 8 as in Figure 5a and 5b, the shortest wavelength of detectability for O 2 changes once more, starting at approximately 0.625 µm with a bandpass of 40%.Thus, although the SNR of strong detectability does not change, the minimum possible wavelength of strong detectability significantly decreases as a function of bandpass width.

Results for Varying Abundance Cases
At this point in our study, we shift to present our abundance case study, wherein we vary the abundance of O 2 below modern-Earth values and O 3 above modern-Earth values.To assess the trade-off between longer observations (i.e., higher SNR) and different concentrations of O 2 and O 3 , we varied the SNR on the observations for the full range of different VMRs per molecule.In Figure 6a, 6b, and 6c, we display the minimum SNR required to achieve a strong detection for each abundance of O 2 at 0.76 µm at 20%, 30%, and 40% bandpasses respectively.Looking first to Figure 6a, we notice that O 2 quickly requires mid-to high-SNR data to be strongly or weakly detected with even one order of magnitude decrease in abundance.In fact, many of the abundances in our simulation are not detectable at all in this SNR range.The Proterozoic abundances of O 2 (0.1% to 1% of modern Earth abundance) are not detectable at any SNR ≤ 20, and in fact will likely require an extremely high SNR to become detectable, as these values are two to three magnitudes lower than modern Earth abundance values.As seen in Figure 6b, varying the bandpass to 30% decreases the required SNR for detection across almost all abundances of O 2 .For instance, where at a 20% bandpass it requires an SNR of 19 to strongly detect O 2 at 2.1×10 −2 VMR, with a 30% bandpass the required SNR drops to 18.This is true for all detectable abundance cases except modern Earth values, which consistently requires an SNR of 8 for strong detection.In Figure 6c, we do not see a difference in required SNR for strong detection with a 40% bandpass.
While bandpass width makes little difference to the detectability of O 2 , it makes a significant difference when detecting O 3 .We can see in Figure 7 the large impact of the change in bandpass width.With a 20% bandpass centered on the 0.76 µm O 2 , we capture the entirety of the O 2 feature, along with the two smaller H 2 O features at 0.74 and 0.84 µm.When the bandpass is widened to 40%, we capture all off the above, along with a portion of both the O 3 feature that peaks at approximately 0.63 µm and the 0.9 µm H 2 O feature.In Figure 8a, 8b, and 8c, we display the minimum SNR required to achieve a strong detection for each abundance of O 3 at 0.64 µm with 20%, 30%, and 40% bandpasses respectively.In Figure 8a, we can see that O 3 is detectable, strongly and weakly, in the full range of our simulation values.At modern Earth abundances, the required SNR for a strong detection is high at 19, but it is possible to achieve a weak detection at SNR = 14.When we increase to high O 3 values, approximately 5x higher in abundance than modern Earth, the required    We note that although the 0.1% and 1% modern O2 abundances (i.e.mid-Proterozoic abundances) are in our study, they are not presented on the plot.We denote these abundances with a pink arrow.The VMR values are on the x-axis, with SNR on the y-axis.Strong detection is shown in purple squares, weak detection is shown in pink dots, the modern abundance range is highlighted with a light pink strip.
SNR drops drastically, with strong or weak detection requiring SNRs of 7 or 5 respectively.The largest decrease in required SNR occurs between modern Earth values (7×10 −7 VMR) and 1.25×10 −6 VMR, dropping steeply from a required SNR of 19 to 11 for a strong detection, and 14 to 9 for a weak detection.When we increase the bandpass width to 30% as in Figure 8b, we see the required SNR for detection drop across all abundance values.For instance, at 1.5×10 −6 VMR, the required SNR for strong detection at a 20% bandpass is 10, with a 30% bandpass the required SNR drops to 8.This is true for all values across the abundances cases, to varying degrees of severity.This happens once more when the bandpass is widened to 40% as in Figure 8c.Looking to the same example case of 1.5×10As previously discussed, in our previous simulations we varied one molecular abundance (for O 2 ) while the other parameters were held fixed to modern Earth-like values, thus we did not explore the relationship between varying both O 2 and O 3 and the resultant change in detectability.However, due to the photochemical relationship between the O 2 abundance and the production of O 3 , the abundances of both molecules would actually be linked in any atmospheric scenario.As found in Kozakis et al. (2022), the relationship between O 2 and O 3 varies for model atmospheres depending on the type of host star, and the linearity of the relationship also appears to vary.When looking to hotter host stars (e.g., a G2 star), peak O 3 abundance occurs at lower than modern Earth O 2 abundances due to the O 3 layer shifting in the atmosphere downwards to O 2 levels due to O 2 photochemical shielding.
This O 2 /O 3 relationship means that we can constrain the overall molecular oxygen abundance by detecting either of the species; O 3 is in fact a highly sensitive tracer of lower abundances of O 2 .In order to explore the impact of this, we examined several scenarios where we varied our parameter values as coupled parameters, i.e. 10% PAL O2/105% PAL O3, etc., consistent with the results for a Sun-like star from Kozakis et al. (2022) (as seen in Table 3).We present our detectability results for these abundance pairs in Figure 9.In Figure 9a-c 5.In Figure 9d-f, we present the O 3 results with H 2 O at 20%, 30%, and 40% bandpasses at 105% PAL and 110% PAL of O 3 .We can see that detectability does not shift significantly for O 3 , as our values do not vary greatly from a modern-Earth like value.However, one notable difference is that there is a larger range for strong O 3 detection at both 105% and 110% O 3 PAL with bandpass width 20% than the same bandpass width with 100% O 3 PAL as in Figure 5.We also see that O 3 is detectable at an SNR of 17 rather than 19 as in the modern-Earth like results for a 20% bandpass.The detectability range of O 3 increases with increasing bandpass width, as expected following Figure 5.
Following this, we analyze the limiting molecule in double or triple molecular detection in Figure 10.We present 10% O 2 /105% O 3 PAL in Figure 10a-c  .Spectra with varied O3 abundances.We feature the highest abundance (3×10 −6 ), the mid point abundance (1.5×10 −6 ), and lowest abundance in the study (7×10 −7 ) (modern Earth), respectively.All spectra are binned at R=140.We also present a 20% and 40% bandpass range width, in light grey and light blue respectively.We present 50% O 2 /110% O 3 PAL in Figure 10d-f.We can see that, as in the modern Earth case, dual detection of H 2 O and O 2 is always possible at all bandpasses.The same is not true for H 2 O and O 3 however, with dual strong detection possible at only a single point in the 20% bandpass, and with a wider range possible at 30%.At 40%, there is no dual detection of H 2 O and

DISCUSSION
The primary utility of the retrieval results presented here is to help understand how the impact of O 2 and O 3 abundance affects the optimal strategy for detecting the presence of an atmosphere with water and oxygen with spectroscopic bandpasses in the visible-light spectral region.Here we discuss the major conclusions from this work, as well as some of the limitations to the study that may impact these conclusions.
The most important result is the role that the width of the bandpass plays.When examining Figures 6 and  8, we can see the notable difference in how a widening bandpass changes the detectability of molecules across abundance and type; where O 2 detectability does not vary significantly as a function of bandpass across abundance values, O 3 changes significantly as the bandpass widens.This is likely due to the fact that the O 2 feature is narrow and deep, thus making it easily detectable at high abundance values, but easily captured in a smaller bandpass width, whereas O 3 has a very broad feature thus the widening of the bandpass allows for a stronger continuum and significant change in detection as a function of bandpass.O 3 also does not have any strong features in the 0.515 -1 µm region, leading to inherent difficulty in detection in this range.However, if we are able to observe the strong O 3 band at 0.36 µm, detection would become stronger at lower SNR data.Recent works, such as by Damiano et al. (2023), have investigated this principle and come to the conclusion that even with impacts of possible confusion due to SO 2 and other species, O 3 detection is indeed easier at lower SNR and lower resolution data.Damiano et al. (2023) found that with spectral resolution of 7, and SNR = 10, O 3 can be detected in the UV at sub-PAL VMRs; they also note that observing additional signatures of habitability in the NIR region is crucial to interpreting O 3 detections at UV and VIS wavelengths.We will examine these questions further in future work addressing the complete UV, VIS and NIR wavelength regions.
A wider bandpass also better enables detections of one or more molecules with a single bandpass -in particular, the detection of O 3 with other molecules.In Figure 5, we can see that for modern Earth abundances a bandpass width of ≥ 30% would allow for a detection of H 2 O at slightly shorter wavelengths (long-wavelength cutoff of ∼0.95 µm versus 0.99 µm for a 20% bandpass) and also a joint measurement of a strong O 2 detection and at least a weak O 3 detection with SNR = 8-9 versus only O 2 (which could help to limit false positives and motivate additional observations).Furthermore, Figure 10 shows that a 40% bandpass width enables a significant improvement in the ability to detect all three species with a single bandpass at around 0.72 µm; this could either be used alone as a first reconnaissance bandpass or could be used as the followup to a longer-wavelength search for H 2 O alone at low SNR.The optimal choice between these two options depends significantly on whether a planet is detectable at longer wavelengths (due to IWA) and whether the exposure time required for longer-wavelength measurements is significantly impacted by instrument sensitivity; we leave these instrument-and target-dependent considerations to future work.
The second important result is that the abundance versus SNR results in Figures 6 and 8 show a non-linear increase in the required SNR at smaller abundances.Combined with the fact that SNR is linearly dependent on the square of exposure time, this suggests that our exposure time will be extremely sensitive to the limiting abundance that must be detectable.Figure 10 further demonstrates this, showing that when O 2 drops from 50% PAL (12% VMR) to 10% PAL (2% VMR), it essentially becomes undetectable.This motivates a deeper analysis of whether detecting O 2 at visible wavelength should be a high priority in the progression of measurements if there is a high likelihood of a non-detection for even a moderate abundance, compared with a measurement of more sensitive markers of atmospheric abundance in the UV or NIR.
We note that high O 3 abundances are challenging to model, in that it takes a very small amount to completely overwhelm the spectrum.Spectra with high O 3 have continua near zero, which also causes the error value to increase substantially, thus high values of O 3 must be handled with care.At high abundances, the error value increase can lead to poorly constrained retrievals using our grid.Susemiehl et al. (2023) found that this occurs at high O 3 (log 10 O 3 > −5) concur-rently occurring with low P 0 (log 10 P 0 < −1.7) using our grid.Our simulations go no higher than log 10 O 3 ≤ −6 and therefore avoid this degeneracy and resulting poorly constrained retrievals.
As discussed in BARBIE1, there have been similar prior works that investigated the relationship between SNR and detectability, specifically Feng et al. (2018) and Damiano & Hu (2022).We explore the differences in retrieval parameterization structure in BARBIE1, however even with the differences in techniques and analysis, our results for O 2 and O 3 detectability are in agreement.An SNR of 10 at R = 140 is sufficient to firmly constrain the abundances for an Earth-twin atmosphere with O 2 and O 3 .Our work also explores the influence of varying bandpass width on molecular detection, which varies from prior work, and thus presents a new analysis of detection possibilities.

CONCLUSIONS & FUTURE WORK
To summarize, by investigating bandpass width in tandem with SNR, we can see that detectability is intrinsically linked to both factors as an additional function of molecular type.The ability to properly prioritize and select the best combination of parameters can drive efficient observing practices.By understanding the SNR requirements for strong detection of O 2 and O 3 and also understanding which bandpass width could result in simultaneous strong detection of O 2 , O 3 , and H 2 O, we can properly prioritize the best combination of bandpass and required SNR for detection, thereby informing the best options for instrument design trades and observing procedure.O 3 is most easily observable using a 30 or 40% bandpass width at shorter wavelengths with mid-low SNR data.With a 20% bandpass width, O 3 is difficult to detect, requiring SNR ≥ 19 at modern-Earth values.O 2 is not significantly affected by bandpass width, and consistently requires an SNR of ≥ 8 to be strongly detected at modern-Earth values at 0.76 µm and shorter.It is not detectable at SNR ≤ 20 at Proterozoic era abundances.Since the coronagraph and instrument capabilities may dictate that the best observation occurs at shorter wavelengths, O 2 and O 3 are well within the most optimized instrument capabilities.O 2 and O 3 have strong geochemical markers that provide a depth of knowledge to potential Earth observations, and a heightened ability to constrain the observed Earth-like epoch.
By also investigating coupled atmospheric abundances of O 2 and O 3 , we can study how detectability of these molecules vary with the other parameter.Allowing parameters to vary with each other following leading coupled photochemistry and atmospheric models, we can prioritize the optimal bandpass width for more realistic exo-Earth simulations.
In future work, we plan to build a new spectral model grid that includes a wider range of molecular species, and we will extend to shorter and longer wavelengths than the visible range.Specifically, we will cover the same wavelengths as the proposed coronagraph instrument for the Habitable Worlds Observatory (Juanola-Parramon et al. 2022), including UV, Optical, and NIR.This will allow us to study more biosignatures and molecular species of interest, and represent the physical chemistries more accurately.This will allow us to expand our simulations of possible observations and establish best observational practices for exoEarth observations using next generation telescopes.We will also develop a PSG module to display all detectability information in a publicly accessible format.We note that SNR results may be subject to minor changes following grid reconstruction due to PSG radiative transfer upgrades.We will also include a full exoplanet yield calculation using the data contained within BARBIE1 and BARBIE2 to give an educated baseline for detection of biomarkers.The interpolation metric used in this work is a trilinear interpolation scheme which can be used on 3D grids as in this work, but in future works we will investigate if another interpolation metric, such as inverse distance weighted interpolation or multiplicative weight interpolation, can minimize interpolation error in grid-based retrievals.
We will also conduct photochemically self-consistent studies in order to inform and broaden the results for future retrieval studies using grids.The expansion of grid parameter space will allow for the possibility of chemically consistent modeling, combined with input from updated photochemical models to ensure self-consistent atmospheric gas compositions with retrievals.This will be particularly important for considerations of gas detections for planets around different star types, given the impact star type has on the photochemistry of planetary atmospheres (Segura et al. 2005(Segura et al. , 2010;;France et al. 2012).
N. L. gratefully acknowledges financial support from an NSF GRFP.N.L. gratefully acknowledges Dr. Joesph Weingartner for his support and editing.N. L. also gratefully acknowledges Greta Gerwig, Margot Robbie, Ryan Gosling, Emma Mackey, and Mattel Inc.™ for Barbie (doll, movie, and concept), for which this project is named after.This Barbie is an astrophysicist!The authors also gratefully acknowledge conversation with Dr. Chris Stark regarding exoEarth yields and instrument design.The authors would like to thank the Sellers Exoplanet Environments Collaboration (SEEC) and Ex-oSpec teams at NASA's Goddard Space Flight Center for their consistent support.MDH was supported by an appointment to the NASA Postdoctoral Program at the NASA Goddard Space Flight Center, administered by Oak Ridge Associated Universities under contract with NASA.

Figure 2 .
Figure 2. Results of the fiducial case study for O2, but showing only a narrow range of SNR values.The y-axis shows the abundance values for each molecule in log scale, with the true value is shown with the black dashed line (in this case, a modern Earth-like O2 value of 0.21 VMR).Each dot represents a bandpass center, with the pink line portraying the median retrieved values of O2 abundance for that bandpass, and the gray shaded region representing the upper and lower limits for the 68% credible region.Increased retrieval certainty can be seen where the gray regions narrow.Each point is colored to indicate varying detection strength.Unconstrained regions (lnB < 2.5) are shown in light pink dots, weak detections (2.5 ≤ lnB < 5) are shown in dark pink diamonds, and strong detections (lnB > 5) are shown in purple squares.We present the range at which detectability materially changes for O2.

Figure 3 .
Figure3.Results of the fiducial case study for O3 but focused on a specific narrow SNR range.We show the SNR range at which detectability materially changes for O3.All facets of the plot remain the same as in Figure2.We notice that for SNR=20, 3 bandpasses achieve a strong detection but none of these bandpasses correctly constrain the abundance of O3 within the 68% credible region.We present further investigation on this in Figure4.

Figure 4 .
Figure4.Corner plots for O3 at 0.64 µm at SNR=20 to portray the lack of continuum and its impact on O3 abundance retrieval.We also show a corner plot for SNR = 200 with all parameters set to exact grid points to portray the interpolation error in As.The 68% credible region is shown as pink shading in the 1D marginalized posterior distributions along the diagonal of the corner plot, and the true values are represented by black lines in the diagonals of the corner plot, and black stars within the 2D plots.It is clear that the error in AS is due to a combination of parameter degeneracy and interpolation error in the grid-derived forward models.

Figure 5 .
Figure5.In all panels, the shortest bandpass center at which one can achieve a strong detection for O2, or a strong or weak detection for O3 are shown in purple triangles, dark purple diamonds, and light pink dots respectively.The strong detection range for H2O is presented in pink squares, to provide context for dual or triple molecule detection.The range between the shortest and longest bandpass center at which detection is possible is filled in.We present the range for a strong detection SNR is on the y-axis, and the bandpass centers are on the x-axis.Note that the last bandpass center for strong detection is shaded out to the long edge of the bandpass, to maintain ability to directly compare the panels.
above is the detectability of the 0.74 µm O2 feature spanning multiple abundances and SNRs with a 20% bandpass.(b)Same, but a 30% bandpass.(b)Same, but a 40% bandpass.

Figure 6 .
Figure 6.Shown above is the lowest SNR values for strong detection of the 0.74 µm O2 feature as a function of O2 abundance.We note that although the 0.1% and 1% modern O2 abundances (i.e.mid-Proterozoic abundances) are in our study, they are not presented on the plot.We denote these abundances with a pink arrow.The VMR values are on the x-axis, with SNR on the y-axis.Strong detection is shown in purple squares, weak detection is shown in pink dots, the modern abundance range is highlighted with a light pink strip.
(a) Shown above is the detectability of the 0.63 µm O3 feature spanning multiple abundances and SNRs with a 20% bandpass.(b) Same, but a 30% bandpass.(b) Same, but a 40% bandpass.

Figure 8 .
Figure 8. Shown above is the lowest SNR values for strong detection of the 0.63 µm O3 feature as a function of O3 abundance.The VMR values are on the x-axis, with SNR on the y-axis.Strong detection is shown in purple squares, weak detection is shown in pink dots, and the modern abundance range is highlighted with a light pink strip.detectability thus it is highly likely to strongly detect H 2 O.In all bandpasses, O 2 is the limiting molecule for dual or triple detection in both SNR and wavelength.In terms of dual H 2 O/O 3 detection, O 3 is the limiting molecule in wavelength, but H 2 O is the limiting molecule in SNR.

Figure 9 .
Figure 9.The shortest bandpass center at which one can achieve a strong detection for O2 or O3 at varying present atmospheric levels (PALs) for a specific SNR are shown in purple triangles and dark purple diamonds respectively.The range between the shortest and longest bandpass center at which detection is possible is filled in.The strong detection range for H2O is presented in pink squared to provide context to the results.O 3 , however there is a large range (approximately 0.12 µm in width) wherein triple detection is possible from SNR ≥ 11.At 30% and 40% bandpasses there are also areas where dual detection is possible with O 2 and O 3 at mid-high SNR and short wavelengths.With a 20% bandpass, H 2 O is the limiting molecule for dual H 2 O/O 2 detection in both SNR and wavelength.At a 30% bandpass, H 2 O is still the limiting molecule for dual H 2 O/O 2 detection and triple H 2 O/O 2 /O 3 detection in SNR, however O 3 is the limiting molecule in wavelength.When looking at dual H 2 O/O 3 detections, H 2 O is the limiting molecule in SNR and wavelength, mimicking the dual H 2 O/O 2 detection.At a 40% bandpass, O 3 is the limiting molecule in wavelength for a triple H 2 O/O 2 /O 3 detection, and H 2 O is the limiting molecule in SNR for a triple H 2 O/O 2 /O 3 detection.We then have the same

Figure 10 .
Figure10.Shown above are the regions of single, dual, and triple molecular detection as a function of wavelength.wavelength is presented on the x-axis, SNR is presented on the y axis.The hatched regions symbolize regions with single molecule detection, shaded regions symbolize only the overlapping areas between dual molecular strong detection zones, and the black outlined shaded regions symbolize the overlapping areas between triple molecular strong detection.We present the 10% O2 PAL and 105% O3 PAL case in the first row, the 50% O2 PAL and 110% O3 PAL case in the second row, and the 100% O2 PAL and 100% O3 PAL case in the last row.For all cases, we present a 20%, 30%, and 40% bandpass width.

Table 1 .
The above values were used in our simulations for O2, moving from Earth-like values into different epochs of Earth's history.† Modern Earth-like value.

Table 2 .
The † Modern Earth-like value.

Table 3 .
Kozakis et al. (2022)um values for the chemically consistent retrievals, following the Atmos values found inKozakis et al. (2022)for O2 and O3 , we present the O 2 results with H 2 O at 20%, 30%, and 40% bandpasses at 10% PAL, 50% PAL, and 75% PAL of O 2 .We can see that at 10% PAL, O 2 detectability is unlikely, requiring a minimum SNR of 19 at a 20% bandpass, and SNR of 18 at 30% and 40% bandpasses.However, between 50% PAL, 75% PAL, and the previously presented 100% PAL of O 2 , there are little difference, with slight variance in SNR (e.g. from requiring SNR of 8 for 100% PAL O 2 to requiring SNR of 10 for 50% PAL O 2 ) and little change across bandpass width as in Figure . We can see that detection of O 2 dually or triply with H 2 O or O 3 is difficult.At a 20% bandpass, H 2 O and O 2 are dual detectable only at an SNR ≥ 19.At a 30% bandpass, triple H 2 O/O 2 /O 3 detection is possible at an SNR ≥ 18 in a very narrow wavelength range (approximately 0.66 to 0.7 µm), and dual H 2 O/O 2 detection is also possible at an SNR ≥ 18, while dual H 2 O/O 3 detection has a wider range of detectability, from SNR ≥ 14, and from approximately 0.63 to 0.7 µm.At a bandpass of 40%, the range for triple detection increases in wavelength, to approximately 0.63 to 0.76 µm.The range for dual H 2 O/O 3 detection also increases, starting from SNR ≥ 11 and from 0.63 to 0.75 µm.At all bandpasses, H 2 O has a large range in wavelength and SNR of single H 2 O/O 2 detection.In the 40% bandpass, once again the results mimic the 50% O 2 /110% O 3 PAL counterpoint case shown in Figure 10f.The differences shown are a smaller region of triple H 2 O/O 2 /O 3 detection and dual O 2 /O 3 , resulting in larger single O 2 and dual H 2 O/O 2 detection regions.This also results in a smaller single H 2 O detection region.In these cases, the limiting molecules are the same as above: At a 20% bandpass, H 2 O is the limiting molecule for dual H 2 O/O 2 detection in both SNR and wavelength.At a 30% bandpass, H 2 O is the limiting molecule for dual H 2 O/O 2 detection and triple H 2 O/O 2 /O 3 detection in SNR, however O 3 is the limiting molecule in wavelength.At a 40% bandpass, O 3 is the limiting molecule in wavelength and SNR for a triple H 2 O/O 2 /O 3 detection.