Retrievals Applied to a Decision Tree Framework Can Characterize Earthlike Exoplanet Analogs

Exoplanet characterization missions planned for the future will soon enable searches for life beyond our solar system. Critical to the search will be the development of life detection strategies that can search for biosignatures while maintaining observational efficiency. In this work, we adopted a newly developed biosignature decision tree strategy for remote characterization of Earthlike exoplanets. The decision tree offers a step-by-step roadmap for detecting exoplanet biosignatures and excluding false positives, based on Earth’s biosphere and its evolution over time. We followed the pathways for characterizing a modern-Earth-like planet and an Archean-Earth-like planet and evaluated the observational trades associated with coronagraph bandpass combinations of designs consistent with the Habitable Worlds Observatory precursor studies. With retrieval analyses of each bandpass (or combination), we demonstrate the utility of the decision tree and evaluate the uncertainty on a suite of biosignature chemical species and habitability indicators (i.e., the gas abundances of H2O, O2, O3, CH4, and CO2). Notably for modern Earth, less than an order of magnitude spread in the 1σ uncertainties was achieved for the abundances of H2O and O2, planetary surface pressure, and atmospheric temperature, with three strategically placed bandpasses (two in the visible and one in the near-infrared). For the Archean, CH4 and H2O were detectable in the visible with a single bandpass.


INTRODUCTION
We as a science community are approaching a new horizon of scientific discovery for exoplanets.With the thousands of exoplanet detections that have been made to date, we are uniquely positioned to take the next steps toward exoplanet characterization and evaluating the potential for life to exist outside our solar system.The outcome of these efforts will depend on our ability to identify planets with habitable conditions and to remotely analyze them for signs of life.The recent Decadal Survey on Astronomy and Astrophysics 2020 recommended a spacebased ∼6m telescope operating at near infrared/optical/ultra-violet wavelengths as a means to search for Earthlike planets orbiting nearby sun like stars and characterize their atmospheres with direct imaging and spectroscopy (National Academies of Sciences, Engineering, and Medicine 2021).While this mission, dubbed The Habitable Worlds Observatory (HWO) by NASA, may not launch until the late 2030s or early-mid 2040s, now is the time to develop life detection strategies such that these science driven objectives can inform the instrumentation and architecture of future exoplanet characterization missions like HWO.This is especially important given that the search for life will likely drive multiple challenging telescope architecture and instrument decisions (e.g., coronagraph design).
To conduct a search for life via spectroscopic observations, we can extract planetary contextual information from exoplanet spectra and search for biosignatures.This can generally be done through atmospheric retrievals, which are inference analyses that statistically determine atmospheric states able to reproduce the spectral data taking into account sources of uncertainty, for example, from observational noise (Madhusudhan 2018).Coronagraph bandpasses may be limited to ∼ 10-20% (Roberge & Moustakas 2018).Therefore, spectrally characterizing potentially Earth-like planets via reflected light coronagraphy with HWO may require a piece-wise approach in which we observe discrete portions of the planet's spectrum at a given time.The placement of these bandpasses and determining the observational requirements necessary to make biosignature detections is crucial for developing a robust search strategy and making informed instrument recommendations.Community endeavours like the Confidence of Life Detection (CoLD) Scale (Green et al. 2021) and the Biosignature Standards of Evidence workshop (Meadows et al. 2022) have laid the groundwork in the context of generating community discourse on the topic and ideas for reporting a level of confidence associated with a potential life detection.Additionally, prior studies have explored the aspect of planetary and atmospheric inference studies as a function of signal-to-noise ratio (SNR) and resolution (e.g., Madhusudhan & Seager 2009;Benneke & Seager 2012;Line et al. 2013;Feng et al. 2018;Barstow et al. 2020).However, there is work to be done to connect observational strategies for life detection to observational considerations for specific facilities like HWO.
Here, we present a newly developed decision tree framework that organizes the workflow of observations and outlines bandpass choices to conduct a search for Earth-like atmospheric biosignatures based on Earth through time.We test the feasibility of this framework by calculating how well it can constrain key atmospheric species, despite only covering part of the planet's spectrum, for modern and Archean Earth as examples of types of exoplanets we may someday observe.Specifically, the decision tree considers bandpasses placed to observe key gases for the characterization of habitability and search for biosignatures: atmospheric water vapor (H 2 O), molecular oxygen (O 2 ), ozone (O 3 ), methane (CH 4 ) and carbon dioxide (CO 2 ).H 2 O is required by all life as we know it and remotely detecting it in an exoplanet atmosphere can help identify the habitable targets amongst a broader set of terrestrial planets (e.g., Mottl et al. 2007).O 2 is a byproduct of the dominant metabolism on modern Earth, oxygenic photosynthesis, and a key biosignature that will be sought on exoplanets (e.g., Meadows et al. 2018).O 3 can provide UV shielding from stellar activity that can be potentially harmful to organic life, but is also a photochemical byproduct of O 2 and can remain detectable at lower O 2 abundances (Schwieterman et al. 2018).Finally, CH 4 is included because it is the main product of biogenic processes like methanogenesis and has few false positive mechanisms in the context of terrestrial planet atmospheres (e.g., Wogan et al. 2020).Additionally, CH 4 detected alongside CO 2 could provide chemical context for a more oxidizing environment that would necessitate a substantial CH 4 flux in order to maintain its presence in the atmosphere (e.g., Krissansen-Totton et al. 2018;Arney et al. 2018;Thompson et al. 2022).Atmospheric pressure is included in the decision tree to rule out a low pressure atmosphere false positive for O 2 that can leave interpretation of O 2 in absence of a detection of CH 4 ambiguous for modern Earth (Wordsworth & Pierrehumbert 2014).
The decision tree includes observational pathways tuned to detecting each of the aforementioned biosignature chemical species, and also has built in logic to account for potential false positive scenarios.We studied two pathways (modern Earth and Archean Earth) outlined in the decision tree as a first step towards testing the utility of this strategy for characterizing observationally distinct Earth-like biospheres.We chose modern Earth because it is the most well studied inhabited environment and a common point of reference in many exoplanet habitability studies.In contrast, the Archean was an era of Earth's geological history spanning roughly 4.0 to 2.5 billion years ago (Gyrs) that was representative of an anoxic, (but still inhabited) environment highly distinct from modern Earth.Observationally, Archean Earth would have looked substantially different from modern Earth and in the decision tree we reflect that with the observations that are prioritized for each planetary case.Examining both of these pathways allows us to conduct robust trade studies using simulated observations and retrievals to guide instrument development in the context of future exoplanet direct imaging and characterization missions.

METHODS
We performed a series of atmospheric studies to test a proposed observational strategy to characterize potentially Earth-like exoplanets with the HWO.The main goal of this study was to assess the general feasibility of the decision tree framework, and highlight key trade-offs that could be used to inform the development of HWO.The proposed decision tree framework outlines a step-by-step workflow of observations, which are prioritized based on searching for Earth-like habitability markers, atmospheric biosignature species, and ruling out potential false-positives.Using the retrieval model, we executed each level of the decision tree strategy and made key atmospheric inferences.

The Decision Tree Observational Strategy
This observational strategy offers a top level guideline for detecting exoplanet biosignature species and excluding false positives based on our current best understanding of Earth's biosphere and its evolution throughout geologic history.Outlined below (Figure 1) is a cartoon diagram of our adopted decision tree showing the step-by-step process for characterizing Earth-like biosignatures.While this decision tree is tailored for the search for biosignatures on exoplanets analogous to Earth history, future iterations of HWO decision trees could be broadened to include other types of planets, including non-habitable ones.
Figure 1.Cartoon schematic of the decision tree framework.The modern Earth and Archean Earth paths that are highlighted in blue and orange respectively, were selected for exploration in our retrieval simulations as a first look at how this framework could work in observations of exoplanets.In general, the purple, green, and red boxes represent wavelength regions in the UV, visible, and NIR channels respectively.In practice, observations following the framework would be performed from the top to the bottom where at the base, observational targets could be categorized into the respective planetary scenarios.Ambiguous planetary scenarios are in dark grey, dead rocky cases are in red, Modern Earth-like cases are in blue, Proterozoic Earth-like cases are in pink, and Archean Earth-like cases are in orange.
In practice, the order of observations for a given high priority target would be executed from top to bottom where the color coding for each layer of the decision tree represents the following coronagraph channels split by wavelength ranges: visible from 0.4 -1.0 µm (green), near infrared from 1.50 -1.80 µm (red), and ultra-violet from 0.2-0.4µm (purple).In practice, those observations that utilize different coronagraph channels could be performed in parallel.Hypothetically, the search would start with H 2 O to establish the possibility of habitability.In that same observation that searches for H 2 O, CH 4 could also be observed if it were present at abundances higher than modern Earth and more akin to Archean Earth with specifics to be determined by future work.If water or both species were to be detected, then one could move on to the next set of observations outlined in the decision tree.For example, if water vapor was detected but methane was not, one could perform a search for molecular oxygen with the 0.68-0.81µm bandpass, as this would suggest a planet deficient in CH 4 and perhaps similar to modern Earth.If O 2 were detected, one could next search for CH 4 in the NIR to help rule out biosignature false positive scenarios for O 2 and search for an additional line of evidence for life.If CH 4 were not detected, obtaining constraints on atmospheric pressure are a useful way to rule out a low pressure atmosphere false positive scenario for O 2 .Alternatively, if methane was detected along with the initial water vapor observation, a search for CO 2 could be conducted in the 1.50-1.80µm bandpass to provide geophysical context for interpreting the CH 4 as a biosignature (or not), and this bandpass additionally has a CH 4 absorption feature that would improve the CH 4 constraint.Each observation could be performed sequentially until an end branch is reached in the decision tree.Those end branches then indicate planets of multiple types which could range from dead and rocky, to observationally ambiguous, to a potential biosignature detection characteristic of an era of Earth's history.
In this work, we focused on simulating habitable Earth-like planets around a sun-like star and use the decision tree to characterize both a modern Earth-sun twin and an Archean Earth-sun twin.However, the decision tree could be adapted to consider potential Earth-like planets around other host stars as well.For example, a modern Earthlike planet around an M dwarf or late K dwarf host star may produce more substantial amounts of CH 4 given the different photochemistries driven by the different UV spectra of these stars (e.g., Segura et al. 2005;Arney 2019).With detectable levels of CH 4 , O 2 , H 2 O, and an inferred global surface pressure above 0.1 bar, one could still classify that planet as modern Earth-like, given knowledge of these stars' photochemistries, but the optimal order of observations in a decision tree framework might differ.We focus on the G-type scenario here as a starting point, but developing iterations of the tree for other stars will be an important avenue for future work.

The Retrieval Model
The open-source rfast model (Robinson & Salvador 2023) was adopted for the atmospheric retrievals.It incorporates a radiative transfer forward model, an instrument noise model, and a retrieval tool to facilitate "fast" investigations of exoplanet atmospheric remote sensing scenarios.The radiative transfer forward model can simulate both 1-D and 3-D views of an exoplanet in (1) reflected light, (2) emission, and (3) transit, and the atmospheric chemical and thermal state (including profiles of cloud properties) are the forward model inputs.The incorporated noise model is based on Robinson et al. (2016), and provides wavelength-dependent noise estimates for a variety of observing scenarios.Finally, the retrieval package employs emcee, a Bayesian sampling package (Foreman-Mackey et al. 2013), which calls the aforementioned radiative transfer model while mapping out the posterior distribution for the atmospheric and planetary parameters used to fit a noisy observation.
In applications of the atmospheric retrieval model presented below, flat priors are used to bound parameter space and the likelihood function is given by, where χ 2 is computed in the standard fashion given a set of simulated datapoints, associated Gaussian uncertainties, and a model prediction.The retrievals assume an isothermal temperature profile (representative of a atmospheric column-averaged temperature) with constant profiles assumed for the species mixing ratios.The decision tree framework is not formally Bayesian and our atmospheric retrieval approach simply maximizes the likelihood function within the bounds of the imposed flat priors.

Modern Earth & Archean Earth Atmospheric Modeling
Self-consistently calculated atmospheres were generated using the photochemical model incorporated as part of the Atmos tool (Arney et al. 2016;Arney et al. 2017;Teal et al. 2022).Atmos is a coupled 1-D photochemical-climate model that uses planetary inputs (e.g., chemical species mixing ratios, chemical species fluxes, gravity, and stellar spectrum) and associated chemical reactions to calculate the steady-state atmospheric profiles of gases, hazes (if applicable), pressure, temperature etc.In both the modern Earth and Archean Earth atmospheric scenarios, the default solar spectrum was used to model Earth-like cases relevant to reflected light observations (Thuillier et al. 2004).The simulated modern Earth atmospheric case assumed a total fixed surface pressure of 1 bar.The atmospheric state we simulated for the Archean Earth-like scenario represents a haze free case and was modeled with a CO 2 mixing ratio of 4% and a CH 4 /CO 2 ratio of 0.075, which is well below 0.1 where hazes are expected to form (Trainer et al. 2006).Additionally, for the Archean Earth scenario, the solar spectrum was scaled to accurately reflect its wavelengthdependent insolation from 2.7 Gyrs ago (Claire et al. 2012).

Simulated rfast Observations
Reflected light observations in this study were modeled after the decadal direct imaging exoplanet mission concepts that will likely inform the future development of HWO.For the purpose of our study, we assume HWO's proposed exoplanet coronagraphic instrumentation will be based off the properties of the coronagraphs studied in the Large-UV-Optical-Infrared (LUVOIR) and Habitable Exoplanet (HabEx) observatory concept studies (Roberge & Moustakas 2018;Gaudi et al. 2018).This coronagraph instrument would include three separate coronagraph channels, labeled as "ultra-violet" (with a wavelength range of 0.2-0.4µm), "visible" (0.4-1.0 µm), and "near-infrared" (1.0-1.8 µm).Over this range, the observed spectrum is split into smaller spectral bandpasses.It is possible to conduct simultaneous observations in two of the different observational channels (i.e., bandpasses in the visible and NIR channels could be observed in parallel, while two bandpasses in a single channel must be observed in series).In this study, each observational channel is modeled at resolving powers of 7, 140, and 70 for the UV, visible, and NIR respectively based on the LUVOIR design (Roberge & Moustakas 2018).We then assume 20% spectral bandpasses across each wavelength channel.Following the Modern Earth decision tree pathway (See Figure 1) we have three main observational bandpasses of interest.(1) The 0.89 µm bandpass at 0.81-0.97µm, which is capable of measuring H 2 O abundances at Earth-like levels, as well as CH 4 abundances higher than modern levels.(2) The 0.75 µm bandpass at 0.68-0.81µm, which was chosen to retrieve on atmospheric abundances of O 2 by encapsulating the strongest O 2 absorption feature in this range (the O 2 A-band) at 0.762 µm.(3) The 1.65 µm bandpass at 1.50-1.80µm, which can be used to constrain (but not necessarily measure) CH 4 and CO 2 abundances.We simulated observations using each singular bandpass and subsequently modeled observations of bandpass combinations all at a nominal SNR of 10.Table 1 and Table 2 of the Appendix include all the single and bandpass combinations that we simulated for characterizing modern Earth-like and Archean Earth-like atmospheric scenarios, respectively.Each observation consisted of a simulated noisy spectral dataset which was generated with randomized error bars and constant noise set at the shortest wavelength within a given bandpass combination.We adopted this formalism for modeling the observational noise from results outlined in the studies from the LUVOIR and HabEx decadal survey reports (Roberge & Moustakas 2018;Gaudi et al. 2018).

RESULTS
Retrieval results are analyzed for information inferred from spectral "pieces" of modern Earth-like (Section 3.1) and Archean Earth-like (Section 3.2) spectra comprised of 20% spectral bandpasses according to their respective pathways on the decision tree.The modern Earth pathway (Figure 1 blue path) included the 0.75 µm, 0.89 µm, and 1.65 µm bandpasses and the Archean Earth pathway (Figure 1 orange path) included the 0.89 µm and 1.65 µm bandpasses.The following subsections highlight constraints for species volume mixing ratio abundances for a number of gases related to habitability or a biosignature search: O 2 , H 2 O, CO 2 , O 3 , and CH 4 .It also includes the retrieved values for various atmospheric state parameters, including surface albedo, planetary mass/radius, surface pressure, and atmospheric temperature.

The Modern Earth Pathway
Figure 2 depicts the full spectrum of modern Earth which was retrieved on in pieces.Each spectral piece corresponds to a 20% bandpass indicated in the decision tree and labeled by the midpoint of each spectral swath.In the 0.75 µm region, the O 2 A-band feature is present at 0.76 µm, and an H 2 O feature is present at 0.72, µm.The 0.89 µm bandpass includes two prominent H 2 O features at 0.82 µm and 0.94 µm.Finally the 1.65 µm region contains weak CH 4 (at 1.65 µm) and CO 2 (at 1.6 µm) absorption features.The goal of this study was to determine the amount of planetary information that could be inferred from a given observation spanning one or a combination of these smaller spectral regions.The breadth of information that can be inferred from spectral information is strongly dependent on the singular bandpass choice (or bandpass combination).Figure 3 shows the retrieved marginal posterior distributions of gas mixing ratios for H 2 O (Figure 3a), CH 4 (Figure 3b), O 2 (Figure 3c), and CO 2 (Figure 3d).The retrievals show that observations gathering spectral information from all three bandpasses (labeled as "All three") provide the best constraints for each of these gases, and the majority of atmospheric parameters that were retrieved on (this is described in more detail below).Notably, the 0.75 µm & 0.89 µm bandpass combination provides excellent constraints on both the O 2 and H 2 O abundance and is second only to the "All three" combination in terms of the quality of the constraints.For CH 4 and CO 2 , it is extremely difficult to constrain these species with any of the bandpass combinations.At best, we obtain upper limit constraints on the abundances of each of these species.Most of the posteriors for CH 4 and CO 2 exhibit distributions with a significant drop off at a given value not dictated by the statistical prior of that parameter making these upper limit constraints.In Figure 3b, a vertical grey dashed line representing an Archean Earth-like estimate of 3.5 × 10 −2 for the CH 4 abundance (adopted from Robinson & Reinhard (2020)) is included and demonstrates that we can only rule out high CH 4 scenarios with the atmospheric retrievals.
Figure 4 outlines the marginal posteriors for the log of the surface albedo (A s ), the planetary radius (R p ), and planetary mass (M p ).We found that constraints on the surface albedo (Figure 4a) are strongly dependent on the bandpass (or bandpass combination) choice.For planetary radius (Figure 4b), we find the posteriors for all the observational cases exhibit distributions that are peaked closely to "truth" (i.e., the original input value) and have statistical tails extending toward higher values.Planetary mass (Figure 4c) goes unconstrained for all observational cases and is bound solely by the mass prior.Constraints on the surface pressure (P 0 ) and the characteristic atmospheric temperature (T 0 ) are shown in Figure 5.For surface pressure (Figure 5a), a majority of the distributions (most notably the "All three" combination and the 0.75 µm & 0.89 µm combination) have peaked distributions near truth and exhibit somewhat Gaussian behavior with tails on either side of the peak.In these retrieval analyses, the bandpasses were not selected to detect a Rayleigh slope for an Earth-like planet.However, pressure constraints can come from the widths of gas absorption bands due to broadening.In Figure 5b we also find that the constraints on atmospheric temperature are peaked near the surface temperature (288 K) and have extended tails toward higher temperatures for all the observational scenarios.The temperature constraints ultimately stem from temperature dependent opacities (mainly H 2 O).
Figure 6 takes Figure 3 and re-formats it such that the posterior distributions for all the atmospheric biosignatures are shown for an individual bandpass or combination of bandpasses in each subplot.This is to better highlight the bandpass combinations that may be optimal for detecting (or constraining) one or more biosignature species.For instance, the 0.89 µm bandpass (Figure 6b) provides a clear detection of H 2 O (filled purple distribution) but provides no constraint on the O 2 abundance (purple hatched distribution).In order to get constraints on both H 2 O and O 2 the 0.75 µm & 1.65 µm (Figure 6e), the 0.75 µm & 0.89 µm (Figure 6d), or the "All three" (Figure 6g) bandpass combinations would be required.

The Archean Earth Pathway
For the Archean Earth pathway on the decision tree, there are two main bandpasses of interest, the 0.89 µm bandpass and the 1.65 µm bandpass.This has fewer observations in the path, because in this case the "search for water" can occur in the same bandpass as the search for the primary biosignature, CH 4 .This feature is detectable at CH 4 concentrations higher than those present on modern Earth, and that may occur on biospheres similar to Archean Earth whose primary producers generated CH 4 in an oxygen-poor atmosphere.This feature might also be detectable in an oxygen-rich atmosphere around M and K dwarfs (Segura et al. 2005;Arney 2019), whose photochemisty allows for higher CH 4 abundance in the presence of oxygen; this will be treated in the decision tree framework in future studies.Figure 7 shows a noise-free Archean Earth spectrum generated from the aforementioned Atmos atmospheric profiles.The atmosphere used in our analysis does not include a haze, which may have been intermittently present during the Archean (e.g., Arney et al. 2016).The 0.89 µm and 1.65 µm bandpasses are color coded in the green and red shaded regions respectively and the proposed wavelength regions for each coronagraph channel are marked along the top axis.Absorption features of key species are labeled along with the spectral slope due to Rayleigh scattering.Within each of the two bandpasses there are CH 4 features at 0.87 µm and 1.27 µm, which both provide key CH 4 abundance information in the atmospheric retrievals.Additionally, there is a CO 2 absorption feature encapsulated in the 1.65 µm bandpass at 1.6 µm and H 2 O absorption features in the 0.89 µm bandpass at 0.82 µm and 0.94 µm.Figure 8 shows the resulting constraints on the H 2 O, CH 4 , and CO 2 abundances for the singular and combined bandpass observations.For CH 4 especially (Figure 8b), the overall constraints are much improved in comparison to the modern Earth case because of its increased abundance for this Archean Earth-like scenario.We also find here that the prominent CH 4 feature at 0.87 µm allows for the detection of CH 4 at the shorter wavelength bandpass compared to modern Earth.Because the CO 2 constraints are weaker compared to the H 2 O and CH 4 constraints with the nominal SNR 10 observation, we simulated an additional SNR 20 observation with the 1.65 µm singular bandpass.We find (Figure 8c) at higher SNR the retrievals produced a more sharply peaked posterior distribution for CO 2 (Figure 8c purple filled curve).H 2 O (Figure 8a) can be detected with the 0.89 µm & 1.65 µm bandpass combination or the singular 0.89 µm bandpass.Similar trends to the modern Earth scenario can be seen for the parameters in Figure 9 and Figure 10.However, the pressure constraints are notably weaker with an inability to rule out high pressure scenarios in comparison to the modern Earth results.Similar to Figure 6 for the modern Earth decision tree pathway, Figure 11 shows species abundance constraints for each singular bandpass observation, and for the combination of the two.In this atmospheric scenario, all oxygen bearing species (i.e., O 2 and O 3 ) go unconstrained at all observational bandpass combinations due to the very low relative abundances of these species during the Archean.In the 0.89 µm bandpass (Figure 11a), we can constrain both the CH 4 and H 2 O in this one bandpass.It remains challenging to extract CO 2 abundance information even from the 1.65 µm singular and 0.89 µm & 1.65 µm combined observations, which are the best case scenarios here, when at R=70.In the 0.89 µm & 1.65 µm case the CO 2 abundance posterior still exhibits a statistically significant tail toward lower values making the probability to prefer low CO 2 abundances just as likely as CO 2 detections at the peak (Figure 11c red curve).Better constraints on CO 2 can come from higher SNR, as demonstrated here, or higher spectral resolution.

DISCUSSION
Overall, our decision tree retrieval analyses for both Modern and Archean Earth indicate the best constraints on each planetary parameter are achieved when maximizing the amount of spectral data we retrieve on.In other words, simulations that retrieved on a highest number of combined bandpasses (e.g., the "All three" combination for modern Earth and the 0.89 µm & 1.65 µm combination for Archean Earth) generally provided more contextual information that thus led to higher quality constraints on a given parameter, even when a given gas did not have absorption features in all of the bandpasses being combined.By design, the observations for single bandpasses that contained absorption features for a given species also generally provided good constraints for that species.For example, the 0.89 µm bandpass does well at constraining the H 2 O abundance, because of the multiple H 2 O features to retrieve on in this wavelength range.However, when using that same bandpass on its own to retrieve the modern Earth O 2 abundance, it does poorly due to a lack of O 2 features within that wavelength range.
Stepping through the decision tree for the modern Earth case, H 2 O is able to be detected and even constrained to within an order of magnitude for the "All three" combination to 1-σ uncertainty (Table 1).We saw that for the 0.89 µm bandpass, including that weaker 0.82 µm H 2 O feature in addition to the stronger 0.94 µm feature is useful for better constraining the H 2 O abundance.This is because the depth of that weaker feature allows the retrieval model to rule out high H 2 O abundance scenarios, which are limited by the signal strength (or depth) of that feature.Given the low abundance of CH 4 for modern Earth, it is not detected in the initial bandpass observation.Obtaining H 2 O abundance constraints will inform interpretation of the actual habitability of a given potentially habitable target orbiting in the habitable zone of its host star.A non-detection of CH 4 in this first observation means the next observation should prioritize a search for O 2 as the planet is unlikely to be an Archean Earth analog.Using the "All three" bandpass combination, gas abundance constraints for O 2 are (to 1-σ confidence) within an order of magnitude of the true value.The 0.75 µm & 0.89 µm bandpass combination also provides excellent constraints on the O 2 abundance, allowing us to move on to the next level of the decision tree which is conducting a search for CH 4 and CO 2 in the NIR.Both CH 4 and CO 2 remained difficult to constrain for all bandpass combinations, which was to be expected.This is due to the low relative abundances of these species for a modern Earth-like atmospheric scenario, and the correspondingly small features caused by these gases in the assumed HWO wavelength range.From the CH 4 posteriors, we can infer an upper limit on the abundance, and rule out an Archean Earth-like scenario with a high CH 4 abundance to 1-σ level confidence for all the observational combinations (1-σ spreads shown in Table 1).While the retrievals only ruled out very high CO 2 concentrations on the order of ∼ 10% of the atmosphere, this can be useful in ruling out scenarios where CO 2 -dominated atmospheres lead to the abiotic photochemical production of O 2 or O 3 (Segura et al. 2007;Domagal-Goldman et al. 2014;Tian et al. 2014).The final observational priority is to constrain the atmospheric pressure.Atmospheric pressure is intentionally generalized in the decision tree without a specific wavelength range indicated.While one could tune a dedicated bandpass on the decision tree to encompass, for example, the Rayleigh scattering slope, we show here that pressure constraints can also be derived at various wavelengths from the widths of spectral features that are broadened due to pressure.The marginal posterior distributions for pressure rule out < 0.1 bar scenarios to 1-σ confidence for all the observational combinations except for the 0.75 µm singular bandpass (1-σ spreads shown in Table 1), helping rule out low pressure O 2 false positive cases.This indicates that the final categorization for the planet would be "modern Earth" as opposed to "Ambiguous" and an accurate characterization for our modern Earth abundance scenario.
Going through the decision tree for the Archean Earth case, observations would again start with the top of the tree, prioritizing a search for H 2 O and CH 4 in the visible.H 2 O remains detectable from the retrieval results with the 0.89 µm singular bandpass, but for CH 4 , the constraints are vastly improved in comparison to the modern Earth scenario.Since H 2 O and CH 4 were successfully detected, a search for CO 2 and additional CH 4 features in the NIR would be next to provide additional geochemical context to interpret the CH 4 detection.In our retrievals, the CO 2 was on the cusp of detection at nominal SNR 10 observations with a resolving power of 70.With initial observations, this planetary scenario could be placed in the Ambiguous planet category and flagged as a potentially key target for follow-up.We performed simulated follow-up observations at an increased SNR of 20 with the singular 1.65 µm bandpass.Those results yielded significantly improved abundance constraints on the CO 2 , which were able to better rule out low CO 2 abundance scenarios.In practice, observations at increased SNR, or increased resolving power could help strengthen the CO 2 inference such that this abundance scenario could be characterized as Archean Earth-like.
A notable pathway on the decision tree that will be further explored in future work is the Proterozoic Earth Pathway, which is key to contextualizing potential biosignature detections made in the UV wavelength range.In planetary scenarios where atmospheric O 2 is present but difficult to detect, the 0.255 µm bandpass (which extends from 0.23 -0.28 µm) incorporated in the decision tree would play an important role in characterizing less oxygenated planets.In these instances, making observations in the UV would be the only avenue to detecting oxygen bearing species (e.g., O 3 ) within the proposed HWO wavelength range.As a whole, this decision tree is designed to categorize planets based on Earth through time because Earth is our only example of a habitable and inhabited world.Thus, this decision tree is designed to provide preliminary classifications based on the best-studied biosignatures expressed by Earth over its history and flag planets for detailed follow-up.However, it is important to acknowledge that the current iteration of the decision tree may not be able to identify alternative biospheres different from Earth.Future work can expand or rework the current decision tree framework to test other hypotheses, including alternative biosignatures and their false positives.
In stepping through these pathways on the decision tree, our results validate the prioritization of the 0.89 µm bandpass at the top of the decision tree since we can potentially constrain water and a key biosignature species in a single observation.Obtaining abundance constraints on multiple species in one observation enhances our ability to establish planetary habitability (via an H 2 O detection) and perform more efficient preliminary exoplanet characterization studies of a given target.NIR observations with the 1.65 µm bandpass provide important contextual climate information through abundance inferences of key species like CO 2 , which could help guard against misinterpretations of Archean Earth-like and CO 2 dominated planets (Damiano & Hu 2022).Future telescope architectures also have the potential to leverage parallel observations (in contrast to sequential observations) for increased observational efficiency.For example, with the currently assumed divisions between the visible and NIR coronagraph channels, a search for H 2 O and/or CH 4 in the visible could occur simultaneously with a search for CH 4 and CO 2 in the NIR, which would be the ideal setup for classifying a potentially Archean Earth-like planet.Conversely, if the long wavelength cutoff for the visible channel were moved shortward such that the 0.75 µm and 0.89 µm bandpasses were in separate channels, then a search for H 2 O/CH 4 and O 2 could be done in parallel, which would be more useful for a modern Earth scenario.Future trade studies on optimizing the coronagraph channel wavelength ranges will be important to consider as mission development continues.
Another key result of this work was our ability to constrain various planetary parameters that did not have a dedicated observational bandpass in the decision tree.For both atmospheric cases, and all the simulated observational scenarios, characteristic atmospheric temperature (T 0 ) exhibited posterior distributions that were peaked near the surface temperature (288 K for modern Earth and 296 K for Archean Earth) with statistically significant tails extending toward high values.These constraints largely arise from water vapor opacities, which exhibit temperature sensitivity and prior work has shown similar trends (Young et al. 2023;Gomez Barrientos et al. 2023;Robinson & Salvador 2023).However, it is worth noting that the constraints found here are poorer than the aforementioned studies suggesting bandpasses encompassing H 2 O features in the NIR, which are not present in this iteration of the decision tree, are necessary if the aim is to improve the atmospheric temperature constraints further.In regards to other parameters, we also saw surprisingly good constraints on the planetary radius for both atmospheric cases and all the simulated observations produced posteriors peaked at 1 R⊕ with tails extending toward high values.These radius distributions all rule out (to 1-σ confidence) planets > 3 R⊕ (Table 1).Strong radius constraints in general can arise when the geometric albedo and/or observational phase angle are well-known.Our retrieval observations assume a fixed observational phase angle representative of what one would expect for a planet observed in quadrature/gibbous phase and is the most likely cause of the strong radius constraints.Additionally, we assume the the orbital distance of the planet is known and fixed at 1 AU which also leads to improved radius constraints.Taken together, this implies that if the scattering environment of the planet is well characterized along with the phase angle geometry and the orbital distance, that would provide sufficient information to provide constraints on planetary radius.While planetary mass was included as a retrieved parameter in these analyses, mass went unconstrained in all observational scenarios.This was to be expected given that mass (equivalent to surface gravity) has been seen to be difficult to constrain in reflected light (Feng et al. 2018).Mass estimates could be improved with astrometry or radial velocity measurements that could be incorporated into the mass prior of the retrieval.Additional parameters that where included in the retrieval but did not have dedicated bandpasses in the decision tree were the cloud parameters (cloud thickness, cloud top pressure, cloud optical depth, and cloudiness fraction).Constraining these parameters would be a crucial aspect to exoplanet characterization because they could indicate potentially Earth-like climatic states.Cloud optical depth for example, is best constrained in the NIR with the 1.65 µm bandpass or the "All three" combination (see Table 1 and Table 2).
The decision tree takes into consideration observations that may potentially rule out known false-positive scenarios.Mechanisms for generating O 2 abiotically, for example, are well studied in the literature (Harman et al. 2018;Domagal-Goldman et al. 2014;Luger & Barnes 2015;Wordsworth & Pierrehumbert 2014).On the decision tree, high O 2 abundances lead to the dead and rocky category to represent instances where large volumes of O 2 are being generated from extensive photodissociation of water vapor and that then leads to atmospheric escape of the hydrogen leaving the atmospheric O 2 behind.This false positive mechanism generally applies to planets orbiting M dwarfs since these stars typically generate higher UV fluxes in comparison to sun-like stars.Additionally, M dwarfs go through an extended super-luminous pre-main sequence evolutionary phase that can drive off volatiles and create high levels of O 2 from photodissociation of H 2 O (Luger & Barnes 2015).Another O 2 false-positive pathway in the decision tree is at the pressure branch.The 0.1 bar threshold for categorization represents an atmospheric regime in which a limited abundance of non-condensable species leads to a buildup of water vapor at high altitudes (due to lack of an atmospheric cold trap) and that H 2 O is then susceptible to photolysis (Wordsworth & Pierrehumbert 2014).
The decision tree retrieval analyses performed in this work are key precursor studies to the development of the upcoming Habitable Worlds Observatory and developing direct imaging observational strategies for exoplanet characterization in general.We have demonstrated the utility of the decision tree and the planetary parameter constraints that can be achieved with simulated nominal SNR 10 observations in a given 20% bandpass at fixed resolving power.The decision tree framework has laid out a step-by-step guide for how one might prioritize observations for character-izing Earth-like planets orbiting sun-like stars but one could also expand on or adapt the decision tree based on the scientific motivations for a given observation.

CONCLUSION
Future exoplanet characterization efforts will rely on our ability to carry out observations efficiently while also maximizing the contextual information derived from them.Coronagraph instrumentation is limited to observing fractions of a full 0.2-1.8µm spectrum at a given time, but critically, we show that meaningful atmospheric inferences can be performed regardless through strategically placed spectroscopic bandpasses.The observational decision tree framework presented in this work can provide a useful roadmap for categorizing Earth-like exoplanets in a minimal number of observations.We have demonstrated the utility of the decision tree using retrieval analyses following the pathways for both the modern Earth and Archean Earth.In both atmospheric scenarios, we found that not only can we achieve constraints on biosignatures and habitability markers prioritized within the decision tree, we can also constrain additional planetary parameters not outlined in the decision tree (e.g., planetary radius, global surface pressure, and atmospheric temperature).We simulated SNR 10 observations with every singular and combination of bandpasses for each atmospheric pathway and tabulated the quality of each parameter constraint for each combination.We found that the "All three" bandpass combination (consisting of the 0.75 µm bandpass from 0.68 -0.81 µm, the 0.89 µm bandpass from 0.81 -0.97 µm, and the 1.65 µm bandpass from 1.50 -1.80 µm) for modern Earth and the "0.89 µm & 1.65 µm" bandpass combination for Archean Earth provided the best constraints for each of the retrieval parameters.Overall, we show that the decision tree can be implemented with retrieval analyses and give meaningful insights to developing observational strategies for characterizing Earth-like exoplanets.
Table 2. List of Parameters and 1-sigma spreads for Archean Earth retrieval combinations *The CO2 abundance was not well detected in the SNR 10 simulated observations and we refrain from reporting their 1-σ intervals as they were likely subject to being influenced by the priors.

Figure 2 .
Figure 2. Simulated Reflected Light Spectrum of Modern Earth.Shown here is the reflected light spectrum of modern Earth with coronagraph wavelength channels indicated along the top for the UV (0.2-0.4 µm), the Vis (0.4-1.0 µm), and the NIR (1.0-1.8 µm).Individual 20% bandpasses are indicated according to the decision tree strategy with the 0.75 µm bandpass (0.68-0.81 µm) dark green shaded region, the 0.89 µm bandpass (0.81-0.97 µm) green shaded region, and 1.65 µm bandpass (1.50-1.80µm) red shaded region.The Rayleigh scattering slope and absorption features of key species are labeled in text along with an error bar scaling for an SNR of 10 relative to 0.68 µm.Note that atmospheric retrievals were performed on individual or a combination of these three bandpasses.

Figure 3 .
Figure 3. Posterior Distributions for Key Decision Tree Biosignature Species (modern Earth).a, Posterior distributions for the H2O abundance derived from simulated SNR 10 observations at various combinations of wavelength coverage.The "All three" combination encompasses the 0.89 µm bandpass, the 0.75 µm bandpass, and the 1.65 µm bandpass (grey filled).The singular observational bandpasses include the 0.75 µm bandpass (red), the 0.89 µm bandpass (blue hatched), and the 1.65 µm bandpass (yellow).The paired observations include the 0.75 µm & 0.89 µm bandpass pair (purple), the 0.75 µm & 1.65 µm bandpass pair (orange), and the 0.89 µm & 1.65 µm bandpass pair (green hatched).For CH4, O2, and CO2, black vertical dashed lines indicate the surface values given by the input atmospheric model.The input value for H2O (also indicated by a vertical dashed line) is taken to be the column average mixing ratio.b, The same as a but for the CH4 abundance.The grey vertical dashed line represents a modeled estimate for an Archean Earth-like abundance of CH4 (3.5 × 10 −2 ) from Robinson & Reinhard (2020) for comparison to the modern value.c, same as a and b but for the O2 abundance.d, same as a, b, and c, but for the CO2 abundance.

Figure 4 .
Figure 4. Posterior Distributions for Planetary Surface Albedo, Planetary Radius, and Planetary Mass.a, Posterior distributions for the log of the planetary surface albedo (As) derived from simulated SNR 10 observations and all of the decision tree combinations of wavelength coverage.The input values for each parameter are indicated by a black vertical dashed line.b, Same as a, but for the planetary radius (Rp).c, Same as a and b, but for the planetary mass (Mp).

Figure 5 .
Figure 5. Posterior Distributions for Global Surface Pressure and Global Temperature.a, Posterior distribution for global surface pressure of the planet derived from simulated SNR 10 observations and all of the decision tree combinations of wavelength coverage.For each parameter, black vertical dashed lines indicate the surface values given by the input atmospheric model.b, Same as a, but for temperature.Note that the retrieved spread would be representative of temperatures at the surface and throughout the troposphere.

Figure 6 .
Figure 6.Biosignature Abundance Constraints Broken Down by Observational Bandpass Combinations.a, Chemical species abundance constraints for the 0.75 µm bandpass.The abundance constraints for O2 (purple hatched), H2O (purple filled), CO2 (red), O3 (grey), and CH4 (yellow filled) are all shown.b, Abundance constraints for the same species but for the singular 0.89 µm bandpaas observation.c Constraints for the same species but for the singular 1.65 µm bandpass observation.d, Constraints for the same species but for the 0.75 µm & 0.89 µm bandpass pair.e, Constraints for the same species but for the 0.75 µm & 1.65 µm bandpass pair.f, Constraints for the same species but for the 0.89 µm & 1.65 µm bandpass pair.g, Constraints for the same species but for "All three" bandpasses combined (i.e., 0.75 µm & 0.89 µm & 1.65 µm bandpasses).

Figure 7 .
Figure 7. Modeled Haze free Archean Earth Spectrum.Shown here is the planet-to-star flux ratio vs wavelength and in black is the simulated spectrum for a haze free Archean Earth-like planet around the sun.The spectral bandpasses from the Archean Earth pathway on the decision tree are shown where the 0.89 µm bandpass is shaded in green and the 1.65 µm bandpass is shaded in red.The Rayleigh scattering slope and species absorption features are labeled in text.Also shown is an SNR 10 error bar scaling indicative of the simulated observations for each singular bandpass and combination.

Figure 8 .
Figure 8. Posterior Distributions for Key Decision Tree Species (Archean Earth Pathway).a, Marginal posterior distributions for the H2O abundance derived from the simulated SNR 10 observations covering two singular bandpasses, the 0.89 µm bandpass, the 1.65 µm bandpass, and a combination of the two.The posterior for the 0.89 µm bandpass is illustrated in the blue hatch distribution, the 1.65 µm posterior is the yellow distribution and the combination of the two is the green filled distribution.For CH4 and CO2, black vertical dashed lines indicate the surface values given by the input atmospheric model.The input value for H2O (also indicated by a vertical dashed line) is taken to be the column average mixing ratio.b, The same as a but for the CH4 abundance.c, The same as a, and b, but for the CO2 abundance.An additional retrieved posterior is included for a simulated SNR 20 observation with the 1.65 µm singular bandpass (purple filled).

Figure 9 .
Figure 9. Posterior Distributions for Planetary Surface Albedo, Planetary Radius, and Planetary Mass (Archean Earth Pathway).a, Posterior distributions for the log of the planetary surface albedo (As) derived from simulated SNR 10 observations and all of the decision tree combinations of wavelength coverage for the Archean Earth path.The input values for each parameter are indicated by a black vertical dashed line.b, Same as a, but for the planetary radius (Rp).c, Same as a and b, but for the planetary mass (Mp).

Figure 10 .
Figure 10.Posterior Distributions for Global Surface Pressure and Global Temperature.a, Posterior distribution for global surface pressure of the planet derived from simulated SNR 10 observations and all of the decision tree combinations of wavelength coverage for the Archean Earth path.For each parameter, black vertical dashed lines indicate the surface values given by the input atmospheric model.b, Same as a, but for temperature.Note that the retrieved spread would be representative of temperatures at the surface and throughout the troposphere.

Figure 11 .
Figure 11.Biosignature Abundance Constraints Broken Down by Observational Bandpass Combinations (Archean Earth Pathway).a, Chemical species abundance constraints for the 0.89 µm bandpass.The abundance constraints for CH4 (yellow filled), H2O (purple filled), O2 (purple hatched), CO2 (red), and O3 (grey) are all shown.b, The same as a, but for the singular 1.65 µm bandpass.c, The same as a and b but for the 0.89 µm & 1.65 µm bandpass combination.Note that oxygen bearing species go unconstrained for all observational scenarios given the anoxic environment of Archean Earth.