Water Absorption in the Transmission Spectrum of the Water World Candidate GJ 9827 d

Recent work on the characterization of small exoplanets has allowed us to accumulate growing evidence that sub-Neptunes with radii greater than ∼2.5 R ⊕ often host H2/He-dominated atmospheres both from measurements of their low bulk densities and from direct detections of their low mean molecular mass atmospheres. However, the smaller sub-Neptunes in the 1.5–2.2 R ⊕ size regime are much less understood and often have bulk densities that can be explained either by the H2/He-rich scenario or by a volatile-dominated composition known as the “water world” scenario. Here we report the detection of water vapor in the transmission spectrum of the 1.96 ± 0.08 R ⊕ sub-Neptune GJ 9827 d obtained with the Hubble Space Telescope (HST). We observe 11 HST Wide Field Camera 3 transits of GJ 9827 d and find an absorption feature at 1.4 μm in its transit spectrum, which is best explained (at 3.39σ) by the presence of water in GJ 9827 d’s atmosphere. We further show that this feature cannot be caused by unocculted starspots during the transits by combining an analysis of the K2 photometry and transit light source effect retrievals. We reveal that the water absorption feature can be similarly well explained by a small amount of water vapor in a cloudy H2/He atmosphere or a water vapor envelope on GJ 9827 d. Given that recent studies have inferred an important mass-loss rate (>0.5 M ⊕ Gyr−1) for GJ 9827 d, making it unlikely to retain a H-dominated envelope, our findings highlight GJ 9827 d as a promising water world candidate that could host a volatile-dominated atmosphere. This water detection also makes GJ 9827 d the smallest exoplanet with an atmospheric molecular detection to date.


INTRODUCTION
While many questions remain regarding the nature of sub-Neptune exoplanets, the last decade of trans-Corresponding author: Pierre-Alexis Roy pierre-alexis.roy@umontreal.camission spectroscopy with the Hubble Space Telescope (HST) has shown that the larger sub-Neptunes are often best described by H 2 /He-dominated atmospheres (e.g., Benneke et al. 2019a,b, Mikal-Evans et al. 2020, Kreidberg et al. 2022).However, this picture is much less clear when considering sub-Neptunes that are in the smaller 1.5-2.2R ⊕ size-regime, near the radius valley (Fulton et al. 2017, Fulton & Petigura 2018, Van Eylen et al. 2018, Hardegree-Ullman et al. 2020).These planets have bulk densities than can be explained by either the H 2 /He-rich sub-Neptune scenario, or by a volatile-dominated composition, where water (or another molecule of similar mean-molecular-weight) supplants H 2 and He as the most abundant atmospheric species (Rogers & Seager 2010, Luque & Pallé 2022, Rogers et al. 2023).This type of exoplanet has been long theorized and is referred to as "water world" (Adams et al. 2008, Acuña et al. 2022).
These smaller sub-Neptunes, which are often inconsistent with extended H-dominated atmospheres with large scale heights (Aguichine et al. 2021, Piaulet et al. 2022), are also found in a smaller mass regime than the larger sub-Neptunes, making these close-in planets much more exposed to mass-loss processes (Owen 2019), and thus more likely to have lost their H 2 and He envelope over their lifetime.A recent study found a first line of evidence for the existence of such volatile-rich water worlds in the super-Earth Kepler-138 d, by combining a thorough interior analysis of the planet with mass-loss estimates, effectively showing that this super-Earth cannot be purely rocky, but that it also cannot retain a hydrogen layer (Piaulet et al. 2022).However, the direct spectroscopic confirmation of a volatile-rich high meanmolecular-weight atmosphere on a water world candidate still eludes us, and such a result would provide a new line of evidence for the water worlds.
The discovery of the transiting sub-Neptune GJ 9827 d (Niraula et al. 2017, Rodriguez et al. 2018) represents a rich opportunity to characterize the atmosphere of a warm sub-Neptune via transmission spectroscopy and to deepen our understanding of this potential water world (Aguichine et al. 2021).Rapidly orbiting (6.2 days) a low-mass K6V star with a size of 1.96 ± 0.08 R ⊕ , a mass of 3.4 ± 0.6 M ⊕ (Kosiarek et al. 2021), and a zero-albedo equilibrium temperature of 680 ± 25 K (Rodriguez et al. 2018), GJ 9827 d allows us to obtain a high signal-tonoise ratio (S/N) in transmission spectroscopy, and to add a precious new target to the sample of sub-Neptunes with transit spectra.While JWST now allows to observe the eclipses and phase curves of small exoplanets deeper in the infrared (e.g., Kempton et al. 2023), transit spectroscopy remains the best method to obtain indepth looks into the atmospheres of sub-Neptunes and potential water worlds with HST, as they rarely are hot enough to provide high S/N in the near-infrared (hot Neptune desert; Owen & Lai 2018).
While the average density of GJ 9827 d has now been constrained (> 3σ) by numerous RV studies of the system (Prieto-Arranz et al. 2018, Rice et al. 2019, Kosiarek et al. 2021), there still remains ambiguity regarding its bulk composition, as its density could be explained by a range of compositions from an extended H 2 /He layer to a water world composition with a ∼20 % water mass fraction (Aguichine et al. 2021).However, the high irradiation of the planet, the old age of the system (Kosiarek et al. 2021) and the non-detections of HeI and Hα absorption from the ground (Kasper et al. 2020, Carleo et al. 2021, Krishnamurthy et al. 2023) make it unlikely that GJ 9827 d would have retained a primordial H-dominated envelope to date.
In this work, we present the most precise look yet at GJ 9827 d via transmission spectroscopy with the Hubble Space Telescope Wide Field Camera 3 (HST/WFC3), and reveal a water absorption feature in its transmission spectrum.In Section 2, we present the observations obtained for this study and we describe the data analysis in Section 3. Section 4 presents our atmospheric analysis of the HST transit spectrum and the related results are presented in Section 5. We end by discussing our findings and presenting our conclusions in Section 6.

OBSERVATIONS AND DATA REDUCTION
GJ 9827 d was observed transiting its host star 11 times between December 2017 and December 2020 with the Wide Field Camera 3 (WFC3) onboard the Hubble Space Telescope (HST) as part of the mini-Neptune atmosphere diversity survey (GO 15333: PI Crossfield).The G141 grism was used in order to obtain the transmission spectrum of the sub-Neptune over the 1.1-1.7 µm range.
Each of the 11 HST/WFC3 transit observations consisted of ∼3 hours of observing time spanning three telescope orbits with ∼1 hour gaps between them.Each transit observation is thus composed of one orbit before, during, and after transit.The transit time series were obtained with the G141 grism using the spatial scan mode.In order to optimize the duty cycle of our observation, we used both the forward and backward detector scans.We discard one of the transits from our analysis (November 1st 2019) since a pointing maneuver cut orbit 1 short before the ramp was stabilized, effectively carrying an unusually strong ramp to orbit 2 and polluting the in-transit observations.
We reduce the observations following standard procedures for HST/WFC3 observations (details in Benneke et al. 2019a,b).In order to minimize the background contribution, we subtract consecutive reads upthe-ramp and then add together the background subtracted frames.We then construct flat-fielded images from the flat-field data product provided by STScI.We use a normalized row-added flux template in order to remove and replace outlier pixels in our frames.We follow Benneke et al. (2019a) in order to correct for the slight slanted shape of the trace on the detector, which is introduced by the spatial scan mode, using a trapezoidal shape integration scheme for the wavelength bins, which we choose to be 30 nm wide.Our flux integration does not perform presmoothing and does account for partial pixels along the trapezoidal bin boundaries.Finally, in order to account for the small drift of the star across the detector during the observations, we account for a small position shift which is measured in each frame.

DATA ANALYSIS
We perform the light-curve fitting of our 10 transits of GJ 9827 d individually using the ExoTEP framework (Benneke et al. 2017(Benneke et al. , 2019a,b),b).We use ExoTEP to jointly fit the transit model with a systematics model and a photometric noise parameter in a Markov Chain Monte Carlo scheme (Foreman-Mackey et al. 2013).We decide to fit the transits individually since they display variability in the transit timings (see Figures 1, 2, and Section 6.1).
Each visit in our data set consists of three HST orbits which do not cover the full transit duration of GJ 9827 d 7800 8000 8200 8400 8600 8800 9000 9200 (Figure 1).The in-transit observations only occur during the second orbit of each visit, either observing the ingress, the middle of transit, or the egress (Figure 1).
For that reason, we cannot obtain reliable constraints on the orbital parameters out of the partial transit observations and decide to use a fixed orbital solution during the fits (b=0.91,a/R ⋆ = 19.88).Since the visits do not include a burn-in orbit, we cannot follow the standard procedure to discard the first orbit, which displays a stronger ramp in time as the detector is still settling (e.g., Kreidberg et al. 2014).We rather choose to keep the 2-4 last points of orbit 1 in each visit (Figure 1), as the strong ramp has stabilized by then, and it provides essential baseline information, especially for visits that only have mid-transit or egress data in orbit 2 (Figure 1).For all orbits, we discard the first forward and backward scans.
Because GJ 9827 is a close-in, near-resonant system, some of the visits in our data set also include transits of GJ 9827 b.Given the partial coverage of our visits, we simply remove the points where GJ 9827 b is expected to transit, which affects visits 5 and 10.Visits 6 and 7 also include a transit of planet b, but it happens in the first orbit which is already mostly discarded.Finally, we remove the last 5 points of visit 3 since they are clear outliers.

White-light-curve fit
We fit for systematics trends in the normalized transit light curves simultaneously with the transit model using an analytical model that allows for a linear slope throughout the visit duration and an exponential ramp in each orbit.Following previous work (e.g., Kreidberg et al. 2014, Benneke et al. 2019b) we use the following parameterization to account for these systematics: Here, c is the normalization constant, v is the linear slope throughout the visit, a and b are the rate and amplitude of the exponential ramp in each orbit, and d is an offset only for first orbit reads.S(t) is a function that is equal to 1 for forward scans and to s for backward scans, allowing to correct for the offset between forward and backward scans.Finally t v is the time since the start of the visit, while t orb is the time since the start of the orbit.
We produce our transit light-curve models using the Batman package (Kreidberg 2015).Since we are not trying to fit the orbital solution, the only two astrophysical parameters that we are fitting in the transit light curve are the transit depth, and the mid-transit time (Figure 2).For the limb darkening, we use the Exotic-LD package (Grant & Wakeford 2022) to compute the coefficients using 1D stellar models (Kurucz 1993) and the quadratic limb darkening law.The impact parameter and semi-major axis are set to the values in Niraula et al. (2017).We obtain the posterior distribution on our parameters by running a MCMC analysis individually for each of the 10 visits.We use four walkers per parameter and all priors are uniform.The 10 white-light-curve fits are shown in Figure 1.

Spectroscopic light-curve fit
We use the white-light-curve fits results in terms of the systematics model to pre-correct the light curves in each spectral bin (divide-white method; Stevenson et al. 2014).Thus, we divide the spectroscopic light curves by the white-light-curve best-fitting systematics model before starting the fitting.We produce our spectroscopic transit light-curve models similarly as in the white-light-curve case, but we now keep the mid-transit time fixed to the best-fitting value found by the whitelight-curve fit.The limb darkening is again modelled with Exotic-LD, and this time, our systematics model is a three-parameter linear slope with trace position (measured during the data reduction, see Section 2).We thus obtain 10 transmission spectra, one from each visit, by running an MCMC analysis on each.We again use four .Individual relative transit depths compared to the relative final transmission spectrum.For each of the 10 transmission spectra, we obtain the relative transit spectrum by subtracting the average across wavelengths.We then subtract the relative final transmission spectrum from the individual relative spectra, and display these transit depths for all visits in each channel.The relative transit depths are consistent across the 10 visits with the final relative spectrum.Points that are significantly away from zero (dotted lines) have larger error bars which makes them less important in the weighted average and thus do not bias our transmission spectrum.The uncertainty on the combined spectrum is also displayed as the grey region.In the top left panel, the broadband transit depths are shown, centered on the average across the 10 visits, highlighting the variability in the observed broadband transit depths from the different visits.
walkers per parameter and uniform priors on all parameters.

Combining all the visits together
We compute a weighted average of our 10 individual transmission spectra to obtain our final transmission spectrum of GJ 9827 d.In order to verify the robustness of our spectrum, and to ensure that it is not affected by that variability in the observations, we compare the relative transit depth in each channel, for each visit (Figure 3).From each spectrum, we subtract the weighted average (across wavelengths) of said spectrum, essentially making it a relative transit spectrum centered around zero.We then subtract the relative combined spectrum of all visits (also centered around zero) from each individual spectrum to effectively center each spectroscopic transit depth around zero.We then inspect this relative transit depth for each spectroscopic bin and for each visit, in order to ensure that the points in our final transmission spectrum are not affected by outliers  The black contours highlight the 1, 2 and 3σ Bayesian credible regions.The water abundance relative to a solar composition atmosphere is shown on the top axis.The posterior probability distribution allows for multiple atmospheric scenarios ranging from H2/He envelopes with small amounts of water to water-dominated envelopes.The blue points identify two representative samples of these two scenarios which are displayed in Figure 6.(Figure 3).We find that for each spectroscopic channel, all visits mostly agree with the weighted average within error bars, and points that are inconsistent have larger error bars, which makes them much less important in the weighted average, since the weight of each points is inversely proportional to the uncertainty squared (Figure 3).The final average transmission spectrum is presented in Table 1 and in Figure 4. We decide to discard the last spectroscopic channel (1.67-1.70 µm) since it is systematically lower than the rest of the spectrum, and is near the edge of the trace on the detector where the data is less reliable.

ATMOSPHERIC MODELING
We perform atmosphere retrievals on our GJ 9827 d transmission spectrum using the SCARLET framework (Benneke & Seager 2012, 2013, Knutson et al. 2014, Kreidberg et al. 2014, Benneke 2015, Benneke et al. 2019a,b, Pelletier et al. 2021, Roy et al. 2022).SCAR-LET parameterizes the molecular abundances, the cloud deck pressure and the temperature to fit our spectrum.SCARLET uses a Bayesian nested sampling analysis (with single ellipsoïd sampling ;Skilling 2004Skilling , 2006) ) to obtain the posterior probability distribution of our parameter space and the Bayesian evidence of our models.For each set of parameters, SCARLET produces a forward atmosphere model in hydrostatic equilibrium (Benneke 2015), computes the opacity associated to each molecule throughout the 40 vertical (pressure) layers of the model, computes the transmission spectrum for that model and finally performs the likelihood evaluation.The model transmission spectra produced at each step have a resolution of 16000 and are then con-volved to the wavelength bins of the observed spectrum.The molecules considered in our retrieval are H 2 O, CH 4 , CO 2 , CO and N 2 , as well as H 2 and He which are not parameterized and rather fill up the atmosphere (Benneke & Seager 2013).
We assume well-mixed vertical chemical profiles, where the abundances of molecules do not vary throughout the atmosphere.We choose a log-uniform prior on Bayesian model comparison results from our SCARLET atmosphere retrievals in the free chemistry setting the abundance of each molecule ranging from 10 −10 to 1 in volume mixing ratio.We assume a constant temperature structure throughout the atmosphere and use a Gaussian prior centered on the planet's equilibrium temperature (680 ± 100 K; Rodriguez et al. 2018) on that parameter.The parameterization also includes a cloud deck top pressure, which blocks all light rays going through that pressure level.We again use a log-uniform prior on that parameter from 10 −4 mbar to 10 5 mbar.Thus, our atmosphere retrieval includes seven free parameters in total (or less when molecules are removed, see Table 2).

RESULTS
The observed transit spectrum of GJ 9827 d displays a water absorption feature at 1.4 µm (Figure 4).Qualitatively, the transit depths in the spectrum are deeper in the 1.4 µm water band followed by a downward slope that follows the wing of the water absorption feature.Quantitatively, a Bayesian model comparison analysis of our well-mixed retrievals (Benneke & Seager 2013) prefers models that include the presence and absorption of water with a significance of 3.39σ (Bayes Factor = 72.52;Table 2) compared to models that do not include water.

Metallicity-clouds degeneracy
Our free chemistry retrievals show that the data can both be explained by a water-rich atmosphere, where water is the most abundant molecule, as well as with a H 2 /He-dominated atmosphere that still contains a small amount of water (Figure 4).At 1σ, models with a water mixing ratio between 0.02% and 80% are preferred by the spectrum.When compared to the amount of water in a solar metallicity envelope, we see that this interval in abundance ranges from 1 × solar metallicity models, which are dominated by H 2 and He gas, to 1000 × solar metallicity models, where water is now the principal species (Figure 4).
In the cases where water is present in small amounts, a cloud deck is needed to explain the observed transit spectrum (Figures 4, 5).This is due to the fact that the observed spectrum does not display the large amplitude expected from cloud-free models (Figure 6).This need for clouds in low mean-molecular-weight atmospheres is also seen in the marginalized probability distributions of the other molecules (Figure 5), since their spectral features are inconsistent with the observed spectrum, and thus must be muted by clouds in order to yield highlikelihood models.In cases where the water is more abundant and becomes the principal molecule, the spectra naturally display muted features because of the high density of the atmospheres and lower atmospheric scale height, thus removing the need for high clouds.In this water-rich scenario, the constraint on the cloud-top pressure disappears, explaining the observed posterior distribution (all deep cloud decks become equally consistent; Figure 4).

Upper limits on other molecules
While methane is known to display a similar 1.4 µm absorption feature as water (Bézard et al. 2022), it remains disfavored in our free chemistry retrieval analysis.The spectral signatures of methane not only include a feature around 1.4 µm, but also at 1.2 µm and at 1.7 µm, as shown by SCARLET forward atmosphere models for a pure methane envelope and for a solar composition atmosphere (Figure 6).However, the observed transmission spectrum of GJ 9827 d does not display these absorption features at 1.2 and 1.7 µm (Figure 6) and is in better agreement with the water models that display a smaller feature at 1.2 µm, no absorption at 1.7 µm, and a broader feature at 1.4 µm.In order to obtain chemically consistent models that agree with the observed transit spectrum, the carbon-to-oxygen (C/O) ratio must be decreased in order to favor O-bearing molecules (here, water) and a cloud deck must be included to mute the spectral amplitude of the water absorption features.Such models (e.g.C/O = 0.1, Metallicity = 100 × solar and pCloud = 1 mbar) yield qualitatively and quantitatively similar transmission spectra to those favored by our free chemistry retrieval (Figures 4,6).
No other molecule besides water is statistically detected by our retrievals (Table 2, Figure 5).However, we can derive upper limits in their abundances based on our results from the free chemistry retrievals, either from the non-detection of specific spectral features (e.g., .HST/WFC3 transmission spectrum of GJ 9827 d (data points) along with SCARLET forward atmosphere models (colored lines).Top: Two samples from our well-mixed retrieval analysis (Figure 4) are shown, representing the mini-Neptune scenario with a cloudy H2/He atmosphere composed of ∼1% water (pale blue) and a water world scenario with a water-rich atmosphere (dark blue).Middle: A secondary atmosphere model for a pure methane envelope is also shown (red) in order to highlight the methane absorption features.Chemically consistent models (still assuming a uniform temperature profile) are shown for a cloud-free, solar composition case (solar metallicity, solar C/O; orange) and for a cloudy case with C/O=0.1 and a 100 × solar metallicity (green).The observed spectrum is inconsistent with cloud-free low-metallicity scenarios and prefers water absorption features to methane absorption features, mainly around 1.2 and 1.65 µm.The strength of the features in the spectrum is also displayed in units of H/He scale heights (right axis).Bottom: The best-fit model from the retrieval analysis is shown (pale blue), along with the transmission spectrum of the same model once the water opacity is turned off (grey).
The contribution of water opacity to the spectral signatures is highlighted in blue.We also present a binned version of the transmission spectrum where points are binned together by pair with the exception of the blue-most channel.
CH 4 ) or because too much of anyone species increases the mean-molecular-weight of the atmosphere, eventually yielding a flat spectrum (e.g., N 2 ).Thus, our retrievals allow us to constrain the upper limits on the mixing ratios of CH 4 , CO 2 , CO and N 2 to 3.04, 19.4,52.5 and 60.4 % at 3σ significance.

Significance of a featureless spectrum
In order to evaluate how our spectrum deviates from a featureless spectrum, we compute the deviation from the best-fitting straight line using χ 2 statistics.Using the binned version (bottom of Figure 6), we obtain that the transmission spectrum of GJ 9827 d deviates from a straight line at 3.24σ.In order to assess how our water detection compares to a featureless flat spectrum within the Bayesian paradigm, we compute the Bayesian evidence of a one-parameter flat line model Z flat .Given the simplicity of this one-parameter model, we do not need to use SCARLET nested sampling to obtain that value, and rather numerically estimate it via the following analytical solution: where N is the number of points in the spectrum, σ i is the uncertainty of the i th point of the spectrum denoted D i , and where θ 1 and θ 2 are the limits of our uniform prior on the transit depth parameter θ.This allows us to show that our atmosphere model is preferred to the flat spectrum model at 4.01σ (Bayes Factor = 620.76;Table 2).

Ruling out stellar contamination
The Transit Light Source Effect (TLS, Rackham et al. 2018) can mimic water features at the 20 ppm level or more for modestly spotted stars under certain observational configurations, and could thus create the feature observed in our transmission spectrum.The best constraint available for the starspot coverage and contrast for GJ 9827 comes from the K2 Campaign 12 (C12) lightcurve, with an SFF-derived (Vanderburg & Johnson 2014) peak-to-valley amplitude of 0.45%, slightly lower than the typical 0.7% for K6 spectral types (Rackham et al. 2019).Coarse scaling laws can relate the observed K2 amplitude to the surface coverage of starspots, under assumptions of size, number, and location of spots on a stellar surface (Rackham et al. 2018, Guo et al. 2018).For a K6 spectral type, 0.45% amplitude variations relate (conservatively) to spot-covering fractions of 1-4% (Rackham et al. 2019).Thus, we adopt a spot contrast typical of a K6 star and a filling factor of 1-4% (Rackham et al. 2019).Under these assumptions, a planet with a 1% transit depth could expect an H 2 O contamination of <15 ppm from unocculted spots.GJ 9827 d's much smaller transit depth of < 0.1% would therefore yield a negligible TLS water contamination of < 1.5 ppm.
We perform a retrieval on our combined transit spectrum in which we fit for TLS parameters rather than for atmospheric parameters.We use the 1-4% spots filling factor as a prior in this TLS retrieval, and the other parameters are the photospheric temperature, the spots' temperature difference and a scaling factor.Our TLS models simulate unocculted star spots by averaging the stellar spectrum of the star's photoshere with the spectrum of the cooler spots (weighted by the spots coverage) based on Phoenix stellar models (Husser et al. 2013).It then simulates the transit of an airless planet to obtain the stellar-contaminated transit spectrum.When using the photometry-derived prior, we find that the TLS fitting is restricted to flat models which indicate that stellar contamination is not expected for that system (Figure 4).
Repeating the same TLS retrieval with nonrestrictive uniform priors on all parameters further demonstrates that the signal cannot be caused by the star.We run the same TLS retrieval as described above, but without using the 1-4% spots filling factor prior, to see under what stellar conditions the signal can be explained by the star.We find that, in order for the TLS to reproduce the signal in our spectrum, not only does the spots parameters need to be unrealistically large (73% spots coverage and < −794 K spot temperature difference), but the model needs to adopt a strong positive ramp towards short wavelengths (as in Moran et al. 2023), which becomes inconsistent with the K2 transit depth measurement.We thus conclude that stellar contamination cannot explain the feature in the transmission spectrum of GJ 9827 d, and that the water detection comes from the planet's atmosphere.

DISCUSSION AND CONCLUSIONS
The HST/WFC3 transmission spectrum of GJ 9827 d presented in this work provides a precious target in the population of sub-Neptune exoplanets for which we have precise transmission spectra; and highlights GJ 9827 d as the smallest exoplanet with an unambiguous atmospheric molecular detection to date.Compared to the other characterized sub-Neptunes, our detection of a ∼1 H/He scale height water feature (Figure 6) makes it stronger than for similarly hot sub-Neptunes, although it remains broadly consistent with the previously observed trend where hotter sub-Neptunes display stronger H 2 O amplitudes (Crossfield & Kreidberg 2017).Our analysis of the 10 observations of GJ 9827 d's transits also revealed some variability in this planet's orbit, which is to be considered for further monitoring of the system with state-of-the-art facilities such as JWST.Finally, our detection of a water feature in GJ 9827 d's transit spectrum provides the first detection of water in the atmosphere of a potential water world, which, when combined with GJ 9827 d's large mass-loss rate, provides a first line of evidence for this sub-Neptune hosting a water-steam dominated atmosphere.

Variability in the transits of GJ 9827 d
The analysis of the 10 transits of GJ 9827 d with HST revealed a significant variability in the transit timings observed from one visit to the other (Figure 2).While this variation is not surprising for a near-resonant system and did not impact the features observed in the transit spectrum (as shown by our analysis of the relative spectra; Figure 3), it still is in contrast with the previously observed TTVs for this system which were of the order of ∼3 minutes (Niraula et al. 2017).However, the 5-10 minutes TTVs observed for planet d in this work are consistent with an independent study of the TTVs of the GJ 9827 system combining all photometric and radial velocity data (Livingston et al., in prep.).
The limited number of in-transit data points in the time series in this program could also explain the range of transit depths observed in our results.As described earlier, each HST orbit displays an exponential ramp in time that is fitted by our systematics model.This ramp has a much stronger effect in the first few integrations than in the last few integrations of each orbit, when it has settled.Thus, HST/WFC3 observations inherently provide better quality observations towards the end of each orbit.When considering the individual visits in our program, it seems that egress visits yield deeper transit depths than mid-transit or ingress visits (Figures 1,  3).This could then be explained by the fact that the different visits in our data set have varying in-transit data quality depending on whether the late-orbit integrations are in the transit (ingress and mid-transit visits) or are in the baseline (egress visits).For instance, visit 6, which is an egress observation, displays a deeper transit depth than in the other visits.However, the relative shape of the transmission spectrum, which is the quantity of interest for this work, is consistent with the other visits (Figure 3).
Another potential source of the TTV and transit depth variability observed in our program is star-spot crossing.GJ 9827 has been shown to display quasiperiodic flux (∼0.45%) variations with a period of ∼30 days (Rodriguez et al. 2018, Teske et al. 2018, Prieto-Arranz et al. 2018, Rice et al. 2019, Kosiarek et al. 2021).If these stellar variations were caused by stellar spots, then using a fixed orbital solution and a transit model that does not include the effect of spots in our light-curve fits could lead to biases in our retrieved parameters.
Because of the variability discussed above, we decided to fix the orbital solution and limb darkening coefficients for the light-curve fits in our program (Section 3).In order to ensure that the limb darkening coefficients and stellar parameters chosen do not affect our atmospheric inference, we repeat our light-curve fits for multiple assumptions on the limb darkening.Using quadratic limb darkening coefficients, we reproduce the same fitting but using a 3D stellar model (Magic et al. 2015) when computing the coefficients.We further try the light-curve fits by varying the effective stellar temperature to the +1σ and −1σ values of that parameter (Kosiarek et al. 2021).In all cases, we find that the limb darkening and choice of stellar parameters only affect the retrieved spectrum with a constant offset throughout the wavelength range, and that the relative spectra all show the water absorption feature and are all consistent within 1σ.This thus shows that our choice for the stellar and limb darkening parameters does not affect the shape of the transmission spectrum, and subsequently, our atmospheric analysis.
Similarly, given the difficult observational setting, we test the robustness of the spectrum to the systematics models used by trying two alternative light-curve fitting methods.First, we start from the divide-white corrected spectroscopic light curves and jointly fit a relative transmission spectrum.To do so, we jointly fit (across visits) the relative transit depth in each channel, where the broadband average of the individually fitted spectra is subtracted for each visit (since there sometimes are discrepancies between the white-light-curve transit depths, and average spectral depths in HST/WFC3 data).Secondly, we repeat the method presented in Section 3, but using the RECTE systematics model and orbit 1 (and not using the divide-white method; Zhou et al. 2017) for the seven visits which are not affected by transits of planet b in orbits 1 or 2. This gives us 7 transmission spectra that we combine with a weighted average.Both of these methods produce relative transmission spectra that are consistent within uncertainties with the one presented in Table 1.The spectrum obtained from the RECTE models has increased uncertainties, both from the smaller number of visits and from increased fitted scatter in some visits, but is still in agreement with the other two spectra.The spectrum presented in this work is thus robust to the choice of systematics model.

Water in the envelope of a potential water world
The water detection in the transit spectrum of GJ 9827 d makes it the first water world candidate with an atmospheric water detection consistent with a water-rich envelope.It thus positions itself in the sample of potential water worlds, with other small sub-Neptunes and super-Earths such as Kepler-138 d (Piaulet et al. 2022), L 98-59 d (Kostov et al. 2019), TOI-1685 b (Bluhm et al. 2021), GJ 3090 d (Almenara et al. 2022), TOI-270 d (Günther et al. 2019, Mikal-Evans et al. 2023).In contrast, a water feature was also detected in the transit spectrum of TOI-270 d (Mikal-Evans et al. 2023), but the analysis revealed that the H-rich atmosphere scenario was favored for this sub-Neptune, showing that the line can be fine between a mini-Neptune and a water world.
With its small mass of 3.42 M ⊕ (Kosiarek et al. 2021) and its proximity to its host star (6.2 d orbit), the estimated mass-loss rate of GJ 9827 d is >0.5 M ⊕ /Gyr (Krishnamurthy et al. 2023).With an estimated age around 6 Gyr (Kosiarek et al. 2021), GJ 9827 d is unlikely to retain an extended H 2 /He envelope today.Furthermore, monitoring of GJ 9827 d's spectrum in the search of Hα and HeI signatures with Keck/NIRSPEC (Kasper et al. 2020) CARMENES (Carleo et al. 2021) and IRD (Krishnamurthy et al. 2023) resulted in no evidence of an extended H 2 /He atmosphere around the planet.Hence, the H-rich scenario with a smaller water abundance would provide a somewhat contradictory statement to the previous studies that observed GJ 9827 d from the ground.However, the water-rich scenario can both explain the observed HST transit spectrum, as well as the non-detection of Hα/HeI lines from ground-based studies.The water-dominated envelope is thus the compositional scenario that explains all of the data at hand on this system in the most natural way.
In this water-rich scenario, GJ 9827 d would thus represent a larger, hotter, close-in version of the icy moons of the giant planets in the solar system.Indeed, water is believed to be the dominant volatile of the icy moons of the solar system (Schubert et al. 2004).GJ 9827 d could then have formed outside of the water ice line, where water ice is available in large amounts as a planetary building block (Mousis et al. 2019, Venturini et al. 2020).It could then have migrated towards its current stable near-resonant orbit, during which the increasingly important stellar irradiation would have driven an important H 2 /He loss, and it would be observed today with a high mean-molecular-weight water vapor atmosphere due to its warm temperature (Adams et al. 2008, Pierrehumbert 2023) and its H 2 /He depletion.
While our transmission spectrum cannot unambiguously distinguish between the H-rich and H-depleted scenarios, we have provided the first water detection in the envelope of a water world candidate, making it a key target for further monitoring with JWST.Transmission spectroscopy of GJ 9827 d with NIRISS/SOSS and NIRSpec/G395H would provide the high-precision continuous viewing of the full transit of the planet that is needed to explain the variability observed with HST, as well as provide a more precise transmission spectrum that could not only probe the water absorption bands, but also probe for the presence of carbon bearing species like CO and CO 2 above 4 µm.A JWST transmission spectrum of GJ 9827 d would thus lift the degeneracy observed in our study (Figure 4) and potentially confirm the water world nature of this sub-Neptune, simultaneously yielding the first direct detection of a water vapor dominated envelope.

Figure 1 .
Figure 1.All 10 HST/WFC3 broadband light-curve fits of the transits of GJ 9827 d.Left: Systematics-corrected and normalized broadband light curves for the 10 transits of GJ 9827 d (data points).Each visit is centered around the fitted transit time for that visit.The best-fitting transit model is also shown as the grey line.Right: Residuals of the broadband light-curve fits shown on the left.
Figure3.Individual relative transit depths compared to the relative final transmission spectrum.For each of the 10 transmission spectra, we obtain the relative transit spectrum by subtracting the average across wavelengths.We then subtract the relative final transmission spectrum from the individual relative spectra, and display these transit depths for all visits in each channel.The relative transit depths are consistent across the 10 visits with the final relative spectrum.Points that are significantly away from zero (dotted lines) have larger error bars which makes them less important in the weighted average and thus do not bias our transmission spectrum.The uncertainty on the combined spectrum is also displayed as the grey region.In the top left panel, the broadband transit depths are shown, centered on the average across the 10 visits, highlighting the variability in the observed broadband transit depths from the different visits.

Figure 4 .
Figure 4. Water detection in the transmission spectrum of GJ 9827 d.Left: Transmission spectrum of GJ 9827 d (black points) shown with our model transmission spectra constraints from the nested sampling atmosphere retrieval (blue) and from the photometry-informed "transit light-source effect" retrieval (orange).The dark blue and light blue shaded regions show the 1σ and 2σ Bayesian credible intervals from the atmosphere retrieval respectively.The atmospheric median transmission model is shown in blue and the best-fitting model is shown in red.The best-fitting TLS model is shown in orange along with the 1σ and 2σ Bayesian credible intervals in light orange.Right: Joint constraints on the cloud-top pressure versus the water mixing ratio derived from our Scarlet well-mixed retrieval.The colored shading describes the normalized probability density as a function of the water mixing ratio (assuming uniform vertical profiles) of the atmosphere, and of the cloud-top pressure.The black contours highlight the 1, 2 and 3σ Bayesian credible regions.The water abundance relative to a solar composition atmosphere is shown on the top axis.The posterior probability distribution allows for multiple atmospheric scenarios ranging from H2/He envelopes with small amounts of water to water-dominated envelopes.The blue points identify two representative samples of these two scenarios which are displayed in Figure6.

Figure 5 .
Figure5.Posterior probability distributions of the free parameters used in the SCARLET free chemistry retrieval.The diagonal panels show the marginalized probability distributions of all individual parameters, whereas the off-diagonal panels show the marginalized probability distributions for each pair of parameters as colored shading.The 1, 2, and 3σ contours are shown in the 2D posteriors.Water is the only molecule detected in our retrieval analysis.
Figure6.HST/WFC3 transmission spectrum of GJ 9827 d (data points) along with SCARLET forward atmosphere models (colored lines).Top: Two samples from our well-mixed retrieval analysis (Figure4) are shown, representing the mini-Neptune scenario with a cloudy H2/He atmosphere composed of ∼1% water (pale blue) and a water world scenario with a water-rich atmosphere (dark blue).Middle: A secondary atmosphere model for a pure methane envelope is also shown (red) in order to highlight the methane absorption features.Chemically consistent models (still assuming a uniform temperature profile) are shown for a cloud-free, solar composition case (solar metallicity, solar C/O; orange) and for a cloudy case with C/O=0.1 and a 100 × solar metallicity (green).The observed spectrum is inconsistent with cloud-free low-metallicity scenarios and prefers water absorption features to methane absorption features, mainly around 1.2 and 1.65 µm.The strength of the features in the spectrum is also displayed in units of H/He scale heights (right axis).Bottom: The best-fit model from the retrieval analysis is shown (pale blue), along with the transmission spectrum of the same model once the water opacity is turned off (grey).The contribution of water opacity to the spectral signatures is highlighted in blue.We also present a binned version of the transmission spectrum where points are binned together by pair with the exception of the blue-most channel.