Water Vapor and Clouds on the Habitable-Zone Sub-Neptune Exoplanet K2-18b

Results from the Kepler mission indicate that the occurrence rate of small planets ($<3$ $R_\oplus$) in the habitable zone of nearby low-mass stars may be as high as 80%. Despite this abundance, probing the conditions and atmospheric properties on any habitable-zone planet is extremely difficult and has remained elusive to date. Here, we report the detection of water vapor and the likely presence of liquid and icy water clouds in the atmosphere of the $2.6$ $R_\oplus$ habitable-zone planet K2-18b. The simultaneous detection of water vapor and clouds in the mid-atmosphere of K2-18b is particularly intriguing because K2-18b receives virtually the same amount of total insolation from its host star ($1368_{-107}^{+114}$ W m$^{-2}$) as the Earth receives from the Sun (1361 W m$^{-2}$), resulting in the right conditions for water vapor to condense and explain the detected clouds. In this study, we observed nine transits of K2-18b using HST/WFC3 in order to achieve the necessary sensitivity to detect the water vapor, and we supplement this data set with Spitzer and K2 observations to obtain a broader wavelength coverage. While the thick hydrogen-dominated envelope we detect on K2-18b means that the planet is not a true Earth analog, our observations demonstrate that low-mass habitable-zone planets with the right conditions for liquid water are accessible with state-of-the-art telescopes.


INTRODUCTION
The recent discovery of the transiting 8.63 ± 1.35 M ⊕ exoplanet K2-18b in the habitable zone of a bright, nearby M3-dwarf provides us with an opportunity to carry out a spectroscopic study of the atmosphere of a Corresponding author: Björn Benneke bbenneke@astro.umontreal.ca habitable-zone planet outside our solar system (Montet et al. 2015, Benneke et al. 2017, Cloutier et al. 2017, Sarkis et al. 2018, Cloutier et al. 2019). K2-18b is an intriguing planet because its equilibrium temperature (255 ± 4 K at an albedo of A = 0.3) is potentially very close to that of the Earth (257 K). The planet's predicted temperature provides the right conditions for water vapor to condense to the liquid phase in its atmosphere. While K2-18b has a much shorter orbital period (33 days) than the Earth, its host star is also cooler and smaller than the Sun (3457 K, 0.45 R ) resulting in only 2.53% of the Sun's luminosity. As a result, the total insolation received by K2-18b (1368 +114 −107 W/m 2 ) is very close to that received by the Earth (1361 W/m 2 ), putting K2-18b firmly in the habitable zone.
Compared to habitable-zone planets around Sun-like stars, habitable-zone planets around M dwarfs offer two key advantages for atmosphere studies (Nutzman & Charbonneau 2008). The small diameter of the star results in larger transit signatures as the amplitude of transit and the atmospheric signal scale inversely with the square of the stellar radius. Furthermore, the short orbital periods for habitable-zone planets around mid M dwarfs (30-70 days) enables the observation of repeated transits within a relatively short time frame. This means that for K2-18b, we were able to observe nine transits with Wide Field Camera 3 on the Hubble Space Telescope (HST /WFC3) within a period of 3 yr.
Kepler showed that as much as 80% of M dwarfs host small planets (< 3R ⊕ ) in the habitable zone (Dressing & Charbonneau 2013, Kopparapu 2013, Silburt et al. 2015, Farr et al. 2015, suggesting that planets like K2-18b may be common. However, most of the M dwarf planetary systems detected in the original Kepler survey are extremely faint, making spectroscopic characterization of these planets prohibitively inefficient. Fortunately, K2-18b orbits a relatively bright (K = 8.89) host star, permitting detailed characterization of its atmosphere.
Beyond its potentially temperate climate, K2-18b also occupies an interesting niche in mass-radius space. Very little is currently known about the bulk and atmospheric compositions of planets with masses between those of Earth and Neptune. These planets have no analogs in the solar system, and aside from the recent atmospheric detection for GJ 3470b (Benneke et al. 2019), most atmospheric studies resulted in nondetections due to the prevalence of high-altitude clouds (Knutson et al. 2014b, Kreidberg et al. 2014, Crossfield et al. 2017. Refined population studies of the radius distribution of sub-Neptune-sized planets have revealed a significant drop in the planet occurrence rates between near 1.5-2.0 R ⊕ (Fulton et al. 2017, Fulton & Petigura 2018. Photoevaporation (Owen & Wu 2013, Lopez & Fortney 2013, Lopez & Rice 2018 and core-powered mass loss (Schlichting 2018) are two theories that can reproduce this observed drop in the occurrence rate. According to photoevaporation models, the most highly irradiated super-Earths are expected to have primarily rocky compositions and relatively small radii, while less irradiated super-Earths are able to retain a modest (few percent in mass) primordial hydrogen-rich atmosphere that inflates their observed radii to values greater than 2.0 R ⊕ . With a relatively low incident flux and a measured radius of 2.61 R ⊕ , K2-18b would then be expected to host an extended hydrogen-rich atmosphere, making it a favorable target for atmospheric characterization studies using the transmission spectroscopy technique.
In this work, we present the detection of water vapor and clouds in the atmosphere of the habitable-zone exoplanet K2-18b. In Section 2 we describe our observations obtained with HST, Spitzer, and K2, as well as the techniques used to reduce the data and produce spectrophotometric lightcurves. In Section 3 we discuss the data analysis and present the best-fit white light-curve parameters and transmission spectrum. Importantly, we also provide a refined stellar characterization. Our atmospheric modeling analysis is described in Section 4. The main results are presented in Section 5, and the possibility of liquid water clouds is discussed in Section 6.

OBSERVATIONS AND DATA REDUCTION
Our team observed the transiting habitable-zone exoplanet K2-18b with Wide Field Camera 3 on the Hubble Space Telescope as part of two large spectral surveys of low-mass exoplanets (GO 13665 and GO 14682;PI Benneke). Building up sufficient signal-to-noise was possible for this habitable-zone planet because the shorter orbital period (33 days) enabled us to observe nine independent transits within 3 yr. The SNR of the transmission signal was further boosted by the small stellar radius, which amplifies the signal of the planet and atmosphere during transit. We complement the HST /WFC3 observations with two new Spitzer transit observations taken at 3.6 µm (Program 12081, PI Benneke) as well as previously published Spitzer (4.5 µm) and Kepler/K2 transit observations from Benneke et al. (2017, see our Table 1).

HST/WFC3 transits
Each of our nine HST /WFC3 visits spanned 6.5 hours and consisted of four full telescope orbits separated by 45 minute gaps in data collection due to Earth occultation. We obtained the HST /WFC3 time series with the G141 grism in spatial scan mode. In this configuration the telescope is scanned during the exposure, moving the stellar spectrum across the detector perpendicular to the dispersion direction (Deming et al. 2013, Kreidberg et al. 2014. This allows for a significantly higher efficiency when observing bright stars like K2-18b. In order to minimize instrumental overheads we utilized both forward and backward scans with the maximum possible duration, covering a large fraction of the 256 × 256 pixel detector subarray used for fast readouts. We do not include the final HST/WFC3 transit in the analysis because it was corrupted by telescope guiding errors which resulted in the spectrum migrating off the detector subarray preventing us from doing useful science with the this particular transit observations. Following standard procedure (e.g., Deming et al. 2013), we minimize the contribution from the sky background by subtracting consecutive nondestructive reads and then coadding these background-subtracted subexposures. We then use the wavelength-dependent flatfield data provided by STScI to build flat-fielded images. Bad pixels are removed and replaced by the corresponding value in a normalized row-added flux template.
The combined effect of the spatial scans and the position-dependent grism dispersion results in a slightly trapezoidal shape for the illuminated patch. The trapezoidal shape is due to a small difference in the dispersion on the detector along its y-axis, yielding a systematic horizontal shift of each wavelength by 2-3 pixels across the scan. To correctly capture this effect, we integrate over trapezoidal wavelength bins instead of rectangular ones, built from lines of constant wavelength obtained from our 2D wavelength solution computed across the detector. We use the same procedure as in Benneke et al. (2019) for the flux integration, avoiding any presmoothing and accounting for the partial pixel flux along the bin boundaries to ensure total flux conservation. We also account for small x position shifts in each frame to correct for the small observed drift in the star's position across the observations.

Spitzer/IRAC transits
We obtained two new transit observations of K2-18b with Spitzer at 3.6 µm (Program 12081, PI Benneke) and reanalyzed the 4.5 µm transit observation previously published in Benneke et al. (2017). The 3.6 µm transit observations were preceded by 30 minute preobservations in peak-up mode to mitigate telescope drift and temperature variations associated with a recent shift in pointing (Grillmair et al. 2012). We used 0.4 s exposures and observed for a total of ∼8 hr on 2016 March 14 and 2016 August 26.
We follow standard procedure for Spitzer /IRAC image processing and start from the flat-fielded and darksubtracted "Basic Calibrated Data" (BCD) images. We use the method presented in Kammer et al. (2015) for background estimation, and determine the position of the star following Benneke et al. (2019). We then chose the aperture, trim duration, and bin size to minimize both the RMS of the unbinned residuals and the timecorrelated noise in the data for individual fits.

Kepler/K2 transits
We supplement our dataset with two previously published K2 transits of K2-18b, corrected for variations associated with telescope jitter and cosmic ray hits as described in Benneke et al. (2017).

Updated stellar parameters
For the correct analysis and interpretation of the K2-18b observations, it is absolutely crucial to update the stellar and planetary bulk parameters compared to the discovery and confirmation papers (Montet et al. 2015, Benneke et al. 2017. This is particularly important for K2-18b because GAIA DR2 substantially updated the parallax distance from the previously estimated 34 to 38.025 ± 0.079 pc (see also Cloutier et al. 2019). In its final consequence, this results in a substantial update to the planet radius from the previous value of R p = 2.29 R ⊕ to R p = 2.61 R ⊕ . Not correcting the stellar radius led Tsiaras et al. (2019) to use a planetary density that was 27% too high and a surface gravity that was 18% too high. The overestimated planetary density may have led to the continued classification of K2-18b as a super-Earth, even though the radius of R p = 2.61 R ⊕ puts K2-18b directly in the center of the intriguing sub-Neptune population (Fulton & Petigura 2018). Besides, the overestimated surface gravity unavoidably resulted in a systematic offset in the atmospheric scale height for each model in the atmospheric retrieval.
To revise the stellar parameters, we first calculate the distance modulus from the GAIA DR2 stellar parallax to be µ = 2.900 ± 0.005, where we add 30 µas systematic offset in the measured parallax (Cloutier et al. 2019). By propagating the uncertainties in K2-18's K-band magnitude (K = 8.899 ± 0.019), we then find an absolute K-band magnitude of M K = 5.999 ± 0.020. Using this absolute K-band magnitude, we infer the stellar radius of K2-18b from the empirical M star correlation (Mann et al. 2015) and the stellar mass from the empirical M dwarf mass-luminosity relation by Benedict et al. (2016, see our Table 2). We choose this approach for the stellar radius because it does not rely on the empirical massradius relationship, which we agreed with Dr. Cloutier to be the preferred approach. Either way the stellar radii presented here and the one from Cloutier et al. (2019) are consistent at the 1σ. However, both estimates present a substantial update to the old pre-GAIA stellar radius estimate, which led Tsiaras et al. (2019) to use an overestimated planetary density and surface gravity for their analyses. Finally, we use the effective stellar temperature from Benneke et al. (2017) to com-pute the stellar luminosity as well as the insolation of the planet (Table 2).

Transit White-light-curve Fitting
We carry out a global analysis of our HST /WFC3, K2, and Spitzer transit light curves within the ExoTEP analysis framework (Benneke et al. 2017(Benneke et al. , 2019. Ex-oTEP jointly fits the transit and systematics models of all transits along with photometric noise parameters using a Markov Chain Monte Carlo (MCMC) method. The main astrophysical outputs of the white-light-curve analysis are the global transit parameters (a/R , b), the transit ephemeris (T 0 , P ), and the transit depths in the K2, HST /WFC3, and Spitzer /IRAC bandpasses (Table 2). Prior to the global MCMC fit, we analyze each transit light curve individually and then initialize the corresponding systematics parameters in the global fit at their best-fit values to ease convergence. We find that the planet-to-star radius ratio estimates in the same band are consistent within their statistical uncertainties across all epochs. The white-light-curve fits to the HST /WFC3 and Spitzer transits are depicted in Figures  1 and 2.

HST/WFC3 Instrument Model
We correct for systematic trends in the uncorrected WFC3 transit light curves by simultaneously fitting an analytical model-ramp function along with the astrophysical transit model. Following previous studies (e.g., Berta et al. 2012, Deming et al. 2013, Kreidberg et al. 2014), we account for the possible presence of both visitlong slopes and orbit-long exponential ramps using the parametric instrumental systematics model: Here, c is a normalization constant, v is the visit-long linear slope, a and b describe the rate and amplitude of the orbit-long exponential slope, S(t) is set to 1 for forward scans and s for backward scans to account for the systematic flux offset between the scan directions, and t v and t orb are the time in hours since the start of the visit and the start of the observations within the current orbit. Following standard procedure, we discard the first HST orbit of each visit, as it systematically exhibits a stronger ramp than the three subsequent orbits. We also remove the first forward and backward scan exposures of each orbit.

Spitzer/IRAC Instrument Model
We account for the presence of systematic variations in the Spitzer time series due to subpixel inhomogeneities in the detector response using the pixel-level decorrelation (PLD) model (Deming et al. 2015, Benneke et al. 2017. For each Spitzer visit, the systematics model . (2) includes both a linear-exponential ramp in time (A e −ti/τ + m t) and the PLD term. In this analytical model, D k (t i ) are the registered counts in each of the 3 × 3 pixels covering the central region of the point spread function. The coefficients w k are time-independent PLD weights. The 10 parameters in this model are fitted along with the transit model for all Spitzer data sets.

Kepler/K2 Instrument Model
We use the detrended K2 light curves of K2-18b from Benneke et al. (2017) and additionally fit a linear trend with time to allow for a residual slope in each visit superimposed with the transit signal.

Transit Model and MCMC Analysis
Our astrophysical transit light-curve model f (t i ) is computed using the Batman module (Kreidberg 2015). In the joint white-light-curve fit, four distinct transit depths are fitted for the four bandpasses of K2, HST /WFC3, Spitzer /IRAC Channel 1, and Spitzer /IRAC Channel 2. The orbital parameters are assumed to be consistent for all 12 transit light curves. We also fit limb-darkening coefficients using a quadratic law because the eight HST /WFC3 combined provide full time coverage of the transit light curve including ingress and egress (Figure 1). The cadence of the observations is accounted for by integrating the model over time within each exposure. Finally, the total loglikelihood for a single set of parameters is calculated using where N is the number of visits, n V is the number of data points d V (t i ) in visit V , and σ V is a photometric noise parameter associated with each visit, simultaneously fitted to account for the possibility of variations in the scatter between independent visits. Each visit has a different systematics model S V (t i ) with additional free parameters specific to the instrument that performed the observations, as described in the previous sections.  data are to first-order independent of wavelength. We test two methods for the correction: either dividing each of the spectroscopic time series by its corresponding best-fit systematics model from the white-light-curve analysis, or dividing it by the ratio of the white light curve to its best-fitting transit model. We find no significant difference between the two approaches for precorrecting the light curves in terms of the derived parameters.
In each spectroscopic channel, we then perform a joint MCMC fit to the eight pre-corrected spectroscopic light curves from the eight HST /WFC3 transits. We keep the parameters a/R , b, T 0 , and P fixed to the best-fitting values from the white-light-curve fit and fit a simplified systematics model that detrends linearly against the x position of the trapezoidal spectral trace on the detector. We obtain consistent transmission spectra when fitting either a single linear limb-darkening coefficient at each wavelength or two coefficients for a quadratic limb-darkening profile. A consistent transmission spectrum is also obtained when using pre-computed limbdarkening coefficients from stellar atmosphere models (see Lothringer et al. (2018) and Benneke et al. (2019) for details). The resulting WFC3 transit depths for the latter are quoted in Table 3 and depicted in Figure 3 together with the K2 and Spitzer transit depths. A typical spectral-light-curve fit for one wavelength bin is depicted in Figure 1.  sit depth across multiple visits in the same bandpass we obtain consistent values, implying that the observed transit depths are relatively insensitive to variable spot coverage on the star.

Stellar Activity
Still, to further rule out a stellar origin, we attempt to fit the transit spectrum with a stellar contamination model. Following Rackham et al. (2018) we consider the hypothesis that the observed transit depth variations are due to inhomogeneities on the stellar surface. The ob- served wavelength-dependent transit depths D λ,obs are modeled as: (4) where D λ is the geometric radius ratio (R p /R ) 2 , assumed here to be constant with wavelength (= D), f spot and f fac are the spot and faculae covering fractions, and the F λ refer to bandpass-integrated PHOENIX model spectra of the photosphere, spots, and faculae (Hauschildt et al. 1999). Using K2-18b stellar parameters (Table 2), we explore a wide range of possible stellar contamination spectra for M2-M3 stars with spot and faculae covering fractions consistent with 1% I-band variability. Based on Rackham et al. (2018), we then explore scenarios with giant spots and solar-like spots as well as with and without faculae, resulting in spot fractions of up to 20%. However, none of those models can explain the amplitude of the observed transit spectrum. The amplitude of resulting transit depth variations due to the stellar inhomogeneities are up to 3-8 parts-permillion (ppm), about an order of magnitude smaller than the observed 80-90 ppm transit depth variations in the observed WFC3 spectrum (Figure 3). Finally, we also Pressure [mbar] Figure 3. Transmission spectrum of K2-18b computed from our global spectroscopic and broadband transit light-curve analysis (black points), and a random sampling of the model transmission spectra in the retrieval MCMC chain (blue). The shaded regions indicate 1σ and 2σ credible intervals in the retrieved spectrum (medium and light blue, respectively), relative to the median fit (dark blue line) and the overall best-fitting model (red). The main feature of the transmission spectrum is the prominent increase in transit depth within the 1.4 µm vibrational band of water vapor covered by the HST /WFC3 data. The K2 data point is plotted at visible wavelengths and the Spitzer /IRAC measurements are indicated at 3.6 and 4.5 µm. The secondary vertical axis on the right indicates the atmospheric pressure for the best-fitting model.
explore the most extreme scenarios where the spot and faculae covering fractions can be as high as 100%, but even those stellar inhomogeneity models fail to explain the amplitude of the observed transit depth variation. They deliver an absolute maximum of 20 ppm at 1.4 µm, which still only corresponds to less than a quarter of the transit depth variation in the observations. We conclude that stellar inhomogeneities and activity cannot explain the measured transmission spectrum.

ATMOSPHERIC MODELING
We compute quantitative constraints on the atmosphere of K2-18b using the SCARLET atmospheric retrieval framework (Benneke & Seager 2012, 2013, Knutson et al. 2014a, Kreidberg et al. 2014, Benneke 2015, Benneke et al. 2019. To be as independent of model assumption as possible, we employ the "free retrieval" mode, which parameterizes the mole fractions of the molecular gases, the pressure of the cloud deck, and the atmospheric temperature as free fitting parameters. SCARLET then determines their posterior constraints by combining the atmospheric forward model with a Bayesian MCMC analysis. To evaluate the likelihood for a particular set of parameters, the atmospheric forward model first computes a model atmosphere in hydrostatic equilibrium, then determines the opacities of molecules at each layer, and finally computes the transmission spectrum. Beyond H 2 /He, our model allows for H 2 O, CH 4 , CO, CO 2 , NH 3 , HCN, and N 2 with a log-uniform prior for mixing ratios between 10 −10 and 1. We find that only H 2 O is required by the data and that including the other molecules has virtually no impact on the best fit to the data. Following Benneke & Seager (2012, 2013, we also include a cloud deck at a freely parameterized cloud top pressure with a log-uniform prior between 0.1 mbar and 10 bar. The cloud deck is assumed to be opaque to grazing light beams below the cloud top pressure as would occur for large droplets. We also explored a more complex three-parameter Mie-scattering cloud description as introduced in Benneke et al. (2019); however, we find no significant improvement in the fit to the observed transmission spectrum compared to gray clouds. Our atmospheric temperature is parameterized using a single free parameter for the mid-atmosphere probed by the observations because low-resolution transmission spectra are largely insensitive to the exact vertical temperature structure (Benneke & Seager 2012). We also considered a five-parameter analytic model (Parmentier & Guil-  the cloud top pressure. The colored shading indicates the normalized probability density as a function of the water vapor mole fraction above the clouds and cloud top pressure derived using our Bayesian atmosphere retrieval framework. The black contours show the 1σ and 2σ regions. Absolute water vapor mole fraction is indicated at the top axis; the water abundance relative to a solar composition atmosphere is shown at the bottom. lot 2014, Benneke et al. 2019), but no gradients in the temperature-pressure profile can be constrained.
High-resolution synthetic transmission spectra are computed using line-by-line radiative transfer, which are integrated over the appropriate instrument response functions to obtain synthetic observations to be compared to the observations. Sufficient wavelength resolution in the synthetic spectra is ensured by repeatedly verifying that the likelihood for a given model is not significantly affected by the finite wavelength resolution (∆χ 2 < 0.001). Reference models are computed at λ/∆λ = 250000. Our final retrieval run computes all atmospheric models at λ/∆λ = 30000.
We perform all retrieval analyses with 100 walkers using log-uniform priors on all parameters. Following best practice for emcee (Foreman-Mackey et al. 2013), our formal convergence test is based on the integrated autocorrelation time (Goodman & Weare 2010), but we run the chains well beyond formal convergence to obtain smooth posterior distribution contours. We doublecheck that the number of iterations of our 100 walker chain exceeds at least 50 times the autocorrelation time computed based on Goodman & Weare (2010) as well as the autocorrelation time computed computed based on the more conservative Farad method in the emcee 3.0.0 documentation.

RESULTS
Our transmission spectrum of K2-18b reveals a slightly attenuated but statistically significant water absorption feature at 1.4 µm in our HST /WFC3 data (Figure 3). The water absorption is detected in multiple neighboring spectroscopic channels covering the 1.4 µm water band and protrudes over an otherwise relatively flat visible to near-IR transmission spectrum. Quantitatively, retrieval models that include molecular absorption by water are favored by the Bayesian factor at 459:1 (3.93σ). We evaluate the Bayes factor by comparing the Bayes evidence for models with and without water opacity (Benneke & Seager 2013). We also evaluate the AIC of our model in comparison to the simple flat line model with R P /R as the only free parameter and find a strong preference of ∆AIC=18.1 in favor of the model with water absorption.
Our retrieval modeling also shows that the data are best matched by a hydrogen-dominated atmosphere with water vapor absorbing above clouds in the midatmosphere (Figure 4). Water abundances between 0.033% and 8.9% are most consistent with the data at 1σ, corresponding to 0.6 times and 162 times the value for a solar abundance atmosphere, respectively. The clouds become optically thick below the 7.7-139 mbar level. The water abundance and cloud top pressure are strongly correlated because models with similar water vapor column density above the cloud deck result in similar transmission spectra (see also Benneke & Seager 2013). For the lowest water abundances, pressure levels up to a few hundred millibars are probed by our observations because K2-18b's radius is relatively small and the low temperature results in a relatively small atmospheric scale height. Both effects reduce the path lengths for starlight grazing through the atmosphere compared to most previously studied exoplanets.
Cloud-free models with water mixing ratios greater than several hundred times solar are substantially disfavored by our combined data set indicating that K2-18b hosts an atmosphere rich in hydrogen and helium. This is consistent with our new measurement of the planet density (Section 3.1) which supports the presence of a thick hydrogen-rich gas envelope (Lopez & Fortney 2014). Quantitatively, the retrieval model with clouds in the mid-atmosphere is favored by at least 8.3:1 (2.6σ) in the Bayesian evidence over retrieval models without clouds. The significance rises further if extreme subsolar O/H ratios for K2-18b are not considered. We conclude that it is crucial to report the water abun- Figure 5. Self-consistent temperature-pressure profile for the retrieved best-fitting molecular composition of K2-18b in comparison to the water phase diagram. The blue curve shows the vertical temperature profile for an Earth-like Bond albedo (A=0.3). The partial pressure of water is shown on the left; the total atmospheric pressure is shown on the right. The temperature profile crosses the liquid area of the water phase diagram (blue circle). Clouds that form in this regime form liquid water droplets and possibly result in liquid water rain. Icy clouds can possibly form at higher altitude as well. Note that the pressure axis is inverted as is commonly done for atmospheric temperature profiles.
dance constraints from retrieval models that consider clouds. Cloud-free retrieval models would misleadingly favor extremely high water mole fractions and narrower constraints.
Finally, while no molecular species other than H 2 O is directly detected in our observations of K2-18b, we can still infer useful upper limits on their mole fractions. The two dominant effects of additional molecules are the introduction of additional wavelength-dependent opacities and, at high mole fraction, the increase of the overall atmospheric mean molecular weight (Benneke & Seager 2012). The latter, in turn, dampens the amplitudes of all molecular absorption features in the transmission spectrum. The upper limit on the spectrally inactive gas N 2 , for example, is 10.9% at 2σ (97.5%) because above that level the atmospheric mean molecular mass would be sufficiently increased to not allow for the amplitude of the detected 1.4 µm H 2 O absorption feature. The other molecular gases (CO, CO 2 , NH 3 , and CH 4 ) are similarly constrained to 7.45%, 2.4%, 13.5%, and 0.248% at 2σ, respectively. Molecular opacities also played an important role for these constraints because CO, CO 2 , NH 3 , and CH 4 can result in substantial opacities within the HST /WFC3 and/or Spitzer bandpasses.

DISCUSSION AND CONCLUSIONS
The simultaneous detection of water vapor and clouds in the mid-atmosphere of K2-18b is particularly intriguing because K2-18b's stellar insolation and atmospheric temperature are sufficiently low for water vapor to condense. Water clouds are therefore a very plausible explanation for the detected clouds. To demonstrate the plausibility of liquid droplet formation, we explore self-consistent temperature structures for atmospheric compositions consistent with the observations ( Figure  5). We computed a set of fiducial temperature-pressure profiles using the implementation described in Benneke et al. (2019) and Morley et al. (2013). Both models iteratively solve the radiative-convective heat transport. The models predict the onset of liquid condensation between approximately 10 and 1000 mbar at locations where the water vapor is supersaturated. The fact that the bestfitting retrieval models are obtained for cloud top pressures in the same range further supports the scenario of liquid water clouds (Figure 4).
The water vapor mixing ratio retrieved from our spectroscopic observations should be regarded as the mole fraction of water vapor in the gaseous atmosphere above the clouds. Because water condenses into clouds, this presents a lower limit on the overall water fraction in the atmosphere K2-18b. The water abundance in the upper atmosphere is set by a complex interplay between upward transport through vertical mixing and homogeneous/heterogeneous condensation and coagulation processes. Detailed modeling of these cloud formation processes will be presented in a follow-up study.
The detection of water vapor makes K2-18b a key target for more detailed follow-up studies with the upcoming James Webb Space Telescope (JWST ). Unlike any other temperate and low-mass planet, we now know that K2-18b shows evidence for atmospheric water vapor and is amenable to characterization via transmission spectroscopy. Our result suggests K2-18b orbits far enough away from its moderately active host star to avoid high levels of haze production, while also remaining cool enough that cloud species like KCl and ZnS that may exist in warmer atmospheres do not form. This may indicate that cooler planets like K2-18b occupy a sought after niche in the low-mass planet parameter space that is accessible to spectroscopic characterization with JWST. JWST 's wavelength coverage will extend from 0.55 µm to the thermal infrared, where many other molecular species like CH 4 , CO, CO 2 , and NH 3 can be probed directly. The higher precision and spectral resolution obtained from repeated JWST transit observations will allow us to better constrain K2-18b's atmospheric composition and cloud properties, and po-tentially even look for biomarkers in the gas envelope of a habitable-zone exoplanet (e.g., Seager et al. 2013).