Exploring the Ability of Hubble Space Telescope WFC3 G141 to Uncover Trends in Populations of Exoplanet Atmospheres through a Homogeneous Transmission Survey of 70 Gaseous Planets

We present analysis of the atmospheres of 70 gaseous extrasolar planets via transit spectroscopy with Hubble’s Wide Field Camera 3 (WFC3). For over half of these, we statistically detect spectral modulation that our retrievals attribute to molecular species. Among these, we use Bayesian hierarchical modeling to search for chemical trends with bulk parameters. We use the extracted water abundance to infer the atmospheric metallicity and compare it to the planet’s mass. We also run chemical equilibrium retrievals, fitting for the atmospheric metallicity directly. However, although previous studies have found evidence of a mass–metallicity trend, we find no such relation within our data. For the hotter planets within our sample, we find evidence for thermal dissociation of dihydrogen and water via the H− opacity. We suggest that the general lack of trends seen across this population study could be due to (i) the insufficient spectral coverage offered by the Hubble Space Telescope’s WFC3 G141 band, (ii) the lack of a simple trend across the whole population, (iii) the essentially random nature of the target selection for this study, or (iv) a combination of all the above. We set out how we can learn from this vast data set going forward in an attempt to ensure comparative planetology can be undertaken in the future with facilities such as the JWST, Twinkle, and Ariel. We conclude that a wider simultaneous spectral coverage is required as well as a more structured approach to target selection.


Introduction
The exoplanet field has rapidly expanded, with thousands of planets currently known today and thousands more anticipated in the coming decade.The vast number of detected worlds has allowed us to begin to further characterize a diverse selection.While direct imaging has provided high-quality thermal emission spectra for a handful of planets (e.g., Samland et al. 2017;Zhou et al. 2020;Wang et al. 2022), the bulk of atmospheric characterization has been undertaken using transit or eclipse spectroscopy.Ground-based, high-resolution observations have been used to detect atomic metals and their ions (e.g., Birkby 2018;Ehrenreich et al. 2020;Kawauchi et al. 2022;Kesseli et al. 2022;Prinoth et al. 2022;Yan et al. 2022) as well as evidence of high-speed winds in the terminator region (e.g., Seidel et al. 2020;Cauley et al. 2021).
While lower-resolution, space-based data are not capable of distinguishing individual absorption or emission lines, molecular species have been detected via their broadband features, giving insights into the atmospheric diversity of extrasolar planets (e.g., Tinetti et al. 2007;Swain et al. 2008).Although most research papers have focused on individual objects, some Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
have begun to conduct population-style studies (e.g., Cowan & Agol 2011;Sing et al. 2016a;Tsiaras et al. 2018;Changeat et al. 2022).For instance, the Infrared Array Camera (IRAC) on board Spitzer was used extensively before the end of the observatory's life in 2019 and Spitzer eclipses and phase curves have been used to search for trends in the day-night temperatures of hot Jupiters (e.g., Baxter et al. 2020;Garhart et al. 2020;Bell et al. 2021;Keating & Cowan 2022;May et al. 2022).
While the Space Telescope Imaging Spectrograph (STIS) has studied a number of planets, leading to the detection of a variety of spectral features in the visible and UV (e.g., Evans et al. 2018;Von Essen et al. 2020), Hubble's Wide Field Camera 3 (WFC3) has been the workhorse of infrared (IR) space-based spectroscopy, initially using staring-mode observations (e.g., Berta et al. 2012;Mandell et al. 2013;Ranjan et al. 2014).The later development of the spatial scanning technique (McCullough & MacKenty 2012) led to far greater efficiencies and thus more precise spectra (e.g., Deming et al. 2013;Kreidberg et al. 2014b;Changeat & Edwards 2021).
Using this technique, groups of planets began to be analyzed in transmission.Sing et al. (2016a) combined Hubble STIS/ WFC3 and Spitzer IRAC data of 10 hot Jupiters, finding a range of atmospheric feature sizes indicative of different cloud levels within the selected planets.The data set from Sing et al. (2016a) was later used in a number of different retrieval studies (Barstow et al. 2017;Pinhas et al. 2019).Nineteen Hubble Space Telescope (HST) WFC3 G141 transmission spectra were analyzed by Iyer et al. (2016), who also concluded that clouds were common in the atmospheres of hot Jupiters.Tsiaras et al. (2018) conducted a larger population analysis which included 30 planets.The study used data purely from HST WFC3 G141, finding that around half the data sets showed significant evidence for atmospheric features.Tsiaras et al. (2018) did not search for a trend between chemistry and the bulk characteristics of a planet but their data was later used by Fisher & Heng (2018), who also included data for the TRAPPIST-1 system from de Wit et al. (2018) and several other studies (Huitson et al. 2013;Mandell et al. 2013;Kreidberg et al. 2014b;Knutson et al. 2014).In their study, they found no trends between water abundance and planet mass or temperature.However, the analysis of 19 planets by Welbanks et al. (2019) suggested a mass-metallicity trend where the water abundance increased with decreasing mass.They also noted that the metallicities implied by these water abundances were generally below those of the giants planets in our solar system (Atreya et al. 2016).Additionally, many other studies have performed retrieval analyses of planets by taking spectral data from the literature (e.g., Barstow et al. 2017;Pinhas et al. 2019;Cubillos & Blecic 2021;Kawashima & Min 2021).For smaller planets within the sub-Neptune or Neptune regime (∼2-6 R ⊕ ), a study of six planets showed a strong correlation between the amplitude of the water feature and the equilibrium temperature of the planet or its bulk mass fraction of H/He (Crossfield & Kreidberg 2017).
Population studies have also been undertaken in emission.For instance, data from Spitzer IRAC channels 1 and 2 have been utilized to study tens of planets (e.g., Baxter et al. 2020;Garhart et al. 2020;Keating & Cowan 2022).Mansfield et al. (2021) presented a simplistic metric which was designed to indicate whether a spectrum showed evidence for a thermal inversion by measuring if the water feature in the HST WFC3 G141 band was in absorption or emission, applying it to 19 planets and comparing the values to a fiducial model.Finally, Changeat et al. (2022) presented an analysis of 25 hot and ultrahot Jupiters in emission with HST and Spitzer and, by using atmospheric retrievals, observationally uncovered an apparent link between the abundance of optical absorbers in their atmospheres and the temperature structure.
In many studies in the literature, data from multiple instruments is combined to expand the wavelength coverage.However, there are many potential issues when trying to infer atmospheric properties based off these merged data sets.First, the wavelength region probed determines the sensitivity of the data to each molecular opacity.Therefore, studying the same planet but with different instruments can often lead to differing constraints on the abundance of a species (Pinhas et al. 2019;Pluriel et al. 2020).Applying this to different planets implies that, if the data sets are not homogeneous, then the cause of any trends seen in the retrievals cannot be determined: the underlying abundances could be different or the data sets could have differing sensitivities to these molecules.Combining instruments can also lead to inconsistencies, as the data sets are not necessarily compatible (Yip et al. 2020(Yip et al. , 2021)).Hence, while a longer spectral baseline may give precise atmospheric abundances (e.g., Wakeford et al. 2018), these constraints could be wholly inaccurate.Hence, by combining a menagerie of data sets, biases can be introduced into the analysis of a single planet as well as a population as a whole.Several works have attempted to overcome the vertical offsets often seen between data sets by adding an offset parameter to the retrieval model (e.g., Luque et al. 2020;Murgas et al. 2020;Wilson et al. 2020;Yan et al. 2020;Yip et al. 2021;McGruder et al. 2022).However, it is unclear how well this works, and the issue of varying sensitivity could still apply and temporal changes (e.g., Bruno et al. 2020;Saba et al. 2022) provide an additional challenge.
Here we conduct a spectroscopic population study of 70 gaseous exoplanetary atmospheres, using a methodology which is standardized and applied uniformly to all targets to try to ensure the extraction of robust trends.In an attempt to avoid the aforementioned biases, we restrict ourselves to using data from HST WFC3 G141 only.To further seek homogeneity, the data were extracted using the same pipeline, Iraclis (Tsiaras et al. 2016b(Tsiaras et al. , 2016c)).By performing standardized atmospheric retrievals using the TauREx 3 code (Al-Refaie et al. 2021) within the Alfnoor pipeline (Changeat et al. 2020a), we search for trends within these data sets.We attempt to find correlations between the water abundance recovered and the planet's bulk parameters, such as mass and temperature, comparing our findings to those from the literature.We also investigate the amplitude of the spectral features seen, searching for trends with the planet's temperature, surface gravity and, for smaller planets, H/He mass fraction.At each stage we attempt to understand the limitations of our approach, including the potential biases that could be introduced.As we stand at the dawn of a new era of increased data quality, consideration of these will be crucial to avoid misinterpreting these data sets.

Observations
To ensure homogeneity in our study, we aimed for all data to be analyzed with a single pipeline: Iraclis (Tsiaras et al. 2016b).The pipeline has previously been used in a number of studies and so we acquired a number of spectra from these.Many of these were taken from the population study by Tsiaras et al. (2018) and the papers resulting from the Ariel Retrieval of Exoplanets School (Edwards et al. 2020b;Pluriel et al. 2020;Skaf et al. 2020;Guilluy et al. 2021).We constrain ourselves to planets which are likely to possess an atmosphere containing significant amounts of hydrogen and helium (R > 2 R ⊕ ; Fulton & Petigura 2018).Therefore, we do not include HST WFC3 data of 55 Cancri e (55 Cnc e; Tsiaras et al. 2016c), LHS 1140b (Edwards et al. 2021a), GJ 1132b (Mugnai et al. 2021;Swain et al. 2021;Libby-Roberts et al. 2022), or TRAPPIST-1 b-h (de Wit et al. 2018;Garcia et al. 2022;Gressier et al. 2022).
A list of all sources of previous data sets analyzed with Iraclis are given in Table 1 along with the proposal numbers and principal investigators of the observing proposals.Meanwhile, the observations analyzed in this work are given in Table 2.These account for 28 new planets, though we note that many of these data sets have been analyzed using other pipelines (Huitson et al. 2013;Mandell et al. 2013;Kreidberg et al. 2014b;Knutson et al. 2014;Ranjan et al. 2014;Evans et al. 2016;Kreidberg et al. 2018aKreidberg et al. , 2018b;;Spake et al. 2018;Carter et al. 2020;Chachan et al. 2020;Guo et al. 2020;Libby-Roberts et al. 2020;Alam et al. 2022;Brande et al. 2022;Glidic et al. 2022), meaning that only 16 data sets have not previously been published at the time of writing.The distribution of our targets, in terms of the planet's semimajor axis and mass, is shown in Figure 1.
We detail the methodology of our Iraclis analysis in Appendix A as well as discussing the results of each individual fitting and comparisons to previous works.For two planets where reasonable fits could not be obtained with the Iraclis pipeline, we took values from the literature, which are noted in Table 3.We also attempted to fit the transit observation of K2-33 b (PN: 14887, PI: Björn Benneke; Benneke et al. 2016a) but it did not have a post-egress orbit and reliable constrains on the transit depth could not be achieved.Furthermore, a staringmode observation of WASP-18 b (PN: 12181, PI: Drake Deming; Deming 2010) was analyzed but the precision achieved on the transit depth was far lower than that of the scanning-mode observation.It was therefore discarded.Similarly, we also analyzed the staring-mode data of GJ 1214b (PN: 12251, PI: Zach Berta-Thompson; Berta et al. 2012) but did not use it in our final analysis due to the better sensitivity offered by the scanning-mode data.
In total we analyze the HST WFC3 G141 transmission spectra of 70 planets, 68 of which have been reduced with the Iraclis pipeline.We note that neither of the spectra that were taken from the literature, and therefore not reduced using the Iraclis pipeline, led to detections of atmospheric features.

Data Analysis
Having created a database of HST WFC3 G141 spectra, we set about analyzing them in search of trends within the population.The analysis included Bayesian retrievals as well as studying the strength of the 1.4 μm water feature.

Retrieval Setup
Atmospheric retrievals were performed on the transmission spectra using the population analysis tool Alfnoor (Changeat et al. 2020a).Alfnoor extends the capabilities of the publicly available retrieval suite TauREx 3 (Al-Refaie et al. 2021) 18 to populations of exoatmospheres.The atmospheres of the planets analyzed here were simulated to range from 10 −4 to 10 6 Pa (10 −9 to 10 bar) and sampled uniformly in log-space by 100 atmospheric layers.For the spectra taken from other studies, the star and planet parameters are given in Tables 6 and 7.For the spectra derived here, the star parameter values we used are listed in Table 8, while the planet parameters are given in Table 9.
In our retrievals we assumed that all planets possess a primary atmosphere with a solar ratio of helium-to-hydrogen (He/H 2 = 0.17).To this we added trace gases and included the molecular opacities from the ExoMol (Tennyson et al. 2016), HITRAN (Gordon et al. 2016), and HITEMP (Rothman and Gordon 2014) databases.
The key molecular absorption within the WFC3 range is H 2 O.However, in a free chemical retrieval, the other molecules chosen can affect the resulting abundance of H 2 O (e.g., Changeat et al. 2020b).Therefore, we attempted several different retrievals to test the robustness of our results.These were as follows: 1. Standard retrieval.In this setup we included the opacities of H 2 O (Polyansky et al. 2018), CH 4 (Yurchenko & Tennyson 2014), CO (Li et al. 2015), CO 2 (Rothman et al. 2010), HCN (Barber et al. 2013), and NH 3 (Yurchenko et al. 2011).On top of this, we also included collisioninduced absorption (CIA) from H 2 -H 2 (Abel et al. 2011;Fletcher et al. 2018) and H 2 -He (Abel et al. 2012) as well as Rayleigh scattering for all molecules.We modeled two sets of clouds.First, as a uniform opaque deck, fitting only the cloud-top pressure (i.e., gray clouds).Additionally, we added wavelength-dependent Mie scattering using the approximation from Lee et al. (2013).2. Optical absorbers.A number of previous WFC3 studies have found evidence for hydrides or oxides (e.g., Evans et al. 2016;Pluriel et al. 2020;Skaf et al. 2020).Hence, in this setup, we included all the opacity sources from our standard retrieval with the addition of TiO Hence, we attempted several retrievals with the C/O ratio fixed to various values in an attempt to see whether the metallicity could be well constrained.5. Flat model.In this retrieval, no molecular opacities were included.Instead, the only fitted parameters were the planet's temperature and radius as well as the pressure of a gray cloud deck.CIA and Rayleigh scattering were also included.To quantify the significance of our molecular detections, we compare the Bayesian evidence (Kass & Raftery 1995) from each of the retrievals to this flat model.We use this as a baseline from which to calculate the significance of any apparent atmospheric detections.In each free chemistry case, all molecular abundances were allowed to vary from log(VMR) = −1 to log(VMR) = −12, where VMR is the volume-mixing ratio.Higher mixing are not expected in the majority of these atmospheres and these would also necessitate accounting for self-broadening of the molecular lines (Anisman et al. 2022a(Anisman et al. , 2022b)).For the equilibrium chemistry retrievals, the metallicity was allowed to vary from 0.1 to 100 and the C/O ratio had bounds of 0.1 and 2. For the Mie clouds we followed the methodology of Tsiaras et al. (2018) and fixed Q 0 to 50, who found that the uncertainty induced by either varying or fixing Q 0 is negligible given the quality of the data at hand.We set a log-uniform prior of χ 0 ranging from 10 −40 to 10 −10 , a particle size from 10 −5 to 10 μm, and a cloudtop pressure from 10 −4 to 10 6 Pa (Lee et al. 2013).
For each planet, the equilibrium temperature was calculated from where T s is the host star's temperature, R s is the host star's radius, a is the planet's semimajor axis, and the albedo and heat redistribution factor are set to A = 0.2 and e = 0.8, respectively.An isothermal temperature-pressure profile was assumed.While this is an oversimplification and can lead to retrieval biases (Rocchetto et al. 2016), the restrictive wavelength range does not allow for the differentiation of an isothermal from a more complex profile.The temperature bounds of the retrieved were set to ±500 K of the planet's equilibrium temperature while the planet's radius was allowed to vary between ±50% of its literature value.The planet's mass was fixed to the values in Tables 7 and 9 ( Changeat et al. 2020c).Finally, we explored the parameter space using the nested sampling algorithm MultiNest (Feroz et al. 2009;Buchner et al. 2014) with 1000 live points and an evidence tolerance of 0.5.et al. (2018) introduced the atmospheric detectability index (ADI), which compares the Bayesian evidence of an atmospheric retrieval to a flat model.Using this metric, they concluded that 16 of the 30 planets analyzed had detectable atmospheres before searching for trends with different bulk parameters, finding a correlation with the planet's radius.Given that we have taken the planets studied in Tsiaras et al. (2018) and expanded upon their sample, we also explored the detectability of atmospheres and searched for links with planet parameters.

Tsiaras
We used the Bayesian evidence to determine the preferred atmospheric model, comparing this to the evidence from the flat model to calculate the significance of any atmospheric detection.Instead of using the ADI, we instead transformed the difference in the Bayesian evidence into a sigma detection and used these values to search for trends with planetary parameters.However, we note that these systems of identifying atmosphere detections are analogous, with an ADI of 3 being equivalent to a 3σ detection (Tsiaras et al. 2018).

Search for Atmospheric Trends
Large gaseous planets are thought to initially form via solid core accretion before undergoing runaway gas accretion and, in the case of the planets studied here, migration is also likely (Mizuno 1980;Bodenheimer & Pollack 1986;Ikoma et al. 2000).In the core-accretion model, lower-mass planets are incapable of accreting substantial gaseous envelopes, instead preferentially accreting higher-metallicity solids (Mordasini et al. 2012;Fortney et al. 2013).Therefore the metallicity, the ratio of the elements heavier than helium to all the elements, can act as a key test of this theory, and studies of the methane content of the gaseous planets within our own solar system are in agreement with the predictions of the core-accretion scenario.Over the last decade, exoplanet observations have expanded the search for a mass-metallicity trend to other planetary systems.Previous observational studies with the HST have found some indications of a mass-metallicity trend within exoplanets' atmospheres (e.g., Wakeford et al. 2017b;Welbanks et al. 2019).Furthermore, by comparing the bulk characteristics of exoplanets to structural evolution models, there is evidence that a exoplanet mass-metallicity trend is likely but could differ for that seen in our own solar system (Thorngren et al. 2016).
The targets studied here cover a wide mass range, from the ultra-low-density Kepler-51 b (M = 0.0166 M J ) to the brown dwarf KELT-1 b (M = 27.23 M J ).We utilized both our free chemistry retrievals, and those conducted assuming equilibrium chemistry, to search for an enrichment trend.The planet metallicities extracted by the chemical equilibrium retrievals were compared to the trend found in Thorngren et al. (2016).They found the strongest correlation was not between the planet's mass and the planet's metallicity but between the mass and the ratio of the planet-to-star metallicity.Hence we, like them, calculated the host star metallicity from

= úsing
the Fe/H values given in Tables 6 and 8.We then divided our retrieval metallicities by these values to ascertain the ratio of the metallicities and search for a trend against the planet's mass.
In the free chemistry case, we attempted to constrain a multitude of molecular species.However, due to the wavelength coverage of HST WFC3 G141, only the abundance of water can be convincingly constrained in each case.Taking the water abundance from the preferred atmospheric model (i.e., one with or without optical absorbers) we followed the methodology of Welbanks et al. (2019) to get the ratio of water-to-hydrogen with respect to solar values.For each planet, the expected water-based metallicity was determined by computing the theoretical abundance at 1 × 10 -3 Pa (0.1 bar) in thermochemical equilibrium, assuming C/O = 0.54 and a metallicity equivalent to that of the host star (Fe/H).The expected water-to-hydrogen ratio was then compared to the retrieved one to give the relative level of enrichment.
In addition to searching for trends with mass, we also investigated the dependence of the retrieved abundances of molecules on temperature.For these, we computed the expected abundance of H 2 O, CH 4 , TiO, VO, FeH, and H - using GGchem over a temperature grid of 100-3000 K at pressures between 1 × 10 2 and 1 × 10 5 Pa (1 × 10 -3 to 1 bar).

Bayesian Hierarchical Modeling
So far, atmospheric retrievals have been limited to a case-bycase basis, where each observations yield their own atmospheric parameters of interest (such as molecular abundance in the atmosphere).With 70 observations available in our study, we would like to seek trends within our samples.The conventional approach is to fit a trend to a set of error bars, where the mean and sigma values are those computed from the individual posterior distributions.The mean and sigma fall short when attempting to capture the statistics presented by the rich and often non-Gaussian posterior distribution.Bayesian hierarchical modeling (BHM) is a principled way to estimate the (hyper)parameter of the trends that may exist within a population.BHM does this by first treating the posterior distribution from each observation as a submodel and, together these submodels, helps to infer the hyperparameters of the global trend across different data sets.The multistage approach accounts for the planet-to-planet variability presented in each observation and properly propagates the uncertainty from each observation to the next layer in the hierarchical model (Gelman et al. 2013).
While Bayesian retrievals have been common in the exoplanetary field for some time, and are now the standard methodology, BHM has not been so widely utilized, perhaps partly due to the general lack of a sufficient number of data sets.However, a number of studies have employed it (e.g., Hogg et al. 2010;Wang et al. 2011;Wolfgang et al. 2016), including works focused on seeking trends in exoplanet atmospheres (Keating & Cowan 2022;Lustig-Yaeger et al. 2022).
When searching for temperature-related trends using the abundances from our retrievals we compared two models: a linear trend and a flat trend (i.e., a null hypothesis).We again utilized MultiNest for this fitting and used the Bayesian evidence from these fits to determine which gave the best representation of the data.We discuss our implementation of BHM in more depth in Appendix C.

Results
For each transmission spectrum analyzed, we determined the best-fitting models using the Bayesian evidence of our retrievals.For the free chemistry cases, four models were compared as well as a flat model.These spectra, and their bestfitting models, are shown in Figure 2. In each case, two or three models are shown: the flat model, the preferred free chemistry model that does not include optical absorbers, and, for planets above 1500 K, the preferred free chemistry model that does include optical absorbers.The preferred overall model is shown by the solid line while dashed lines show the other models.In the following sections, we place the results of these retrievals in the context of the findings of previous studies.

Atmospheric Detectability
We find that, of our sample of 70 planets, 37 have strong evidence (>3σ) of atmospheric modulation.We note that, for several planets (KELT-1 b, HAT-P-2 b, and WASP-18 b) the error bars are too large to detect even a completely clear atmosphere.Therefore, we conclude that, of those for which an atmosphere could have been statistically detected, we find evidence for atmospheric detections on 57% of the population studied here, similar to the 53% from Tsiaras et al. (2018).
To explore the concept further, we computed the expected signal-to-noise ratio (S/N) on a single scale height of atmospheric signal for each planet, comparing this to the achieved detection significance.We find that while there is evidently a correlation between the anticipated S/N and the significance of the detection, for some planets no detection is made even though the precision of the observations is high enough to expect one.GJ 1214b provides the perfect example of this as it has the highest predicted S/N based on a H/Hedominated atmosphere yet there is not strong evidence for spectral modulation due to an atmosphere (Kreidberg et al. 2014b) 19 .
In Figure 3 we also look to see if planet radius, temperature, and surface gravity have an effect on the chances of detecting an atmosphere.We see that large (>1 R J ), hotter (>1000 K) planets generally have a better chance of a detection, even if the S/N is low.Meanwhile, for cooler (<1000 K), smaller (<1 R J ) planets, a nondetection at a high S/N is more prevalent.When comparing the detection rate for planets with similar temperatures but with different surface gravities, there is some indication that those with a lower gravity more regularly have atmospheric detections.However, it is clearly correlated with the expected S/N, which is generally larger for those with a smaller surface gravity within our sample.The effect of the differing precision in the data with respect to the atmosphere's size therefore makes it difficult to distinguish if this is indeed due to this bulk parameter.

Search for Trends between Chemistry and Temperature
We sought to go beyond the work of Tsiaras et al. (2018) by also searching for trends in the chemistry, not just the atmospheric detectability.The planets studied in this work vary in equilibrium temperature by over 2000 K and, as temperature and chemistry are unequivocally intertwined, we searched for evidence of this within our data.Figure 4 shows the retrieved abundances for a number of absorbers, each plotted against temperature.Comparing the findings of our retrievals to chemical equilibrium models with GGchem, we notice a number of things.
First, the retrieved water abundance is almost always above that which is predicted.Indeed, many of the abundances are in fact constrained by our priors, with the upper bound placed at a volume mixing ratio of 10%.These high water abundances contradict the finds of Changeat et al. (2022), where this molecule was generally found to be subsolar.Second, methane is never constrained, despite being predicted at abundances that would be detectable with HST WFC3 at cooler temperatures.An absence of methane despite it being predicted has also been found by previous works (e.g., Benneke et al. 2019;Anisman et al. 2020;Carone et al. 2021;Baxter et al. 2021), and clouds, which are found across the majority of the planet studied here, have been suggested as a mechanism for methane depletion (Molaverdikhani et al. 2020).Additionally, we rarely find evidence for HCN and NH 3 , with two of the three "detections" of the latter species being questionable due to the large abundance in the case of K2-24 b and the high equilibrium temperature in the case of WASP-121 b.We do detect NH 3 in the atmosphere of HD 106315c (T eq ∼850 K), a result which has also previously been found by Guilluy et al. (2021), although they note that the model with NH 3 is only preferred to <2σ to one without this molecule.
For the optical absorbers we considered, there are a number of planets for which models with these species are preferred, and a couple where their abundances are constrained in our retrievals.While no planet had a lower 1σ abundance limit of TiO greater than 10 −9 , we see evidence for VO in WASP-103 b.As in other low-resolution studies of the planet (Skaf et al. 2020), a large abundance of FeH was found in the atmosphere of WASP-127 b.However, compared to Skaf et al. (2020), we do not find as high abundances of FeH for WASP-62 b and WASP-79 b, potentially due to the inclusion of the H -opacity, as noted for WASP-79 b in Rathcke et al. (2021).For this absorption we followed the procedure described in Edwards et al. (2020b) and retrieved the abundance of e−, with the strongest constraints on this species being in the atmospheres of KELT-7 b, WASP-12 b, WASP-79 b, WASP-103 b, and WASP-178 b, all of which are planets with equilibrium temperatures higher than 1800 K.The abundances predicted with GGchem for TiO, VO, and FeH are relatively low and potentially below the detection limit of HST WFC3 G141.Hence, any trend in retrieved abundances is hard to draw out.However, for e−, the predicted abundance increases strongly after 1500 K and the retrieved abundances for the hottest planets appear to follow this trend.
In Figure 4 we also show the results of our fittings with BHM.In each case, the best-fit linear model is shown as well as the traces that were within 1σ of this.We computed the significance of these trends by comparing the Bayesian evidence of this fit to one without a slope (i.e., a model of constant abundance with temperature).Of all the species, only for e− did the fitting with a slope provide a preferable fit to the data and, even in this case, the significance was relatively low (2.28σ).The emergence of the H -ion comes from the thermal dissociation of H 2 at high temperatures.As H 2 O can also thermally dissociate, albeit at higher temperatures than H 2 , we compared our retrieved abundances of H 2 O and e− to explore whether a correlation can be seen in the data.We plot these in Figure 5, in which a general trend of decreasing water abundance with increasing e− can be seen.The overplotted models are from GGchem, showing the expected trend in the data for atmospheres of 1×, 10×, and 100× solar metallicity.These again show that the water abundance retrieved is generally supersolar.

Constraints on Formation via Elemental Ratios
The ratio of elements within the atmospheres of gaseous planets should provide an indication of the formation and evolutionary processes that have shaped the world into what we observe today.In particular, the metallicity and the ratio of carbon-to-oxygen have been proposed as key tracers of where and how giant planets collect gas and solids in evolving protoplanetary disks (e.g., Öberg et al. 2011;Mordasini et al. 2016;Booth et al. 2017;Brewer et al. 2017;Madhusudhan et al. 2017;Eistrup et al. 2018;Turrini et al. 2018;Cridland et al. 2019;Shibata et al. 2020).Hence, in our chemical equilibrium retrievals, we attempted to constrain both the metallicity and C/O ratio of the planets.
In Figure 6, we show the retrieved C/O ratio against the retrieved metallicity as well as the reduced semimajor axis of the planet.We note that the C/O ratio is generally poorly constrained and thus a conclusive trend cannot be extracted.Such a result is expected given that the wavelength coverage of WFC3 G141 only offers very limited sensitivity to carbonbearing species.While the data sets analyzed here offer the possibility of detecting and constraining CH 4 and HCN, which has been proposed as an indicator of high C/O atmospheres (Venot et al. 2015), determining the presence and abundance of CO and CO 2 is much harder as their features are weak in this band.The study of KELT-11 b by Changeat et al. (2020b) provides a good example of the complexity of constraining carbon-bearing species using only data from HST WFC3 G141.
Figure 3. Top: the signal-to-noise ratio (S/N) on the expected spectral modulation due to 1 scale height of a H/He-dominated atmosphere against the significance of the atmospheric detection from this work.While there is a positive correlation, in some cases no atmosphere is detected despite very precise observations.Middle: scatter plot of planet radius against planet temperature, where the color of the points indicates the significance of the atmosphere detection and the size shows the S/N on 1 scale height of atmosphere.Nondetections when detections might be expected happen more often for cooler, smaller planets while large, hot planets generally have a better chance of an atmospheric detection, even at low S/Ns.Bottom: scatter plot of planet surface gravity against planet temperature where the color and size again represent the detection significance and S/N, respectively.No clear trend can be distinguished given the variance in the precision of the data will affect the significance of the detection.For the lower four plots, black data points indicate planets for which the retrieval model with these optical absorbers are preferred while gray points represent those for which the fit is preferable without them.In all cases, abundances are only plotted if the associated models provided a >3σ detection compared to a flat model.The thick colored line on each plot indicates the linear trend from the BHM while the thinner colors lines represent the traces from the fit that were within the 1σ errors of the best-fit model.Only the fit to the abundance of e− gave a Bayesian evidence which was greater than the null hypothesis.
Despite the large uncertainties on the retrieved C/O ratio, we noted that the majority of planets were not consistent with C/O ratios larger than 1.However, the constraints are not precise enough to distinguish between different formation scenarios, which generally predict values between 0.5 and 0.9 (e.g., Turrini et al. 2018).Hence we can only conclude that HST WFC3 G141 data alone are not enough to accurately determine elemental ratios and thus confidently distinguish between formation scenarios.Furthermore, we note that work by Turrini et al. (2021) and Pacetti et al. (2022) suggests other ratios may be more important.These definitely cannot be constrained with HST WFC3 G141 data alone but may be achievable with data where a wider spectral coverage has been achieved with a single instrument (e.g., Gardner et al. 2006;Tinetti et al. 2018;Edwards et al. 2019).

Search for a Mass-Metallicity Trend
Several previous studies of exoplanetary atmospheres have sought to find trends between the mass of the planet and the metallicity of its atmosphere (e.g., Wakeford et al. 2017b;Welbanks et al. 2019).Here we explore this across our population in two ways, first using the GGchem chemical equilibrium retrievals where the metallicity is a fitted parameter and, second, using the methods employed in Welbanks et al. (2019).In Figure 7, we compare our results to those of Thorngren et al. (2016), where the plotted parameter is the ratio of the planet's metallicity to the star's.We find that our retrieved planet metallicities lead to ratios which are generally far above those found by Thorngren et al. (2016).While the best-fit model when fitting a linear trend led to a negative slope, the BHM analysis did not find strong evidence for a massmetallicity trend as the null hypothesis (i.e., a model of constant metallicity with mass) is preferred when fitting a trend to the retrieved metallicities.We give the preferred models, and associated Bayesian evidence, in Table 4.
Meanwhile, in Figure 8 we show the metallicity derived from the H 2 O abundance, overplotting the data from Welbanks et al. (2019) as well as their best fit to the data for comparison.
To maintain consistency with their study, we also plot the results of retrievals where the detection significance is 2-3σ.However, we notice their trend is evidently not reflected in our data.All planets studied in Welbanks et al. (2019) are also included in our population, and one can also notice differences in masses, depending upon which reference studies were used, as well as in the retrieved water-to-hydrogen ratios.The latter of these could be due to a number of factors but the major cause is likely to be a difference in the data sets used.Welbanks et al. (2019) used data from a variety of instruments, while we used only HST WFC3 G141.Each instrument gives access to differing absorption features and thus provides the observer with different sensitivities.Observing the same planet but with different instruments could provide contrasting findings on its composition.Therefore, the same is true when observing a selection of planets with disparate data sets, and so the trend they see may be caused by these differing sensitives to given molecules.Figure 6.The retrieved carbon-to-oxygen ratio for planets within our sample against the retrieved metallicity (top) and planet's semimajor axis (bottom).We note that the C/O ratio is poorly constrained in most cases due to the narrow wavelength coverage offered by HST WFC3 G141.In both plots, the dotted line highlights the prior range assuming a solar metallicity.The dashed line indicates the upper bound on the metallicity used for the planet around the star with the highest metallicity.The dashed-dotted line is the lower bound for the planet around the lowest-metallicity star.
On the other hand, the lack of an obvious trend derived here could well imply biases in our retrievals, with the high water abundances derived, and the large planet-to-star metallicity ratios, potentially providing evidence for this.Alternatively, we could conclude that HST WFC3 alone is simply not sensitive enough to be able to accurately constrain atmospheric metallicities, even through the form of the H 2 O abundance, as the uncertainty on this parameter is generally very high.Our BHM analysis found that, as with the GGChem metallicites, the null hypothesis is preferred over a mass-metallicity trend, with the results given in Table 5.Hence, we find no evidence for a mass-metallicity trend within our data sets.

Comparison of Free Chemistry and Chemical Equilibrium Retrievals
To assess which models may be best for fitting the population as a whole, we compared the Bayesian evidence  2016) also shown.Top: BHM results when only using retrievals which give a >3σ atmospheric detection.Bottom: BHM results when using retrievals which give a >2σ atmospheric detection.In both cases, the BHM found a best-fit trend line which had a negative slope.However, in both cases, the null hypothesis (metallicity is not dependent upon mass) was preferred.
of our free chemistry and chemical equilibrium retrievals.The latter has fewer free parameters and thus is penalized less while the former is more agile as the relative abundances of molecules is completely inhibited.By comparing the evidence for both, we find that both provide fits of a similar quality for lower temperatures but, for planets above 1500 K, the free chemistry model often provides a statistically preferable fit to the data, as shown in Figure 9.Such a finding could be suggestive of disequilibrium chemistry, a claim which has also been made in previous studies (e.g., Baxter et al. 2021;Roudier et al. 2021;Keating & Cowan 2022).However, this finding is in opposition to the noted dearth of methane detected in our free chemical retrievals.Given that equilibrium models predict significant amounts of methane at these lower temperatures, it is strange to find they fit the data as well as models which do not infer the presence of this molecule.It also contrasts the results from Baxter et al. (2021), whose analysis of Spitzer data suggested that cooler planets were not in chemical equilibrium.

Amplitude of Absorption Features
The key spectral feature within the HST WFC3 G141 band is the 1.4 μm water feature.Instead of performing retrievals in an attempt to recover the abundance of this molecule, several studies have instead measured the size of the feature in relation to other bands within WFC3ʼs spectral range (e.g., Stevenson 2016;Fu et al. 2017;Wakeford et al. 2019;Dymont et al. 2022).We fitted the feature using the process described in Appendix C.
The 1.4 μm feature size was utilized by Crossfield & Kreidberg (2017) to imply a number of trends within the atmospheres of sub-Neptunes based off the spectra of six planets.Using the most massive planet from their sample, HAT-P-11 b (25.7 M ⊕ ), as the upper boundary in terms of mass, and the largest of their sample, HAT-P-26 b (6.3 R ⊕ ), as the radii limit, we have extended this to 16 gas dwarf planets.The two key correlations found by Crossfield & Kreidberg (2017) were with the planet's H/He mass fraction and its equilibrium temperature.One interpretation of the trend seen with the former would be that planets with a low H/He mass fraction would have a high mean molecular weight, leading to a smaller than predicted scale height, which was calculated assuming μ = 2.3.Meanwhile, a reduction in the size of the H 2 O amplitude with decreasing temperature was postulated to be due to hazes for cooler (<850 K) planets.
We updated these plots but find these correlations no longer hold, as shown in Figure 10.We now find a decreasing feature size with increasing H/He mass fraction, a result which is counterintuitive.Furthermore, we find that, at temperatures cooler than 550 K, the feature size increase with decreasing temperature, a result that concurs with work done by Kawashima et al. (2019).We highlight the planets that were in the Crossfield & Kreidberg (2017) sample and suggest the trends were seen due to a selection bias.For these trends we find that the are driven significantly by GJ 1214b, as the feature size is extremely well constrained but also very close to zero.
We also attempted to draw out other trends looking across two parameters simultaneously, but found no significant correlations.From this we imply that either (a) no simplistic correlations exist, or (b) the sample is not large enough, or has not been selected carefully enough, to allow for any trends to be teased out.Such a finding highlights the importance of a structured, hypothesis-based selection of planets when attempting population studies and the need for dedicated exoplanet atmospheric survey missions (e.g., Twinkle, Ariel).
We extended the analysis of the feature size to all planets in our study, and Figure 11 displays the recovered 1.4 μm amplitude against temperature.Previous studies have suggested that the surface gravity of a planet, along with its temperature, affect the cloud coverage, which in turn would affect the WFC3 feature size (e.g., Stevenson 2016;Bruno et al. 2018).Hence, we divided the population up by its surface gravity to search for such a trend.While some differences in feature size are seen in planets with similar temperatures, the surface gravity does not appear to be driving this in all cases: in some, the feature size is identical, while, where there are differences, these are not universal.In general, the distribution of feature size is similar to those of previous studies (e.g., Fu et al. 2017), with an increase at around 1200 K.In Figure 11, we also plot the retrieved cloud pressure.Here we see that, in the same temperature range that the feature size increases, the cloud pressure retrieved is deeper in the atmosphere.Therefore, the knowledge gained by performing the spectral feature analysis is also readily available from retrieval analyses.
The spectral feature size has often been used to infer the presence of clouds and proposed as a way of guiding observers as to the spectral modulation that could be expected when planning future observations.Across the population, we recover an average (weighted mean) feature size of 0.81 scale heights, but we note that the median value is much higher (1.37 scale heights).We note that such a low value for the weighted mean is driven by the feature size of GJ 1214b (0.12 ± 0.04), and if one removes this from the weighted mean calculation, an average value of 1.21 scale heights is obtained.All these values are comparable values to previous studies: 1.4 H (Fu et al. 2017) and 0.89 H (Wakeford et al. 2019).As demonstrated in Figure 11, the amplitude of this feature is far below what would be expected from a clear, solar-metallicity atmosphere.There would also appear to be some correlation with temperature, with hotter planets (T > 1500 K) generally having larger feature sizes than cooler ones (T < 700 K).Additionally, we see a trend at around 1200 K where the feature size increases before decreasing again.A similar feature was found by Fu et al. (2017), and our retrievals find these planets have a lower cloud-top pressure (see Figure 11).
Extending the models used to derive the 1.4 μm feature size in the HST WFC3 G141 data, we estimate the amplitude of Notes.Top: using retrievals where the GGChem model provided a >3σ detection of an atmosphere.Bottom: using retrievals where the GGChem model provided a >2σ detection of an atmosphere.Neither linear model, fitted in log-log space, is preferred to the null hypothesis (i.e., no trend between mass and metallicity).2019) are also shown, as are their best-fit trends.Top: BHM results when only using retrievals which give a >3σ atmospheric detection.Bottom: BHM results when using retrievals which give a >2σ atmospheric detection.In the former case, the BHM found a best-fit trend line which had a negative slope while the latter had a positive slope.However, in both cases the null hypothesis (metallicity is not dependent upon mass) was preferred.Additionally, either sign of slope, or indeed no slope, is also within the 1σ errors of each fit.Therefore, while clouds and hazes obviously need to be accounted for during the planning of observations with future facilities, current data show the expected amplitude should, on average, be greater than a single scale height.However, we note also that the methodology used to measure the amplitude of absorption features is somewhat flawed.While the parameters in Equation ( 12) allow the spectrum to be modulated to fit the data, and account for the features seen, the final fit does not provide a robust analysis of the nature of the atmosphere: this can only be achieved by running a Bayesian retrieval which models the passage of starlight through the atmospheric layers to explain the spectrum via base atmospheric proprieties such as temperature, composition, and clouds.
Hence, for comparison, we also took our preferred spectral retrieval models and computed the amplitude of features seen across these wavelength bands.For the spectral ranges 0.6-2.8μm, 0.5-5.3μm, and 0.5-7.8μm, we found a median feature size of 3.32, 3.66, and 4.38 scale heights, respectively.These are roughly 1 scale height larger than the feature amplitude fit suggests.However, we note that these predictions may also be biased as some species, such as CO 2 , are not constrained by our Hubble WFC3 observations.Therefore, the best-fit value of these is essentially an average of the prior range.When considering the spectrum across these longer wavelengths, this may induce features which are not present if the molecules actually exist at values lower than the value "retrieved." In any case, both models point toward features several scale heights in amplitude being observable with future facilities, largely due to the strong absorption that occurs at longer wavelengths.As data are collected with these facilities, our understanding of the cloudiness of extrasolar planets will be enhanced and therefore should allow us to more confidently predict the data quality.Nevertheless, we will never truly know until we look.

Discussion and Conclusions
The HST has been at the forefront of exoplanet atmospheric characterization over the last two decades.While many different instruments on this facility have been used, WFC3 has perhaps been the mostly widely utilized due to its sensitivity to water.In this work we have presented a population study of atmospheres, each studied with the WFC3 G141 grism as the planet transits its host star.
Of the 70 planets studied, we found strong evidence (>3σ) for atmospheric features on 37 of them, with some evidence (2-3σ) for spectral modulation on an additional 14 planets.We note that for several planets (e.g., WASP-18 b), the derived spectrum has error bars that are several scale heights in size, meaning no atmospheric constraints could be expected.As noted by other studies, clouds are ever present and are muting the size of the features seen.While clouds should certainly not Figure 11.Size of the 1.4 μm feature against planet temperature and subdivided by the planet's surface gravity.At lower temperatures, the surface gravity may have an impact on the cloudiness of the planet, and thus the feature size, but at higher temperatures there appears to be no obvious correlation.Gray regions highlight the expected feature size for solar-metallicity atmospheres for a given cloud pressure and a surface gravity between 10 and 20 m s −2 .Across the entire population, the mean 1.4 μm feature size is 0.81 scale heights in the HST/WFC3/G141 while the median is 1.37.The feature size can be directly correlated to the retrieved cloud pressure from our retrievals (bottom), and we note that the 1.4 μm feature size is not a good predictor of the feature size across wider wavelength ranges due to the stronger absorption seen at longer wavelengths.The priors for the cloud pressure in our retrievals are shown by the dotted lines.
be discounted from observational planning, future instruments will probe longer wavelengths, where the absorption of molecules such as H 2 O and CO 2 are stronger, and so may not be as affected, particularly given that the S/Ns should be higher for these data sets.
In this work we have largely struggled to draw out trends in the population, despite employing BHM, which exploits the full richness of the posterior distributions from our Bayesian retrievals.Nevertheless, these null results are not without value as they can inform us of how to approach atmospheric studies in the future.Upon viewing our results, it may be concluded that HST WFC3 G141 data alone are insufficient to draw out detailed trends from a population of objects.To explore this, we created simulated data sets of the planets for which our retrievals suggested there was evidence for spectral modulation due to an atmosphere.For each planet, we utilized the error bars, the retrieved cloud pressure, and retrieved 10 bar radius from the real observations to generated fake data sets, inputting the mass-metallicity trend suggested by Thorngren et al. (2016) and performing retrievals to see if our simulated data would recover the trend.We added Gaussian scatter to the data sets, and the retrieved planet-to-star metallicity ratios are shown in Figure 12.We find that our simulated data are capable of recovering the input trend, which suggests that, in our analysis of the real data, we do not recover a trend because (i) one does not exist across the whole population, (ii) there are biases in our retrieval analysis which are obscuring the true trend, or (iii) both.
In our analysis, on both the real data and the simulated data, the constraints placed on the atmospheric chemistry are poor.Such a result is not overly surprising: all those who have worked with HST WFC3 G141 understand that its limited wavelength coverage means the conclusions that can be drawn from the analysis of it are similarly restricted.The inability to place tight constraints on a planet's metallicity or C/O ratio are a byproduct of a data set which is only truly sensitive to a single feature of a single molecule: the 1.4 μm water feature.As such, studies in the literature have often combined data sets from multiple instruments to extend the wavelength coverage and unlock additional spectral features.However, such an approach has serious implications for the analysis, particularly when attempting to draw trends from a population.
First, the instruments experience significant systematics.While the correction of these with different pipelines usually leads to uniform spectral features, offsets in the transit depth are common (e.g., Guo et al. 2020;Luque et al. 2020;Murgas et al. 2020;Yip et al. 2021;McGruder et al. 2022).When analyzing data from a single instrument, this has little effect as the retrieved planetary radius is generally just slightly smaller or larger.However, when combining instruments, these offsets can cause wild differences in the retrieved atmospheric parameters (Yip et al. 2020).While an offset can be fitted for in the retrieval, without spectral overlap between instruments one cannot be sure of the compatibility of data sets.Currently, only HST WFC3 G102 offers spectral overlap with HST WFC3 G141, but this filter has rarely been used.Combining both the WFC3 IR grisms with data from HST STIS can be done with some confidence due to this spectral overlap (e.g., Wakeford et al. 2017b), although the data are not taken in the same epoch so temporal changes may be an issue (e.g., Saba et al. 2022).
Yet, even if one can confidently combine data from multiple instruments, another factor for consideration emerges.If one compares two planets having utilized different instruments in each case, one cannot be sure if any differences seen are because the planets are distinct or if the instruments simply offer discrete views of these atmospheres.An example of this is demonstrated in Pinhas et al. , which is comparable to the value obtained here of −3.10 1.17 0.95 -+ . With the addition of STIS data, the value retrieved was −4.66 0.30 0.39 -+ , which is not compatible to within 1σ and takes the water abundance of HD 209458b from being solar to distinctly subsolar.The debate here is not about the true value-the STIS data may bring one closer to it by avoiding some degeneracies-but about the change the new data brings about in the result.Considering other planets, for example HAT-P-38 b or TOI-674 b, for which we retrieved similar water abundances to HD 209458b (log 10 (H 2 O) = −3.072.27 1.12 -+ and -3.12 1.04 0.78 -+ , respectively).Comparing the retrieved abundances from the WFC3 G141 data alone would lead us to conclude the atmospheres were similarly enriched.Comparing the HD 209458b abundance retrieved using the STIS data, as well, would lead us to believe they had differing water abundances.As neither HAT-P-38 b nor TOI-674 b have STIS data at the time of writing, it is impossible to know if the retrieved water abundances would also change if such data were also added to our retrievals.While analyzing HST WFC3 G141 data alone has its limitations, such as potentially introducing biases into the retrievals, these limitations are at least uniform across the population.Therefore, when searching for trends the recovered correlations may be biased but at least the relative trend between planets would be consistent.However, when utilizing differing instruments between planets one cannot tell if the changes seen are due to the instruments used or due to the atmospheres actually differing in composition.If one looks at which planets have been studied by both STIS and WFC3, this further enforces this point: higher-mass planets more often have both data sets whereas lower-mass planets have generally only been studied with WFC3 G141.Therefore, if one finds evidence for a mass-metallicity trend, which is based on the H 2 O abundance in the atmosphere, with inhomogeneous data sets, one cannot be sure the trend is not caused by the addition of these data sets for the highermass planets.As such, if one attempts a population study where the instruments each member is studied with varies, one must account for this bias when inferring the presence, or lack, of trends.For example, conducting retrievals on both data sets (e.g., with/without STIS or Spitzer) and comparing the results (e.g., Pinhas et al. 2019;Pluriel et al. 2020;Yip et al. 2021).
The ability to seek out these correlations is further impaired by the choice of targets, with those studied in this work essentially being a random collection of worlds as they have been observed via a variety of proposals, each with different aims.As such, drawing comparison becomes difficult because, in all likelihood, more than one parameter will be affecting the chemistry.Therefore, one may be tempted to split the targets into subgroups in an attempt to uncover trends, but doing so after the observation is required is risky; with the data set available, if one tries hard enough "trends" can be uncovered due to the large uncertainties in the derived parameters and the sparsity of the data once more than a single parameter is used to divide the population.Such an effect can be seen in the relation suggested between temperature and atmospheric feature size previously proposed for sub-Neptunes.In this case, and in all other attempts to draw conclusions, the fact that the precision of the data with respect to the expected atmospheric signal (i.e., the S/N) also differs across the population makes it yet harder to definitively draw comparisons and to imply the atmospheric conditions or conclude as to why an atmospheric detection has not occurred.
In addition to free chemistry retrievals, we fitted chemical equilibrium models to our data sets in an attempt to draw out trends in metallicity and the C/O ratio.However, as noted in the methodology, these retrievals are essentially fitting two free parameters (metallicity and C/O ratio) to a single observable (the H 2 O abundance).As such, the retrieved values are not necessarily reliable.Taking again the example of HD 209458b and HAT-P-38 b, for which we recovered almost identical water abundances in the free chemistry experiment, we can see the issue clearly.While we recovered a metallicity of log 10 (Z P ) = -0.040.62 0.83 -+ for HD 209458b, the value for HAT-P-38 b, 1.59 0.65 0.25 -+ is different to greater than 1σ (despite the large uncertainties) as the model preferred a higher C/O ratio for this planet.Many previous studies have fitted for these without additional data, or with optical data which do not add an additional molecular observable, but our results suggest these findings should be taken cautiously.Spitzer IRAC data have often been added to HST WFC3 G141 to help further constrain the C/O ratio by a given sensitivity to carbon-bearing species (e.g., CH 4 , CO, CO 2 ) but these studies are then exposed to the potential offset risks that we have discussed previously.
Hence, our current ability to extract population-level trends in exoplanet atmospheres is limited by a number of factors.In short, to truly perform population studies of exoplanetary atmospheres, one must achieve a wide spectral coverage with a single instrument and ensure that the planets are selected in a robust manner.JWST (Gardner et al. 2006) will provide better opportunities for this than Hubble, particularly on the spectral coverage front, although for brighter targets multiple instruments will still need to be combined to get the wavelength coverage necessary to accurately constrain both refractory elements and carbon-bearing species.However, while certain JWST proposals are designed as miniature population studies, an organized, well-structured survey of a hundred or more worlds is unlikely to occur due to the time required for such a survey and the proposal-based nature of time allocation.Therefore, it is likely that the population of planets studied will be somewhat random, with the added complexity of different instruments being used, and so some of the hurdles discussed here will still be relevant.Additionally, the S/Ns achieved will vary and so comparative studies will have to be careful when drawing conclusions, even if the same instruments are used.
To overcome these hurdles, one requires missions with dedicated exoplanet surveys which will allow researchers to pose and, hopefully, answer specific questions on the nature of exo-atmospheres.By allowing a large population of targets to be selected with these questions in mind, Twinkle and Ariel will be better placed to provide demographical insights into the atmospheres of exoplanets, revolutionizing our understanding of them in the process (e.g., Edwards et al. 2019;Changeat et al. 2020a;Tinetti et al. 2021;Edwards & Tinetti 2022;Stotesbury et al. 2022).However, these missions will, of course, have their own limitations in terms of data quality.For instance, Ariel will only provide photometric data at visible wavelengths and so may struggle to disentangle the spectral features of optical absorbers.Furthermore, these surveys will recoup a higher science yield if they are constructed upon robust prior knowledge instead of undertaking a blind search for trends.
Therefore, we must strive to utilize each facility in ways which complement their respective capabilities.It is undeniable that JWST, and the continued use of Hubble and other facilities, will provide critical insights into the nature of a diverse set of worlds, with JWST in particular facilitating extraordinary sensitivity and thus offering the chance to probe for extremely small signals (e.g., secondary atmospheres).The knowledge gained must then be leveraged to inform us of how to use these dedicated surveys to best understand the population at large.By striving for these meticulous chemical surveys we will then truly begin to understand the demographics of exoplanet atmospheres.
We thank the referee of this manuscript for taking the time to read our work and provide feedback.Their constructive comments guided the direction of the paper, thereby improving the quality of the final results.
Computing: We acknowledge the availability and support from the High Performance Computing platforms (HPC) from the Simons Foundation (Flatiron), DIRAC, and OzSTAR, which provided the computing resources necessary to perform this work.Data: This work is based upon publicly available observations taken with the NASA/ESA Hubble Space Telescope obtained from the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS 526555.These were obtained from the Hubble Archive, which is part of the Mikulski Archive for Space Telescopes.We are thankful to those who operate the Hubble Space Telescope and the corresponding archive, the public nature of which increases scientific productivity and accessibility (Peek et al. 2019).
For each observation, the associated proposal number and principal investigator are given in Tables 1, 2, and 3 Planet Discovery Papers: the characterization of exoplanetary atmospheres cannot occur without first knowing of the planet's existence.We are therefore grateful to all those who have contributed to planet discovery efforts.Thus, the works announcing the discovery of all planets studied here are now given: CoRoT-1 b ( Barge et al. 2008), GJ 436 b (Butler et al. 2004;Gillon et al. 2007)

Appendix A Light-curve Fitting with Iraclis
We carried out the analysis of the transit data using Iraclis, our highly specialized software for processing WFC3 spatially scanned spectroscopic images (Tsiaras et al. 2016b(Tsiaras et al. , 2016c(Tsiaras et al. , 2018)), which has been used in a number of studies (e.g., Libby-Roberts et al. 2022;Brande et al. 2022;Garcia et al. 2022).The reduction process included the following steps: zero-read subtraction, reference pixels correction, nonlinearity correction, dark-current subtraction, gain conversion, sky background subtraction, calibration, flat-field correction, and bad-pixels/cosmic-rays correction.Then, we extracted the white (1.088-1.68μm) and the spectral light curves from the reduced images, taking into account the geometric distortions caused by the tilted detector of the WFC3 IR channel.
We fitted the light curves using our transit model package PyLightcurve (Tsiaras et al. 2016a) with the transit parameters from Tables 7 and 9.The limb-darkening coefficients were calculated using ExoTETHyS (Morello et al. 2020) and based on the PHOENIX 2018 models from Allard et al. (2012).The stellar parameters are also given in Tables 6 and 8.
During our fitting of the white light curve, the planet-to-star radius ratio and the midtransit time were the only free parameters, along with a model for the systematics (Kreidberg et al. 2014b;Tsiaras et al. 2016b).It is comm on for WFC3 exoplanet observations to be affected by two kinds of timedependent systematics: the long-term and short-term "ramps." The first affects each HST visit and has a linear behavior, while the second affects each HST orbit and has an exponential behavior.The formula we used for the white light-curve systematics (Rw) was the following: where t is time, n w scan is a normalization factor, T 0 is the midtransit time, t o is the time when each HST orbit starts, r a is the slope of a linear systematic trend along each HST visit, and (r b1 , r b2 ) are the coefficients of an exponential systematic trend along each HST orbit.The normalization factor we used n w scan  Note.For consistency, these match those used in the original studies.
We fitted the white light curves using the formulae above and the uncertainties per pixel, as propagated through the datareduction process.However, it is common in HST/WFC3 data to have additional scatter that cannot be explained by the ramp model.For this reason, we scaled up the uncertainties in the individual data points, for their median to match the standard deviation of the residuals, and repeated the fitting (Tsiaras et al. 2018).The only free parameters in our white fitting, other than the HST systematics, were the midtransit time and the planetto-star radius ratio.The full set of white light-curve fits are given in Figure 13.The transit midtimes and white light-curve depths are given in Table 10.
Next, we fitted the spectral light curves with a transit model (with the planet-to-star radius ratio being the only free parameter) along with a model for the systematics (R λ ) that included the white light curve (divide-white method; Kreidberg et al. 2014b) and a wavelength-dependent, visit-long slope (Tsiaras et al. 2016b): where χ λ is the slope of a wavelength-dependent linear systematic trend along each HST visit, LC w is the white light curve, and M w is the best-fit model for the white light curve.Again, the normalization factor we used (n scan l ) was changed to n for l or n for l for upward or downward scanning directions, respectively.Also, in the same way as for the white light curves, we performed an initial fit using the pipeline uncertainties and then refitted while scaling these uncertainties for their median to match the standard deviation of the residuals.

Appendix B Summary of Results for Individual Planets
Here we present further information for the planets where the spectra were fit with Iraclis for this study.We briefly summarize the data that were taken and compare our results to any we could find in the literature for the same data set.

B.1. CoRoT-1 b
The first transit of CoRoT-1 b was observed in 2008 by Barge et al. (2008), unveiling an inflated, 1.5 R J planet.A ground-based transmission spectrum of CoRoT-1 b, obtained with the InfraRed Telescope Facility, achieved a spectral precision that was comparable to the modulation expected from a single scale height of atmosphere but no spectral features could be discerned (Schlawin et al. 2014).
HST WFC3 observed the transit of CoRoT-1 b in staring mode as part of proposal 12181 (PI: Drake Deming; Deming 2010).The GRISM128 subarray was utilized alongside the SPARS10 sequence and 16 up-the-ramp reads leading to an exposure time of 100.651947 s.This data were previously studied but no evidence for molecular absorption was found due to the high uncertainties on the data (Ranjan et al. 2014).Similar results were also recently found by Glidic et al. (2022).
We fitted the data with Iraclis, and it is worth noting that there is no post-egress orbit, likely due to poor knowledge of the ephemeris at the time of observing, and this leads to larger uncertainties on the transit depth.For the transit spectrum, the decreasing absorption with wavelength is best fit with a cloudless atmosphere and the continuous absorption from H -. The result is consistent with the recovered temperature of around 1800 K, which should lead to the dissociation of molecular species.
The transit spectrum is compared to literature results in Figure 14, showing a good consistency with these results in terms of shape but with an offset between the spectrum recovered here and that from Glidic et al. (2022).

B.2. GJ 1214 b
In 2011, three staring-mode transits of GJ 1214 b were taken for proposal GO-12251 (PI: Zachory Berta).The analysis of this data led to a flat spectrum, which was interpreted as suggesting GJ 1214 b had an atmosphere with a mean molecular weight >4.Subsequently, 15 scanning-mode transits of GJ 1214 b were taken between 2012 September and 2013 August as part of proposal GO-13021 (PI: Jacob Bean).Combining these led to a spectrum that had uncertainties of around 50 ppm, equivalent to around 0.15 scale heights assuming a hydrogen-dominated atmosphere.Yet, despite the high precision of the observations, no indication of an atmosphere could be extracted (Kreidberg et al. 2014b).
We analyzed all available data for GJ 1214 b.However, the scanning-mode observations provided a higher precision than the staring-mode data.As discussed in Kreidberg et al. (2014b), the observation taken on 2013 April 12 was affected by poor pointing and so was excluded from the analysis.Kreidberg et al. (2014b) found evidence of a starspot crossing in two observations (2013 August 12 and 4) and so they excluded these from their analysis.We show the white light curves of these transits in Figure 15, showing slight bumps in the residuals which could indeed be due to a starspot crossing.We remove these data, along with the staring observations, to compute the final spectrum analyzed here.In Figure 16, we show the final spectrum obtained in this study using different data sets as well as a comparison to the spectrum from Kreidberg et al. (2014b).We find that the spectrum has slightly more modulation than found by Kreidberg et al. (2014b).Additionally, our retrievals prefer a model with spectral modulation to the 2.41σ level for the free chemistry model and to 2.38σ for the chemical equilibrium run.Given the relatively low significance of these atmospheric detections, and the large number of observations that were combined to create the final data set, we are cautious about inferring the presence of atmospheric features for this planet.Given the size of the error bars derived, the terminator of GJ 1214 b is evidently not cloud-free.The JWST Mid-Infrared Instrument phase-curve taking during Cycle 1 should shed more light on the nature of this world.

B.3. HAT-P-2 b
HAT-P-2 b was discovered in 2007 by Bakos et al. (2007b).It is a massive hot Jupiter (9.1 M J ) that orbits its host star in an highly eccentric orbit (e = 0.52) in about 5.6 days.Due to its large density, the planet is believed to require the presence of a large core.This large mass, combined with the highly eccentric orbit, raises many questions regarding the physics of this planet and its formation.For instance, along the entire orbit, the planet's   equilibrium temperature varies from 1240 to 2150 K (Bakos et al. 2007b).Studying the Rossiter-McLaughlin effect, it was found that the stellar spin axis and orbital axis of the planet should be aligned, thus implying that the planet did not evolve through scattering or Kozai migration (Winn et al. 2007;Loeillet et al. 2008).
While being a very interesting planet, the atmosphere of HAT-P-2 b was not studied with many instruments.A phasecurve observation with Spitzer at 3.6 μm, 4.5 μm, 5.6 μm, and 8 μm was presented in Lewis et al. (2013), highlighting a very complex atmosphere due to the particular orbital configuration of this planet.The study also suggested the planet might experience a temporary dayside thermal inversion near periapse.In a follow-up work, Lewis et al. (2014) performed a complementary analysis with general circulation models to evaluate the impact of the eccentricity on the chemistry and the thermal structure of this planet, highlighting that disequilibrium processes on this planet might be important.
Recently, a partial phase curve was acquired with the HST using the G141 grism (PN: 16194, PI: Desert et al. 2020).The eclipse spectrum from this proposal was presented in Changeat et al. (2022) and here we extract the transit spectrum.However, due to the high mass of HAT-P-2 b, we did not recover a spectrum that was sensitive enough to allow for atmospheric constraints to be made.(2008).This optical eclipse measurement was combined with Spitzer photometry over 3.5-8 μm to infer the presence of a thermal inversion (Christiansen et al. 2010), suggested by the high flux ratio in the 4.5 μm channel of Spitzer compared to the 3.6 μm channel.In their paper, chemical equilibrium models associated these emission features with CO, H 2 O and CH 4 .A thermal inversion was also reported to provide the best fit to this data by the atmospheric models of Spiegel & Burrows (2010); Madhusudhan & Seager (2010) but all three studies noted that models without a thermal inversion could also well explain the data though only with an extremely high abundance of CH 4 .Further Kepler phase curves identified an offset in the dayside hot spot (Esteves et al. 2013(Esteves et al. , 2015) ) as well as changes in its location (Armstrong et al. 2016), highlighting the complex dynamics of hot Jupiter atmospheres.However, while Spitzer phase curves at 3.5 μm and 4.5 μm were also best fitted with a thermal inversion on the dayside and relatively inefficient daynight recirculation, Wong et al. (2016) did not find evidence of a hot-spot offset.
We fitted the HST WFC3 data with Iraclis, but it is worth noting that there is no post-egress orbit.Combined with the fact these observations were taken in staring mode, this led to a transit spectrum wherein the uncertainties on the depth were equivalent to nearly 10 scale heights, meaning no atmospheric signal could be discerned.

B.5. HD 97658 b
The sub-Neptune HD 97658b was discovered as part of the NASA-UC Eta-Earth Program (Howard et al. 2011) and four HST WFC3 transit observations have been obtained across two proposals (GO-13501, Knutson 2012; GO-13665, Benneke et al. 2015).These have previously been analyzed in Knutson et al. (2014) and Guo et al. (2020).
The data from GO-13501 were collected using the GRISM256 subarray, the SPARS10 sequence, and four upthe-ramp reads leading to an exposure time of 14.970785 s.The scan rate was 1 4 s.Meanwhile, the data from GO-13665 used a different observational setup, with an exposure time of 12.795406 s via 16 up-the-ramp reads using the RAPID sampling sequence.The GRISM512 subarray was utilized with a scan rate of 1 4 s.
For the observations from the latter proposal, the spatial scan was not correctly positioned within the subarray window.Therefore, the longer wavelengths were not recovered.Hence, for consistency, we extracted the white light curves of all data sets from 1.088 to 1.650 μm.The change in the white light curve did not affect our spectral bins, which only extend to 1.643 μm.We found that the white curve depths were consistent across the first three visits, with the fourth visit being slightly shallower.The fourth visit also displayed a greater slope across the WFC3 bandpass.
We compare our spectrum to the literature results in Figure 17.We find that, with the exception of a small offset between the two spectra, our fitting of the first two visits compares well to the results of Knutson et al. (2014).However, we find that our analysis of all four observations differs from the spectrum of Guo et al. (2020) at shorter wavelengths,  Knutson et al. (2014), who only analyzed visits 1 and 2. Other than a slight offset, the spectra agree well after two visits but the averaged spectrum of all four transits has a stronger slope.Right: comparison of the spectrum derived in this work of HD97658 b, to that from Guo et al. (2020).While the spectra agree over 1.3-1.5 μm, outside of this range there is a significant disagreement in the features.
despite excellent agreement across the 1.3-1.5 μm range.We note that Guo et al. (2020) found that the choice of long-term detrending technique affected the white light-curve depth for the HST WFC3 G141 observations of HD 97658 b and suggested that a linear trend may not fully explain the systematics seen.We only fitted a linear trend here such that the analysis was consistent across the population.The analysis of this spectrum was complicated by the aforementioned incorrect positioning of the spatial scan and this may have further contributed to this discrepancy.Our final spectrum is shown in Figure 39.Our retrievals showed a clear preference for the presence of an atmosphere and the free chemistry retrieval provided the best fit to the data.

B.6. HD 219666 b
HD 219666 b, a hot Neptune (T eq = 1070 K, R = 4.71 R ⊕ ), was discovered using data from the first sector of the Transiting Exoplanet Survey Satellite (TESS; Esposito et al. 2019) and was one of the first discoveries by this mission to be announced.With an orbital period of around 6 days, it lies in an underpopulated region of the period-radius diagram in an area often coined the "hot Neptune desert."It is one of two planets studied in this work that lies within this region, the other being LTT 9779 b.Esposito et al. (2019) showed that the target was an excellent candidate for atmospheric studies by simulating JWST observations.The planet was subsequently observed as part of proposal GO-15698 (PI: Thomas Beatty), which planned to take two transit observations of HD 219666 b.The first set of observations were taken in 2019 June but pointing was lost and so the target was revisited in August and November of the same year.However, for the visit in August, the first orbit was taken during the transit of the planet, rather than the third orbit as planned.Evidently the planning of the observations was based on poor orbital ephemeris, and this highlights the critical importance of programs which seek to refine the periods of exoplanets that are good targets for atmospheric characterization (e.g., Edwards et al. 2021b;Kokori et al. 2022aKokori et al. , 2022b)).
Despite these issues, the spectrum derived from this single visit led to a confident atmospheric detection.The TauREx free chemistry retrieval recovered a very high volume mixing ratio of water: log 10 (H 2 O) = −1.541.08 0.37 -+ . The retrieval was preferred to the flat model to 3.57σ, suggesting a strong atmospheric detection.

B.7. HIP 41378 b
Discovered using data from the K2 mission, HIP 41378 b is a sub-Neptune (R = 2.6 R ⊕ ) with an orbital period of 15.5712 days and an equilibrium temperature of 960 K (Vanderburg et al. 2016b;Santerne et al. 2019).Its atmosphere has not previously been studied and the HST transit data were taken as part of GO-15333 (PI: Ian Crossfield).Three successful transit observations of HIP 41378 b were taken, each consisting of seven orbits.Two other visits were attempted with one resulting in a complete loss of data and the other having issues due to a guide star acquisition failure, which led to large shifts in the position of the spectrum on the detector.A further two observations were originally planned but were scrapped in favor of observing three transits of TOI-674 b (see Section B.17).For these observations, the GRISM512 subarray was used, with eight up-the-ramp reads using the SPARS25 sampling sequence, and an exposure time of 138.380508 s, combined with a scan rate of 0 25 s.
From our free chemistry retrievals, we find the spectrum to be compatible with a flat line.However, our chemical equilibrium retrieval provided a preferable fit to the flat model and was preferred by 2.65σ.The 1.4 μm feature size derived was 2.25 ± 0.97.

B.8. HIP 41378 f
The outermost cureently known planet in the HIP 41378 system, HIP 41378 f, orbits its host star every 542 days (Vanderburg et al. 2016b;Santerne et al. 2019).The host star is bright and, as the planet's density is very low Santerne et al. 2019), it is an excellent target for atmospheric studies.Given its temperature (T ∼ 300 K), HIP 41378 f is also far cooler than any of the other large gaseous planets studied here.The transit spectrum of HIP 41378 f required 18 consecutive orbits of HST and was taken by proposal GO-16267 (PI: Courtney Dressing).The data were obtained with a scan rate of 0 419 s.The SQ256 subarray was used as well as the SPARS10 reading sequence.The first forward and reverse scans of each orbit were acquired using seven nondestructive reads while the rest had nine.Direct images were acquired with the F126N at various points in the visit as a guidance check.
The transmission spectrum of HIP 41378 f was analyzed in Alam et al. (2022).However, their derived spectrum differs dramatically from ours, as shown in Figure 19.We performed a number of additional fittings to check the credibility of our result.First, we fitted the data using the wavelength bins and limb-darkening coefficients from Alam et al. (2022), but this did not yield a similar spectrum to their work.We tried extracting the data by splitting the nondestructive reads in case of a contamination by a background star, but this did not change the derived spectrum.We also sought an independent fit of the data, with CASCADe20 also being used to analyze the data.The CASCADe pipeline is instrument independent, meaning the manner in which systematic effects are removed from the data is completely different to Iraclis, which was built purposefully for HST WFC3.The initial fit was also done without prior knowledge of the Iraclis result but we ensured the orbital parameters utilized were the same.The resulting CASCADe spectra has highly similar features to the Iraclis fit.We note there is a slight offset between the data sets, but this is common when analyzing spectra with different pipelines (e.g., Mugnai et al. 2021), as well as a slightly steeper slope in the CASCADe data, which gives lower transit depths at shorter wavelengths.
From comparisons between our white light-curve fit and that from Alam et al. (2022), we noted that data had been removed from their analysis (see Figure 1 from their work), including an entire orbit just after ingress.In Alam et al. (2022) they state that the observations were affected by the South Atlantic Anomaly, which is undoubtedly true: many orbits did not acquire the full number of requested frames.However, they do not discuss why this orbit, and data from others, were not present in their fitting despite the data being taken and having no obvious signs of degradation.We also attempted fits without this orbit but still found a spectrum with the same features as before.We also noted that the uncertainties on the spectrum derived by Alam et al. (2022) were much higher than those from Iraclis and CASCADe.While some of this could be due to the removal of data by Alam et al. (2022), the residuals on the white light-curve fit are also significantly higher in their work.Here, the residuals of the white light-curve fit had a standard deviation of around 125 ppm while the standard deviation of the residuals from the fit of Alam et al. (2022) was around 500 ppm.While we cannot know for sure the source of this increased scatter on their white light curve as their pipeline is not public, we speculate that some part of their calibration, reduction, or extraction process must have been suboptimal.
Having found no way to not achieve a spectrum without significant features, we attempted to fit the data using TauREx.Our free chemistry retrievals were flexible enough to fit the features seen, but the credibility of the fit is questionable.Nevertheless, we show the best-fit model to the data in Figure 40.In the case of the GGChem retrieval, while it is preferred to the flat line fit, it is obviously unable to replicate the features in the data.
There have been suggestions that HIP 41378 f could have rings, which would inflate the measured radius of the planet, thereby explaining the very low density (Akinsanmi et al. 2020;Belkovski et al. 2022).Based on data from K2, their modeling suggested the true radius of HIP 41378 f could be 3.7 R ⊕ .The effect of these rings on the transit depth is likely to be chromatic as at wavelengths where the rings are optically opaque more stellar light would be blocked.While this could be the source of the features seen, we do not have the modeling capability to pursue this further, although frameworks have been proposed (Ohno & Fortney 2022).If the planet's true radius is 3.7 R ⊕ , the uncertainties on the transit spectrum would now each be equivalent to around 1.5 scale heights.Therefore, any attempt to determine the effect of rings, if present, may also need to account for the atmospheric contribution.Over such a short spectral range, the solutions to this are likely to be degenerate, but we encourage further work into this option as it may be useful for future data taken of this planet, and others like it, with observatories such as the JWST.
We note the the slope seen in the spectrum of HIP 41378 f is similar to that seen for HD 97658 b and LTT 9779 b, which  might suggest that all these observations are affected by a systematic or astrophysical effect that cannot currently be identified.The narrow wavelength range of HST WFC3 G141 makes it difficult to determine whether strange spectra are a result of poor reduction or are a true representation of the signal.Hopefully, future data of this planet will allow us to uncover the truth about its nature.However, due to the strange nature of the HST WFC3 G141 spectrum, we chose not to include it when fitting for trends within the data using BHM.

B.9. K2-24 b
Transits of two planets orbiting K2-24, a G3 dwarf, were detected during Campaign 2 of the K2 mission (Petigura et al. 2016;Sinukoff et al. 2016).K2-24 b, the inner and smaller (R = 5.68 ± 0.56 R ⊕ ) of the two planets, orbits the host star in 20.8851 days.With a mass of 19 ± 2 M ⊕ (Petigura et al. 2018), K2-24 b has a relatively light density which, combined with the brightness of the host star and relatively large transit depth, makes it a good target for atmospheric characterization (Petigura et al. 2016).
A single-transit observation was obtained with HST WFC3 G141 by proposal GO-14455 (PI: Erik Petigura).The data were collected using the GRISM256 subarray, the SPARS10 sequence, and 16 up-the-ramp reads leading to an exposure time of 103.128586 s.The scan rate was 0 16 s, leading to a peak signal of around 20,000 counts.
Our chemical equilibrium retrieval was not capable of fitting the spectral features seen in the K2-24 b data.However, the best-fit free chemistry model was preferred to a flat line with a significance of 2.46σ.It suggested the presence of NH 3 , in a high abundance, but did not find evidence for H 2 O. Due to the shape of the spectrum, the fitting of the 1.4 μm feature yielded a negative value (−1.04 ± 0.68).

B.10. KELT-1 b
The first low-mass object discovered by the KELT-North survey, KELT-1 b is a 27 M J , 1.12 R J planet with a very-shortperiod circular orbit of 29 hr.Siverd et al. (2012) presented spectroscopy, photometry, and radial velocity data in order to obtain an equilibrium temperature T eq ≈ 2400 K, assuming zero albedo, due to a significant amount of stellar irradiation.
Its extreme temperature and significant inflation make KELT-1 b a valuable case-study for short-period atmospheric characterization.In several early studies it was successfully characterized in eclipse (Beatty et al. 2014(Beatty et al. , 2017a)), suggesting a monotonically decreasing temperature-pressure profile.However, more recent work has suggested the atmosphere presents indications of a localized thermal inversion associated with VO, FeH, and H − (Changeat et al. 2022).
We find that, due to it is high mass, which leads to it being classified as a brown dwarf, there is no evidence for atmospheric species in transit spectrum.We note that the expected transit depth modulation due to 1 scale height of atmosphere is far below the size of the error bars, explaining why a flat spectrum is recovered.

B.11. Kepler-9 b and Kepler-9 c
The Kepler-9 system was discovered in 2010, with planets b and c were the first planets to be confirmed via transit timing variations (Holman et al. 2010).These allowed the masses of the planets to be measured (e.g., Dreizler & Ofir 2014), which were later verified by radial velocity observations (Borsato et al. 2019).
The two transit observations of Kepler-9 b, as well as the two transit observations of Kepler-9 c, were taken in staring mode using the GRISM256 aperture.The SPARS10 sampling sequence was utilized, with 12 up-the-ramp reads, leading to an exposure time of 73.742661 s.These were taken as part of GO-12482 (PI: Jean-Michel Desert).
For Kepler-9 b, the free retrievals did not provide a more preferable fit to the data than the flat model.However, the GGChem retrieval did provide a preferable fit, albeit to only 1.89σ.In the case of Kepler-9 c, the opposite was true: the flat model was preferred to the GGChem retrieval but not to the free chemistry retrievals.However, the preferred free chemistry model did not detect features, rather a slope in the spectrum.Neither atmospheric "detection" is convincing, but this is unsurprising given the size of the error bars with respect to the expected atmospheric modulation due to a single scale height of atmosphere.
B.12. Kepler-51 b and Kepler-51 d Each of the three planets which are known to be orbiting Kepler-51 have low densities (Masuda 2014).Kepler-51 c has only a grazing transit, making constraints on its size difficult.However, Kepler-51 b and Kepler-51 d were originally determined to have radii of 7.1 ± 0.3 R ⊕ and 9.7 ± 0.5 R ⊕ , respectively (Masuda 2014).Their masses, derived from transit timing variations, gave them both densities of less than 0.05 g cm −3 (Masuda 2014).
Two HST WFC3 G141 observations were taken of each planet (PN: 14218, PI: Zach Berta-Thompson) and these were analyzed by Libby-Roberts et al. (2020).In this study, the radii and masses of the planets were updated and they found that the densities were slightly higher than previously thought.However, they were still low, at 0.064 g cm −3 and 0.038 g cm −3 , respectively.Despite the low density, and thus large atmospheric scale height, the transmission spectra uncovered by Libby-Roberts et al. (2020) did not yield any spectral features.
The observations of HST WFC3 G141 were taken with the following settings.The GRISM256 aperture and the SPARS10 readout sequence were utilized, with 15 up-the-ramp reads resulting in an exposure time of 103 s.Due to the faintness of the host star (J = 13.56), the staring mode was used as it offered a higher efficiency and precision than the now-common scanning mode (Libby-Roberts et al. 2020).
We analyzed all four observations but had issues with the extraction of data for one of the Kepler-51 b visits.As we could only analyze a single visit of Kepler-51 b with Iraclis, we use the spectrum from Libby-Roberts et al. (2020) for this planet.However, we note the good comparison between the single visit and their spectrum and also clarify that we utilize our fitting of the Kepler-51 d data.

B.13. LTT 9779 b
LTT 9779 b is an ultrahot Neptune discovered using data from TESS (Jenkins et al. 2020).With a period of less than a day, LTT 9779 b lies within the Neptune desert: there is a dearth of planets between 2 and 10 R ⊕ in these very short orbits.Photometric eclipse observations of LTT 9779 b with Spitzer revealed a spectrum which is best fitted by a noninverted atmosphere, and evidence was found for the presence of CO (Dragomir et al. 2020).Phase-curve observations with TESS and Spitzer have also been made, which found a large (1100 K) day-night brightness contrast and suggestions of a supersolar atmospheric metallicity (Crossfield et al. 2020).The transit observations from these phase curves were not precise enough to constrain the atmosphere in transmission.
HST WFC3 G141 observations of LTT 9779b's transit were taken as part of proposal GO-16457 (PI: Billy Edwards; Edwards et al. 2020a), with an additional data set using the G102 grism also taken.The data were taken using the SPARS10 sequence and the GRISM256 aperture.The total exposure time was 103.13 s, with 16 samples being taken per exposure.To avoid contamination by a background star, we used the up-the-ramp reads, using Iraclis's splitting mode to extract these individually.
Assuming a H/He-dominated atmosphere (mmw = 2.3), the derived spectrum has error bars that are just below the expected atmospheric modulation due to 1 scale height of atmosphere.The spectrum has an increasing transit depth with wavelength, which is best fitted by the model with no absorbers (flat model) where the slope is caused by CIA.The bluest spectral point is significantly higher than the subsequent data points and the retrieval including optical absorbers suggests this could be due to TiO.However, as the detection is based off of a single data point it is unreliable and the G102 data will be required to ascertain the presence of this molecule.However, we do not analyze it here to maintain homogeneity across the planets studied.

B.14. TrES-2 b
TrES-2 b was among the first transiting exoplanets to be discovered (O' Donovan et al. 2006).The 1.2 R J planet orbits its host star in roughly 2.47 days, giving it an equilibrium temperature of around 1700 K.It was among the planets considered for the JWST Early Release Science Transiting Exoplanet program (Stevenson et al. 2016b) but ultimately not selected (Bean et al. 2018).Turner et al. (2016) obtained a ground-based UV transit of TrES-2 b to search for asymmetries in the light curve but no such phenomena were observed.Given the large transit depth, the planet has also been regularly followed-up from the ground with numerous studies updating the ephemerides and searching for nonlinear periods (Rabus et al. 2009;Raetz et al. 2014;Öztürk & Erdem 2019;Edwards et al. 2020c, e.g.,).Additionally, a K-band eclipse was detected by Croll et al. (2010), which, when combined with Spitzer eclipses of the planet (O'Donovan et al. 2010), suggested the dayside could be represented by a blackbody and confirmed the planet's orbit was circular.
The HST WFC3 G141 transmission spectrum was obtained using the staring mode.The RAPID readout mode was used for the GRISM512 aperture, with 16 samples being taken per exposure to give an exposure time of 12.8 s.The data were previously analyzed by Ranjan et al. (2014), who found that the spectrum was not precise enough to constrain the atmosphere.We reanalyze this data here, using Iraclis, and yield the same result: the uncertainites on each data point are several scale heights in size.A comparison between our spectrum and the one from Ranjan et al. ( 2014) is given in Figure 21.

B.15. TrES-4 b
TrES-4 b is a short-period (3.5539268 days), hot Jupiter (R = 1.706R J ; Mandushev et al. 2007).The planet has been observed in eclipse, with data from the Spitzer space telescope suggesting a thermal inversion (Knutson et al. 2009), while attempts have also been made to measure the thermal emission from the ground (Martioli et al. 2018).
As with TrES-2 b, the HST WFC3 G141 transmission spectrum was obtained using the staring mode.Again the RAPID readout mode was used for the GRISM512 aperture, with 16 samples being taken per exposure to give an exposure time of 12.8 s.The data also were previously analyzed by Ranjan et al. (2014), who found that the spectrum was not precise enough to constrain the atmosphere.We reanalyze this data here and find that the recovered error bars are tens of scale heights in size, therefore offering no information on the atmosphere.A comparison of the spectra is shown in Figure 22.

B.16. TOI-270 c and TOI-270 d
TOI-270 c is part of a three-planet system, consisting of two sub-Neptunes and a super-Earth (Gunther et al. 2019), found by TESS.The planets orbit a bright (K = 8.25) M dwarf (T eff ∼3500 K) and the masses of the planets were subsequently measured using radial velocity measurements (Van Eylen et al. 2021).
Three transit observations were taken of TOI-270 c, each constituting three orbits which meant the first orbit was not, in  this case, discarded.Despite having larger ramps than subsequent orbits, fitting the light curve using these orbits still led to good fits to the data.For these observations, the GRISM512 subarray was used, with eight up-the-ramp reads using the SPARS25 sampling sequence, and an exposure time of 138.380508 s, combined with a scan rate of 0 111 s.Additionally, a single transit of TOI-270 d was acquired using the same setup.These observations were taken as part of GO-15814 (PI: Thomas Mikal-Evans).
We find that the white light-curve depth for TOI-270 c varies drastically, with a 1000 ppm difference between visits 2 and 3. On the other hand, the spectral shape of these visits is consistent, as shown in Figures 23 and 22.However, the spectral shape is itself strange: there is a significant drop at bluer wavelengths.Atmospheric scenarios seem unlikely because of the size of the feature (3.5 scale heights) leaving two hypotheses.First, it could be due to the fitting of the data, but this seems unlikely as the effect is recovered in all three visits.Furthermore, there are no significant residuals within any of the fittings and there are no indications that the fittings are, in any way, poor.
The second possible explanation is that the feature could be due to transit light source effect as stellar spots or faculae can cause significant spectral modulation.While no spot-crossing events were found in the light curves, unocculted spots could be to blame and these could also explain the large difference between the averaged HST data and the TESS transit depth from Van Eylen et al. (2021).Surprisingly, the HST data for TOI-270 d have no such feature.The observation of TOI-270 d was taken between 4 days after the third visit of TOI-270 c, with the observations of TOI-270 c data being taken over a period of roughly 7 months.There are 56.61days between visits 1 and 2, while visits 2 and 3 were separated by 164.16 days.Van Eylen et al. (2021) estimated that the rotation period of the star was around 58 days, which would imply that approximately the same face of the star was visible for each observation of TOI-270 c.Nevertheless, the visible portion of the star would have only changed by around 7% between the third visit of TOI-270 c and the transit of TOI-270 d.The transit chords of these planets is different as they have slightly different inclinations, but this would not affect the transit light source effect from unocculted spots.
We attempted atmospheric retrievals on both planets, but could find no model which would adequately fit the data for TOI-270 c.Meanwhile, for TOI-270 d, our retrievals uncovered evidence for H 2 O.

B.17. TOI-674 b
The discovery of the Neptune-sized planet TOI-674 b (R = 5.25 R ⊕ , M = 23.6 M ⊕ ,) was recently announced (Murgas et al. 2021).The planet has a period of only 1.977143 days, but orbits an M dwarf (T eff = 3514 K) and thus has an equilibrium temperature of around 650 K. Three transits of TOI-674 b were taken as part of GO-15333 (PI: Ian Crossfield).These have previously been presented in Brande et al. (2022), where the discovery of a water feature was announced.Their study also  utilized data from TESS and Spitzer, but the retrieved water abundance in their study (log(VMR) = −3.00± 1.00) is consistent with ours (log(VMR) = −3.121.04 0.78 -+ ).We note that a small vertical offset is present between our HST WFC3 G141 spectrum and that of Brande et al. (2022).As such offsets are a common occurrence, this is not concerning except when combining instruments without wavelength overlap.Finding such an offset further motivates our choice to only consider data from HST WFC3 G141 instead of combining all available data sets without being able to verify their compatibility.

B.18. V1298 Tau b and c
V1298 Tau b, a warm Jupiter-sized planet with an orbital period of 24.148 days, was detected around a young solar analog by David et al. (2019b).Further analysis of the K2 data unveiled three additional planets in the system (David et al. 2019a), with planets b and c having orbital periods of 8.25 and 12.4 days, respectively.However, the period of V1298 Tau e could not be constrained as it only transited once in the K2 data.A second transit was later observed by TESS, which helped constrain potential orbital periods, with the highestprobability orbit placing V1298 Tau e close to a 2:1 resonance with V1298 Tau b (Feinstein et al. 2022).
Measuring the masses of these planets has proved difficult due to the intense activity of the host star.Beichman et al. (2019) placed a 3σ upper limit on the mass of V1298 Tau b of 2.2 M J while dynamical arguments have placed constraints on the total masses of the planet pairs (David et al. 2019a).Further radial velocity measurements were taken and led to the first mass measurements in the system (Suárez Mascareño et al. 2021).Their work concluded that V1298 Tau b has a mass of 0.64 M J while V1298 Tau e has a mass of 1.16 M J , making it much denser than typical giant exoplanets.However, the period of V1298 Tau e derived by Suárez Mascareño et al. (2021) disagrees to 4σ with the value from Feinstein et al. (2022).
Two data sets have been taken with Hubble WFC3 G141 to probe the atmospheres of worlds in the V1298 Tau system.The transit of V1298 Tau b utilized 10 Hubble orbits and was taken as part of GO-16083 (PI: Kamen Todorov).For these observations, the GRISM256 subarray was used, with five up-the-ramp reads using the SPARS25 sampling sequence, leading to an exposure time of 89.661957 s.A scan rate of 0 230 s was used, which gave a scan length of around 170 pixels.The observation of V1298 Tau c (GO-16462, PI: Vatsal Panwar) required only eight orbits but used the same detector setup as the transit of V1298 Tau b.
The white light curves fits for the two planets are shown in Figure 25.In both cases non-Gaussian residuals can be seen, which are likely due to the high variability of the host star.Nevertheless, thanks to the divide-by-white method, the residuals on the spectral light curves were Gaussian.The spectra recovered showed clear evidence of spectral modulation and we conducted retrievals to attempt to constrain the chemistry.As noted above, there are disagreements about the orbital period of V1298 Tau e.As the total radial velocity signal is a combination of components from all planets in the system, an error in the derivation of the period of one planet can lead to an inaccurate mass measurement of both it and another planet in the system.Additionally, V1298 Tau c does not have a measured mass.Therefore, for both planets we attempted retrievals in which we fitted for the mass (Changeat et al. 2020c).For V1298 Tau b, we also ran retrievals with the mass fixed to the value from Suárez Mascareño et al. (2021).
In the case of V1298 Tau b, the fixed-mass retrievals did not lead to models which well fitted the data.As can be seen in Figure 26, the retrievals were unable to recreate the strong water seen at 1.4 μm.When we fitted for the mass, the retrieval was subsequently able to fit this feature but the retrieved mass was extremely low.We placed a lower bound on the mass at 0.1 M J , which is far below the 0.64 M J from Suárez Mascareño et al. (2021).
For V1298 Tau c, both the retrievals fitted for the mass as there are currently no constraints on it from radial velocities or transit timing variations.As seen in Figure 27, the retrieval without optical absorbers struggles to fit the spectrum.However, when they are included a combination of FeH and e− help to create the features seen, yet the cool temperature of the planet makes the existence of these species unlikely.
Given the issues with fitting of these spectra it is possible that the stellar activity has affected the spectrum recovered with Iraclis.Iraclis can fit the long-term trend using a linear or quadratic model, but in these cases a different model might be preferable (e.g., a sinusoid).However, we leave such an exploration for future work, which should also aim to resolve the discrepancy in the period of V1298 Tau e to increase the accuracy and confidence in the mass measurements.Some transit timing variations have already been seen in the system and future transit measurements may also lead to mass constraints.Due to the poor fitting of the spectra of V1298 Tau b and c, we do not include them in the primary analyses of this paper (e.g., any of the trend fitting) but include them for completeness and to highlight the issue with the data.

B.19. WASP-6 b
Discovered by Gillon et al. (2009), WASP-6 b has a mass around half that of Jupiter but is inflated and has an equilibrium temperature of 1200 K.The atmosphere of the planet has been widely studied, with Nikolov et al. (2015) presenting the HST STIS and Spitzer IRAC transmission spectrum, which showed signs of scattering in the optical.The same data was later analyzed by Sing et al. (2016a) before Carter et al. (2020) presented the HST WFC3 G141 spectrum.Carter et al. (2020) also took data with the Very Large Telescope (VLT), and other ground-based spectra of WASP-6 b have been also been obtained, all of which have concluded the atmosphere is hazy because of a lack of spectral features due to Na or K (Jordán et al. 2013;Carter et al. 2020).
The HST WFC3 G141 data for WASP-6 b were taken for proposal GO-14767 (PI: David Sing;Sing et al. 2016b).For these observations, the GRISM512 subarray was used, with eight up-the-ramp reads using the SPARS25 sampling sequence, and an exposure time of 138.380508 s, combined with a scan rate of 0 06 s.The data were previously presented in Carter et al. (2020), where it was combined with data from the VLT.The analysis by Carter et al. (2020) again found the atmosphere to be hazy and they also noted that corrections for the stellar heterogeneity could have significant effect on the Na and K abundances.We find a highly similar HST WFC G141 spectrum to this work, as shown in Figure 28.

B.20. WASP-18 b
WASP-18 b (Hellier et al. 2009) has been thoroughly studied since its discovery in 2008.Spitzer, Hubble WFC3, and ground-based eclipses have been taken (Nymeyer et al. 2011;Iro & Maxted 2013;Sheppard et al. 2017;Arcangeli et al. 2018;Manjavacas et al. 2019;Kedziora-Chudczer et al. 2019;Gandhi et al. 2020) as well as phase curves with Hubble WFC3 and TESS (Arcangeli et al. 2019;Shporer et al. 2019).These have revealed a low albedo, poor redistribution of energy to the nightside, and evidence for an inverted dayside temperaturepressure profile.In particular, the analysis from Sheppard et al. (2017) considered a similar data set to us and detected a strong thermal inversion, associated with the presence of H 2 O and CO.
A HST transmission spectrum was taken, along with two eclipses, as part of a phase curve (GO-13467, PI: Bean).These observations had an exposure time of 73.74 s having used the GRISM256 aperture and the SPARS10 sampling sequence with 12 up-the-ramp reads.Due to the high mass of WASP-18 b, 1σ uncertainties on the recover transit spectrum were equivalent to around 5 scale heights, denying any chance of recovering spectral features.

B.21. WASP-19 b
WASP-19 b orbits its bright host star on a very short orbit (0.94 days) and its high temperature and large size make it an excellent target for atmospheric studies (Hellier et al. 2011a).WASP-19 b has been the subject of a number of investigations from both the ground and from space.Work by Anderson et al. (2013) analyzed four Spitzer eclipses, taken across 3.6-8 μm, and constructed a spectral energy distribution of the planet's  dayside atmosphere.They found no stratosphere, supporting the hypothesis that hot Jupiters orbiting active stars have suppressed thermal inversions (Knutson et al. 2010).Analysis of the TESS optical phase curve showed moderately efficient day-night heat transport, with a dayside temperature of 2240 K and a day to night contrast of around 1000 K (Wong et al. 2020).This study also utilized a host of ground-based observations by Anderson et al. (2010b), Burton et al. (2012), Abe et al. (2013), andBean et al. (2013).
WASP-19 b has also been studied via transmission spectroscopy.The retrievals of the STIS-G430L, G750L, WFC-G141, and Spitzer-IRAC observations suggest the presence of water at log(H 2 O) ≈ 4 but show no evidence for optical absorbers (Sing et al. 2016a;Barstow et al. 2017;Pinhas et al. 2019).Those results do not match the groundbased transits that were acquired with the European Southern Observatory's VLT, using the low-resolution FORS2 spectrograph, which covers the entire visible-wavelength domain (0.43-1.04 μm).When analyzing this data, Sedaghati et al. (2017) detected the presence of TiO to a confidence level of 7.7σ.However, data from the Magellan/Inamori-Magellan Areal Camera did not find any evidence for TiO or Na as a featureless transmission spectrum was recovered (Espinoza et al. 2019).
Two transit observations of WASP-19 b have been acquired.The first, in staring mode, as part of Proposal GO-12181 (PI: Drake Deming; Deming 2010) has previously been presented in other studies (e.g., Sing et al. 2016a).The scanning-mode data were taken with the GRISM512 aperture and consisted of four up-the-ramp reads with the SPARS25 sequence.This gave an exposure time of 46.695518 s, with a scan rate of 0 026 s.These observations were part of a phase curve and, while guide star acquisition issues were incurred, the failure in pointing happened long after the transit had occurred, meaning the data analyzed here were unaffected.

B.22. WASP-103 b
WASP-103 b is an ultrashort-period planet (P = 22.2 hr) whose orbital distance is less than 20% larger than its Roche radius, resulting in the possibility of tidal distortions and mass loss via Roche-lobe overflow (Gillon et al. 2014).Given its size, temperature, and the brightness of its host star it is a great target for atmospheric studies and has been observed with numerous instruments.
WASP-103 b's HST WFC3 emission spectrum was found to be featureless down to a sensitivity of 175 ppm, showing a shallow slope toward the red (Cartier et al. 2017).Work by Manjavacas et al. (2019), which performed a reanalysis of the same data set, found that the emission spectrum of WASP-103 b was comparable to that of an M3 dwarf.Delrez et al. (2018) obtained several ground-based, high-precision photometric eclipse observations which, when added to the HST data, could be fit with an isothermal blackbody or with a lowwater-abundance atmosphere with a thermal inversion.However, their Ks-band observation showed an excess of emission compared to both these models.More recently, a phase-curve analysis of the planet was taken and reported in Kreidberg et al. (2018b).The study also utilized the previous HST emission spectra and confirmed a seemingly featureless dayside.A later study on the same data employed a unified phase-curve retrieval technique to obtain a more complex picture of the planet (Changeat 2022).It confirmed the presence of thermal inversion and dissociation processes on the dayside of the planet and found signature of FeH emission.The study also constrained water vapor across the entire atmosphere.Furthermore, ground-based transmission observations found strong evidence for Na and K (Lendl et al. 2017).Later observations by Wilson et al. (2020) yielded a featureless spectrum, while a comprehensive analysis of 11 transmission spectra by Kirk et al. (2021) found evidence for unocculted regions of the star as well as weak evidence for TiO.
Two HST G141 phase curves of WASP-103 b were obtained (PN: 14050, PI: Kreidberg et al 2014a), which each contained a single transit and eclipse.We analyzed these two transits and observed that the slope in the final spectrum is well fit by VO.Other optical absorbers might be present (TiO, H − ) but the data do not allow verification of this.The solution found possesses a wide range of metallicities, subsolar in nature.We note that Kirk et al. (2021) found that VO could only account for their spectrum if present in extremely high quantities.

B.23. WASP-107 b
A sub-Saturn around a solar-metallicity K6 star, WASP-107 b was immediately noted as an excellent target for atmospheric studies (Anderson et al. 2017).Soon after its discovery, a transmission spectrum of WASP-107 b was taken with HST WFC3 G141 as part of proposal GO-14915 (PI: Laura Kreidberg).The data were previously presented in Kreidberg et al. (2018a), showing strong evidence for the presence of water but a possible methane depletion.The study also noted that the features seen were smaller than would be expected for a cloud-free atmosphere, inferring the presence of high-altitude aerosols.In line with the results from Kreidberg et al. (2018a), we found strong evidence for water but muted features compared to a clear atmosphere.
Additionally, an observation was taken with the G102 grism of HST WFC3 (GO-14916, PI: Jessica Spake), which was used to demonstrate that the atmosphere of WASP-107 b was eroding as the G102 data gave access to the 1.083 μm He line (Spake et al. 2018).High-resolution observations have since confirmed this detection (Allart et al. 2019;Kirk et al. 2020).We do not fit the G102 data as part of this study to ensure homogeneity across our data sets.We show a comparison between the spectrum from previous studies and ours in Figure 29.(Evans et al. 2016(Evans et al. , 2018)).The authors of these studies note that chemical equilibrium models with solar abundances cannot reproduce the spectrum seen, while free chemical retrievals can only do so by converging to high abundances of VO and FeH.In parallel, high-resolution, ground-based observations of the transit have put upper limits on the abundances of TiO and VO at the terminator with log (VMR) < -7.3 and 7.9, respectively, suggesting these cannot be causing the inversion seen (Merritt et al. 2020).However, the study highlighted that these limits are largely degenerate with other atmospheric properties such as the scattering properties or the altitude of clouds on WASP-121 b.Another study found a host of atomic metals, including V, which are predicted to exist if a planet is in equilibrium and has a significant quantity of VO (Hoeijmakers et al. 2020).They too noted the absence of TiO, which could support the hypothesis that Ti is depleted via a cold trap.Furthermore, various highresolution studies have found evidence for absorption due to metallic lines (e.g., Cabot et al. 2020;Gibson et al. 2020;Borsa et al. 2021).
Here, we fit three transit observations.Two of these were obtained as part of a phase curve  Evans et al. (2016).We choose to refit this observation to ensure the methodology, parameters, and limb-darkening coefficients were the same across all three transit fits.

B.25. WASP-178 b
WASP-178 b, also known as KELT-26 b, is an ultrahot Jupiter orbiting an A1V host star (Hellier et al. 2019).The planet appears to be in a highly misaligned orbit and has a mass 1.93 0.16 0.14 -+ M J (Martinez et al. 2020), making it one of the heaviest planets in our sample.
A single transit was taken as part of proposal GO-16450 (PI: Joshua Lothringer).For these observations, the GRISM256 subarray was used, with eight up-the-ramp reads using the SPARS25 sampling sequence, and an exposure time of 138.354034 s, combined with a scan rate of 0 07597 s.We note that WFC3 data with the UVIS and G102 grisms has also been taken for this planet but were not included in this study due to the need to ensure homogeneity across the planet sample.
We fitted the data with Iraclis, and the preferred atmospheric model for this finds strong evidence for water in the atmosphere.The increasing absorption at shorter wavelengths is best fit with a large abundance of TiO as well as absorption by H -. The result is consistent with the expected chemistry given the equilibrium temperature of around 2400 K, although we note that the retrieved temperature is cooler than expected (∼1250 K).We note that analysis of the HST WFC3 UVIS data by Lothringer et al. (2022) found evidence for SiO.To understand this planet fully, an analysis of all three HST data sets is required.

Appendix C Fixed CO Retrievals
One set of retrievals conducted in this study fitted equilibrium chemistry models to the data, with the C/O ratio and metallicity as free parameters controlling the chemistry.Despite the large uncertainties on the retrieved C/O ratio, we noted that the majority of planets were not consistent with C/O ratios larger than 1.To explore this further we attempted chemical equilibrium retrievals with fixed C/O ratios of 0.5 or 1.
We then compared the metallicities retrieved in each case, as well as the goodness of the fit.We find that forcing the C/ O to be equal to 1 led to the metallicities of the planets generally being retrieved as solar but that these models provided a poorer fit to the data than when the C/O ratio was free to vary.On the other hand, we find that fixing the C/O to 0.5 generally leads to only minor differences in the metallicity and goodness of fit in comparison to the free C/O case.The results of these retrievals are displayed in Figures 30 and 29 and provide further suggestions that high C/O ratio atmospheres are not compatible with the spectra derived in this work.However, we caution that the narrow wavelength coverage of HST WFC3 G141 does not provide strong constrains on carbon-bearing species so the robustness of the finding is questionable.

Appendix D Bayesian Hierarchical Modeling
The following derivation is based primarily on the work of Lustig-Yaeger et al. (2022), so, while we will provide a brief discussion on the method, interested readers can refer to Section 2.2 in that work for more detailed discussion.Consider that we have obtained N exoplanets, each with M n observations D n .For our case M n = 1 but it can be more than 1.For each of these planets we can infer the joint posterior distribution of a predefined set of parameters θ n given D n , P(θ n |D n ) via standard Bayesian formulation: where P(D n |θ n ) is the likelihood of observing D given the specific set of parameters θ.P(D) is the marginal likelihood over all possible sets of θ n , also known as the evidence.The prior function, P(θ n ), is taken to be flat or uninformative.
Suppose we would like to derive the population-level trend of planetary temperature against some molecular abundance.If we assume the trend can be parameterized by a set of hyperparameters α, we can compute the (population-level) likelihood of the entire data ensemble,  a , as If we further assume that there is no likelihood covariance between parameters of different exoplanets n, we can express Equation (6) as the product of N marginalized integral over parameters θ n : The first term is simply the likelihood function for each individual observation (nth planet) and the second term acts as a reweighting function.
We can further manipulate Equation (6) by recognizing that the second term is a ratio between the new prior, P α ( f n,T ), that depends solely on α, and the original prior P 0 ( f n,T ), multiples  the prior function of the parameters P(θ n ): = Note that we have combined P(D n |θ n ) and P(θ n ) to get the posterior P(θ n |D n ).We have also omitted the evidence term here since it is merely a constant in this context.In our implementation, we have assumed a linear trend for the temperature, hence where m and c represent the slope and intercept of the straight line, and X mol represent the molecular abundance of the chemical species.We can simplify the integral in Equation (7) by summing over all the samples in the posterior traces: The new prior function P α ( f T,n,k ), is assumed to be a Gassuian distribution, and therefore we can compute the probability analytically by comparing f T,n,k with f T (m, c, X mol ): The hyperparameter α has in total three free parameters, i.e., α ≡ [m, c, σ], that we can infer from the posterior traces.
Once the likelihood function,  a , is computed, the posterior on the hyperparameter, α, can be inferred simply by referring to Equation (5): where P(α) is the (hyper)prior function for the hyperparameters α.In this paper we have fixed all the hyperpriors as a uniform distribution.
To infer the hyperparameters, we have used the MultiNest algorithm (Feroz et al. 2009;Buchner et al. 2014) to compute the preferred BHM importance sampling model.Using the logevidence provided by MultiNest, we are able to compare the different models and assess the evidence for a particular trend when compared to a null hypothesis.
In the main text we showed the linear models as well as providing the evidence for these and the null hypothesis.Here we show the models for these linear models again, but also show the model, as well as the traces, for the null hypothesis in each case.For H 2 O, CH 4 , HCN, and NH 3 , these are given in Note.In each case, the null hypothesis (i.e., constant abundance with temperature) yielded a preferable fit to the data.11.
In the main text we fitted the trends for the optical absorbers on all retrievals which, when the optical absorbers were included, led to an atmospheric detection of >3σ, even if the model without optical absorbers was preferred.We did this because we have no reason to expect these planets are not drawn from the same distribution as the others and so we want to test how applicable the GGChem predictions are to all planets.However, in this appendix we also show the fits to only those planets where the retrieval with optical absorbers gave a preferable fit.
When all optical absorbers retrievals are taken, we found evidence for an increasing abundance of e− with increasing temperature.The fits to these are given in Figures 33, 32, 33,  and 34.Additionally, the associated evidence is given in Table 12.Similarly, the fits to only retrievals which preferred the presence of optical absorbers are given in the same figures.In this case, no statistically viable trend was uncovered and the log-evidence for the models is given in Table 13.
For the mass-metallicity fits, in the main text we explored the impact of using only >3σ atmospheric detections or using those at 2-3σ, too.We provide here the full set of models, both for the linear trend and the null hypothesis, and these are given in Figure 37.
With the metallicity for the GGChem retrievals, we used the host star's metallicity [Fe/H] to infer the relative planet-star metallicity and fit our trend to this data.Here we also explore fitting to just the recovered planet metallicity.When only retrievals which led to strong (>3σ) atmospheric detections are used, we again find the linear model has a negative slope (implying a decreasing metallicity with increasing mass, as expected).However, the null hypothesis again has a higher Bayesian evidence.Furthermore, no trend, or even a positive Note.Only the fit to e− provided evidence of a trend with temperature (at a 2.28σ level).In the other cases, the null hypothesis (i.e., constant abundance with temperature) was preferred.

Table 13
Hyperparameters for the BHM Fits for Each Molecule when Using All Retrievals where Optical Absorbers Were Preferred to the Model without Them   Note.In each case, the null hypothesis (i.e., constant abundance with temperature) yielded a preferable fit to the data.slope, is within the 1σ bounds of the best-fitting model.When the retrievals which provided a fit which was preferred to 2-3σ over the flat model are utilized, the BHM shows even less evidence for a mass-metallicity trend.The results are shown in Figure 38 and the hyperparameters, as well as the log-evidence, are given in Table 14.
For completeness, we also show the fits for the water-tohydrogen case.Again, there is no evidence for a massmetallicity trend within the data.The plots are given in Figure 39, while Table 5 contains the log-evidence for each fit.We provide the retrieved metallicities and water abundances in Tables 15 and 16.   37 but without accounting for the stellar metallicity.Again, there is no evidence for a mass-metallicity trend in these data as the constantmetallicity model provides a preferable fit to these data.
Figure 39.Fits from our BHM to the water ratio which was determined using the methods of Welbanks et al. (2019).Left: BHM applied to retrievals that yielded a fit which was preferred to >3σ compared to the flat model.Right: BHM applied to retrievals that yielded a fit which was preferred to >2σ compared to the flat model.The fits for a linear trend fit (top) and flat trend (i.e., constant with mass, bottom) are shown, with the thick line indicating the best-fit model and the thinner lines representing the traces from the fit that fell within 1σ of the best-fit model.In both cases, the flat model is preferred, implying we cannot conclude there is a trend between planet mass and the water-to-hydrogen ratio from these data.where R p is the planet's radius, R s is the radius of the host star, and H is the atmospheric scale height, calculated from where k is the Boltzmann constant, T is the planet's equilibrium temperature, μ is the atmospheric mean molecular weight (set to 2.3), and g is the planet's surface gravity.
An example fit for WASP-107 b is given in Figure 40.We provide the feature size for all planets studied here in Table 17.The spectral feature size has often been used to infer the presence of clouds and proposed as a way of guiding observers as to the spectral modulation that could be expected when planning future observations.Across the population, we recover an average feature size of 0.92 scale heights, a comparable value to previous studies, e.g., 1.4 H (Fu et al. 2017) and 0.89 H (Wakeford et al. 2019).The amplitude of this feature is far below what would be expect from a clear, solar-metallicity atmosphere.However, we note that the magnitude of the atmospheric absorption is only valid across the HST WFC3 G141 range and that other instruments, particularly those that probe further into the IR, will see larger feature sizes.
Extending the models used to derive the 1.4 μm feature size in the HST WFC3 G141 data, we estimate the amplitude of features seen in observations with future instruments by studying the minimum and maximum transit depth across their spectral coverage.For JWST NIRISS GR700XD (0.6-2.Therefore, while clouds and hazes obviously need to be accounted for during the planning of observations with future facilities, current data show the expected amplitude should, on average, be greater than a single scale height.However, we note also that the methodology used to measure the amplitude of absorption features is somewhat flawed.While the parameters in Equation (12) allow the spectrum to be modulated to fit the data and account for the features seen, the final fit does not provide a robust analysis of the nature of the atmosphere.The presence, and effect, of clouds can be better understood by fitting physical models to the data via atmospheric retrievals.
(McKemmish et al. 2019), VO (McKemmish et al. 2016), FeH (Wende et al. 2010), and H − (John 1988; Lothringer et al. 2018; Edwards et al. 2020b).3. Equilibrium chemistry.For these retrievals we used the equilibrium chemistry code GGchem (Woitke et al. 2018) via the recently developed TauREx plugin (Al-Refaie et al. 2022).As with the free chemistry retrievals, we included Rayleigh scattering and CIA as well as both simple gray clouds and Mie scattering.We ran these retrievals with optical absorbers and without, with the free parameters being the atmospheric metallicity and C/O ratio.4. Fixed C/O equilibrium chemistry.As the HST WFC3 band only really allows for the confident detection of H 2 O, chemical equilibrium retrievals are essentially fitting two free parameters (metallicity and C/O ratio) to a single observable (the H 2 O abundance).Therefore, the results are generally highly degenerate, particularly given the lack of sensitivity to carbon-bearing species.

Figure 1 .
Figure 1.For the planets studied here, the distribution of their semimajor axis and mass (gold).The entire currently known transiting population with mass measurements is shown in gray.

Figure 2 .
Figure2.HST/WFC3/G141 data sets utilized in this study.For each planet, the preferred model is denoted by the solid line while the dotted lines show other models which provide a poorer fit.Models including optical absorbers (red) are only shown if they provide a preferable fit to the data than those without (blue).In each case, the gray line denotes the flat model.The error bar in the top-right corner of each plot highlights the transit depth variation due to 1 scale height of atmosphere.Planets have been ordered in terms of their equilibrium temperature, from coolest (top left) to hottest (bottom right).

Figure 4 .
Figure 4. Retrieved abundances of H 2 O, CH 4 , HCN, NH 3 , TiO, VO, FeH, and e− against planet equilibrium temperature.In some cases, only an upper bound on the presence of the molecule could be placed and, for these, the error bar extends to log 10 (VMR) = −12, the lower bound of our priors.The upper bound is shown by the dashed line.The filled regions bounded by dashed-dotted gray lines indicate the predicted abundances from GGchem chemical equilibrium models (assuming C/O = 0.54 and solar metallicity) across 1 × 10 2 to 1 × 10 5 Pa (1 × 10 -3 to 1 bar).For the lower four plots, black data points indicate planets for which the retrieval model with these optical absorbers are preferred while gray points represent those for which the fit is preferable without them.In all cases, abundances are only plotted if the associated models provided a >3σ detection compared to a flat model.The thick colored line on each plot indicates the linear trend from the BHM while the thinner colors lines represent the traces from the fit that were within the 1σ errors of the best-fit model.Only the fit to the abundance of e− gave a Bayesian evidence which was greater than the null hypothesis.

Figure 5 .
Figure 5.Comparison of the retrieved abundances of H 2 O and e−.Black data points indicate planets for which the retrieval model with these optical absorbers were preferred to >3σ compared to the retrieval without optical absorbers.Gray points represent those which are still best fit with optical absorbers present, but at a lower significance compared to the models without them.In all cases, abundances are only plotted if the associated models provided a >3σ detection compared to a flat model.The blue, red, and green filled regions indicate the abundances derived from GGchem at different metallicities at pressure of 1 × 10 2 to 1 × 10 5 Pa.The dotted lines indicate the upper bound used as a prior for each molecule.The lower bound in each case was log 10 (VMR) = −12.

Figure 7 .
Figure 7. Retrieved metallicity from our GGChem runs with the model from Thorngren et al. (2016) also shown.Top: BHM results when only using retrievals whichgive a >3σ atmospheric detection.Bottom: BHM results when using retrievals which give a >2σ atmospheric detection.In both cases, the BHM found a best-fit trend line which had a negative slope.However, in both cases, the null hypothesis (metallicity is not dependent upon mass) was preferred.

Figure 8 .
Figure 8. Ratio of H 2 O-to-H with respect to stellar metallicity values for the planets from our study.The recovered values from Welbanks et al. (2019) are also shown, as are their best-fit trends.Top: BHM results when only using retrievals which give a >3σ atmospheric detection.Bottom: BHM results when using retrievals which give a >2σ atmospheric detection.In the former case, the BHM found a best-fit trend line which had a negative slope while the latter had a positive slope.However, in both cases the null hypothesis (metallicity is not dependent upon mass) was preferred.Additionally, either sign of slope, or indeed no slope, is also within the 1σ errors of each fit.

Figure 9 .
Figure9.Comparison of the Bayesian evidence for retrievals with free chemistry and those using the GGChem chemical equilibrium scheme.

Figure 10 .
Figure 10.Comparison of the 1.4 μm feature height to different bulk parameters for the sub-Neptunes within our study.Red data points indicate those studied by Crossfield & Kreidberg (2017) with their trends also shown in red.Our trends are shown in black and are linear fits, except for with temperature, where a second-order polynomial fit was used.

Figure 12 .
Figure12.Retrieved metallicity when a simulated data set is created using the model fromThorngren et al. (2016).From our fake data sets, which utilized the error bars from the real observations as well as the retrieved 10 bar radius and cloud pressure, we can recover the input trend, albeit a marginal detection due to the size of the uncertainties on the planet's metallicity.The finding contrasts the real data, where no obvious trend was uncovered.
-scanning directions (forward scanning) and to n w rev ( ) for downward-scanning directions (reverse scanning).The reason for using separate normalization factors is the slightly different effective exposure time due to the known upstream/downstream effect (McCullough & MacKenty 2012).

Figure 14 .
Figure 14.Comparison of the transit spectrum of CoRoT-1b obtained here to those in the literature.While the features within the spectra are similar, there are offsets between different studies.

Figure 15 .
Figure 15.White light curves for scanning-mode data of GJ1214 b on 2013 August 4 (top) and 12 (bottom).The residuals of the in-transit orbit are non-Gaussian, and this could be caused by starspot crossing.

Figure 16 .
Figure 16.Transit spectra of GJ1214 b using different amalgamations of data sets.All are roughly consistent with one another, and the spectrum used for atmospheric analyses in this study is shown in black.

B
.4.HAT-P-7 b HAT-P-7 b is an inflated hot Jupiter of 1.4 R J (Pal et al. 2008), which was studied during the commissioning program of Kepler when the satellite detected the eclipse as part of an optical phase curve (Borucki et al. 2009).These measurements indicated that HAT-P-7 b could have a dayside temperature of around 2650 K, which confirmed predictions from Pal et al. (2008); Fortney et al.

Figure 17 .
Figure 17.Left: comparison of the spectrum derived in this work of HD 97658 b to that fromKnutson et al. (2014), who only analyzed visits 1 and 2. Other than a slight offset, the spectra agree well after two visits but the averaged spectrum of all four transits has a stronger slope.Right: comparison of the spectrum derived in this work of HD97658 b, to that fromGuo et al. (2020).While the spectra agree over 1.3-1.5 μm, outside of this range there is a significant disagreement in the features.

Figure 18 .
Figure 18.(Left) Uncorrected transit spectra for each visit of HD 97658 b.The shaded regions indicate the white light-curve depth, and 1σ uncertainty, for each visit.(Right) Normalized transit spectra for each visit of HD 97658 b.The averaged transit spectrum, which was used in the analysis here, is given in black.(The complete figure set (10 images) is available.)

Figure 19 .
Figure 19.Comparisons of different spectra for HIP 41378 f.All spectra derived in this work exhibit features while the result from Alam et al. (2022) is flat.Two pipelines were used here to check the result.They give slightly different mean depths and the Cascade spectrum has a decreasing slope at shorter wavelengths.Utilizing the wavelength bins and limb-darkening coefficients from Alam et al. (2022) did not remedy the situation.

Figure 20 .
Figure 20.Spectrum of HIP 41378 f recovered with Iraclis and the preferred models from our retrievals.

Figure 22 .
Figure 22.Comparison of the transit spectrum of TrES-4 b obtained here to the spectrum from Ranjan et al. (2014).

Figure 21 .
Figure 21.Comparison of the transit spectrum of TrES-2 b obtained here to the spectrum from Ranjan et al. (2014).

Figure 23 .
Figure 23.Left: uncorrected transit spectra for each visit of TOI-270 c.The shaded regions indicate the white light-curve depth, and 1σ uncertainty, for each visit.Right: normalized transit spectra for each visit of TOI-270 c.The averaged transit spectrum, which was used in the analysis here, is given in black.

Figure 24 .
Figure 24.Comparison of the transit depths obtained here to those from Van Eylen et al. (2021) for TOI-270 c (left) and TOI-270 d (right).As previously noted, we find a large variance in the transit depth obtained for each visit of TOI-270 c and only one of these is of a similar depth to the TESS data.

Figure 25 .
Figure 25.White light-curve fits for the observations of V1298 Tau b (top) and V1298 Tau c (bottom).We note the non-Gaussian residuals in each case, potentially due to stellar activity.

Figure 26 .
Figure26.Top: spectrum of V1298 Tau b recovered with Iraclis and the preferred models from our retrievals.Bottom: the probability distributions for the planetary mass when it was fitted for.We note that, while it gives a preferable fit to the data, the fitted mass is extremely low (a value of 0.64 M J was recovered by SuárezMascareño et al. 2021).The optical absorber model is unlikely for this planet given the equilibrium temperature (∼1000 K).

Figure 27 .
Figure 27.Top: spectrum of V1298 Tau b recovered with Iraclis and the preferred models from our retrievals.Bottom: the probability distributions for the planetary mass when it was fitted for.The optical absorber model is preferred, but it is unlikely that FeH and e− are present in the atmosphere of this planet given the equilibrium temperature (∼700 K).

Figure 28 .
Figure 28.Comparison of the transit spectrum of WASP-6 b obtained here to the spectrum from Carter et al. (2020).

Figure 29 .
Figure 29.Comparison of the transit spectrum of WASP-107 b obtained here to the spectrum from Kreidberg et al. (2018a) and Spake et al. (2018).

Figure 30 .
Figure30.Comparison of the retrieved metallicity when the C/O ratio is a free parameter to when it is fixed to 0.5 (top) and 1 (bottom).We only show planets with our sample which have a strong (>3σ) atmosphere detection.Figure31.Comparison of the Bayesian evidence for the fixed C/O ratio retrievals when it is fixed to 0.5 (top) and 1 (bottom).The dotted line shows the 3σ region.

Figure 31 .
Figure30.Comparison of the retrieved metallicity when the C/O ratio is a free parameter to when it is fixed to 0.5 (top) and 1 (bottom).We only show planets with our sample which have a strong (>3σ) atmosphere detection.Figure31.Comparison of the Bayesian evidence for the fixed C/O ratio retrievals when it is fixed to 0.5 (top) and 1 (bottom).The dotted line shows the 3σ region.

Figure 32 .
Figure 32.Retrieved abundances of H 2 O, CH 4 , HCN, and NH 3 against planet equilibrium temperature.In some cases, only an upper bound on the presence of the molecule could be placed and, for these, the error bar extends to log 10 (VMR) = −12.The filled regions bounded by dashed gray lines indicate the predicted abundances from GGchem chemical equilibrium models (assuming C/O = 0.54 and solar metallicity) across 1 × 10 2 to 1 × 10 5 Pa (1 × 10 -3 to 1 bar).Left: the thick colored line on each plot indicates the linear trend from the BHM while the thinner colors lines represent the traces from the fit that were within the 1σ errors of the best-fit model.Right: the same except for the flat model (i.e., the null hypothesis).

a
Similar toLustig-Yaeger et al. (2022), we have added an additional term, σ, to account for the variability in the traces.

Figure 33 .
Figure33.Retrieved abundances of TiO against planet equilibrium temperature.In some cases, only an upper bound on the presence of the molecule could be placed and, for these, the error bar extends to log 10 (VMR) = −12.The filled regions bounded by dashed gray lines indicate the predicted abundances from GGchem chemical equilibrium models (assuming C/O = 0.54 and solar metallicity) across 1 × 10 2 to 1 × 10 5 Pa (1 × 10 -3 to 1 bar).Left: the thick colored line on each plot indicates the linear trend from the BHM while the thinner colored lines represent the traces from the fit that were within the 1σ errors of the best-fit model.Right: the same except for the flat model (i.e., the null hypothesis).Top: black data points indicate planets for which the retrieval model with optical absorbers is preferred, while gray points represent those for which a preferable fit is obtained without them.In these plots, all these retrieval traces were used in the BHM.Bottom: only the retrievals for which the optical absorber model was preferred.

Figure 32 .
Figure 32.The associated log-evidence and best-fit parameters are shown in Table11.In the main text we fitted the trends for the optical absorbers on all retrievals which, when the optical absorbers were included, led to an atmospheric detection of >3σ, even if the model without optical absorbers was preferred.We did this because we have no reason to expect these planets are not drawn from the

Figure 34 .
Figure 34.Same as Figure 33 but for VO.

Figure 35 .
Figure 35.Same as Figure 33 but for FeH.
Table 12Hyperparameters for the BHM Fits for Each Molecule when Using All Retrievals where Optical Abosrbers Were Preferred to the Flat Model by >3σ In each cases, the null hypothesis (i.e., constant abundance with temperature) was preferred.

Figure 36 .
Figure 36.Same as Figure 33 but for e−.

Figure 37 .
Figure37.Retrieved metallicity from our GGChem runs.Left: BHM applied to retrievals that yielded a fit which was preferred to >3σ compared to the flat model.Right: BHM applied to retrievals that yielded a fit which was preferred to >2σ compared to the flat model.The fits for a linear trend fit (top) and flat trend (i.e., constant with mass, bottom) are shown with the thick line indicating the best-fit model and the thinner lines representing the traces from the fit that fell within 1σ of the best-fit model.In both cases, the flat model is preferred, implying we cannot conclude there is a trend between planet mass and the metallicity from these data.

Figure 38 .
Figure38.Same as Figure37but without accounting for the stellar metallicity.Again, there is no evidence for a mass-metallicity trend in these data as the constantmetallicity model provides a preferable fit to these data.

Figure 40 .
Figure 40.Posteriors and best-fit spectrum for amplitude fitting of WASP-107 b.

Table 3
Proposal Information for Data for Which a Good Fit Could Not Be Achieved with Iraclis and the References for the Spectrum Analyzed in This Study

Table 4
Results from Our BHM Fittings in Search of a Mass-Metallicity Trend from Our GGChem Retrievals

Table 6
Stellar Parameters for Observations Acquired from the Literature

Table 7
Planet Parameters for Observations Acquired from the Literature Note.For consistency, these match those used in the original studies.

Table 9
Planet Parameters Utilized in This Work on Observations Fitted Here with Iraclis a Mass taken from Suárez Mascareño et al. (2021).

Table 10
White Light-curve Depths, and Midtransit Times, for all Observations Fitted Here

Table 11
Hyperparameters for the BHM Fits for Each Molecule

Table 14
Hyperparameters for the BHM Fits for a Mass-Metallicity Trend when the Stellar Metallicity Was Not Accounted for

Table 15
Results of Our Chemical Equilibrium Retrievals P ) log 10 (Z P /Z S ) σ Planet Name C/O log 10 (Z P ) log 10 (Z P /Z S ) σ

Table 17 1
.4 μm Feature Size, in Scale Heights, of the Planets Studied Here Note.In this table we report all fitted values, but in the figures we only plot those where the uncertainty of the feature size is less than 2 scale heights.