The JWST Early Release Science Program for Direct Observations of Exoplanetary Systems. V. Do Self-consistent Atmospheric Models Represent JWST Spectra? A Showcase with VHS 1256–1257 b

The unprecedented medium-resolution (R λ ∼ 1500–3500) near- and mid-infrared (1–18 μm) spectrum provided by JWST for the young (140 ± 20 Myr) low-mass (12–20 M Jup) L–T transition (L7) companion VHS 1256 b gives access to a catalog of molecular absorptions. In this study, we present a comprehensive analysis of this data set utilizing a forward-modeling approach applying our Bayesian framework, ForMoSA. We explore five distinct atmospheric models to assess their performance in estimating key atmospheric parameters: T eff, log(g), [M/H], C/O, γ, f sed, and R. Our findings reveal that each parameter’s estimate is significantly influenced by factors such as the wavelength range considered and the model chosen for the fit. This is attributed to systematic errors in the models and their challenges in accurately replicating the complex atmospheric structure of VHS 1256 b, notably the complexity of its clouds and dust distribution. To propagate the impact of these systematic uncertainties on our atmospheric property estimates, we introduce innovative fitting methodologies based on independent fits performed on different spectral windows. We finally derived a T eff consistent with the spectral type of the target, considering its young age, which is confirmed by our estimate of log(g). Despite the exceptional data quality, attaining robust estimates for chemical abundances [M/H] and C/O, often employed as indicators of formation history, remains challenging. Nevertheless, the pioneering case of JWST’s data for VHS 1256 b has paved the way for future acquisitions of substellar spectra that will be systematically analyzed to directly compare the properties of these objects and correct the systematics in the models.


Introduction
The diversity of exoplanets that have been discovered to date has reshaped our understanding of planetary systems and has raised new questions regarding their formation and evolution.
Although different formation scenarios, such as core accretion (Pollack et al. 1996), gravitational instability (Boss 2000), and gravoturbulent fragmentation of molecular clouds (Padoan et al. 2005), have been proposed, constraining the evolutionary history of observed planets remains challenging.Indeed, the initial physical and chemical conditions of the circumstellar disk in which they formed are no longer directly observable, and the dynamical evolution of the system (migration, ejection, or planetary capture) is unknown.However, formation models have identified key parameters in the atmosphere of young planets that can potentially serve as formation tracers, such as 70 51 Pegasi b Fellow.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence.Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
the metallicity [M/H] (Ormel et al. 2021), the carbon-tooxygen ratio C/O (Öberg et al. 2011), and, more recently, the isotopic ratio of 12 CO/ 13 CO (Zhang et al. 2021).By detecting specific molecules in the atmospheres of transiting planets, it has become possible to estimate these observables (Hoeijmakers et al. 2019;Spake et al. 2021).However, this technique is limited to highly irradiated planets close to their host stars.Direct imaging allows for the characterization of the atmospheres of planets with larger separations, but due to the high contrast with their host star, only sufficiently bright companions can be accessed.Presently, there are fewer than 30 lowmass companions for which spectroscopic data have been obtained.Only a few of these observations have reached sufficient spectral resolution (R λ > 1000) to attempt estimating chemical abundances in their atmospheres, including β Pic b (Snellen et al. 2014), HD 106906 b (Daemgen et al. 2017), 1RXS J1609 b (Lafrenière et al. 2010), Delorme 1(AB) b (Eriksson et al. 2020;Betti et al. 2022;Ringqvist et al. 2023), HR 8799 b, c (Konopacky et al. 2013;Ruffio et al. 2019;Wang et al. 2021Wang et al. , 2023)) (Hoch et al. 2022;Petrus et al. 2023), which is the subject of this work.
The system VHS J125601.92-125723.9 (hereafter VHS 1256) has been identified as a hierarchical system with the detection of the companion VHS 1256 b, orbiting at ∼8 06 ± 0 03 (projected physical separation of 179 ± 9 au) from the tight (0.1 as) M7.5+M7.5 binary VHS 1256 AB (Gauza et al. 2015;Stone et al. 2016).Its distance and age have been estimated to 21.14 ± 0.22 pc (Brown et al. 2021) and 140 ± 20 Myr (Dupuy et al. 2023), respectively.Bowler et al. (2020) and Zhou et al. (2020) monitored the companion with the Hubble Space Telescope and Spitzer, respectively, and revealed significant wavelength-dependent variability, corresponding to an estimated rotation period ranging between 21 and 24 hr.Zhou et al. (2022) confirmed this hourly period variability and further identified a higher temporal baseline variability with multiepoch tracking over a 2 yr period.Both publications emphasized VHS 1256 b as one of the most variable substellar objects discovered thus far (over 20% between 1.1 and 1.7 μm) and interpreted this variability as the signature of a complex and heterogeneous atmospheric structure (inhomogeneous cloud cover, heterogeneous temperature, etc.).These conclusions align with the results provided by Gauza et al. (2015), who identified VHS 1256 b as an L7type object at the boundary of the L-T transition (between 1000 and 1400 K).Through this temperature range, planetary atmospheres are expected to undergo significant modifications to their cloud structure in terms of vertical distribution (condensation and/or sedimentation of FeH, TiO, and VO; Cushing et al. 2008) and/or horizontal distribution (appearance of holes in the cloud cover; Burgasser et al. 2002;Marley et al. 2010).Furthermore, by employing the methodology detailed in Allers & Liu (2013), Gauza et al. (2015) noted indicators of low surface gravity within the spectrum of this object, consistent with its young age.Notably, they observed a characteristic triangular H-band shape.This low gravity implied a reddening effect (see Figure 1 in Miles et al. 2023) that provides additional evidence supporting the presence of cloudy layers.
Several studies have analyzed medium-and high-resolution spectra of VHS 1256 b to better understand this complexity.Using Keck Near-Infrared Spectrograph (NIRSpec) data in the H band (R λ ∼ 25,000), Bryan et al. (2018) estimated a projected rotational velocity v sin(i) of 13.5 ± 4.1 km s −1 , which was consistent with the rotational period estimated by Zhou et al. (2020).This v sin(i) measurement also enabled Zhou et al. (2020) to estimate that VHS 1256 b is very likely to be observed equatorially.Keck/NIRSpec L-band spectroscopy (Δλ = 2.9-4.4 μm; R λ ∼ 1300) enabled the detection of methane, which revealed nonequilibrium chemistry between CO and CH 4 (Miles et al. 2018).Hoch et al. (2022) compared medium-resolution Keck/OSIRIS spectroscopy in the K band (R λ ∼ 4000) with a custom grid of precomputed synthetic spectra to derive a T eff range of 1200-1300 K, a log(g) range of 3.25-3.75dex, and a C/O of 0.590 ± 0.354.These atmospheric parameter estimates were confirmed by Petrus et al. (2023), who fit a medium-resolution spectrum from the X-Shooter (R λ ∼ 8000) spanning 1.10-2.48μm simultaneously with the ATMO grid (Tremblin et al. 2015) to estimate T eff = 1380 ± 54 K, log(g) = 3.97 ± 0.48 dex, [M/H] = 0.21 ± 0.29, and C/O > 0.63.They concluded that while the estimates of [M/H] and C/O suggested supersolar values, indicating significant enrichment in solids during its formation, the hypothesis of a formation by fragmentation of the initial molecular clouds could not be dismissed due to the characteristics of the system's architecture (hierarchical system) and the lack of sufficient precision in estimates of these formation tracers.
More recently, VHS 1256 b has been observed with JWST, which combined medium-resolution spectroscopy from NIR-Spec and the Mid-Infrared Instrument (MIRI) (Miles et al. 2023; see Section 2.1).The spectra were integrated to obtain a robust estimate of the bolometric luminosity log(L bol /L e ) = −4.550± 0.009.Dupuy et al. (2023) combined this luminosity with their estimate of the age to derive the mass of VHS 1256 b using the hybrid cloudy evolutionary model of Saumon & Marley (2008).They identified a bimodal solution with a lower-mass regime at 12 ± 0.1 M Jup and a higher-mass regime at 16 ± 1 M Jup .Given this bimodality, we can confidently place the mass between 12 and 20 M Jup , but, as we do not know whether this object is currently burning deuterium, we do not know which of the bimodal peaks it truly falls in.Dupuy et al. (2023) also estimated a radius R of 1.30 R Jup and 1.22 R Jup , a T eff of 1153 ± 5 K and 1194 ± 9 K, and a log(g) of 4.268 ± 0.006 dex and 4.45 ± 0.03 dex according to the lower-mass and higher-mass solution, respectively.Miles et al. (2023) also compared this spectrum with a grid of synthetic spectra from Mukherjee et al. (2023) to estimate a T eff of 1100 K, a log(g) of 4.5 dex, and a radius of 1.27 R Jup and to highlight a complex atmosphere with a different cloud deck and varying regimes of vertical mixing.Additionally, the MIRI data enabled the detection of a strong silicate absorption at around 10 μm, supporting the presence of silicate clouds in its atmosphere.
These previous studies indicate the usefulness of employing grids of precomputed synthetic spectra for characterizing the atmosphere of VHS 1256 b.These grids are generated from atmospheric models that incorporate our current understanding of the physics and chemistry involved.However, it has been observed that these models sometimes encounter challenges in accurately reproducing certain spectral features and extended wavelength ranges.These systematic errors can be attributed to the use of strong assumptions when constraining the physical and chemical processes occurring in planetary atmospheres, such as cloud formation and evolution, chemical disequilibrium, vertical mixing, and dust sedimentation (see Section 2.2).The imposition of these assumptions leads to a limited number of free parameters (<6), and the quality of the fit heavily relies on the quality of the models.The impacts of these systematic errors on the estimation of atmospheric properties have already been recognized for L and M spectral types (Cushing et al. 2008;Stephens et al. 2009;Bonnefoy et al. 2014;Manjavacas et al. 2014;Lachapelle et al. 2015;Bayo et al. 2017;Petrus et al. 2020Petrus et al. , 2023;;Suárez et al. 2021;Lueber et al. 2023;Palma-Bifani et al. 2023).These studies have interpreted these discrepancies as indications of a deficiency in the modeling of dust and haze within the synthetic atmosphere models.
With the wavelength coverage and data quality of the spectra now provided by JWST, it is imperative to establish a comprehensive understanding of the limitations of selfconsistent models of atmospheres and to develop analysis methods that take appropriate account of their systematic errors.This is crucial to ensure the production of reliable and robust estimates of the atmospheric properties of planetarymass objects and to avoid overinterpretation of these properties, especially when dealing with spectra covering a narrower spectral interval.In this paper, we exploit the unprecedented data provided by JWST to propose an original method that aims to identify and propagate the systematic errors of the models directly into the error bars of the atmospheric parameters estimated from the forward-modeling approach.Five different grids of precomputed synthetic spectra are used, generated from five different self-consistent models, to compare the results they yield and the interpretation that can be made to constrain the formation pathway of VHS 1256 b.Section 2 describes the data set and models used in this study.Section 3 exploits the broad wavelength coverage provided by JWST to study the dispersion of parameter estimates along the spectral energy distribution (SED).Section 4 focuses on medium-resolution spectral features to estimate chemical abundances.Finally, Section 5 is devoted to a discussion of the results, and a conclusion is given in Section 6.

Data
The data set analyzed in this study was first presented in Miles et al. (2023) as the first spectroscopic data set of a substellar object obtained through direct imaging with JWST (ERS #1386; Hinkley et al. 2022).The observations were conducted using two instruments: NIRSpec (Böker et al. 2022;Jakobsen et al. 2022) in integral field unit mode and MIRI (Wright et al. 2023) in medium-resolution spectrometer mode (Argyriou et al. 2023).This data set offers the widest wavelength coverage to date for this kind of object, spanning from ∼1 to 18 μm.It is important to note that the signal-tonoise ratio (S/N) dropped significantly in channels 4A, 4B, and 4C of MIRI, which covers wavelengths above 18 μm.As a result, this channel was not included in this analysis.The spectral resolution of the data ranges from ∼1500 to 3500 (see Figure 1 for a visualization of the resolution).The data set is composed of 12 individual channels, with three acquired by NIRSpec and nine obtained by MIRI.These channels were sequentially observed over a time span from UT 09:43:42 to 13:56:05 on 2022 July 5, corresponding to ∼20% of the rotation period of the object.The data were reduced using adapted version 1.7.2 of the standard JWST pipeline for NIRSpec (Bushouse et al. 2022a) and adapted version 1.8.1 for MIRI (Bushouse et al. 2022b).The different steps of the reduction are detailed in Miles et al. (2023).

Models
Since the initial application of atmospheric models to imaged planetary-mass objects in the late 1990s (Allard et al. 1996;Marley et al. 1996;Tsuji et al. 1996), various families of 1D models have been developed to replicate the spectral characteristics of progressively colder atmospheres.To address current challenges in atmospheric modeling (cloud formation and evolution, nonequilibrium chemistry, etc.), these models employ different strategies, varying in the complexity and inclusion of physical processes, as well as the range of parameters they explore.Additionally, these models can cover different wavelength ranges and operate at various spectral resolutions.In this study, we considered five distinct grids of synthetic spectra, generated by four cloudy and one cloud-free model.The rest of this section is dedicated to their introduction.The parameter spaces they investigate are summarized in Table 1.Their spectral resolutions have been estimated from the sampling of the synthetic spectra they provided, assuming a Nyquist sampling rate.They are illustrated in Figure 1.
1. ATMO (Tremblin et al. 2015).This is a cloudless model that focuses on two nonequilibrium chemical reactions: CO/CH 4 and N 2 /NH 3 .These instabilities can induce diabatic convection (fingering convection) due to the difference in mean molecular weights between CO and CH 4 at the L-T transition and N 2 and NH 3 at the T/Y transition.That affects the temperature-pressure (T-P) profile in the atmosphere, reducing the temperature gradient and providing an alternative explanation for observed reddening that does not invoke clouds.The  Marley (2001).Opacities are included for 15 molecules and atoms, as well as the collision-induced opacity of hydrogen and helium, and the solar abundances from Lodders (2010) are considered.Vertical mixing in the cloud model is calculated using the mixing-length theory.Models assume chemical equilibrium throughout the atmosphere.The parameter space considered includes temperatures from 900 to 2400 K, log(g) from 3.5 to 5.5 dex, [M/H] from −0.5 to +0.5, and the cloud sedimentation efficiency parameter f sed from 1 (fully cloudy) to 8 (cloudless).4. BT-Settl (Allard et al. 2012).This model estimates the abundance and size distributions of dust grains by comparing timescales of condensation, coalescence, mixing, and gravitational settling for 55 types of solids in different atmospheric layers.The radiative transfer is then calculated using the PHOENIX code.Nonequilibrium chemistry is permitted for several molecules, including CO, CH 4 , CO 2 , N 2 , and NH 3 .Vertical mixing is accounted for using the mixing-length theory under hydrostatic and chemical equilibrium.We used the CIFIST version of BT-Settl, which incorporates solar chemical abundances defined by Caffau et al. (2011).The explored parameter space includes T eff ranging from 1200 to 7000 K (with the high end limited to 3000 K for computational efficiency) and log(g) ranging from 2.5 to 5.5 dex. 5. DRIFT-PHOENIX (Helling et al. 2008;Witte et al. 2009Witte et al. , 2011)).This model combines the stationary nonequilibrium cloud model called DRIFT (Woitke & Helling 2003, 2004;Helling & Woitke 2006) that incorporates growth, evaporation, and gravitational settling of seven solids (TiO 2 , Al 2 O 3 , Fe, SiO 2 , MgO, MgSiO 3 , and Mg 2 SiO 4 ) with the code PHOENIX (Hauschildt et al. 1997;Allard et al. 2001) that calculates radiative transfer and hydrostatic and chemical equilibrium and employs mixing-length theory in 256 atmospheric layers.To replenish the upper layers depleted of grains due to condensation, vertical mixing by convection is included.The grain coagulation is not taken into account.The parameter space covered by this model ranges from T eff of 1000-3000 K, log(g) from 3.0 to 6.0 dex, and metallicity [M/H] from −3.0 to 6.0.For [M/H] = 0.0, the solar element abundances from Grevesse et al. (1992) are used, implying a fixed C/O = 0.55 in the grid.

Parameter Space Exploration with ForMoSA
To compare these models with the data, we used the forward-modeling code ForMoSA.71Initially introduced in Petrus et al. (2020), this code has been updated for the analysis of the X-Shooter data related to VHS 1256 b (Petrus et al. 2023).This updated version is the one considered in this work.This publicly available tool utilizes the Bayesian inversion technique known as "nested sampling" (Skilling 2004) to efficiently explore the complex parameter space provided by grids of precomputed synthetic spectra.By doing so, it proposes an estimation of the atmospheric properties of brown dwarfs and directly imaged planets but also other parameters such as the radius, the interstellar extinction, the radial velocity, and the projected rotational velocity.Regarding the spectral resolution of the data analyzed in this study, only the radius will be constrained here, in addition to the parameters explored by the grids.In order to optimize the analysis process, the model grids presented in this study have been converted to the xarray72 format.This format allows for efficient manipulation, including interpolation and labeling of the various dimensions allowed by the grids.The spectral resolutions of both the data and synthetic spectra are compared at each wavelength to align them with optimized values.When the model has a lower spectral resolution compared to the data (as observed in certain ATMO and Exo-REM wavelengths; see Figure 1), the data's resolution is adjusted accordingly.This adjustment involves convolving both the observed and synthetic spectra with a Gaussian distribution, using the calculated FWHM to achieve the desired resolution.Subsequently, the Python module spectres73 is employed to resample the spectra onto a wavelength grid, ensuring a defined Nyquist sampling and achieving the desired spectral resolution with wavelengths.This final step optimizes computation time by limiting the number of data points considered during Bayesian inversion, all while preserving the information.The nested sampling inversion process is then carried out using the Python module nestle. 74For each fit, we considered priors uniformly distributed through the parameter space defined by the grids, a uniform prior between 0 and 10 R Jup for the radius, and a Gaussian prior for the distance with μ = 21.15pc and σ = 0.22 pc, corresponding to the Gaia DR3 measurement and error, respectively.

Fits as a Function of the Wavelength Range
The forward-modeling approach used in this study offers several advantages, including the ability to fit high-resolution data across a broad range of wavelengths within a relatively short computing time (less than 1 day).However, this approach has its challenges, as the quality of the fit is directly influenced by the quality of the model, which can be difficult to fully quantify.Petrus et al. (2020) demonstrated that systematic errors present in the model BT-Settl greatly influenced the posteriors of the parameter estimates explored during the inversion of nine spectra of young M7-M9 dwarfs.The fits converged to different sets of parameters depending on the spectral band considered.These differences considerably dominated the errors on these parameters derived from the Bayesian algorithm.This work was further extended to the analysis of isolated brown dwarfs at the L-T transition observed with Spitzer (Suárez et al. 2021), L dwarfs using X-Shooter data of VHS 1256 b with the ATMO grid (Petrus et al. 2023), and Very Large Telescope/SINFONI data of AB Pic b with the BT-Settl and Exo-REM grids (Palma-Bifani et al. 2023).These subsequent studies confirmed that the systematic errors observed in the previous work were indeed a recurring issue in the forward-modeling approach.In this section, we present an extension of the method proposed in these three previous studies.We address the systematics in the models to provide a robust estimation of the atmospheric properties of VHS 1256 b, taking advantage of the extensive spectral coverage offered by JWST.

Description of the Method
To assess the influence of systematic effects on the estimation of parameters, we conducted a comparative analysis between the results obtained from fitting the full wavelength coverage permitted by JWST and the results obtained from fitting different spectral windows defined along the SED.We applied each model introduced in Section 2.2 for this purpose.
In the initial step, we combined the NIRSpec and MIRI spectra.For regions where the wavelength coverage overlapped between the channels, we selected the data with the highest spectral resolution.The fitting procedure was then carried out for each model, considering a wavelength range from 0.97 to 18.02 μm.The corresponding fits are represented by unicolor solid lines in Figure 2, and the corresponding sets of parameters are summarized in Tables 4 and 5.
Next, we divided the spectrum into 15 distinct spectral windows and conducted independent fits for each window.To account for potential calibration discrepancies between channels, we utilized the wavelength coverage of each channel to define the specific range for each window.For the NIRSpec data, we further divided them into subwindows to examine the model performance across different infrared bands, while in the case of MIRI, each channel was treated as an individual spectral window.The complete list of spectral windows can be found in Tables 2 and 3, and the corresponding best fits and parameter estimates are illustrated in Figure 2 using bicolor lines and in Tables 4 and 5, respectively.
One of the advantages of this approach is the ability to assess the contribution of each spectral window to the likelihood calculation, denoted as C win .This allows us to identify the spectral windows that have the most significant impact on the inversion results when considering the full SED.The calculation of C win is performed using the following formula: where S/N λ is the S/N associated with each wavelength, assuming completely uncorrelated errors.The sum of S/N 2 is considered to align with the definition of the likelihood function utilized during the fits, which is directly proportional to a χ 2 value.Consequently, C win is maximized for spectral windows that have a greater number of data points and higher S/N.This information is illustrated in the top right panel of Figure 2. In the case of JWST's data, the fit is primarily driven by the NIRSpec G395HF/290LP channel, which contributes approximately 60% to the overall fit.On the other hand, all MIRI data collectively contribute around 0.26% to the overall fit.
In this approach, we also gain the ability to estimate the sensitivity of the models to each parameter depending on the spectral window used.For a given spectral window Δλ, model m, and parameter p, we calculated this sensitivity S Δλ,m,p using the following relation: ) (and consequently S Δλ,m,p ) will tend toward 0. Conversely, a substantial S Δλ,m,p value will result if there is significant spectral variation within the window corresponding to variations in the parameter p.The sensitivity is visualized in Figures 3-5 through the size of the data points and will be further discussed in the subsequent sections.

Results: Parameters as a Function of Wavelengths
All five models exhibit deviations from the observed data when the entire wavelength range is considered for the fit, particularly at longer wavelengths (λ > 7 μm), which carry less weight in the likelihood calculation.To assess the goodness of fit, we calculated the reduced χ 2 values for the fits using the full SED, denoted as c r,full 2 , as well as for the fits using the spectral windows, denoted as c r,win 2 (see Figure 2).For the latter case, we constructed a synthetic SED by combining the best fit from each spectral window, following the merging procedure described earlier, enabling a direct comparison with the fit of the full SED.Among the models tested, the Exo-REM model demonstrates the best performance, yielding c r,full 2 = 246 and c r,win 2 = 57.The utilization of smaller wavelength ranges in the independent fits (spectral windows) results in a decrease in the overall c r,win 2 from a factor of 1.5 for the DRIFT-PHOENIX model to 6 for the BT-Settl model.This improvement is not surprising, as this procedure is similar to a fit in which the number of free parameters is multiplied by the number of spectral windows.Consequently, there is a variation in the estimated parameters depending on the specific spectral window considered.
As illustrated in Figures 3-5, the high S/N provided by JWST allows for precise estimation of the atmospheric properties of VHS 1256 b in each fit, resulting in very small error bars.However, the significant dispersion observed in these estimates across the different spectral windows indicates that these errors cannot be representative of this dispersion.Furthermore, as depicted in these figures, the sensitivity of the models to each parameter varies depending on the spectral window considered.We propose here a procedure that provides a robust estimate of each parameter Θ win,f , where Θ represents the parameter of interest.This procedure considers both the dispersion and the models' sensitivity of each parameter to redefine a more robust error bar.
For each parameter estimated with each model, we defined a sample composed of subsamples of points randomly drawn from the posteriors obtained with each spectral window.We excluded those affected by the silicate absorption between 7.5 and 10.5 μm, which are known to not be reproduced by the current generation of models.The number of points in each subsample is determined proportionally based on the calculated sensitivity S Δλ,m,p using Equation (2).The final value and error of each parameter are defined as the mean and standard deviation, respectively, of this sample and are presented in Figures 3-5 as the solid gray line and gray area, respectively.They are also reported in Tables 4 and 5.

The Effective Temperature T eff
Considering the young age of VHS 1256 b, the estimate of T eff,full considering the full wavelength range is consistent with its spectral type and with the prediction of the evolutionary models (Dupuy et al. 2023) for the grids Sonora, Exo-REM, and ATMO (Figure 3).The BT-Settl and DRIFT-PHOENIX grids appear to converge toward higher T eff values than those predicted by evolutionary models.However, it is noteworthy that for these two models, ForMoSA converges to T eff values close to their lower limit.The ATMO and Exo-REM grids demonstrate greater stability across the different spectral windows compared to BT-Settl, Sonora, and DRIFT-PHOENIX.This leads to smaller error bars on the final values T eff,win,f , which remains consistent with T eff,full for each model.The large error bar at λ > 13 μm is attributed to the low S/N at these wavelengths.The sensitivity of T eff remains relatively constant across all spectral windows.

The Surface Gravity log(g)
The results using the full wavelength range for the models Exo-REM, ATMO, Sonora, and BT-Settl converge toward low values of log(g) (<4.5 dex), whereas DRIFT-PHOENIX tends to converge toward high values (>4.5 dex; Figure 3).Significant dispersion was observed across the various spectral windows, covering the entire range of log(g) values explored,  Note.The nine windows correspond to the three channels used to acquire the data combined with the three grating settings.
particularly for the BT-Settl, Sonora, and DRIFT-PHOENIX grids.Smaller dispersion was also evident in the other grids, Exo-REM and ATMO.However, trends could be identified, with a preference for lower surface gravities along the SED.The grid Sonora provides low log(g) for wavelengths shorter than 4 μm but higher log(g) for longer wavelengths.The sensitivity analysis confirmed that the log(g) parameter is primarily constrained by the NIR data (λ < 5 μm).Due to the wide dispersion observed across the different spectral windows, it is challenging to provide a robust value for log(g) except for Exo-REM and ATMO, which converge to consistent values.Although these log(g) win,f point to low values (<4.12 dex), typical of young objects, they are significantly lower than the predictions of evolutionary models provided by Dupuy et al. (2023).

The Metallicity [M/H]
The exploration of [M/H] is performed using the grids Exo-REM, Sonora, DRIFT-PHOENIX, and ATMO.When the full wavelength range is considered, the ATMO model indicates a solar [M/H], while the Sonora model suggests a supersolar metallicity.On the other hand, both the Exo-REM and DRIFT-PHOENIX models converge toward a subsolar metallicity (Figure 4).These results vary when considering the spectral windows.In this case, the Exo-REM model estimates a solar [M/H], while the Sonora model suggests a supersolar [M/H].Conversely, both the DRIFT-PHOENIX and ATMO models converge toward a subsolar [M/H].The three identified regimes of metallicity are further confirmed by the estimates of [M/H] win,f .Similar to the log(g) parameter, the [M/H] estimation is primarily constrained by the NIR data (λ < 5 μm).

The Carbon-Oxygen Ratio C/O
Among the considered grids, only Exo-REM and ATMO enable the exploration of the C/O ratio (Figure 4).The Exo-REM model appears to exhibit a preference for low C/O ratios, although the wide dispersion of estimates makes it difficult to establish a definitive trend.The situation is even more complex with the ATMO model, as the estimated C/O ratios span the entire range of values explored by the grid.Consequently, this parameter cannot be reliably estimated using this method.Nonetheless, it appears that the NIR data exhibit the highest sensitivity to this parameter 3.2.5.The Adiabatic Index γ and the Sedimentation Factor f sed The parameters γ and f sed are specific to the ATMO and Sonora models, respectively (see Figure 5).When considering the NIRSpec data, their estimates exhibit distinct trends, with a medium/high value for γ and a low value for f sed .However, these parameters display similar variations when analyzed with the MIRI data.Specifically, there is an initial increase in both parameters up to λ ∼ 9 μm, followed by a sharp decline in the MIR range.Similarly to C/O, the new errors for γ win,f and f sed,win,f both fail to represent the dispersion adequately.The f sed values are particularly intriguing, indicating a minimal impact of clouds at λ < 3 μm ( f sed ∼ 2), a significant impact at λ = [3-10] μm ( f sed > 5), and no discernible impact at λ > 10 μm ( f sed ∼ 1).This may reveal the complexity of the cloud clover in the atmosphere of VHS 1256 b, which stands at the L-T transition.Additionally, the estimation of γ (∼1.03) and f sed (∼1) using the full SED Note.Estimations of the parameters explored by the grids are provided for both the full wavelength range and the combined spectral windows.For the combined spectral windows, error bars have been calculated by considering the dispersion of the individual fits from each spectral window.The error bars of the fits on the full SED reflect the high S/N of the data.Note.Estimations of the parameters not explored by the grids are provided for both the full wavelength range and the combined spectral windows.The radius R is determined from the dilution factor C K , the luminosity L is calculated using the Stefan-Boltzmann law and the estimates of the radius and the T eff , and the mass M is calculated using the gravitational law and the estimates of the radius and the log(g).
aligns coherently with the results observed for late L objects (Stephens et al. 2009;Tremblin et al. 2017).

The Radius R
The radius is not a parameter explored by the model grids.Instead, it is estimated using the dilution factor C K = (R/d) 2 , which takes into account the flux dilution between the synthetic spectra generated at the outer boundary of the atmosphere and the observed spectrum.Here, R represents the planet's radius and d represents the distance to the planet.To propagate the error from the distance to the radius, we define a Gaussian prior for the distance centered at 21.14 pc with a standard deviation of 0.22 pc, according to the estimate from Brown et al. (2021).We impose a flat prior to the radius to leave it unconstrained.The results are illustrated in Figure 6.It is observed that the radius estimated by each model is consistently underestimated, except for the Exo-REM grid, which provides a radius in good agreement with the predictions of evolutionary models.When the full SED is used for the fit, the radius estimates from the Sonora and ATMO grids also align with the results from Dupuy et al. (2023) and Miles et al. (2023).However, a significant dispersion remains, resulting in substantial error bars for the final values (see Table 5).

Focusing on Spectral Features
The medium spectral resolution provided by JWST has enabled the redetection of various atomic and molecular absorptions in the spectrum of VHS 1256 b.Notably, the two potassium (K I) doublets at 1.173 and 1.248 μm are present, previously identified as robust indicators of surface gravity and metallicity through empirical and synthetic analysis of spectra from planetary-mass objects (Allers & Liu 2013;Petrus et al. 2023).Additionally, we identify carbon monoxide (CO) overtones at approximately 2.3 μm, which can be used to measure the C/O ratio (Konopacky et al. 2013;Nowak et al. 2020;Petrus et al. 2021;Hoch et al. 2022;Petrus et al. 2023).The methane (CH 4 ) absorption between 2.8 and 3.8 μm is also detected, confirming the lower-resolution detection of these species in the atmosphere of VHS 1256 b by Miles et al. (2018), who suggested nonequilibrium chemistry between CO and CH 4 and a significant vertical mixing (K zz > 10 8 cm 2 s −1 ) in the atmosphere to explain the presence of this feature.Finally, this JWST spectrum enabled the detection of a highly resolved CO absorption forest centered at 4.6 μm, already detected in the low-resolution spectra of isolated brown dwarfs by Sorahana & Yamamura (2012).In this section, we conduct a comprehensive analysis of these prominent spectral features to determine the extent to which they can be utilized to constrain atmospheric properties through the forward-modeling approach.We assess the performance of each model described in Section 2.2.

Description of the Method
In one of our previous works, Petrus et al. (2021) demonstrated the challenges associated with fitting spectral features detected at medium resolution using precomputed grids of synthetic spectra.They highlighted that even small inhomogeneities in the grids could remain and propagate through interpolation and introduce biases in the resulting posteriors.Reproducing the depth of the absorption features and accounting for systematics in the continuum shape were also identified as potential sources of bias.To mitigate these issues, we have developed a two-step procedure that analyzes each spectral feature independently.
We initially conducted a fit on the wavelength ranges, Δλ continuum , that avoided the considered spectral features (red spectral windows in Figure 7).From this fit, we extracted the T eff , which was then used to define a local pseudocontinuum at the positions of the spectral features.This approach serves as an alternative to the definition of the pseudocontinuum as a low-resolution version of the spectrum (e.g., Petrus et al. 2023), which may be suitable for higher spectral resolution data but could be unreliable for the resolution of JWST (R λ < 3000).Additionally, we extracted the dilution factor C K to ensure that the depth of the absorption is solely related to the atmospheric properties explored by the grid.In the second fit, we only considered the wavelength ranges, Δλ fit , which focused on the spectral features (green spectral windows in Figure 7).We fixed the values of T eff and C K to the estimates obtained from the first fit.All spectral windows considered are summarized in Table 6.

Results
For each spectral feature, we illustrate in Figure 7 the synthetic spectrum interpolated from the model grids that maximizes the likelihood when compared to the observed data.The reduced c r 2 , calculated using Δλ fit , is provided to quantify and compare the quality of each fit.Its value greater than 1 suggests that the models may have limitations in replicating these spectral signatures and/or that there may have been an underestimation of the error bars during data extraction.The corresponding set of parameters is represented in Figure 8. Similar to the method described in Section 3, we have observed variations in the estimates of atmospheric parameters depending on the spectral feature considered for the fit, as well as the model used.Globally, the models are capable of reproducing this information at medium resolution.However, certain spectral features appear to be more challenging to reproduce, such as the potassium line at 1.243 μm.Now, let us delve into the detailed performance of each individual model.

ATMO
The T eff derived from the various spectral windows excluding the spectral features is in agreement with the estimate obtained in Section 3, ranging between 1242 and 1520 K.This consistency is expected, as we employed a similar methodology by utilizing a fit based on Δλ continuum , which encompasses a broad range of wavelengths.The low values of log(g) (less than 4.37 dex) also align well with the age of the system.However, it is worth noting that a remarkably low and unphysical estimate (2.56 dex) is obtained when considering methane absorption.Subsolar metallicities are estimated for the companion, except for the potassium absorption lines, which yielded a supersolar value.The reported C/O ratios range from solar to supersolar, except for the CH 4 feature that provides a subsolar value.Furthermore, the adiabatic index (γ) shows variability across the entire grid size in the different fits, making it challenging to identify a clear trend.Among the models used, ATMO demonstrates the ability to reproduce this information at medium resolution with the lowest c r 2 .However, it seems to inaccurately reproduce the CO overtone at 2.30 μm.

Exo-REM
Similarly to ATMO, Exo-REM allowed us to estimate consistent T eff values, ranging from 1233 to 1460 K, and low log(g) values, lower than 4.17 dex.We also encountered discrepancies with the very low T eff values provided by methane absorption and the very low log(g) values provided by both methane absorption and CO overtones.The estimated metallicity is mostly solar, except for the fit considering the CO absorption forest, which yielded a supersolar value at the grid's edge.However, caution should be exercised when interpreting this solar value, as it appears that the fits converged to a single node of the grid, possibly due to a bias in the interpolation step caused by small inhomogeneities in the grid.Two modes of C/ O ratios are identified: one supersolar, when CO is considered in the fits, and one subsolar, when methane and potassium are used.Exo-REM demonstrated a successful reproduction of each spectral feature with a relatively low c r 2 compared to the other models, except for the potassium absorption lines, where the depth is not fully reached.

Sonora
The T eff estimates obtained from Sonora range from 1170 to 1401 K, with a deviant value of 1837 K when considering the CH 4 feature.These results are consistent with the two previous models and with Section 3. In contrast, we observed high surface gravities, with values exceeding 4.5 dex, which are inconsistent with the age of the system.Additionally, a supersolar metallicity is found, except when fitting using CH 4 .Similar to the adiabatic index γ explored by ATMO, the sedimentation factor f sed also varies across the size of the grid.Note.The Δλ continuum is used to estimate the T eff and the radius.These derived values serve as constraints within the Δλ fit for fitting these features.
Sonora is capable of reproducing the different spectral features and generally exhibits the second-lowest c r 2 compared to the other grids.

BT-Settl
This model estimates T eff to be between 1426 and 1718 K, with a low surface gravity of <4.5 dex, which is coherent with the spectral type and age of the object, respectively.Despite its ability to reproduce each spectral feature, the calculated c r 2 is generally higher than that of the three previous models we discussed, indicating lower performance.

DRIFT-PHOENIX
As shown in Figure 7 and by its high c r 2 values, this particular version of DRIFT-PHOENIX was unsuccessful in accurately reproducing the spectral information at medium resolution.Consequently, making a robust estimate of atmospheric parameters becomes challenging.

Discussion
In Sections 3 and 4, we have presented two different approaches to leverage the spectral information observable from imaged exoplanets with JWST, aiming to estimate their atmospheric properties.The results of this study indicate a dispersion of these estimated parameters based on the wavelength coverage used for the fit, the considered spectral features, and the adopted model.We have proposed a method to incorporate this dispersion into a new error bar, attempting to provide a robust estimate for each parameter.This section is dedicated to discussing these results, investigating the origin of the dispersion, and analyzing the detected silicate absorption at 10 μm.

The Bolometric Luminosity and the Mass
This JWST spectrum covers 98% of the bolometric flux of 1256 b.By filling the remaining 2% with a model and integrating this full SED, Miles et al. (2023) calculated a log(L bol /L e ) of −4.550 ± 0.009, which was consistent with previous estimates provided by Hoch et al. (2022) and Petrus et al. (2023).In our study, we derived the bolometric luminosity directly from our estimates of T eff and the radius R. For each spectral window defined in Section 3.1, we calculated L bol using the Stefan-Boltzmann law: 4 , where σ is the Stefan-Boltzmann constant.These results are visualized in Figure 9, and as expected, we observe dispersion across the different spectral windows, influenced by the dispersion observed for the T eff and the radius estimates.The fits using the full SED yielded bolometric luminosity in good agreement with the estimates of Miles et al. (2023) for each model but with a very small relative error (less than 0.03%).Using the same method described in Section 3.2, we calculated robust values of L bol for each model and reported them in Table 5. Exo-REM and ATMO are the two models that provide the most stable L bol over wavelength coverage.
We also calculated the mass from our estimates of the log(g) and the radius, applying the gravitational law: where G is the gravitational constant.As with the bolometric luminosity, we also observe dispersion in the mass estimates (Figure 9).We take this dispersion into account to provide the final values reported in Table 5.Interestingly, for the grids BT-Settl, Exo-REM, and ATMO, the mass values obtained from each spectral window appear to be too low and unphysical (<5 M Jup ) when compared to the previous estimate made by Miles et al. (2023).They considered the age of the system, the measured L bol , and the hybrid cloudy evolutionary model of Saumon & Marley (2008) to estimate the mass of VHS 1256 b to be lower than 20 M Jup with two potential solutions: one at 12 ± 0.5 M Jup and another at 16 ± 2 M Jup .Nonetheless, this comparison must be approached with caution due to the inherent uncertainties within the evolutionary models, encompassing factors such as initial entropy variations and the uncertainty in age determinations, including the potential for age discrepancies between the star and its companion.The trend of log(g) as shown in Figure 3 seems to impact the Sonora grid, indicating a higher mass when using MIRI data and a lower mass when using NIRSpec data.Lastly, the substantial dispersion in log(g) obtained with the DRIFT-PHOENIX grid results in a wide range of mass estimates.

The Origins of the Dispersion
The results presented in Sections 3 and 4 have revealed that the estimates of atmospheric properties for VHS 1256 b, derived using the forward-modeling approach, exhibit a significant dependence on several factors.These include the wavelength coverage, the specific model used, and the type of information considered for the fit (large spectral windows or specific spectral features).The dispersion observed in Figures 3-6 can be attributed to three main sources: the object's characteristics, the data reduction, and the models' performances and properties (chemical abundances, physics considered).
First, VHS 1256 b has been identified as the most variable substellar object by Bowler et al. (2020) and Zhou et al. (2022), showing a luminosity variation of over 20% between 1.1 and 1.7 μm over an 8 hr period and ∼5.8% at 4.5 μm (Zhou et al. 2020).This variability has been attributed to an inhomogeneous cloud cover.A spectroscopic time follow-up of this object is underway to analyze its cloud and atmospheric structure, and the results will be presented in dedicated papers.Using the three-sinusoid model proposed by Zhou et al. (2022), Miles et al. (2023) conducted a simulation to estimate the expected variability during the 4 hr time acquisition of the JWST data used in this study.They estimated a maximum variability of less than 5% for 1 μm < λ < 3 μm, about 1.5% for 3 μm < λ < 5 μm, and negligible variability for λ > 5 μm.However, Figure 9 reveals an unexpected increase in the bestfit bolometric luminosity as a function of wavelength for fitted spectral channels between 1 and 7 μm, with approximate percentage changes of 330%, 390%, 400%, 40%, and 210% for the grids BT-Settl, Sonora, DRIFT-PHOENIX, Exo-REM, and ATMO, respectively, which cannot be attributed to the variability of VHS 1256 b.
An alternative explanation for this luminosity dispersion might be attributed to the extraction methodology.The data set employed in this analysis comprises distinct channels observed independently (three for NIRSpec and nine for MIRI).As the spectral coverage provided by these channels is employed to define the spectral windows used in the fitting process in Section 3, inaccuracies in flux calibration during extraction could potentially influence the inferred luminosity.However, it appears improbable, given that this discrepancy is discernible between the sub-windows of each NIRSpec channels, despite a similar extraction process, and therefore a similar flux calibration (see Table 2).Furthermore, the magnitude of the dispersion varies across models, indicating a model-dependent interpretation.
Previous studies have invoked systematic errors within models of atmospheres to explain biases in estimating atmospheric parameters for early-type and hotter objects.Petrus et al. (2020) attributed them to deficiencies in dust grain modeling, which were partially mitigated by introducing interstellar extinction (A V ) as an additional free parameter in the fitting process.This approach was further employed by Hurt et al. (2023) in the characterization of 90 late M and L dwarfs.They confirmed that the fit quality significantly improved when incorporating an extra source of extinction.Notably, no interstellar extinction was detected for these observed targets.While both Petrus et al. (2020) and Hurt et al. (2023) focused on data within the wavelength range λ < 2.5 μm, we extended this concept to a broader wavelength range.We incorporated the interstellar extinction law as defined by Draine (2003) as a free parameter in the fitting process, using our five different models.
The results, as presented in Appendix B, demonstrate that exploring the impact of an interstellar extinction leads to an enhancement in the quality of the fits for the models BT-Settl, Exo-REM, and ATMO, evident through reduced c r 2 values.This improvement is not observed in the case of DRIFT-PHOENIX and Sonora, where the calculated c r 2 values across the entire SED are higher when considering A V .This phenomenon can be explained by the significant heterogeneity inherent in the data set concerning spectral resolution and S/N, as illustrated in the top right panel of Figure 2. Notably, focusing exclusively on the channel G395HF/290LP of NIRSpec (2.87-5.27μm), which holds the greatest sensitivity in the likelihood computation, yields a reduction in With each model, substantial deviations persist when considering A V , particularly at λ > 5 μm, underscoring that the considered extinction law, derived from a population of dust grains within the interstellar medium, fails to accurately capture the dust and cloud characteristics that shape the atmosphere of VHS 1256 b.We have also observed a negative value of A V when the model Exo-REM is considered, indicating that this model is excessively red.This underscores the critical significance of advancing the development of extinction laws specifically tailored for characterizing atmospheric dust and haze.
Moreover, the luminosity depicted in Figure 9 is calculated based on the estimated values of T eff and radius considering the Stefan-Boltzmann law.Consequently, the dispersion of the luminosity within the 1-7 μm range directly correlates with variations in these two parameters.Upon comparing the dispersions of T eff and radius illustrated in Figures 3 and 6, respectively, it becomes apparent that the T eff estimates remain relatively consistent across all models, except for Sonora.Conversely, the radius estimates show a general increase across all models, except for Sonora.One plausible explanation for this behavior is the potential variance in cloud cover between the Sonora model and the remaining models, attributed to the parameter f sed , which governs the sedimentation process of the diverse clouds generated (Luna & Morley 2021).Indeed, it is expected that different cloud coverage would strongly impact the shape of the SED at low spectral resolution (N.Whiteford et al. 2024, in preparation) but also should be critical in the reproduction of the spectral information at higher resolution (see Section 4) and consequently in the robust estimate of chemical abundances.Moreover, the cloudless model ATMO, grounded in chemical disequilibrium physics, offers good performances in reproducing the data.We suggest that this cloudless approach should be integrated with cloud models, as a synergistic interplay between the two factors is plausible and can yield combined effects.Indeed, in the case of VHS 1256 b, the presence of clouds in the photosphere is directly detected from the spectra, as evidenced by the silicate absorption.This implies that a fully cloud-free model is evidently not selfconsistent in reproducing the atmosphere of this kind of object.
Lastly, it is important not to disregard the potential influence of the spectral covariance and the choice of Bayesian priors, which are known to impact the estimation of atmospheric parameters (Greco & Brandt 2016).In this work, we have opted for a conservative approach by defining flat priors constrained by the size of the grids considered.Some inversion codes, like STARFISH (Czekala et al. 2015), have integrated the impact of an error covariance matrix into likelihood calculations.This code also quantifies the influence of systematic errors in models using a Markov Chain Monte Carlo framework.In our future work, we plan to implement and test these functionalities while considering nested sampling in our code ForMoSA (M.Ravet et al. 2024, in preparation).

Exploitation of the Silicate Absorption
JWST's data provided the first detection of silicate absorption in the atmosphere of a low-mass companion (Miles et al. 2023), directly revealing the presence of clouds in the atmosphere of VHS 1256 b.This feature had already been detected in the atmosphere of isolated objects, in particular at the L-T transition (e.g., Cushing et al. 2006;Looper et al. 2008;Suárez & Metchev 2022).At these temperatures (∼1200-1700 K), clouds of forsterite (Mg 2 SiO 4 ), enstatite (MgSiO 3 ), and quartz (SiO 2 ) can condense following nonequilibrium chemistry (Lodders 2002;Helling & Woitke 2006).This has been detected empirically using atmospheric retrieval methods (Burningham et al. 2021;Vos et al. 2023) and will be explored in the retrieval analysis of the data set of VHS 1256 b used in this paper (N.Whiteford et al. 2024, in preparation).In their work, Suárez & Metchev (2022) used a library of lowresolution spectra (R λ ∼ 90), obtained with Spitzer, to define a spectral index in order to investigate the depth of the silicate absorption as a function of the spectral type.Their library was composed of ∼110 field and young isolated planetary-mass objects covering a wide range of spectral types from M5 to T9.Their method was based on multiple linear interpolations to estimate the pseudocontinuum and the absorption depth at the position of the feature (at 9 μm).The spectral index was defined as the ratio of both fluxes.Thanks to the diversity of their spectral library, they identified positive correlations between their silicate index and the NIR color excess and between their silicate index and the photometric variability, proving that silicate clouds had a great impact on the atmospheric properties of planetary-mass objects.The silicate index definition was improved in Suárez & Metchev (2023) considering the silicate absorption extends up to 13 μm in relatively young objects, like VHS 1256 b, and a better approach to estimate the continuum.This led to the main conclusion that the silicate absorption is sensitive to surface gravity.It is redder, broader, and more asymmetric in low surface gravity dwarfs.This work was followed by Suárez et al. (2023), who identified, in the same Spitzer library, a dependence between the depth of the silicate absorption and the viewing geometry of the target, confirming that equatorial latitudes are cloudier.In the case of VHS 1256 b, which is seen equator-on (90°-+ 28 0 ; Zhou et al. 2020), they find a strong silicate absorption and reddening.However, the inability to date to observe the silicate absorption between 8.0 and 12.0 μm at medium resolution (R λ > 100) has restricted the comparison of this molecular absorption's properties (depth, shape, etc.) with atmospheric model predictions.This limitation has slowed down the development of these models, as they currently cannot replicate this feature.Consequently, our comprehension of the physical and chemical mechanisms underlying its appearance, notably its chemical composition, remains hindered.In this section, we propose a new method based on the forward-modeling analysis to define a new silicate index.

Method
To avoid the region where the models fail to reproduce the silicate absorption, we defined two windows on the left side (6.0-8.0 μm) and the right side (12.0-14.0μm) of the feature.The data were fitted using ForMoSA, considering only the wavelengths covered by these two windows.The parameters estimated from this fit were then used to estimate the pseudocontinuum between 8 and 12 μm.We performed this fit for each model defined in Section 2.2, as well as for each isolated object in the Spitzer library and for the JWST data of VHS 1256 b.We limited the Spitzer library to objects with spectral types between M6 and T8.The final sample consisted of 12 young objects (age <400 Myr) and 93 field objects.The index I Si was calculated as the equivalent width of the silicate absorption using the formula where C λ and F λ are the fluxes of the estimated pseudocontinuum and the data, respectively, at the wavelength λ, with λ ä [8.0, 12.0] μm.The error bar on I Si is estimated by the calculation of I Si,min and I Si,max , considering F − F err and F + F err , respectively, where F err is the error of the data.This method is illustrated in Figure 10.

Results
In their work, Suárez & Metchev (2022) found that the silicate index was higher on average between the spectral types L4 and L6 compared to colder and warmer objects in their library.They interpreted this phenomenon as the result of Si being unable to condense into clouds for T eff > 2000 K and the sedimentation of clouds below the photosphere after the L-T transition.Figure 11   The spectral resolution of the JWST data has been reduced to match that of Spitzer for a fair comparison.The red area represents the spectral window used to estimate the pseudocontinuum, while the green area indicates the spectral window used to calculate the spectral index.In this figure, we specifically present the case of BT-Settl as an illustrative example.Suárez & Metchev (2023), enabling a direct comparison between the two approaches.Similar results were obtained for each model used.An I Si ∼ 0 is calculated for M-type objects, indicating a lack of significant silicate absorption in this spectral range.Additionally, the detection of CH 4 after the L-T transition, as well as the potential contribution of the NH 3 absorption at 10.5 μm, which is present in the spectra of T dwarfs, may lead to negative values for the estimated equivalent widths.However, it should be noted that the validity of the results from the five models is limited to the spectral types they cover, as indicated by the colored area.Furthermore, we observe that the silicate index is generally higher for young objects compared to field objects, as recently highlighted by Suárez & Metchev (2023).This is likely caused by the higher surface gravity of older objects, resulting in their contraction, which increases the sedimentation rate of the silicate clouds below the photosphere.
The silicate index calculated for VHS 1256 b, following the equivalent-width procedure, aligns with the observed trends.The L7 spectral type and young age of this object favor the presence of a silicate absorption feature, resulting in a high value of I Si for each model.As observed in Figure 10, the absorption feature of VHS 1256 b seems to be redder and broader compared to the absorption detected in the spectrum of other objects (e.g., 2MASS 2148+4003).The reason for this distinct shape needs to be investigated, but it could be attributed to complex silicate clouds with varying abundances of different silicate elements.Indeed, Cushing et al. (2006) showed that forsterite (Mg 2 SiO 4 ) was redder than enstatite (MgSiO 3 ).The in-depth analysis of the silicate feature is a relatively new research direction that is currently in progress (Burningham et al. 2021;Suárez & Metchev 2023;Suárez et al. 2023).With this analysis, we have demonstrated that the current generation of atmospheric models can be utilized to accurately estimate the pseudocontinuum at the position of this feature, taking advantage of their inability to reproduce the silicate absorption feature.This allows for robust quantification of the equivalent width and facilitates discussions on the spectral type and age of observed objects.

Current and Future Improvements in Models
This work highlighted the current limitations of forward modeling in characterizing the atmospheres of low-mass objects.One approach to overcome these limitations is to improve the quality of the existing generation of atmospheric models.As demonstrated in this study, it is crucial to compare the outcomes obtained by considering various model families.Therefore, simultaneous improvements are necessary for each model.Various enhancement points have been identified.
The physics governing the clouds (formation and evolution) requires improvement.Currently, the particle size distribution is defined by a single broad lognormal distribution, which may need adaptation because it suggests smaller particles at higher atmospheric levels.Incorporating microphysical models that present smaller particles in a "nucleation mode" and larger particles in a "growth/settling mode" could be an interesting solution (see Helling & Woitke 2006).Having a cloud model with smaller particles at the right heights might aid in reproducing the ∼10 μm silicate feature and potentially explain some of the observed interstellar reddening from optical to NIR wavelengths.
As demonstrated by ATMO, diabatic convection can replicate cloud effects in the atmosphere, substantially altering temperature structures, particularly at the L-T transition.However, the presence of clouds is now confirmed through the detection of silicate absorption.Simultaneous integration of disequilibrium and clouds marks a promising advancement, slated for implementation in Sonora and refinement in Exo-REM.Additionally, introducing physically motivated adjustments to temperature structures, including effects like mean molecular weight gradients not presently considered, could impact the T-P profile in the mid-and upper atmosphere, influencing cloud physics and estimates of chemical abundances.
All the model families considered in this study are 1D, combining simulations of the vertical atmosphere structure with radiative transfer codes to generate synthetic spectra.VHS 1256 b exhibits significant variability, suggesting a complex atmosphere potentially marked by an inhomogeneous cloud cover.It is evident that 1D models alone will not suffice to capture the complexity of these objects existing in three dimensions.Currently, significant efforts are underway to interpret such variability by developing a new generation of 3D models of atmospheres (Showman et al. 2020;Tan & Showman 2021a, 2021b;Plummer & Wang 2022) that aim to map the surfaces of exoplanets observed by JWST and, in the near future, by the Extremely Large Telescope (ELT).
Finally, the recent identification of isotopologue ratios as tracers of formation motivates the development of models that can explore their potential different values.Recently, isotopologues ( 13 C, 18 O, and 17 O) were identified in the atmosphere of VHS 1256 b through comparisons of JWST data with synthetic spectra generated within a retrieval framework (Gandhi et al. 2023).It is important to investigate the capability of self-consistent models in detecting these chemical elements, estimating their abundance, and studying the impact of systematic errors in the models on these values.
Expansion of grids is currently underway to address this new avenue of research.

Summary and Conclusion
The main objective of this study was to conduct a comprehensive analysis, employing the forward-modeling approach, of the most detailed spectrum obtained to date for an imaged planetary-mass companion.The aim was to assess the capabilities and limitations of the current generation of five different self-consistent atmospheric models and determine the extent to which we can characterize the atmospheres of such objects.
In Section 3, we presented an innovative method that took advantage of the extended wavelength coverage provided by JWST.By dividing the SED into multiple spectral windows, we explored the dispersion of parameter estimates across these windows and propagated this dispersion to determine new error bars for each atmospheric property.Our analysis revealed that relatively robust estimates could be obtained for T eff , log(g), and radius.However, we observed that the log(g) and radius tended to be underestimated compared to the predictions of evolutionary models.This discrepancy resulted in underestimated values of the mass.On the other hand, the calculated luminosity, derived from T eff and radius, was in good agreement with evolutionary models.Interestingly, we observed a curious trend of increasingly large fitted bolometric luminosities as a function of wavelength region fit, which may be attributed to a misrepresentation of dust, haze, and clouds in the models of atmospheres.Estimating chemical abundances ([M/H] and C/O), as well as parameters such as γ and f sed , proved to be more challenging due to the significant dispersion observed along the SED.Furthermore, we found that the choice of the model used for the fit had a significant impact on the final parameter values, underscoring the importance of careful model selection in interpreting atmospheric properties, confirming the results of Lueber et al. (2023) regarding the estimation of log(g).
In Section 4, we employed a targeted approach to constrain the [M/H] and C/O ratios by focusing on specific molecular and atomic absorption features known to be sensitive to these chemical abundances.By optimizing the fitting strategy for each absorption feature, we obtained consistent estimates of T eff and log(g) with those derived in Section 3. Regarding [M/ H], different scenarios emerged depending on the choice of the atmospheric model.The Sonora model suggested a supersolar metallicity, while the DRIFT-PHOENIX and ATMO models indicated a subsolar metallicity.Lastly, the Exo-REM model tended to converge toward solar values.Moreover, obtaining a robust estimate of the C/O ratio proved to be challenging as well.The ATMO model suggested supersolar values, similar to the Exo-REM model when considering CO, and coherent with previous studies (Hoch et al. 2022;Petrus et al. 2023).Nevertheless, in both cases, the inversion reached the edge of the grid.
After considering various sources of dispersion in the estimation of atmospheric parameters, such as the high variability of VHS 1256 b and potential biases during data reduction, we turned our attention to systematic errors in the five different models used.Indeed, there is a noticeable discrepancy in accurately reproducing the complexity of the atmosphere of VHS 1256 b, encompassing factors such as the clouds' chemical composition and dispersion, the properties of dust and haze, and various physical processes (vertical mixing, sedimentation, chemical disequilibrium, etc.).These systematics inherently contribute to biases in the determination of atmospheric parameters.Identifying and rectifying them will constitute one of the main challenges of exoplanetary atmospheric characterization in the coming decade.
To characterize the silicate absorption feature detected, we took advantage of the fact that its absorption band at ∼10 μm is not reproduced by the models.This allowed us to estimate the pseudocontinuum between 8 and 12 μm, which enabled the calculation of the equivalent width of the silicate absorption.We compared it to those obtained from the silicate absorption detected in the spectra of isolated low-mass objects observed at low resolution with Spitzer.Through this analysis, we were able to confirm the correlation between the depth of the silicate absorption and the spectral type, as previously identified by Suárez & Metchev (2022).Additionally, we identified a connection between the silicate absorption feature and the age of the objects.
All of these results have demonstrated the capability of models to partially reproduce the observed data.The use of JWST data has allowed the development of optimized inversion procedures, which take into account the systematics present in the models.By performing a multimodel strategy and propagating these systematics into the error bars of the estimated parameters, we ensure a more accurate and realistic characterization of the atmospheric properties and avoid any potential overinterpretation of the results.However, constraining the formation history of VHS 1256 b using its atmospheric properties estimated from the forward-modeling approach remains challenging.A supersolar metallicity is usually used to quantify the solid material that is accreted during the formation, and the C/O is considered as a tracer of the birth location of the planet (see Nowak et al. 2020;Petrus et al. 2021;Hoch et al. 2022;Palma-Bifani et al. 2023), but with this study, we have shown that the estimates of these commonly used tracers of formation were strongly dependent on the model and the spectral information used for the fit.Moreover, Mollière et al. (2022) have demonstrated that the relationship between chemical abundances and the formation mechanisms described by the models (e.g., Öberg et al. 2011) is more complex than initially anticipated, and it is dependent on various intricate parameters, particularly in the case of formation within a disk.The chemical composition of the protoplanetary disk from which VHS 1256 b may have formed is unknown, but it needs to be estimated in order to establish reference values.An alternative approach is to estimate it from the host star.However, in our case, VHS 1256 (AB) is a shortperiod binary, which poses significant challenges in accurately determining its chemical abundances.Furthermore, Mollière et al. (2022) have demonstrated the significant influence of various factors on the evolution of the disk structure, such as gap opening, winds, and self-shadowing.They have also highlighted the impact of pebble evaporation inside ice lines on the accretion of chemical elements (in both gas and solid phases) during the planet's formation.These processes can potentially affect the overall metallicity and the carbon-tooxygen ratio (C/O).Lastly, the authors emphasize that the chemical composition estimated solely from spectroscopic data may not necessarily reflect the bulk composition of the planet but rather describes the photospheric composition.In order to truly connect the measured atmospheric composition and the original disk composition, it is crucial to gain a comprehensive understanding of the mechanisms that facilitate vertical chemical exchange between the upper and deeper layers of the planetary atmosphere.
Due to the presence of these unknown factors, it seems challenging to definitively determine the mode of formation of VHS 1256 b, and more generally of planetary-mass objects, by exclusively exploring its atmosphere.Nevertheless, the development and systematic application of spectral inversion methods, such as the one presented in this study, to the populations of imaged planetary-mass objects (both star companions and isolated objects) will enable a "comparative" analysis of their properties, evolving into a "statistical" study over the course of the JWST mission and once ELTs become operational around 2030.
Figure 12.Same as Figure 2. A zoom has been performed to illustrate the differences between the models and the data with more details.In this figure, the zoom was chosen to represent the data acquired with the channel G140HF/100LP of NIRSpec between 0.97 and 1.89 μm.
Figure 13.Same as Figure 12 but with the zoom chosen to represent the data acquired with the channel G235HF/170LP of NIRSpec between 1.89 and 3.17 μm.
Figure 14.Same as Figure 12 but with the zoom chosen to represent the data acquired with the channel G395HF/290LP of NIRSpec between 3.17 and 5.27 μm.

Appendix B Interstellar Extinction as a Free Parameter
Figure 16 illustrates the best-fitting model, achieved when integrating the interstellar extinction A V as defined by Draine (2003) into the list of parameters explored by ForMoSA, following the concept introduced by Petrus et al. (2020).Notably, the c r 2 value is substantially diminished in comparison to cases where A V is omitted (with the exception of DRIFT-PHOENIX and Sonora).However, despite these improvements, notable deviations persist in the reproduction of the observed data.

Figure 2 .
Figure2.Comparison between the data and the interpolated synthetic spectra generated with the set of parameters that maximized the likelihood for each model.The best fit using the full SED is depicted with unicolor lines, while the best fits using different spectral windows are illustrated with bicolor lines.The data are represented by black lines.All spectra have been normalized with the bolometric luminosity of VHS 1256 b to simplify the figure, and an offset was applied.The use of spectral windows enhances the quality of the fit for each model.We note that a possible underestimation of the error bars of the data may play a role in the high value of c r 2 , which is provided as a comparative measure of the performance of the best-fit models, as well as the number of degrees of freedom (DoF) that have been used to calculate it.The top right panel shows the relative contribution of each spectral window to the fit using the full SED.It corresponds to C C win win ot t when C win is calculated from Equation (1) and C win ot t is the sum of C win over the 15 spectral windows.The NIRSpec channel G395HF/290LP, from 2.87 to 5.27 μm, with its higher spectral resolution and S/N, carries the greatest weight in the calculation of the likelihood.See Appendix A for a zoomed version of this plot.
synthetic flux generated by the model m at the wavelength λ with the minimum and maximum values of the parameter p explored by m, respectively, and N Δλ is the number of data points contained in Δλ.The remaining parameters are estimated using the posteriors of the fits corresponding to the specific spectral window and model considered.Therefore, these index values are only relevant for objects with atmospheric properties similar to VHS 1256 b.If, for a given model m, the spectral information F m,λ (p) within the considered window is not affected significantly by the parameter p, the difference l

Figure 3 .
Figure3.Estimation of the T eff (left panel) and log(g) (right panel) using the full SED during the fit (dashed colored line) and using the spectral windows (colored dots).The size of the dots indicates the sensitivity of each spectral window to each parameter.If the sensitivity is too low to be represented by a dot, a colored cross is used.Each panel corresponds to a different model, with the parameter range indicated by the filled colored area.The final value and error extracted from these results are shown by the solid gray line and gray area, respectively.The dotted black lines represent the value estimated byDupuy et al. (2023), who interpolated the evolutionary model fromSaumon & Marley (2008)  assuming the bolometric luminosity and age of VHS 1256 b.The two solutions they have found are given.

Figure 4 .
Figure 4. Same as Figure 3 but for the [M/H] (left) and the C/O (right).The dotted black lines represent the solar values 0.0 and 0.55 for [M/H] and C/O, respectively.

Figure 5 .
Figure 5. Same as Figure 3 but for the adiabatic index γ (left) and the sedimentation factor f sed (right).

Figure 6 .
Figure 6.Same as Figure 3 but for the radius R.This parameter is estimated from the dilution factor C K considering the distance from Brown et al. (2021).

Figure 7 .
Figure7.Comparison between the data and the interpolated synthetic spectra generated with the set of parameters that maximized the likelihood for each model and for each spectral feature considered.The red area represents the wavelength coverage used to estimate T eff and the dilution factor C K , while the green area represents the wavelengths used to fit the spectral features with T eff and C K fixed.The calculated c r 2 values for each fit are provided, allowing for a comparison of each model's ability to reproduce the data.

Figure 8 .
Figure 8. Parameter estimates provided by each model.The results obtained using each spectral feature are represented by different line formats.The colored areas represent the parameter spaces explored by each grid.The black lines indicate the estimates of T eff and log(g) from Dupuy et al. (2023) for the two mass scenarios, as well as the solar values of [M/H] and C/O.

Figure 9 .
Figure 9. Same as Figure 3 but for the bolometric luminosity L bol (left panel) and the mass M (right panel).These parameters are estimated from the Stefan-Boltzmann law and the gravitational law, respectively.
displays I Si as a function of spectral type for each object in the Spitzer library, including VHS 1256 b.The top right panel reproduces the silicate index as defined by

Figure 10 .
Figure 10.Comparison between three examples of silicate absorption identified in the Spitzer library and the detection in the JWST data of VHS 1256 b.The spectral resolution of the JWST data has been reduced to match that of Spitzer for a fair comparison.The red area represents the spectral window used to estimate the pseudocontinuum, while the green area indicates the spectral window used to calculate the spectral index.In this figure, we specifically present the case of BT-Settl as an illustrative example.

Figure 11 .
Figure 11.The silicate index as a function of spectral type for the objects in the Spitzer library and VHS 1256 b.The top right panel displays the results obtained using the method defined by Suárez & Metchev (2023) for comparison.The other panels present the results obtained in this study, where the silicate index is calculated as an equivalent width.The open circles and filled circles represent young and field objects, respectively, and the star represents the estimate for VHS 1256 b.The colored areas indicate the spectral type ranges covered by the five different models used in the analysis.

Figure 15 .
Figure 15.Same as Figure 12 the zoom chosen to the data acquired MIRI 17.98 μm.

Figure 16 .
Figure 16.Same figure as Figure 2 but with the exploration of the interstellar extinction A V .The estimate of the other parameters, obtained with each model, are given in each panel, respectively.
Figure1.Spectral resolution as a function of the wavelength coverage allowed by the NIRSpec + MIRI data.The observed data are represented by the black lines.The five models are depicted by the colored lines: ATMO in green, Exo-REM in blue, Sonora in yellow, BT-Settl in red, and DRIFT-PHOENIX in purple.For the models, the spectral resolution has been calculated assuming a Nyquist sampling rate.

Table 1
Parameter Spaces Explored by the Models

Table 2
Spectral Windows Defined for the NIRSpec Data

Table 4
Summary of the Estimations of the Parameters Explored by the Grids

Table 5
Summary of the Estimations of the Parameters Not Explored by the Grids

Table 6
Spectral Windows Were Employed to Fit the Four Spectral Features: K I lines, CO Overtones, CH 4 Absorption, and CO Forest