An Empirical Bayesian Approach to Limb Darkening in Modeling WASP-121b Transit Light Curves

, , , , , , , , and

Published 2021 June 3 © 2021. The American Astronomical Society. All rights reserved.
, , Citation Fan Yang et al 2021 AJ 161 294 DOI 10.3847/1538-3881/abf92f

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/161/6/294

Abstract

We present a novel, iterative method using an empirical Bayesian approach for modeling the limb-darkened WASP-121b transit from the TESS light curve. Our method is motivated by the need to improve Rp/R* estimates for exoplanet atmosphere modeling and is particularly effective with the limb-darkening (LD) quadratic law requiring no prior central value from stellar atmospheric models. With the nonlinear LD law, the method has all the advantages of not needing atmospheric models but does not converge. The iterative method gives a different Rp/R* for WASP-121b at a significance level of 1σ when compared with existing noniterative methods. To assess the origins and implications of this difference, we generate and analyze light curves with known values of the LD coefficients (LDCs). We find that noniterative modeling with LDC priors from stellar atmospheric models results in an inconsistent Rp/R* at a 1.5σ level when the known LDC values are the same as those previously found when modeling real data by the iterative method. In contrast, the LDC values from the iterative modeling yield the correct value of Rp/R* to within 0.25σ. For more general cases with different known inputs, Monte Carlo simulations show that the iterative method obtains unbiased LDCs and correct Rp/R* to within a significance level of 0.3σ. Biased LDC priors can cause biased LDC posteriors and lead to bias in the Rp/R* of up to 0.82%, 2.5σ for the quadratic law and 0.32%, 1.0σ for the nonlinear law. Our improvement in Rp/R* estimation is important when analyzing exoplanet atmospheres.

Export citation and abstract BibTeX RIS

1. Introduction

Stellar limb darkening (LD) is an optical effect caused by photons originating shallower inside the star as distances from the center of a star increase. As a consequence, stellar surface brightness decreases with increasing radius. Being able to represent surface brightness and LD accurately is fundamental to exoplanet transit modeling. Typically, LD is characterized by what is referred to as "LD laws." The most popular laws are the linear, quadratic, and nonlinear LD laws (Claret 2000; Claret & Bloemen 2011). Deriving LD coefficients (LDCs) by finding adjustments from stellar atmosphere spherical modeling has been proved effective, even though it sometimes induces biases for exoplanet parameters, e.g., to the radius ratio of exoplanet to star (Rp /R*; Espinoza & Jordán 2015). Removing or at least reducing this potential uncertainty in Rp /R* caused by LD modeling is important when constraining the exoplanet atmosphere through precise measurements of bandpass-dependent Rp /R* (Deming et al. 2005; Karkoschka & Tomasko 2011; Madhusudhan 2019).

In recent applications of exoplanet transit fitting, LD has been implemented with floating coefficients (LDCs) with a Gaussian prior based on theoretical stellar atmosphere models (Demangeon et al. 2018; Evans et al. 2018; Nikolov et al. 2018). This approach is a compromise as a result of quality-limited data and stellar atmospheric model approximations. Fully free LDCs in fitting cause a large uncertainty in the estimation of LDCs, and this uncertainty propagates to the other transit parameters (Bordé et al. 2010; Cabrera et al. 2010). Applying LDCs from an independent method can significantly reduce the uncertainty. Unfortunately, LDCs are difficult to obtain accurately. Direct observation is only available when host stars are close enough to be spatially resolved (Haubois et al. 2009). Microlensing is another independent method to measure the stellar LD (Witt 1995; Dominik 2004; Zub et al. 2011). Theoretical predictions from stellar atmospheric models sometimes show inconsistencies in their inputs (e.g., PHOENIX and ATLAS; Kurucz 1979; Husser et al. 2013). These inconsistencies are reported to cause significant biases to Rp /R* at a level of 1%–10% when fixing LDCs during transit fitting (Csizmadia et al. 2013; Espinoza & Jordán 2015).

Howarth (2011) showed that the LDCs obtained from stellar atmosphere and transit models are sometimes different due to the different fitting processes (synthetic-photometry/atmosphere-model effects). More interestingly, the difference yields approximately constant values for the same planet (within reasonable stellar parameter ranges). The discussion about geometry is described in, for example, Espinoza & Jordán (2015). The offset due to geometry is also nearly constant but only for 3500 K ≤ Teff ≤ 7500 K (see Figure 6 of Espinoza & Jordán 2015).

Another unsolved issue is the standard deviation (σ) of the LDC prior, which is usually set as a Gaussian distribution (e.g., Silvotti et al. 2014; Chen et al. 2018; Luque et al. 2019). The σ should take into account knowledge obtained from stellar modeling–predicted LDCs, the transit fitting experience, and observational data. The selected value of σ is quite different among different works (Wang et al. 2013; Siverd et al. 2018; Barkaoui et al. 2019; Wang et al. 2019; Shporer et al. 2020). The LDC prior significantly affects the final LDC value for transit fitting.

The LD error propagation to Rp /R* is reported to sometimes be up to 1% when applying theoretical LDCs as a prior (Müller et al. 2013; Espinoza & Jordán 2015). This Rp /R* uncertainty is not negligible when analyzing exoplanet atmospheres through wavelength-dependent Rp /R* (Seager & Deming 2010; Madhusudhan 2019). For atmosphere-detected sources, Rp /R* varies in different observational bands from a few tenths of a percent to a few percent, correlated with the width of the bands. The observational significance level ranges from less than one to a few σ (Charbonneau et al. 2002; Sing et al. 2016; Evans et al. 2017). It has been reported that the bias can be reduced but not fully prevented when applying higher-order LD laws or using higher-precision photometry, e.g., Kepler, the Transiting Exoplanet Survey Satellite (TESS), and, once launched, the James Webb Space Telescope (Csizmadia et al. 2013; Müller et al. 2013; Espinoza & Jordán 2015). The creation of unbiased Rp /R* modeling is still in the exploratory stage.

Some pioneering works have been performed by applying stellar atmosphere model–independent LDC priors using Kepler light curves and simulation light curves (Csizmadia et al. 2013; Müller et al. 2013). They obtained the Rp /R* bias caused by different synthetic stellar atmosphere models and compared the result with parameters derived with free LDCs. A transit light curve with a signal-to-noise ratio (S/N) of ∼10 is reported to cause a 2% uncertainty in Rp /R* when applying a fully free LDC prior (Csizmadia et al. 2013).

Facing the same challenge in the TESS era, we present an iterative method in the framework of empirical Bayes (Casella 1985) to obtain LDCs without resorting to a synthetic stellar model and perform a proof-of-concept assessment. The paper is organized as follows. Section 2 describes the iterative method and its application to the light curves of WASP-121b in the TESS survey. In Section 3, we perform Monte Carlo simulations to assess the transit parameters derived. We examine the biases of transit parameters caused by biased LDC priors when applying classic (hereafter termed "noniterative") methods. Any multimodality of the LDCs is addressed in this section. We also examine whether there is any evidence of overfitting for both the quadratic and nonlinear law. Section 4 summarizes our findings, particularly for exoplanet atmosphere modeling.

2. Iterative Method for LD with TESS Light Curves

The LD interacts with other transit parameters in transit fitting (Mandel & Agol 2002). In this section, we describe an iterative method for implementing LD and apply it to WASP-121b. Also, we present noniterative fittings applying float LDCs with Gaussian priors to WASP-121b and a TESS-identified source, HD 219666b, for comparison.

The discovery of exoplanet WASP-121b was reported by Delrez et al. (2016). The host star has a mass of ${1.353}_{-0.079}^{+0.080}{M}_{\odot }$, a radius of 1.458 ± 0.030 R, and a luminosity of 10.4 mag in the V band. The planet is a hot Jupiter with a period of 1.28 days, a mass of ${1.183}_{-0.062}^{+0.064}$ MJ, and a radius of 1.865 ± 0.044 RJ.

Exoplanet HD 219666b was discovered by Esposito et al. (2019) using TESS data. It is a hot Neptune around a G7 star (mass 0.92 ± 0.03 M, radius 1.03 ± 0.03 R) with a period of 6.04 days. The planet has a mass of 16.6 ± 1.3 M and a radius of 4.71 ± 0.17 R.

2.1. Mathematical Justification of the Iterative Method: Empirical Bayes

The Bayes theorem is used to infer parameters from our preknowledge and the observations. Bayesian inference (Feroz et al. 2009) can be expressed as

Equation (1)

where Θ stands for parameters, D for the data, and M for the model. Here P(Θ D , M) is the posterior distribution, P( D Θ, M) is the likelihood, P(ΘM) is the prior, and P( D M) is the Bayesian evidence.

The model (M) contains the Bayesian idea of theories, experience, and the weight toward former and new observations. The prior is a model prediction that takes the above into account (Efron 2010). The prior shapes the form of the posterior distribution. However, a good model-based prior is sometimes hard to obtain, e.g., the arbitrary choice of σ in the LDC prior in planet fitting (Shporer et al. 2020). The empirical Bayes method (Morris 1983; Casella 1985) solves this issue in a general way by obtaining the prior from both the new data and the former knowledge.

The external knowledge is treated as former observations D 1, D 2, ..., D m , with the new observations as D m+1, D m+2, ..., D n . The new observations here refer to different data sets from either different observing runs or just resampling of one sample, e.g., jackknife or bootstrap sampling. A basic assumption is that the relation between the observations and the parameters Θ is the same among D 1, D 2,..., D n . If we use the prior obtained purely from D 1, D 2, ..., D m , the empirical Bayes is the same as the nonempirical Bayes.

The empirical Bayes allows a prior distribution form to be assumed from former observations and obtains the parameters of the distribution form through both the former observations and the new observations (data). The prior obtained is finally applied in the Bayes inference (Equation (1)). The empirical Bayes has recently become a widely used method for Bayes inference (Figueiredo & Nowak 2001; Malinverno & Briggs 2004; Wipf & Rao 2007). The conditions under which we use the method are similar to those in other astronomical work, for example, Anderson et al. (2018) and Osborne et al. (2020).

In our approach, we apply an empirical Gaussian prior with the σ value from foreknowledge and iteratively fit the data to find the proper prior center for LDCs. Iteration is commonly used with the empirical Bayes (Efron 2010). It is more mathematically sound to use subsamples generated from the observational data in the iterative process in considering if the data sets were not identically distributed and if there were any abnormal data points. The subsample here can be obtained by a resampling method, e.g., jackknife or bootstrap. In each iteration cycle, the subsample is updated, and the model parameters inherit from the previous iteration. The subsample can be replaced by the whole sample if the difference between using a subsample and the whole sample is not evident. The rationale behind this is that reuse of the whole data set is being treated as the use of an identical distribution. In this work, the subsample is replaced by the whole data set. We have tested the subsample approach by randomly omitting 10% of the whole data set in each iteration and found only negligible differences from just using the whole data set for all iterations. In addition, the resampling does not have much practical significance in our empirical Bayes approach. It would be impossible for us to improve the precision of transit parameters at the level of 1σ–2σ if the data sets were not identically distributed.

2.2. Iterative Method of LD Determination

A parameterized transit model is typically used (Mandel & Agol 2002) to determine the LD by fitting transit light curves. An alternative choice proposed recently is through tabulated stellar intensities (Short et al. 2020). Markov Chain Monte Carlo (MCMC) methods and multimodal nested sampling algorithms (MULTINEST) have proved to be effective in determining the multidimensional distribution of parameters (Feroz & Hobson 2008; Feroz et al. 2009; Patil et al. 2010; Foreman-Mackey et al. 2013). We obtain our fitting results mainly from an MCMC routine (PYMC; Patil et al. 2010) and find that the results are similar when we replace PYMC with a MULTINEST routine (PyMultiNest; Buchner et al. 2014).

Our method is particularly effective for the case when the stellar atmosphere model–predicted LD is significantly different from the LD obtained from transit fitting. The difference could come from the systematic errors in the stellar models or the different fitting processes (Howarth 2011). It is clear from Howarth (2011) that offsets between the LDC values from different modeling schemes are approximately constant for the same planet in different wavelengths (see their Figures 6 and 7). Also, the LDC offsets from different models are almost constant or weakly dependent on the stellar effective temperature. The offset values for different planets are related to, e.g., the impact factor of the planet system, the temperature, and the gravity of the star. This nearly constant offset acts to support an approach that knowledge of LD uncertainty from stellar atmospheric models is also helpful for setting the LD prior σ during transit fitting. The widths of the distributions inferred from the model grids should not, however, be severely affected.

Our method iterates the MCMC fitting process to the light curves, rather than just taking the result of a single MCMC fitting process. In every iteration loop, the LDC prior updates with the fit result from the previous loop. Every loop is an independent MCMC transit fitting process and does not differ from the classic transit fitting as described in Section 2.3. For the avoidance of doubt, no MCMC chain-breaking takes place.

The LDCs start with arbitrary initial priors. We determine the LDCs' mean values from the morphology of the transit light curves. The LD output from light-curve fitting is set as the input prior to the next iteration. For the most general case, it would be possible to define the iteration step as some fraction of the difference between two iterations. We currently do not use that level of generality, and our fraction is 100%. A theoretical discussion of the choice of the iteration step is beyond the scope of this work.

Our LDC prior for each parameter is a separate Gaussian with a σ of 0.05. A Bayesian interpretation of the prior is a mean value with a σ of 0.05, predicted by the fitting model that includes, e.g., stellar atmospheric knowledge, the transit fitting experience, and TESS data knowledge. The resultant effect is very hard to quantify and causes difficulties in LDC prior σ applications (Shporer et al. 2020). Utilizing the "control variate method" to examine the relative influences of stellar atmospheric knowledge, the transit fitting experience, and TESS data knowledge would shed light on the quantitative prior σ choice. However, it is very hard to achieve and beyond the scope of this work. We apply an empirical LD prior σ taken from similar works. The σ value of 0.05 is empirically used in TESS exoplanet fitting (Wang et al. 2019). For the noniterative method, the prior central value is mainly derived from stellar atmospheric models. For the iterative method, the prior center is obtained from the new observations.

The fitting is repeated 10 times with the same prior, and we take the median value as the LDC output to reduce fluctuations. We repeat the loop if the standard deviation of LDCs among duplicated fittings is larger than 0.05. We also repeat the iterative loop when the difference in the LDCs compared to the previous loop is larger than 0.05. These actions improve the convergence speed.

2.3. Light Curves Derived from TESS Photometry

TESS (Ricker et al. 2015) was launched in 2018, aiming to discover transiting exoplanets in the solar neighborhood. TESS employs four cameras with a total field of view (FOV) of 24 × 96 deg2. TESS captures 30 minute cadence images, named full-frame images, for all of the sources in the FOV and 2 minute images, as well as photometry products, for certain sources. Photometry precision reaches ∼1% at 16 mag in a broad optical band (0.6–1 μm).

We use 2 minute cadence image frames, named the target pixel file (TPF), to generate light curves. The pipeline versions are spoc-3.3.57-20190215 for WASP-121b and spoc-3.3.36-20180925 for HD 219666b. A comprehensive description of the data reduction was presented in earlier work (Yang et al. 2019, submitted). We briefly overview the reduction here. Also, we have compared our result with the light curve generated by TESS Science Processing Operations (PDC light curves). The differences in the light curves are negligible for the transit fitting in this work.

Our photometry pipeline starts by checking and correcting the stellar astrometry relative to the nominal position of the host star of the exoplanet as measured by Gaia. Our photometry measurement is then taken from a circular aperture of 3 pixels, corresponding to 63''. The sky background is accounted for as the median value of the pixels that constitute the lowest fifth percentile in flux in the vicinity of the host star. The quadrature sum of the standard deviation of these pixels and the Poisson noise of the source itself is used as the photometric uncertainty.

Contamination flux from nearby unresolved stars is removed based on a relationship between the flux brightness profile and the distance to the Gaia centroid of the unresolved stars. The flux percentage removed is 31.49% ± 1.79% and 0.2% ± 0.007% for WASP-121b and HD 219666b, respectively. Possible blending from a binary companion should be subtracted as well. However, for these two sources, there is no evidence of a binary companion in earlier literature (Delrez et al. 2016; Esposito et al. 2019).

Detrending the light curve to remove the long-term trends is needed for exoplanet transit fitting (Gibson et al. 2012; Vanderburg & Johnson 2014). WASP-161b has a well-measured ephemeris for transits. This enables us to apply a simple but effective detrending algorithm based on the ephemeris (Yang et al. 2020). We extract the light curve centered at the transit midpoint time. The light curves during the transit are then masked and fitted with a linear function. The fitted function is used to apply a correction to the light curve including the transit phase. Then, we fold the light curve around the transit phase, as shown in Figure 1. We also perform tests with higher-order polynomial functions (up to order 10) and a cubic spline function (Daylan et al. 2021) that give negligible differences when fitting the transit light curves.

Figure 1.

Figure 1. Fits to the TESS transit-folded light curves of WASP-121b (top) and HD 219666b (bottom). The lower panel in each plot gives the residual map. The blue points are the observed data, the green points are binned to 30 minute cadence for clarifying, the red solid line is the best-fit transit model, and the yellow region is the 1σ confidence region of the fits. The midpoint of the transit event is shifted to half the period, at the center of the x-axis.

Standard image High-resolution image

2.4. The Application of the Iterative Method to TESS Light Curves

We apply our new method to the TESS light curves of WASP-121b (transit depth S/N ∼ 9.6). We present HD 219666b (transit depth S/N ∼ 2.7) using noniterative transit fitting as a comparison. The systems have already been identified as containing exoplanets, so we take the same period, orbit type (circle orbit), and binary companion contamination from the discovery papers (Delrez et al. 2016; Esposito et al. 2019).

The difference between the iterative and noniterative methods is described in Section 2.1. The operation in a single iteration is the same as the classic MCMC fitting. A single iteration requires 50,000 steps, ignoring the first 30,000 steps as burn-in. The number of steps is determined by a trial run, which shows that the chain is stable after the first 30,000 steps. The trial also indicates a negligible difference if we run for yet more steps, e.g., 200,000. The free parameters in MCMC fitting are Rp /R*, the inclination of the planet orbit (i), the semimajor axis in stellar radii (a/R*), the time of transit center (T0; the transiting midpoint is shifted to half of the period), and the LD parameters. The priors for the free parameters except LD are all uniform (shown in Table 1).

Table 1. Free Parameters of Light-curve Fitting

ParametersDescriptionWASP-121bWASP-121bHD 219666b
  Iterative MethodNoniterativeNoniterative
Parameter Prior
Rp /R* Planet/star radius ratio ${ \mathcal U }$[0, 0.2] ${ \mathcal U }$[0, 0.2] ${ \mathcal U }$[0, 0.1]
i Inclination ${ \mathcal U }$[70, 110] ${ \mathcal U }$[70, 110] ${ \mathcal U }$[70, 90]
a/R* Semimajor axis of planet orbit in stellar radii ${ \mathcal U }$[0, 30] ${ \mathcal U }$[0, 30] ${ \mathcal U }$[13.3, 0.3]
T0Time offset of transit center ${ \mathcal U }$[0.3*1.27, 0.6*1.27] ${ \mathcal U }$[0.3*1.27, 0.6*1.27] ${ \mathcal U }$[0.3*6, 0.6*6]
u1 Linear LD ${ \mathcal N }$(0.33, 0.05) ${ \mathcal N }$(0.33, 0.05)
u2 Quadratic LD ${ \mathcal N }$(0.21, 0.05) ${ \mathcal N }$(0.20, 0.05)
c1 Quadratic LDC ${ \mathcal N }$(2.84, 0.05)
c2 Quadratic LDC ${ \mathcal N }$(−4.93, 0.05)
c3 Quadratic LDC ${ \mathcal N }$(4.77, 0.05)
c4 Quadratic LDC ${ \mathcal N }$(−1.64, 0.05)
Fitting Result (Quadratic Law)
u1 Linear LDC0.24 ± 0.030.23 ± 0.040.32 ± 0.03
u2 Quadratic LDC0.09 ± 0.030.16 ± 0.040.19 ± 0.03
Rp /R* Planet/star radius ratio0.1239 ± 0.00030.1234 ± 0.00040.0418 ± 0.0004
i Inclination89.5 ± 1.589.9 ± 1.686.45 ± 0.13
a/R* Semimajor axis of planet orbit in stellar radii3.80 ± 0.033.81 ± 0.0313.33 ± 0.31
Standard deviation of residualParts per million17631763652
Reduced χ2 Reduced χ2 1.00041.00041.0005
Fitting Result (Nonlinear Law) a
c1 Quadratic LDC1.54 ± 0.042.81 ± 0.04
c2 Quadratic LDC−2.42 ± 0.04−4.96 ± 0.04
c3 Quadratic LDC2.28 ± 0.044.74 ± 0.04
c4 Quadratic LDC−0.76 ± 0.04−1.67 ± 0.04
Rp /R* Planet/star radius ratio0.1237 ± 0.00030.1234 ± 0.0003
i Inclination89.9 ± 2.090.0 ± 1.8
a/R* Semimajor axis of planet orbit in stellar radii3.80 ± 0.033.81 ± 0.03
Standard deviation of residualParts per million17631763
Reduced χ2 Reduced χ2 1.00041.0004

Note.

a Quadratic law only fitted for HD 219666b to be consistent with identification paper (Esposito et al. 2019).

Download table as:  ASCIITypeset image

An inclination prior of ${ \mathcal U }$[70, 90] is commonly used in the literature but changed to be ${ \mathcal U }$[70, 110] for WASP-121b. The prior cutoff at an inclination of 90° is to avoid the posterior presenting two symmetry modalities centered at 90°. When the real inclination is close to 90°, the posterior distribution is severely affected by the cutoff. Also, the two modalities are both very close to 90° and negligible. The WASP-121b inclination shows a more reasonable posterior distribution without a 90° cutoff in prior.

2.5. Convergence and Multimodality for Quadratic and Nonlinear Models

We utilize the iterative method with both the quadratic and nonlinear LD laws. These laws are expressed with the following equations:

Equation (2)

Equation (3)

where μ is given by cos(γ), γ is the angle from the outward surface to our line of sight, and u1, u2, and cn are the LDCs.

The iteration is taken as convergent if the transit parameters reach a certain set of values and then only show small fluctuations subsequently. In this work, the convergence is treated as achieved if the fluctuation from loops 25–30 of the iteration is less than 0.005 (calculated from the LDC prior's σ/10). The iteration is convergent for WASP-121b when modeled with the quadratic law (as shown in Figure 2). The LDCs are constrained by the transit light curve to some extent when applying the noniterative method (Müller et al. 2013; Espinoza & Jordán 2015; Evans et al. 2018). Otherwise, the transit fitting in earlier works makes no sense. The partial constraint to LDCs in every single iteration fitting can be enlarged with our iterative method and thereby forms an LDC prior independent fitting method.

Figure 2.

Figure 2. Iteration process for the quadratic law. The blue diamonds show the root squared difference (as defined by Equation (4)), the green points show u1, and the black points show u2. The error bar of each coefficient is the standard deviation of the parameter from 10 fittings with the same input prior. The error bar of the root squared difference is derived from the error of each LDC following the error propagation.

Standard image High-resolution image

For the nonlinear law, the transit parameters do not change much during an iteration, within 0.5σ derived from MCMC fitting. The transit parameters, except for the LDCs, reach convergence in the first loop. This is consistent with the conclusion from earlier literature (Espinoza & Jordán 2015) that a more complex LD model gives more reliable transit parameters for the noniterative (single iteration) method.

However, the LDCs for the nonlinear law do not converge to specific values. The standard deviation of the fitting residual is about the same for different LDCs, implying multimodality, which is a typical issue in statistics (Geyer & Thompson 1995).

The multimodality of the LDCs does not appear in a single MCMC fitting. The MCMC having difficulty in jumping between the local modes is a common occurrence (Metropolis et al. 1953; Tak et al. 2018). This increases the risk of underestimating the uncertainty in the LDCs and other parameters. How to resolve this is a much-debated topic in the field of machine learning (Møller et al. 2006; Tak et al. 2018; Paulin et al. 2019).

2.6. LD Result

The estimates of the transit parameters are taken from the "final fitting" (as shown in Table 1). To arrive at our final transit fitting, for the quadratic law, we take the LDC values at convergence, while for the nonlinear law, we take the LDC values after the fifth iteration. The fitted light curve when applying the quadratic law is as shown in Figure 1. The fitted light curve using the nonlinear law looks very similar and so is not included.

Using the convergence definition in the previous section, the quadratic law converges. To assess the speed and convergence of the iterative method, we define the root squared difference as

Equation (4)

where ai,n is LDCi at the nth iteration loop, and ai is the median value of LDCi when the iteration converges. The iteration procedure is shown in Figure 2. Each parameter shows an individual path to convergence. When converging, the iterations show a fluctuation within 0.005.

This work aims at discussing and reducing the bias of transit parameters to within a few σ. It is necessary to prove that our fitting does not induce extra uncertainties compared to the published works. To this end, HD 219666b is used for comparison purposes. We fit the light curve with the noniterative method using the same prior as the identification paper (Esposito et al. 2019) but replacing the median value of the prior with their reported result. This exercise is to understand whether we can obtain the same fitting parameter estimates when applying the same data and fitting method. The fitting applies the quadratic law as in Esposito et al. (2019). The transit parameter estimates from our MCMC fitting, as shown in Table 1, are consistent with Esposito et al. (2019) to within 0.3σ. We try the iterative method on HD 219666b but find that it does not converge. This nonconvergence is mostly due to the low S/N. A comprehensive analysis of convergence using simulations is described in the next section.

We also show the result for WASP-121b following the classic transit fitting method (no iteration), applying both the quadratic and nonlinear laws. The LDC priors use median values taken from the atmospheric models of the host star (Claret 2018) with a Gaussian distribution σ of 0.05. From the results (as shown in Table 1), we find that the transit parameters' posterior values are highly consistent (within 0.1σ) with the quadratic and nonlinear laws when applying the LDCs from the same stellar atmospheric model.

The results from the iterative and noniterative methods show significant differences in the LDC posteriors. More importantly, the difference in Rp /R* between the iterative and noniterative methods is 0.4%, 1σ for the quadratic law and 0.2%, 0.6σ for the nonlinear law. This difference level is crucial to consider when analyzing the exoplanet atmosphere (Yang et al. 2019, submitted; Seager & Deming 2010; Evans et al. 2018). Simulations are used to help assess these differences (see the next section).

3. Assessment of LD Applications Using Simulations

Monte Carlo simulations are performed to evaluate the biases and uncertainties of LDCs and the other transit parameters, especially Rp /R*, obtained by iterative and noniterative methods. The LDC biases, if present, and error propagation to other parameters are most important to evaluate. The simulations generate light curves with certain inputs, mainly based on the fitting result of WASP-121b (as shown in Table 1). Gaussian random noise is added to the light curve.

3.1. LDC Measurement from the Iterative Method

From our experiences on WASP-121b with real data, the LD quadratic law converges, but the nonlinear law does not. We evaluate the LDC measurement using Monte Carlo simulations.

Simulated light curves are constructed by applying Mandel & Agol (2002) models with the derived transit parameters of WASP-121b obtained with the iterative method (as shown in Table 1). Time sampling is the same as in TESS: 2 s of sampling but binned to 2 minutes. Random noise with a σ of 1763 parts per million (ppm) is added to the light curve. We apply the quadratic law iterative method as described previously. We repeatedly generate and fit new light curves 100 times in total. The starting LDC priors are randomly chosen and different every time. The MCMC process takes a few days to complete running on an 8-core processor with a 2 GHz clock frequency.

Out of the 100 modeling runs, 99 converge. Convergence is taken as achieved if the LDC fluctuation between iteration loops 25–30 is less than 0.005. The distribution of the LDCs when converged is shown in Figure 3. The LDC values are 0.25 ± 0.03 for u1 and 0.08 ± 0.06 for u2, which are very close to the input LDC values of 0.24 for u1 and 0.09 for u2.

Figure 3.

Figure 3. Distribution of LDCs from 100 simulations. The dashed lines in each histogram indicate the input value of the LDCs: 0.24 for u1 and 0.09 for u2. The estimated values of u1 and u2 are 0.25 ± 0.03 and 0.08 ± 0.06, respectively.

Standard image High-resolution image

As with real data, the nonlinear law applied iteratively does not converge for the simulated light curves. The LDC input values are set as the result from the real light curve modeled with the iterative nonlinear law. The input values of other parameters are taken from the quadratic law result. The analysis in this subsection from now on refers to quadratic law only.

We thus conclude that the quadratic law iterative method arrives at a correct LDC estimation, at least for WASP-121b TESS data. In the simulation test, we cover the parameters we see as important. The effectiveness of our iterative method depends on the photometry accuracy, binning, and transit parameters, e.g., transit duration, inclination, and stellar surface brightness (described by LDCs). Empirically, the parameters above are the most crucial in our topic, although more parameters affect the transit fitting. Photometry accuracy includes two parts, i.e., the noise in the photometry and the number of measurement points. In statistics, the two parts are connected by the relation

Equation (5)

where σl is the noise of data sets with larger binning intervals, σs is the noise of data sets with smaller binning intervals, and n is the ratio of the number of data sets. In the simulations, we take the same number of data points as the real data of WASP-121b. The light curves with worse photometry accuracy result in a lower S/N of Rp /R*.

We change the input values of these parameters in the simulations one at a time and fit the light curve to see if we obtain an output consistent with the input. For each set of parameters, we repeat the simulation 10 times to ensure statistical significance.

Our fitting results indicate that the output LDCs are consistent with the input values, except for light curves with low S/N, a high impact parameter (b), and unusually large LDC values.

The quadratic law iterative method does not converge when the S/N is below 5. The light curves have insufficient S/N and/or number of data points to model the LD properly when S/N ∼ 5. The standard deviation of the residual of the fitted light curve is the same as the noise added in the light curve. The well-fitted light curves with different sets of LDCs hint at LDC multimodality. The simulations imply that 8 is the lower limit for the S/N to ensure that the iterative method converges when the transit parameters are the same as WASP-121b.

The morphological distortion of light curves caused by finite integration times has been considered by Kipping (2010). We have discussed inclination and semimajor axis biases versus the binning in the TESS WASP-121b light curve in earlier work (Yang et al. 2019, submitted). We concluded that the 30 minute cadence caused significant bias to the estimation of inclination and the semimajor axis for WASP-121b. The binning threshold for nonbias detection in the inclination (semimajor axis) strongly depends on the transit duration, which is 5 minutes for WASP-121b.

We have assessed the impact on the iterative method of simulated light curves with a binning interval of fewer than 5 minutes. The iterations do converge, and parameters are obtained consistently. For the simulations with a binning interval longer than 5 minutes, we find that the iterations converge, and the LDCs obtained do not show any significant bias to the input value as shown in Figure 4.

Figure 4.

Figure 4. The LDC distributions for different binning intervals. The simulation of light curves including generating and fitting is repeated 10 times for each binning interval. The LDCs are obtained as the median values among 10 times repeated simulations. The error bars are obtained as LDC standard deviations among 10 simulations.

Standard image High-resolution image

A high impact parameter causes difficulties in transit fitting because the transit happens too close to the limb of the star. The impact parameter is defined as

Equation (6)

where a is the semimajor axis, i refers to inclination, and R* indicates stellar radius. The LDCs from light-curve fitting are significantly different from the stellar model–predicted LDCs when the impact parameter is high (Müller et al. 2013; Espinoza & Jordán 2015). In our simulations, the iterative method fails to converge when the inclination is close to 75°, which is the highest impact parameter among all possible transits. The situation is straightforward to understand when the transit just grazes the edge (limb) of the host star; the stellar surface provides little useful input to the transit modeling. Uncertainty in the LDC values propagates severely to the other transit parameters, especially Rp /R*, which disrupts the whole modeling process.

The iteration mechanism works for general cases of LD but becomes less effective when both the linear and quadratic coefficients are larger than 0.4. Such LDCs would be appropriate for log g ∼ 2.5 cgs and Teff ∼ 3000 K (Claret 2018), which are uncommon and very different from the main-sequence stars (Gaia Collaboration et al. 2018; Zhang et al. 2020).

3.2. Iterative Method Influence on Transit Parameters

Ideally, transit modeling, including LD modeling, should provide unbiased estimates of all parameters and, most importantly, Rp /R*. We perform simulations to assess the bias of transit parameters.

The simulations are performed using the quadratic law iterative method to check if the method can derive unbiased transit parameters. The simulation is the same as the one that gives the LDC distributions in Figure 3. We generate light curves and fit them with arbitrary initial LDC priors, repeating the simulations 100 times. The estimated transit parameters are consistent within 0.66σ with the input values to construct the light curves. The Rp /R* appears at 0.1240 ± 0.0004, which differs at 0.25σ to the input value (as shown in Figure 5). The differences in inclination and semimajor axis are at 0.35σ and 0.66σ. The simulation shows that the iterative method is effective in estimating unbiased transit parameters.

Figure 5.

Figure 5. The Rp /R* distribution from using the iterative method with 100 simulated light curves. The vertical line indicates the input value of Rp /R* at 0.1239.

Standard image High-resolution image

Moreover, the transit parameters (except LDCs) from the iterations after the first few are close to the final converged result. The difference in Rp /R* between the 5th, 10th, 15th, and final Rp /R* is negligible at the 0.2σ level. The result indicates that the LDCs after five loops, as well as the converged LDCs, are all suitable for citing in the "final fitting" if aiming at unbiased transit parameters like Rp /R*.

The transit parameters are not biased (Rp /R* within 0.3σ) if the iteration converges with changed photometry precision, transit duration, inclination, and LDCs when constructing the simulated light curves. The simulations also reveal that the transit parameters are usually not biased, even if the LDCs do not converge. The bias of transit parameters (except LDCs) arises when the inputs of simulated light curves are more extreme, e.g., an S/N less than 1.25, each LDC larger than 0.4, or an impact parameter larger than 0.7. The light curve contains encoded information about an exoplanet's transit across the face of a star, but it will not necessarily be able to constrain all transit parameters equally well. In particular, a light curve may not be suitable for estimating the LD parameters but may still be usable for estimating other transit parameters. For LD, this unsuitability presents itself as nonconvergence, possibly multimodality. When the suitability drops further and the data are not really usable for other parameters, e.g., Rp /R*, the estimates become biased.

Simulation tests with the nonlinear law perform similarly in parameter estimating. The simulation is the same as described in Section 3.1. The transit parameters (except LDCs) are almost stable after the first iteration. We take the LDC values from the 5th, 10th, and 15th iterations to be priors for the "final iteration." Unlike the unconverged LDC posterior parameters, the estimated Rp /R* is stable. The value is 0.1237 ± 0.0004, which is 0.5σ different from the input radius ratio. The differences in other parameters are within 1σ compared to their values.

3.3. Comparison with Transit Parameters Obtained by Noniterative Methods

The simulation tests on the iterative method indicate that the transit parameter estimation is unbiased. Another important matter is whether or not the transit parameters estimated by noniterative means are biased. Furthermore, if the bias is present, how large the bias is for each parameter, especially for Rp /R*, needs to be understood.

A basic test is to take the values used in creating the light curves, use them as priors in a simulation, and then check the posterior values after the simulations have been completed. This enables us to establish whether the fitting itself is introducing any bias.

We perform noniterative fitting of 1000 simulated light curves following the procedure described in Section 3.1. The light curves are constructed with input values that are the same as the iteration result from using real data. The distribution of the results centers around the input value and forms a distribution due to the random noise added to the light curves and uncertainty in the fitting process. We show the distribution of Rp /R* as an example (in Figure 6). It yields an Rp /R* value of 0.1240 ± 0.0004 compared with the input value of 0.1239.

Figure 6.

Figure 6. The Rp /R* distribution with the noniterative method for 1000 simulated light curves. The vertical line gives the input value.

Standard image High-resolution image

Also, the posteriors of the parameters in a single fitting just represent the distribution of the transit parameters from the fitting to 1000 simulated light curves. The σ of the Rp /R* posterior is 0.0003, while the standard deviation of the Rp /R* derived from fitting to 1000 light curves is 0.0004. The σ and standard deviation of each LDC are ∼0.03 and 0.03, respectively. The consistency between the LDC posterior from a single fitting and the statistical distribution of LDC values from multiple light curves demonstrates that the MCMC modeling is self-consistent.

Similarly, we perform fitting for 100 simulated light curves generated using the nonlinear law. The LDC priors have the same values as used in constructing the light curves. The distribution of Rp /R* centers at 0.1239, with an uncertainty of 0.0004.

The simulations with the noniterative method show the fitting results centering at the input values if the priors are set to the same values as the input values. Fitting to the real data, we are not sure about the correctness of the priors and the bias propagated to the transit parameters. This impact needs to be evaluated through simulations. We therefore apply offsets to the input values and take the shifted values as the priors in simulations. We only apply offsets to the LDCs for the purposes of this work. The unshifted input values are 0.24 for u1 and 0.09 for u2.

For every offset of the prior, we generate and fit 100 light curves. The offset is with a step of 0.05 independently for u1 and u2. The fitting results imply that the bias in the fitted parameters is significant. The bias of Rp /R* is up to 0.82%, which is at a significance level of 2.5σ when the offset is 0.3 or larger for both u1 and u2 (as shown in Figure 7).

Figure 7.

Figure 7. The Rp /R* distribution for fitting with a shifted LDC prior; the upper panel shows the quadratic law, and the lower panel shows the nonlinear law. The vertical line gives the input value of Rp /R* at 0.1239.

Standard image High-resolution image

We emphasize a certain combination of priors, which is 0.33 ± 0.05 for u1 and 0.21 ± 0.05 for u2. This set of LDC priors is the same as the prior we applied in real data fitting with the noniterative method. The simulation result implies an Rp /R* distribution centered at 0.1233 with a σ of 0.0004. The results of fitting the transit parameters are almost the same (within 0.25σ) as the fitting of real data. We thus conclude that the difference in Rp /R* between the iterative and noniterative methods when fitting the real data is due to the offset of the LDC priors, and that Rp /R* is biased in noniterative fitting at 0.5%, 1.5σ.

We also generate and fit simulated light curves 1000 times with a broad uniform prior centered on the input LDCs and with an interval of 0.5. The parameters fluctuate too much and are not suitable for transit fitting. This implies that predictions of appropriate LDC priors are needed when applying noniterative MCMC fitting to transit light curves, at least for the WASP-121b TESS light curve.

In the case of the nonlinear law, the simulation is performed similarly to the quadratic law. The result shows that the offset of the LDC priors biases the obtained Rp /R* but not as much as in the quadratic law. The distributions of radius ratios among simulations with different LDC priors are about the same. We take one set of biased LDC priors as an example. The LDC prior is shifted with an offset of 0.2 for each coefficient. Generating and fitting the light curve is repeated 100 times. The Rp /R* value is 0.1235 ± 0.0004, which is 0.32%, 1.0σ away from the input value (as shown in Figure 7). The simulation result is consistent with the real data fitting.

Our simulations and real data modeling with the nonlinear law confirm the claims in the literature that an LD law with higher-order terms used noniteratively performs better in deriving transit parameters than laws with lower-order terms (Espinoza & Jordán 2015; Magic et al. 2015; Espinoza & Jordán 2016; Morello et al. 2017; Maxted 2018; Neilson et al. 2018; Claret 2000). The bias propagated to transit parameters is reduced but still exists when the LDC priors are not precisely the same as the real LDCs.

We also simulate light curves using nonlinear law inputs and fit the light curves with the quadratic law (with and without the noniterative method). The results are consistent with the results of the quadratic law simulation when the nonlinear inputs are set exactly as the nonlinear fitting result to real data. However, the simulated light curves are very sensitive to the nonlinear law inputs. Any small changes to the LDCs would severely affect the light curves and cause biased fittings with both the nonlinear and quadratic laws. Also, the fittings are not biased when we replace the nonlinear law inputs with the multimodality values. The multimodality, together with the sensitivity (the width of each multimodality range is narrow) of the input stellar parameters, makes it very hard and beyond the scope of this work to discuss the reasonable nonlinear LDC ranges.

3.4. Overfitting Check

The multimodality of the LDCs triggers a question: which model does the relation between the transit parameters and light curves with random noise follow, many-to-many or many-to-one? The many-to-many model indicates a robust MCMC process in which the light curves are dominated by transit parameters rather than random noise. The many-to-one model indicates an overfitted model where the fitting is too close to the exact data values regardless of noise (Tetko et al. 1995; Liu et al. 2018).

We perform a simulation to check if MCMC fitting with the nonlinear law is overfitted. We first generate a light curve as described previously. The random noise has a σ of 1763 ppm. The MCMC fitting is performed to the simulated light curve. The parameter values and standard deviation of the residuals are recorded. Next, we generate another light curve with the same input as the first but with different random noise (keeping the σ of random noise the same). We apply the recorded transit parameters to fit the light curve and calculate the standard deviation of the residual. The standard deviations of both residuals are 1763 ppm. The consistency of the standard deviations indicates that the fitting is not an overfitting.

We perform similar overfitting checks with the quadratic law with S/N ∼ 5 (LDC multimodality) and ∼10 (LDC convergence) and find no evidence of overfitting. We thus conclude that neither the quadratic nor the nonlinear laws are being overfitted.

4. Summary and Discussion

We have presented a novel, iterative method using an empirical Bayesian approach to handle LD in modeling the light curves of exoplanet transits. The method is especially effective for the quadratic LD model and works for the nonlinear law as well. Prior centers from the synthetic stellar atmosphere modeling are not needed. The method converges to obtain unbiased LDCs. Our method is particularly effective for the case when the LD obtained from the stellar atmosphere model is significantly different from the result obtained from transit fitting. The difference could be due to inconsistency among the stellar atmosphere models (with different geometry or input physics) or from the intrinsic difference among the fitting processes of the stellar atmosphere and transit models (Howarth 2011). Howarth (2011) showed that the difference is nearly a constant offset for the same planet in different wavelengths. Also, the LDC offsets are almost constant or weakly dependent on the stellar effective temperature. This motivates the idea that even if the central values of LDCs are different between the stellar model and transit fitting predictions, the widths of the distributions inferred from the model grids would be unaffected. Preknowledge of, e.g., the stellar model prediction, the fitting experience, and the observation data quality is inheritable when setting the LDC prior for transit fitting.

We have applied the iterative method to the WASP-121b TESS light curve and compared the parameter estimates with the classic MCMC fitting for both the quadratic and nonlinear laws. The result implies that the iterative method obtains a different, more accurate Rp /R* at 0.32%, a significance level of 1σ compared to the noniterative method. This improvement is important when taken in the context of exoplanet atmosphere analysis.

Monte Carlo simulations indicate that the iterative method obtains unbiased transit fitting parameters. In WASP-121b simulations, the LDC bias from applying the iterative method is within 0.4σ. The difference in Rp /R*, inclination, and semimajor axis between the iteratively obtained and input values is 0.25σ, 0.35σ, and 0.66σ, respectively. By comparison, the noniterative modeling will cause a bias in Rp /R* of up to 0.82% (2.5σ) when applying shifted LDCs as priors. The simulations also imply that if the real LDCs are as the iterative method obtained, then the noniterative method obtains biased transit parameters when applying the same prior in the real data fitting. The biased estimate of the transit parameters is the same as the estimate from the noniterative method fitting to the real data. This hints at the transit parameters of WASP-121b (real data) obtained with the noniterative method being biased.

We have also presented simulations indicating that the iterative method does not converge for the nonlinear law, as well as for the quadratic law with poor S/N data, due to the multimodality in fitting the LD law. From the simulation results, we recommend taking the result after the first few loops in the iteration for the application of nonlinear law modeling. The simulations show that applying the nonlinear law is better at reducing the bias in parameter fitting compared to the quadratic law when using the noniterative method.

Our work is a new approach to LD, requiring no theoretical LDC value from the stellar atmosphere models. We demonstrate the effectiveness of the method with the TESS WASP-121b light curve and simulations. Our proof of concept shows that the iterative method helps solve some of the difficulties in LDCs in transit fitting. Our unbiased fitting results, especially for Rp /R*, are particularly important for the analysis of the exoplanet atmosphere.

5. Data Availability

The data underlying this paper are available in the paper and its online supplementary material.

This work made use of Astroquery 8 and the NASA Exoplanet Archive. We would like to thank Ranga-Ram Chary for many fruitful discussions and Juan Carlos Segovia for very useful contributions. We also thank You-Jun Lu for feedback on our work. F.Y., J.-F.L., and S.-S.S. acknowledge funding from the National Science Fund for Distinguished Young Scholars (No. 11425313), National Key Research and Development Program of China (No. 2016YFA0400800), and National Natural Science Foundation of China (NSFC.11988101).

Footnotes

Please wait… references are loading.
10.3847/1538-3881/abf92f