Exoplanet Imitators: A Test of Stellar Activity Behavior in Radial Velocity Signals

, , , and

Published 2019 December 20 © 2019. The American Astronomical Society. All rights reserved.
, , Citation Chantanelle Nava et al 2020 AJ 159 23 DOI 10.3847/1538-3881/ab53ec

Download Article PDF
DownloadArticle ePub

You need an eReader or compatible software to experience the benefits of the ePub3 file format.

1538-3881/159/1/23

Abstract

Accurately modeling effects from stellar activity is a key step in detecting radial velocity (RV) signals of low-mass and long-period exoplanets. RVs from stellar activity are dominated by magnetic active regions that move in and out of sight as the star rotates, producing signals with timescales related to the stellar rotation period. Methods to characterize RV periodograms assume that peaks from magnetic active regions will typically occur at the stellar rotation period or a related harmonic. However, with surface features unevenly spaced and evolving over time, signals from magnetic activity are not perfectly periodic, and the effectiveness of characterizing them with sine curves is unconfirmed. With a series of simulations, we perform the first test of common assumptions about signals from magnetic active regions in RV periodograms. We simulate RVs with quasi-periodic signals that account for evolution and migration of magnetic surface features. As test cases, we apply our analysis to two exoplanet hosts, Kepler-20 and K2-131. Simulating observing schedules and uncertainties of real RV surveys, we find that magnetic active regions commonly produce maximum periodogram peaks at spurious periods unrelated to the stellar rotation period: 81% and 72% of peaks, respectively, for K2-131 and Kepler-20. These unexpected peaks can potentially lead to inaccuracies in derived planet masses. We also find that these spurious peaks can sometimes survive multiple seasons of observation, imitating signals typically attributed to exoplanet companions.

Export citation and abstract BibTeX RIS

1. Introduction

Current radial velocity (RV) observations aim to detect signals of less massive and/or longer period planets than ever before. These planets induce RV semi-amplitudes similar to or smaller than those produced by stellar activity (e.g., López-Morales et al. 2016; Dai et al. 2017; Haywood et al. 2018). The field has reached an era, therefore, in which understanding and correcting for stellar activity is essential to accurate RV detections and characterizations of interesting new exoplanets.

Stellar signals in RVs result from three main physical processes: variations in a star's internal pressure produce surface oscillations, convection leads to surface granulation, and magnetic activity produces surface spots, plage, and faculae (Fischer et al. 2016). Oscillation effects are dominated by p-modes that occur on timescales of minutes, while granulation effects occur on timescales of minutes to hours (Leighton et al. 1962; Labonte et al. 1981; Kuhn 1983). When observed over three 10 minute exposures in a night, each separated by approximately two hours, signals from p-mode oscillations and granulation can average to less than one meter per second (m s−1) for Sun-like stars (Dumusque et al. 2011; Meunier et al. 2015; Chaplin et al. 2019). These observing strategies alleviate many but not all effects of stellar activity on RVs, with at least meter-per-second signals from magnetic active regions remaining (e.g., Makarov et al. 2009; Meunier et al. 2010; Lagrange et al. 2011; Haywood et al. 2014). As the next generation of spectrographs come online, precision of RV mass determinations will not be limited by astronomical instruments, but rather by our ability to model and remove signals from magnetic active regions.

Magnetic active regions containing spots, plage, and faculae impact the overall flux measured from their location. As the star rotates, active regions on the approaching limb impact the amount of blueshifted light measured, and those on the receding limb impact the amount of redshifted light measured. Magnetic active regions also suppress convective blueshift at any location to produce a net redshifted effect (Lagrange et al. 2010; Meunier et al. 2010; Haywood et al. 2016). RV signals from magnetic active regions vary quasi-periodically as activity features evolve on the rotating stellar surface. These signals can make detecting and characterizing exoplanets particularly difficult when the planets' orbital periods fall close to the rotation period of the star (Prot). Newton et al. (2016) and Vanderburg et al. (2016) demonstrated this to be a potentially serious challenge in the case of exoplanets orbiting in the habitable zones of M-dwarf stars.

The interference of magnetic activity signals in our ability to detect exoplanet RV signals is a long known problem (e.g., Gillon et al. 2011; Maxted et al. 2011). It is customary, therefore, to estimate Prot using a variety of methods. Prot can be estimated using the equation ${P}_{\mathrm{rot}}\approx 2\pi {R}_{* }/v\sin i$, where R* is the radius of the star, v is its rotational velocity, and i is the inclination of the star's rotation axis with respect to Earth. R* can be accurately measured using asteroseismology (Bedding et al. 2010; Huber et al. 2013), interferometry (Boyajian et al. 2012a, 2012b), or stellar spectral models. Empirical relations between stellar effective temperature and radius have also been established to estimate radii of main-sequence A-, G-, F-, and M-dwarf stars (Boyajian et al. 2012a; Mann et al. 2015). The value of vsini can be measured from the width of spectral lines. However, i is typically unknown, and therefore the value of Prot derived using this method is only an upper limit. Another method to estimate Prot utilizes the empirical relation, derived by Wright et al. (2011), between a star's X-ray bolometric luminosity and Prot. However, this method depends on many simplifying assumptions about stellar dynamos and is susceptible to systematic errors associated with pre-main-sequence and binary stars. These assumptions and errors lead to non-quantifiable uncertainties in final Prot estimates.

Prot can also be derived from a star's photometric light curve (LC). The performance of this method has greatly improved as data from dedicated, high-cadence space-based and ground-based photometric surveys have become available. For example, McQuillan et al. (2014) estimated Prot for main-sequence stars using Kepler mission LCs, Haywood et al. (2014) demonstrated one of the early applications of a Gaussian process (GP) regression model to constrain Prot from a Kepler LC, and Newton et al. (2016) estimated Prot for 387 nearby M dwarfs using MEarth LCs (Irwin et al. 2009).

The method used by McQuillan et al. (2014) utilizes autocorrelation functions (ACFs) to estimate Prot. However, the error associated with this method is based solely on chi-squared fits and does not account for simplified assumptions about the evolution and distribution of magnetic active regions. Additionally, many LCs show no clear rotational signals and produce inconclusive ACF results. GP regression models, like those used by Haywood et al. (2014), avoid deterministic functions and can generate complex signals from magnetic activity observed in RVs. GP regression is currently the most physically motivated method to model stellar activity. Angus et al. (2018) demonstrated that GP regression with a quasi-periodic kernel provides more accurate estimates of Prot from LCs than both sine-fitting with periodograms and ACF analyses. However, it is computationally intensive and rarely applied to high-cadence data sets. Newton et al. (2016) estimated Prot according to the statistically significant periodogram peak resulting in the best-fit sine curve to the transit-removed LC. The error associated with this method suffers from the same issues as that of the ACF analysis, and many M dwarfs produce incoherent LCs that cannot be characterized by a simple sine curve.

RV analyses rely heavily on Prot estimates from the methods discussed above to differentiate magnetic activity signals from Keplerian signals produced by exoplanets. As long as a statistically significant peak in the RV periodogram is located more than a few days from the estimated value of Prot, it will often be explored as a potential exoplanet signal. If the peak is long-lived, surviving over multiple seasons of observation, stellar activity is considered unlikely to be the source of the signal (e.g., Buchhave et al. 2016; Pinamonti et al. 2019).

Validation of exoplanet companions with the above method relies on a number of assumptions. First, it assumes that estimates of Prot from other methods are correct within a few days. Next, it assumes that magnetic activity signals will usually produce a maximum peak in the periodogram near Prot or one of its major harmonics. Finally, it assumes that signals from evolving activity features will not produce long-lived peaks in the RV periodogram at periods unrelated to Prot.

In this paper we test the last two assumptions above by analyzing how magnetic activity signals present in periodograms of real RV data. We describe our methods in Section 2, and in Section 3, as test cases, we apply them to the known planetary systems, Kepler-20 and K2-131. In Section 4, we report our results and in Section 5 we discuss the implications of those results with respect to reliable RV detection of exoplanets in the presence of stellar activity signals.

2. Method

Our method follows three main steps: simulate magnetic activity RV signals, select model parameters, and investigate simulated RV periodograms. Here we provide a general outline of our method.

2.1. Simulation of Magnetic Activity RV Signals

We simulate magnetic activity RV signals using GP regression with a quasi-periodic kernel, motivated by the work of Haywood et al. (2014). The quasi-periodic kernel has the form

Equation (1)

where k(t,t') is the correlation weight between observations taken at times t and t'. A is the mean amplitude of the activity signal, τ is related to the evolution timescale of activity features, Prot is the stellar rotation period, and ω is related to the average distribution of activity features on the surface of the star. The parameter ω describes the level of high-frequency variation expected within a single stellar rotation. Since the level of high-frequency variation defines the number and spacing of local minima or maxima within the timescale of Prot, it is physically related to the average distribution of magnetic active regions on the stellar surface. For each unique set of GP hyperparameters (A, τ, Prot, ω), we simulate 100,000 iterations of stellar RV signals by sampling randomly from the GP prior distribution, with each iteration representing a different phase of the activity signal.

Using Equation (1), we model magnetic activity RVs for targets observed by current RV campaigns, assigning observation times and uncertainties from the real RV data with which we later compare our modeled RVs. We apply a bootstrapping method to real RV uncertainties, using them in a different randomized order with each new iteration of modeled RVs. Section 3 details observation times and RV uncertainties for specific test targets.

2.2. Selection of Model Parameters

The values of A, τ, and Prot for a given target are adopted from the literature when available or are estimated as follows. We set A equal to the standard deviation of the target's real RV residuals (with signals from confirmed exoplanets removed), minus the median value of reported observational uncertainties. With this value, we test a case in which any remaining spread in the RVs, after the removal of known exoplanet signals, can be attributed to a combination of stellar activity and errors associated with observations. Rajpaul et al. (2016) showed how the removal of signals from known exoplanets can lead to spurious periodic signals in RV data sets. However, only the mean values of our modeled RV data sets (A) depend on real RV residuals. The overall structure of each of our modeled data sets are independent of the value of A, and therefore are insusceptible to the spurious periodic signals mentioned above. We set τ and Prot according to the best estimates from LC and/or RV analyses performed on the data sets. Section 3 details the values of A, τ, and Prot adopted for specific targets.

For the purpose of our tests, we use two values of τ to probe different relationships between stellar rotation and the evolution timescale of magnetic active regions. In the first, with τ ≈ Prot, we utilize real τ estimates from the LC and/or RV analyses mentioned above. In the second, with τ = 10 Prot, we explore the case of highly stable magnetic activity features, compared to measured activity lifetimes on Sun-like stars (Giles et al. 2017). This is the case of an unchanging stellar surface over multiple rotations and timescales of typical RV observations. Some faculae regions fall under this category, with features surviving up to 10 times as long as spots (Collier Cameron et al. 2019).

To explore whether uncertainties associated with hyperparameter estimates affect final maximum peak distributions, we performed additional simulations using A, τ, and Prot values falling at the high and low limits of their computed uncertainties. In most of these test cases, final distributions and occurrence rates had similar overall trends to simulations using our published hyperparameter values. Any exceptions to this are further discussed in Section 3.

As mentioned above, ω is physically related to the average distribution of magnetic active regions on the stellar surface. Models have demonstrated that even highly complex activity distributions will average to just two to three large active regions in a given rotation (Jeffers & Keller 2009). The distribution ω = 0.5 ± 0.05 is consistent with this behavior, allowing for two to three local minima or maxima per rotation. This prior on ω has been used successfully to determine exoplanet masses from a number of RV data sets (e.g., Haywood et al. 2014, 2018; Grunblatt et al. 2015; López-Morales et al. 2016). While several RV characterizations have used broader priors on ω, the results from Jeffers & Keller (2009) make a strong case for the much tighter Gaussian prior above (e.g., Faria et al. 2016; Mortier et al. 2016; Astudillo-Defru et al. 2017; Cloutier et al. 2017). Broader priors risk overfitting other noise signals, and mistakenly attributing them as part of the magnetic activity signal. We simulate five different values of ω for each target, sampling evenly from the above distribution, i.e., ω = [0.45, 0.475, 0.5, 0.525, 0.55].

2.3. Investigation of Simulated RV Periodograms

In each simulation, we investigate distributions of maximum RV periodogram peak locations over 100,000 iterations. In each iteration, we generate an RV signal and calculate a generalized Lomb–Scargle (GLS) periodogram on the signal (Lomb 1976; Scargle 1982; Zechmeister & Kürster 2009). We calculate the GLS periodogram with a lower limit of 1.5 days (to avoid the 1 day peak due to nightly observations) and an upper limit of half the baseline of the observations' time span (to consider only periods detectable in the simulated data). From each GLS periodogram, we record the period of the maximum statistically significant peak, with a false alarm probability (FAP) rate >1%. We repeat the process of generating modeled RVs and identifying maximum periodogram peaks over a number of iterations, plotting the final distribution in a histogram.

We consult the final distribution of peak periods to calculate how often maximum peaks in the periodogram occur at a series of important periods, detailed below. We define two occurrence rates at any given period: the first is the percentage of iterations with significant (FAP > 1%) peaks in the RV periodogram that have a maximum peak falling within 5% of that period, and the second is the percentage of all 100,000 iterations that have a maximum peak falling within 5% of the period. We base the 5% metric on the range of uncertainties produced by LC estimates of Prot, used as priors in the RV fits (e.g., Buchhave et al. 2016; López-Morales et al. 2016; Dai et al. 2017).

We calculate occurrence rates at periods related to Prot, including Prot itself and integer multiples of Prot up to the longest period for which the periodograms were calculated. These periods also include rotational harmonics (e.g., Prot/2, Prot/3). We calculate occurrence rates for the same number of rotational harmonics as calculated for integer multiples of Prot. For example, if we calculate occurrence rates for integer multiples up to Prot × 5, we calculate occurrence rates for rotational harmonics down to Prot/5. To track the window function signal, we also calculate occurrence rates for the period of the cadence peak, the maximum peak produced by the cadence of observations. To calculate the cadence periodogram, we produce a signal with the same time stamps as the observations and replace RV amplitudes by random values from the uniform distribution 1.0 ± 1 × 10−15. We then calculate a GLS periodogram on the cadence signal with the same period limits used to calculate the simulated RV periodograms.

Finally, we calculate occurrence rates for maximum peaks falling at a specified period of interest (POI), typically the period of an exoplanet candidate in question. Section 3 details POI selections for specific targets. For targets with multiple seasons of simulated RVs, we also determine a rate of time coherence at the POI. In iterations with a maximum peak occurring at the POI, we calculate the GLS periodogram of each independent season by setting unused RVs to zero with an error of 100 m s−1. This method preserves periodic signals inherent to the observational cadence and was first described in Dumusque et al. (2012). We define a maximum periodogram peak at the POI to be long-lived if it remains the maximum peak in each of the periodograms of each individual season. We define the rate of time coherence as the number of iterations with a long-lived maximum peak occurring at the POI divided by the total number of iterations with a maximum peak at the POI.

3. Application to K2-131 and Kepler-20

We used the method described above to explore potential effects of magnetic activity in the published RV measurements of two known exoplanet systems: K2-131 and Kepler-20. Both systems contain planets detected via the transit method, with additional strong periodic signals detected in follow-up RVs and considered as potential nontransiting exoplanet companions.

3.1. K2-131

K2-131 is a solar-type star with Prot = 9.68 days and one ultra-short period transiting exoplanet companion, K2-131b (Pb = 0.369 days), confirmed by Dai et al. (2017) with combined RV observations from HARPS-N (Cosentino et al. 2012) and the Magellan Planet Finder Spectrograph (PFS; Crane et al. 2010). In addition to the RV signal of K2-131b, Dai et al. (2017) found a statistically significant peak at 3.0 days in the periodogram of their combined PFS/HARPS-N RV observations, reproduced here in Figure 1. The peak raised the possibility of a potential additional nontransiting exoplanet in the system. However, they attributed the signal to magnetic activity due to its close proximity to the second rotational harmonic (Prot/3 = 3.2 days) and additional detection of the signal in data from two magnetic activity indicators.

Figure 1.

Figure 1. Top: the GLS periodogram of K2-131's combined HARPS-N/PFS RVs. Dai et al. (2017) explored the strong signal at the period of interest (POI), 3.0 days, as potential evidence of a nontransiting exoplanet companion. Bottom: the periodogram inherent to the cadence of K2-131 RV observations. The maximum peak inherent to observational cadence, the cadence peak, exists at 6.0 days.

Standard image High-resolution image

We simulated combined RV signals of K2-131 and its known transiting exoplanet, K2-131b. We set a value of POI = 3.0 days to investigate whether the combined signal from K2-131's magnetic activity and K2-131b could produce the 3.0 day periodic signal discussed above. We used observation times and RV uncertainties from the combined set of 41 HARPS-N and 32 PFS observations of K2-131, in which the 3.0 day signal was originally detected (Figure 2; Dai et al. 2017). All of the observations were taken within a single season, the HARPS-N data between 2017 January and April and the PFS data over six nights in 2017 March and April. The observations have a baseline of approximately 66 days, so we set the upper limit for detection of a periodic signal at one-half that time span, approximately 33 days.

Figure 2.

Figure 2. Example simulated RV data set for the combined magnetic activity signal of K2-131 and its companion K2-131b, using observation times from combined HARPS-N/PFS RVs (Dai et al. 2017).

Standard image High-resolution image

We simulated magnetic activity RVs of K2-131 using GP regression and the quasi-periodic kernel described in Equation (1). In their original RV analysis, Dai et al. (2017) utilized GP regression to fit for magnetic activity. We used the final GP hyperparameters reported in Table 7 of their paper: A = 26.0 m s−1, Prot = 9.68 days, τ = 8.9 days. We test this set of hyperparameters with each of the five average activity distributions, ω = [0.45, 0.475, 0.50, 0.525, 0.55]. We also ran a second set of simulations with the activity evolution timescale increased to τ = 96.8 days, in order to explore the case of activity features that remain stable over the timescale of observations.

As described in Section 2.2, we performed additional simulations using A, τ, and Prot values falling at the high and low limits of the uncertainties reported in Table 7 of Dai et al. (2017). Most of these test cases yielded final occurrence rates and overall trends similar to those reported in Section 4. However, cases testing the low limit of Prot have an overlap in values at the POI and Prot/3 (3.0 days and 3.18 days, respectively), within the 5% error. In these cases, occurrence rates at the POI increased to resemble values reported for Prot/3 in Tables 1 and 2.

Table 1.  Occurrence Rates (%) for Select Maximum Peak Values in Periodograms of Simulated Combined RVs from K2-131 and K2-131b with Evolving Activity Features (τ = 8.9 days)

Peak Value ω = 0.45 ω = 0.475 ω = 0.5 ω = 0.525 ω = 0.55
POI = 3.0 days 1.7 1.5 1.3 1.1 1.0
Prot = 9.7 days 6.6 6.8 7.4 7.6 7.7
Prot/2 = 4.8 days 5.8 5.5 5.3 4.9 4.8
Prot/3 = 3.2 days 3.6 3.3 3.0 2.7 2.5
Prot × 2 = 19.4 days 1.3 1.4 1.3 1.3 1.5
Prot × 3 = 29.0 days 2.1 2.3 2.4 2.5 2.7
Cadence peak = 6.0 days 3.8 3.7 3.6 3.5 3.3

Note. Only 13 iterations of simulated RV periodograms had no statistically significant peaks, so occurrence rates with respect to just iterations having statistically significant peaks and with respect to all 100,000 iterations were equal in the case of K2-131. Therefore, only a single occurrence rate is listed for each combination of ω and period values.

Download table as:  ASCIITypeset image

Table 2.  Occurrence Rates (%) as Reported in Table 1, but with Stable Activity Features (τ = 96.8 days)

Peak Value ω = 0.45 ω = 0.475 ω = 0.5 ω = 0.525 ω = 0.55
POI = 3.0 days 0.4 0.4 0.3 0.3 0.2
Prot = 9.7 days 36.0 38.2 40.6 43.2 45.0
Prot/2 = 4.8 days 18.3 17.9 17.6 16.7 16.2
Prot/3 = 3.2 days 6.9 6.3 5.4 4.8 4.1
Prot × 2 = 19.4 days 0.5 0.4 0.4 0.4 0.3
Prot × 3 = 29.0 days 0.1 0.1 0.1 0.1 0.1
Cadence peak = 6.0 days 1.2 1.2 1.2 1.1 1.1

Download table as:  ASCIITypeset image

In all cases, we simulated the planetary RVs of K2-131b with a Keplerian signal, assigning the exoplanet parameters reported in Table 7 of Dai et al. (2017): K = 6.55 m s−1, P = 0.369 days, e = 0, tc = 3582.9360 (BJD—2,454,000), where K is the semi-amplitude of the exoplanet RV signal, P is the orbital period, e is the orbital eccentricity, and tc is the central time of transit. With our lower period limit for periodogram calculations set to 1.5 days, we did not track maximum peak occurrences at the orbital period of K2-131b (P = 0.369 days) in a subsequent analysis of simulated magnetic activity signals. However, since the signal at 3.0 days appeared in real RV data for K2-131 before the removal of the signal from K2-131b, we included the planetary signal to investigate the possibility of the combined signals from magnetic activity and K2-131b producing a maximum peak at the POI.

We generated a total of 10 simulations, using the two values of τ and five values of ω given above, with A and Prot fixed. In each simulation, we generated 100,000 iterations of RVs from K2-131 and its known exoplanet companion. Figures 3 and 4 show example histograms for simulations with τ = 8.9 days and τ = 96.8 days, respectively, both with ω = 0.55. All five values of ω produced similar final maximum peak distributions, but for simplicity, we only show distributions for a single value of ω. Results for the other values of ω are reported in Tables 1 and 2.

Figure 3.

Figure 3. Histogram of maximum peak periods for simulated combined RVs of K2-131 and K2-131b, with evolving activity features (τ = 8.9 days) and ω = 0.55. The histogram is shown in gray with the bin size set to match the occurrence rates listed in Table 1, and shown with a smaller bin size in black to show more detail. The vertical lines correspond to periods for which occurrence rates were calculated.

Standard image High-resolution image
Figure 4.

Figure 4. Same as shown in Figure 3, but with stable activity features (τ = 96.8 days) and occurrence rates in gray from Table 2. The top panel shows the full distribution and the bottom panel shows a zoomed-in view.

Standard image High-resolution image

Tables 1 and 2 list occurrence rates for simulations with τ = 8.9 days and τ = 96.8 days, respectively. We calculated occurrence rates for maximum peaks located within 5% of Prot, its rotational harmonics (Prot/2, Prot/3), its integer multiples (Prot × 2, Prot × 3), the cadence peak (6.0 days; Figure 1), and the POI (3.0 days; Figure 1). With only a single season of RVs available in the original data set, we did not calculate a rate of time coherence at the POI.

3.2. Kepler-20

Kepler-20 is a solar-type star with Prot = 27.4 days and five confirmed transiting exoplanet companions, all with orbital periods shorter than 80 days (Table 3, Fressin et al. 2012; Gautier et al. 2012). Buchhave et al. (2016) published a sixth nontransiting companion, with combined HARPS-N and HIRES data after removing RV signals from the largest three known transiting planets. The predicted RV amplitudes of signals from the smallest two planets are too small to be detected with HARPS-N and HIRES precision. Buchhave et al. (2016) included the nontransiting planet in their fit after detecting a maximum peak located at 34.94 days in the HARPS-N RV residual periodogram that remained coherent over two seasons of observation separated by approximately 138 days (Figure 5). They concluded the long-lived signal to be planetary, considering the approximately 7 day difference between the signal in question and Prot of the star. Buchhave et al. (2016) also observed no correlation between their RVs and the three activity indicators tested, and therefore did not include a model for stellar activity in their final fit.

Figure 5.

Figure 5. Top: the GLS periodogram of Kepler-20's HARPS-N RV residuals (signals from Kepler-20b, Kepler-20c, and Kepler-20-d removed) is shown in black. The cyan and green curves show GLS periodograms of just first season and second season observations, respectively. The GLS periodogram is calculated for a single season by setting unused RVs to zero with their errors set to 100 m s−1, a method first described by Dumusque et al. (2012). The maximum peak at the period of interest (POI), 34.94 days, is long-lived because it remains the maximum peak in the periodograms of both individual seasons of observations. Due to its long-lived nature, Buchhave et al. (2016) investigated the signal at the POI and attributed it to a nontransiting exoplanet companion. Bottom: the periodogram inherent to the cadence of Kepler-20 RV observations. The cadence peak exists at 37.2 days (notably close to the POI at 34.94 days).

Standard image High-resolution image

Table 3.  Transit and Orbital Parameters for Kepler-20's Five Transiting Exoplanet Companions

Parameter Kepler-20b Kepler-20c Kepler-20d Kepler-20e Kepler-20f
Orbital period (days) ${3.696115}_{-.000001}^{+.000001}$ ${10.85409}_{-.000003}^{+.000003}$ ${77.6113}_{-.0001}^{+.0001}$ ${6.098523}_{-.000014}^{+.000006}$ ${19.57758}_{-.00012}^{+.00009}$
Tc (BJD—2,454,000) ${967.5020}_{-.0002}^{+.0003}$ ${971.6080}_{-.0002}^{+.0002}$ ${997.730}_{-.002}^{+.001}$ ${968.932}_{-.001}^{+.002}$ ${967.5020}_{-.0002}^{+.0003}$
Orbital eccentricity ${0.03}_{-0.03}^{+0.09}$ ${0.16}_{-0.09}^{+0.01}$ <0.6* <0.28** <0.32 **
Planet radius (R) ${1.868}_{-0.034}^{+0.066}$ ${3.047}_{-0.056}^{+0.084}$ ${2.744}_{-0.055}^{+0.073}$ ${0.865}_{-0.028}^{+0.026}$ ${1.003}_{-0.089}^{+0.050}$
Planet mass (M) ${9.7}_{-1.44}^{+1.41}$ ${12.75}_{-2.24}^{+2.17}$ ${10.07}_{-3.70}^{+3.97}$ ... ${10.07}_{-3.70}^{+3.97}$

Note. The majority of values are taken from results reported in Table 4 of Buchhave et al. (2016). Values marked by a single asterisk are from the fit reported in Table 2 of Gautier et al. (2012). Values marked by a double asterisk are from the fit reported in Table 1 of Fressin et al. (2012).

Download table as:  ASCIITypeset image

We simulated RVs of Kepler-20's magnetic activity signal. We set a value of POI = 34.94 days to investigate whether the stellar activity signal alone could produce the long-lived periodic signal that Buchhave et al. (2016) attributed to a nontransiting planet, as discussed above. We used observation times and RV uncertainties from the 104 HARPS-N observations of Kepler-20, collected over two seasons between 2014 and 2015 (Figure 6). We did not include the 30 available HIRES observations because the peak originally motivating the planetary signal at 34.94 days was detected in the periodogram of HARPS-N RVs only. The signal was not detected in the periodogram of combined HARPS-N and HIRES RVs. The HARPS-N observations have a baseline of approximately 570 days, so we set the upper limit for the detection of a periodic signal at one-half that time span, approximately 285 days.

Figure 6.

Figure 6. Example simulated RV data set for the magnetic activity signal of Kepler-20, using observation times from two seasons of HARPS-N RVs. (Buchhave et al. 2016).

Standard image High-resolution image

We simulated magnetic activity RVs of Kepler-20 using GP regression and the quasi-periodic kernel described in Equation (1). Unlike in the case of K2-131, the original RV analysis performed by Buchhave et al. (2016) did not utilize GP regression to fit for magnetic activity, and therefore does not provide estimates of A, τ, and Prot. We estimated those three hyperparameters as described below.

We set A equal to the standard deviation of the HARPS-N RV residuals minus the median value of the associated uncertainties. We removed signals from Kepler-20b, Kepler-20 c, and Kepler-20d to calculate the RV residuals, because it was in these residuals that Buchhave et al. (2016) detected the signal at 34.94 days. As explained before, Kepler-20e and Kepler-20f induce signals smaller in amplitude than HARPS-N RV precision, and therefore cannot be reliably fitted and removed (Buchhave et al. 2016). To estimate τ and Prot, we performed an ACF analysis on the same Kepler-20 photometric data originally analyzed by Buchhave et al. (2016), consisting of LCs from 15 Kepler campaigns (Q3-Q17) collected between 2009 and 2013. We applied discrete shifts to the LC and cross-correlated the shifted LCs with the original, revealing peaks separated by a timescale related to Prot, with correlation powers dropping off at a rate related to τ (McQuillan et al. 2014; Giles et al. 2017). Our ACF analysis failed to converge on a final value for τ, but did produce a Prot estimate of 27.4 ± 0.8 days. We instead used a value of τ = 22.9 ± 0.2 days, obtained from the relationship between τ, stellar effective temperature, and the scatter in the photometric LC, described in Equation (8) of Giles et al. (2017). We used the following final hyperparameter values in our reported simulations of Kepler-20: A = 2.31 m s−1, τ = 22.9 days, Prot = 27.4 days.

Again, using the method described in Section 2.2, we found that uncertainties associated with our τ and Prot estimates did not affect final maximum peak distributions. We also tested two additional values of A: the standard deviation of HARPS-N RV residuals without the mean HARPS-N observational uncertainty subtracted (6.07 m s−1) and the reported semi-amplitude of the RV signal attributed to Kepler-20g (4.10 m s−1; Buchhave et al. 2016). In all of these test cases, final distributions and occurrence rates demonstrated similar overall trends to those reported in Section 4.

We ran a second set of simulations with the activity evolution timescale set to τ = 274.0 days, in order to explore the case of activity features that remain stable over the timescale of observations. For both sets of A, τ, and Prot values, we again tested five average activity distributions, ω = [0.45, 0.475, 0.50, 0.525, 0.55].

We generated a total of 10 simulations, using the two values of τ and five values of ω given above, again with A and Prot fixed. In each simulation, we generated 100,000 iterations of magnetic activity RVs for Kepler-20. We plotted a histogram of the distribution of maximum peak periods over all iterations. Figures 7 and 8 show example histograms for simulations with τ = 22.9 days and τ = 274.0 days, respectively, both with ω = 0.55. All five values of ω produce similar final maximum peak distributions, as shown in Tables 4 and 5.

Figure 7.

Figure 7. Histogram of maximum peak periods for simulated Kepler-20 RVs, with evolving activity features (τ = 22.9 days) and ω = 0.55. The vertical lines correspond to periods for which occurrence rates were calculated. The histogram is shown in gray with the bin size set to match occurrence rates listed in Table 4, and shown with a smaller bin size in black to show more detail. The vertical lines correspond to periods for which occurrence rates were calculated.

Standard image High-resolution image
Figure 8.

Figure 8. Same as shown in Figure 7, but with the stable activity features (τ = 274.0 days) and occurrence rates in gray from Table 5. The top panel shows the full distribution and the bottom panel shows a zoomed-in view.

Standard image High-resolution image

Table 4.  Occurrence Rates (%) for Select Maximum Peak Values in Simulated RV Periodograms of Kepler-20, with Evolving Activity Features (τ = 22.9 days)

Peak Value ω = 0.45 ω = 0.475 ω = 0.5 ω = 0.525 ω = 0.55
POI = 34.94 days 4.3/0.7 4.7/0.8 4.4/0.8 4.8/0.9 4.8/0.9
POI time coherence 17.6 17.2 16.5 18.7 17.3
Prot = 27.4 days 6.9/1.2 6.4/1.1 6.7/1.2 6.7/1.2 6.5/1.2
Prot/2 = 13.7 days 4.6/0.8 4.2/0.7 3.8/0.7 3.3/0.6 3.2/0.6
Prot/3 = 9.1 days 1.3/0.2 1.1/0.2 1.0/0.2 0.9/0.2 0.7/0.2
Prot/4 = 6.9 days 0.5/0.1 0.5/0.1 0.4/0.1 0.3/0.1 0.4/0.1
Prot/5 = 5.5 days 0.2/0.0 0.2/0.0 0.1/0.0 0.2/0.0 0.1/0.0
Prot/6 = 4.6 days 0.2/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/7 = 3.9 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/8 = 3.4 days 0.2/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/9 = 3.0 days 0.1/0.0 0.2/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/10 = 2.7 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
 
Prot × 2 = 54.8 days 2.3/0.4 2.0/0.3 2.1/0.4 1.5/0.5 2.4/0.5
Prot × 3 = 82.2 days 1.7/0.3 1.7/0.3 1.6/0.3 1.9/0.3 1.8/0.3
Prot × 4 = 109.6 days 1.9/0.3 1.9/0.3 2.1/0.4 2.3/0.4 2.3/0.4
Prot × 5 = 137.0 days 1.6/0.3 1.7/0.3 1.8/0.3 2.2/0.4 2.0/0.4
Prot × 6 = 164.4 days 1.5/0.3 1.4/0.3 1.7/0.3 2.0/0.4 2.0/0.4
Prot × 7 = 191.8 days 1.4/0.2 1.6/0.3 1.7/0.3 1.9/0.4 1.9/0.4
Prot × 8 = 219.2 days 1.5/0.3 1.6/0.3 1.9/0.4 1.9/0.4 2.2/0.4
Prot × 9 = 246.6 days 1.3/0.2 1.2/0.2 1.3/0.2 1.6/0.3 1.5/0.3
Prot × 10 = 274.0 days 0.8/0.1 0.9/0.2 1.0/0.2 1.0/0.2 1.0/0.2
Cadence peak = 37.3 days 3.1/0.6 3.1/0.5 3.0/0.5 3.2/0.6 3.3/0.6

Note. First rates listed are with respect to only iterations having statistically significant peaks, and second rates listed are with respect to all 100,000 iterations.

Download table as:  ASCIITypeset image

Table 5.  Occurrence Rates (%) as Reported in Table 4, but with Stable Activity Features (τ = 274.0 days)

Peak Value ω = 0.45 ω = 0.475 ω = 0.5 ω = 0.525 ω = 0.55
POI = 34.94 days 1.1/0.3 1.3/0.4 1.5/0.4 1.7/0.5 1.7/0.5
POI time coherence 5.5 9.0 7.5 6.7 8.0
Prot = 27.4 days 66.2/18.3 68.5/19.4 70.7/20.2 72.9/20.9 74.1/21.2
Prot/2 = 13.7 days 16.8/4.7 14.4/4.1 12.4/3.5 10.5/3.0 9.0/2.6
Prot/3 = 9.1 days 1.4/0.4 0.9/0.3 0.7/0.2 0.5/0.1 0.4/0.1
Prot/4 = 6.9 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/5 = 5.5 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/6 = 4.6 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/7 = 3.9 days 0.1/0.0 0.1/0.0 0.0/0.0 0.1/0.0 0.1/0.0
Prot/8 = 3.4 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/9 = 3.0 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot/10 = 2.7 days 0.0/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
 
Prot × 2 = 54.8 days 0.2/0.1 0.1/0.0 0.2/0.1 0.1/0.0 0.1/0.0
Prot × 3 = 82.2 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot × 4 = 109.6 days 0.4/0.1 0.4/0.1 0.3/0.1 0.3/0.1 0.3/0.1
Prot × 5 = 137.0 days 0.1/0.0 0.1/0.0 0.0/0.0 0.1/0.0 0.1/0.0
Prot × 6 = 164.4 days 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0
Prot × 7 = 191.8 days 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0
Prot × 8 = 219.2 days 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0 0.0/0.0
Prot × 9 = 246.6 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
Prot × 10 = 274.0 days 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0 0.1/0.0
 
Cadence peak = 37.3 days 0.6/0.2 0.7/0.2 0.9/0.3 0.9/0.2 0.9/0.2

Download table as:  ASCIITypeset image

Tables 4 and 5 list occurrence rates for simulations with τ = 22.9 days and τ = 274.0 days, respectively. We calculated occurrence rates for maximum peaks located within 5% of Prot, its integer multiples (Prot × 2, ..., Prot × 10), its rotational harmonics (Prot/2, ..., Prot/10), the cadence peak (37.3 days; Figure 5), and the POI (34.94 days; Figure 5).

In iterations with a maximum peak at the POI, we checked whether the maximum peak was long-lived, surviving both seasons of observation. We used the method described in the final paragraph of Section 2.3 to calculate a rate of time coherence at the POI, listed in Tables 4 and 5. Figure 9 shows an example periodogram in which simulated Kepler-20 activity RVs, with ω = 0.55, produce a long-lived peak at the POI.

Figure 9.

Figure 9. Example of simulated Kepler-20 magnetic activity RVs with evolving activity features (τ = 22.9 days) and ω = 0.55, producing a long-lived peak in the periodogram at the POI, 34.94 days. The power corresponding to a 1% FAP rate is indicated by the dashed black line, and the GLS periodogram of simulated Kepler-20 RVs is shown in solid black. The magenta and cyan curves show GLS periodograms of just the first season and second season observations, respectively. We use the same method to calculate periodograms of individual seasons as used in Figure 5 (Dumusque et al. 2012). Simulating only magnetic activity, we are able to reproduce a long-lived signal that could be attributed to a nontransiting planet. Of the maximum peaks occurring at 34.94 days, 16.5%–18.7% were long-lived.

Standard image High-resolution image

4. Results

Our analyses yield results that provide insight into the interpretation of RV periodogram results with respect to magnetic activity, given the limited, nonuniform sampling typical of current RV observing strategies.

4.1. Unexpected Maximum Peaks in RV Periodograms

Figures 3 and 7 show final distributions for simulated RVs of K2-131 and Kepler-20 with evolving magnetic active regions, and therefore stellar surfaces that change over the timescale of observations (τ ≈ Prot). In these cases, a large fraction of simulated active region signals fail to produce a maximum peak at Prot or a related period. Some simulations fail to produce significant peaks at all. This is rarely the case for K2-131, with only 0.1%–0.2% of iterations lacking significant peaks, but in the case of Kepler-20, a whopping 81.0%–83.1% of simulated signals fail to produce significant peaks. The range in reported rates is a result of varying ω values, while for simplicity, figures are only shown for a single average active region distribution (ω = 0.55). A large fraction of the statistically significant maximum peaks that do occur in the RV periodogram are located at periods unrelated to Prot: 80.6%–81.0% of maximum peaks for K2-131 and 71.5%–73.0% of maximum peaks in the case of Kepler-20 (Tables 1 and 4). If analyses of real RV data disregard stellar active regions or use inadequate models, these spurious periodic signals could interact with Keplerian signals of known exoplanets and lead to inaccurate RV mass measurements with underestimated errors.

Figures 4 and 8 show final distributions for simulated RVs of K2-131 and Kepler-20 with magnetic active regions, and therefore stellar surfaces, that remain unchanged over the timescale of observations (activity evolution timescale increased to $\tau \gg {P}_{\mathrm{rot}}$). In these cases, where strong rotation signals would typically be expected, a large fraction of simulated activity signals fail to produce a maximum peak at Prot or a related period. While essentially all iterations of simulated K2-131 signals produced significant peaks, simulated Kepler-20 RVs still fail to produce significant peaks at any period in 71.6%–72.4% of iterations. Again, a considerable fraction of the significant maximum peaks that do occur in the RV periodogram are located at periods unrelated to Prot: 34.3%–38.2% of peaks in the case of K2-131 and 14.0%–15.0% of peaks for Kepler-20 (Tables 2 and 5). These results demonstrate that spurious peaks are inherent to RVs of many activity distributions, even when magnetic surface features are unchanged over the timescale of observations. Therefore, even with high-cadence observations of exoplanets orbiting relatively inactive stars, spurious periodic signals could still lead to aforementioned errors in RV mass determinations.

4.2. Observational Cadence and the Stellar Rotation Signal

All 10 of our final distributions show maximum peaks in the simulated RV periodograms favoring a period located between Prot and the cadence peak. In cases where τ ≈ Prot (Figures 3 and 7), this feature in the histogram could be attributed to random variations in the overall distributions. However, cases where τProt (Figures 4 and 8) show clear features in the histogram between Prot and the cadence peak. This suggests that maximum signals in RV periodograms have a tendency to occur at a period related to the limited sampling of the signal at Prot. In the case of Kepler-20, this favored period occurs at the POI, 34.94 days.

4.3. Occurrence at the POI for Kepler-20 and K2-131

For simulations of K2-131 using a real estimate of the activity evolution timescale (τ ≈ Prot), 1.0%–1.7% of significant maximum peaks in the RV periodogram occurred at the POI, 3.0 days (Table 1). Considering the close proximity of the POI to the second rotational harmonic (Prot/3 = 3.2 days), these occurrence rates are strikingly low relative to occurrence rates at other period values. However, as mentioned in Section 3.1, simulations including the lowest value of Prot within its reported uncertainty lead to an overlap between the POI and Prot/3, within the allowed 5% window. In this case, occurrence rates at the POI increase to 2.5%–3.6%. Due to the large range of potential maximum peak values in a given distribution, maximum peaks are not particularly likely to occur at any given period value. Therefore, while occurrence rates at the POI seem low, they appear more significant when compared to the highest occurrence rates at Prot, 6.6%–7.7% (Table 1).

Simulations of Kepler-20 using a real estimate of the activity evolution timescale (τ ≈ Prot) produce a maximum peak in the RV periodogram at the POI in 4.3%–4.8% of iterations. These rates are relatively high when compared to the highest occurrence rates at Prot, 6.4%–6.9% (Table 4). Occurrence rates at the POI in simulations for Kepler-20 also seem more significant when compared with occurrence rates at the POI in similar simulations for K2-131. Since the POI for K2-131 is relatively close (0.2 days away) to its nearest Prot relative (Prot/3), and the POI for Kepler-20 is more than seven days away from Prot, we would expect to see greater occurrence rates at the POI in the case of K2-131. However, simulations of K2-131 and K2-131b produce much lower relative occurrence rates at the POI, 3.0 days. Our results therefore prove contrary to the assumption that maximum peaks from magnetic activity in RV periodograms usually occur at periods related to Prot.

Simulated Kepler-20 RVs further defy assumptions about magnetic activity signals with 16.5%–18.7% of maximum peaks occurring at 34.94 days being long-lived, remaining the maximum peak in periodograms of both seasons of observation. These long-lived periodic signals that occur many days from a star's estimated Prot could be misinterpreted as exoplanet signals, particularly when analyses disregard models for stellar activity. While our results alone cannot rule out or confirm the existence of nontransiting planets around any target for certain, simulations of Kepler-20 demonstrate how spurious signals in RV periodograms from magnetic active regions could appear planetary in nature.

5. Discussion

Our ability to detect RV signals of low-mass and long-period planets is currently limited by magnetic activity effects on stellar surfaces, which ubiquitously appear as m s−1 level variations in even the least active stars (e.g., Isaacson & Fischer 2010). The key to breaking this magnetic activity barrier is understanding how activity effects appear in observations and finding optimal ways to characterize and model them. Periodogram-based approaches are highly common attempts to distinguish between signals from exoplanets and magnetic active regions. However, the effectiveness of characterizing evolving, quasi-periodic signals with perfectly periodic sine curves is untested. The simulations we present in this paper are a first test of the reliability of common assumptions about magnetic activity signal behavior in RV periodograms. Here we highlight the implications of our results for past and future exoplanet detections and characterizations.

The assumption that magnetic activity signals will peak at a period related to Prot in the RV periodogram could lead to inaccurate mass measurements and missed exoplanet signals. Our results in Section 4.1 reveal that RV signals from magnetic active regions often peak at periods unrelated to Prot in the GLS periodogram, even in the case of high-cadence observations of stars with highly stable, unchanging magnetic regions. These spurious periodic peaks are unlikely to be attributed to magnetic activity when a prior estimate of Prot is known, and they can disguise RV semi-amplitudes of real exoplanet signals. Both targeted follow-up observations to determine masses of known, transiting exoplanets and blind observations to detect new exoplanets are susceptible to this effect. RV fits without a physically motivated model for stellar activity rely heavily on periodogram analyses and risk inaccurately measuring masses of known transiting exoplanets, or missing/misidentifying signals of unknown companions completely.

Magnetic activity signals produce long-lived peaks that could be misidentified as nontransiting companions. Results discussed in Section 4.3 reveal multiple examples of a purely quasi-periodic simulated magnetic activity cycle producing the same spurious maximum peak in the RV periodogram over multiple seasons of observation. Our results in Section 4.2 suggest that these long-lived spurious signals may be related to the limited sampling of the rotation signal, and therefore tend to occur at a period between Prot and the strongest periodic signal inherent to observational sampling, the cadence peak. Given the common assumption that magnetic activity cycles will not produce long-lived significant peaks at periods unrelated to Prot, these long-lived spurious peaks from magnetic activity could be mistakenly attributed to an exoplanet. Fits excluding a model for magnetic activity are particularly vulnerable to misidentifying these spurious periodic signals.

In order to model and fit stellar activity signals, we need to utilize methods that provide reliable prior estimates of associated parameters, particularly τ and Prot. ACF analyses and GP regression fits applied to Kepler LCs have provided τ and Prot estimates leading to successful RV mass measurements (e.g., Haywood et al. 2014; López-Morales et al. 2016). Ideally, photometric LCs used to inform magnetic activity models should be observed near the same time frame as RVs, in order to avoid comparing data sets taken at different phases of a star's activity cycle or with dramatically different activity feature distributions. Spectroscopic activity indicators and chromospheric RVs can also constrain magnetic activity fits, and are inherently simultaneous with RV observations.

The current approach of RV surveys is to use stellar activity information retroactively, correcting effects from magnetic active regions in RV measurements after data have already been collected. However, a more efficient way to deal with activity effects could be to schedule RV observations of specific systems in ways that optimize the sampling of both the magnetic activity and exoplanet signals, so both can be more easily extracted from the data. Early attempts to do this have proven successful (López-Morales et al. 2016; Barros et al. 2017; Santerne et al. 2018). As more precise and stable RV instruments become available, collaboration between those instruments will be necessary to achieve better coverage of stellar activity cycles. Refined algorithms can then produce optimized observing strategies for individual targets. These steps will be key to accurately measuring exoplanet masses in RV data.

Our results in Section 4 reveal that magnetic activity cannot be ignored in RV exoplanet fits. A fit accounting for stellar activity should be considered for all potential exoplanet signals, either with GP regression or another well-motivated model. This is true even for cases where stellar activity signals are not obviously observed with strong peaks in the periodogram or correlations to known activity indicators. The simulations detailed in this paper can become a standard tool for determining what signals to expect from magnetic activity in the RV periodogram, comparing those signals with real data sets, and preventing assumptions about peak locations from leading to inaccurate mass measurements and false exoplanet detections.

We would like to thank Annelies Mortier and Andrew Collier Cameron for sharing their radial velocity and stellar activity expertise in meetings. Some of this work has been carried out in the frame of the National Centre for Competence in Research "PlanetS" supported by the Swiss National Science Foundation (SNSF). This material is based upon work supported by the National Aeronautics and Space Administration under grants No. NNX15AC90G and NNX17AB59G issued through the Exoplanets Research Program. This work was also performed under contract with the California Institute of Technology (Caltech)/Jet Propulsion Laboratory (JPL) funded by NASA through the Sagan Fellowship Program executed by the NASA Exoplanet Science Institute (R.D.H.).

Please wait… references are loading.
10.3847/1538-3881/ab53ec