PLANETARY CANDIDATES OBSERVED BY KEPLER. VII. THE FIRST FULLY UNIFORM CATALOG BASED ON THE ENTIRE 48-MONTH DATA SET (Q1–Q17 DR24)

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and

Published 2016 May 23 © 2016. The American Astronomical Society. All rights reserved.
, , Citation Jeffrey L. Coughlin et al 2016 ApJS 224 12 DOI 10.3847/0067-0049/224/1/12

0067-0049/224/1/12

ABSTRACT

We present the seventh Kepler planet candidate (PC) catalog, which is the first catalog to be based on the entire, uniformly processed 48-month Kepler data set. This is the first fully automated catalog, employing robotic vetting procedures to uniformly evaluate every periodic signal detected by the Q1–Q17 Data Release 24 (DR24) Kepler pipeline. While we prioritize uniform vetting over the absolute correctness of individual objects, we find that our robotic vetting is overall comparable to, and in most cases superior to, the human vetting procedures employed by past catalogs. This catalog is the first to utilize artificial transit injection to evaluate the performance of our vetting procedures and to quantify potential biases, which are essential for accurate computation of planetary occurrence rates. With respect to the cumulative Kepler Object of Interest (KOI) catalog, we designate 1478 new KOIs, of which 402 are dispositioned as PCs. Also, 237 KOIs dispositioned as false positives (FPs) in previous Kepler catalogs have their disposition changed to PC and 118 PCs have their disposition changed to FPs. This brings the total number of known KOIs to 8826 and PCs to 4696. We compare the Q1–Q17 DR24 KOI catalog to previous KOI catalogs, as well as ancillary Kepler catalogs, finding good agreement between them. We highlight new PCs that are both potentially rocky and potentially in the habitable zone of their host stars, many of which orbit solar-type stars. This work represents significant progress in accurately determining the fraction of Earth-size planets in the habitable zone of Sun-like stars. The full catalog is publicly available at the NASA Exoplanet Archive.

Export citation and abstract BibTeX RIS

1. INTRODUCTION

The Kepler instrument is a 0.95 m aperture, optical (423–897 nm at >5% throughput), space-based telescope that employs 42 CCDs to photometrically observe ∼170,000 stars over a field of view of 115 square degrees (Koch et al. 2010). It achieves a combined (intrinsic and instrumental) noise on twelfth-magnitude, solar-type stars of ∼30 ppm (Gilliland et al. 2011; Christiansen et al. 2012) on a 6 hr timescale. The primary objective of the Kepler mission is to determine the frequency of Earth-size planets in the habitable zone around Solar-like stars (Borucki et al. 2010) by searching for the periodic drops in brightness that occur when planets transit their host stars. Observations of the original Kepler field lasted from 2009 May 13 until 2013 May 11, when the second of four on board reaction wheels failed. The spacecraft could no longer maintain the required pointing precision in the original Kepler field and was re-purposed for an ecliptic plane mission (K2; Howell et al. 2014). In this paper, we focus exclusively on data collected from the original Kepler field (19h 22m 40s, +44°30' 00'').

A series of previously published Kepler catalog papers presented an increasingly larger number of planet candidates (PCs) as additional observations were collected by the spacecraft (Borucki et al. 2011a, 2011b; Batalha et al. 2013; Burke et al. 2014; Mullally et al. 2015a; Rowe et al. 2015). These catalogs have been used extensively in the investigation of planetary occurrence rates (e.g., Catanzarite & Shao 2011; Youdin 2011; Howard et al. 2012; Dressing & Charbonneau 2013; Dong & Zhu 2013; Fressin et al. 2013; Petigura et al. 2013; Foreman-Mackey et al. 2014; Burke et al. 2015; Mulders et al. 2015), the determination of exoplanet atmospheric properties (e.g., Coughlin & López-Morales 2012; Esteves et al. 2013; Demory 2014; Sheets & Deming 2014), and the development of planetary confirmation techniques via supplemental analysis and follow-up observations (e.g., Moorhead et al. 2011; Morton & Johnson 2011; Adams et al. 2012, 2013; Colón et al. 2012; Fabrycky et al. 2012; Ford et al. 2012; Santerne et al. 2012; Steffen et al. 2012; Barrado et al. 2013; Dressing et al. 2014; Law et al. 2014; Lillo-Box et al. 2014; Muirhead et al. 2014; Plavchan et al. 2014; Rowe et al. 2014; Everett et al. 2015). Furthermore, astrophysically variable systems not due to transiting planets have yielded valuable new science on stellar binaries, including eclipsing (e.g., Coughlin et al. 2011; Prša et al. 2011; Slawson et al. 2011), self-lensing (Kruse & Agol 2014), beaming (Faigler & Mazeh 2011; Shporer et al. 2011), and tidally interacting systems (e.g., Thompson et al. 2012). While widely used, these previous catalogs involved a substantial amount of manual vetting by a dedicated team of scientists, and as a result were non-uniform (i.e., not every signal was vetted, and those examined were not vetted to the same standard).

This paper describes the use of a robotic vetting procedure to produce, for the first time, a fully automated and uniform planetary catalog based on the entire Kepler mission data set (Q1–Q17; 48 months; data release 24). This procedure and the resulting catalog enable a more accurate determination of planetary occurrence rates, as any potential biases of the robotic vetting can be quantified via artificial transit injection and other tests. However, we note that due to a subtle flaw in the implementation of a veto in the Kepler pipeline, a non-uniform planet search was conducted, and thus care should be taken if using this catalog to compute planetary occurrence rates (see Section 2.2).

In Section 2, we discuss the population of signals possibly due to transiting planets that are identified by the Kepler pipeline and used in this catalog. In Section 3, we describe the robotic procedure employed to vet and disposition every signal. In Section 4, we list the inputs to and results of the robotic vetting, describe the designation of Kepler Objects of Interest, and explain the subsequent transit-model fitting. In Section 5, we compare this catalog to previous and ancillary catalogs, assess the performance of the robotic vetting utilizing the results of artificial transit injection, and highlight and scrutinize new PCs that are potentially rocky and in the habitable zone of their host stars. In Section 6, we discuss the scientific impact of this catalog, and what work can be done to further improve and characterize our vetting procedures for the next Kepler catalog. Finally, due to the significant number of acronyms that are inherent to any large mission like Kepler, in Appendix A we list and define all of the acronyms used in this paper.

2. Q1–Q17 DR24 TCES

This catalog is based on Kepler's 24th data release (DR24), which includes the processing of all data utilizing version 9.2 of the Kepler pipeline (Jenkins et al. 2010). This marks the first time that all of the Kepler mission data have been processed consistently with the same version of the Kepler pipeline. Over a period of 48 months (2009 May 13 to 2013 May 11), subdivided into 17 quarters (Q1–Q17), a total of 198,646 targets were observed, with 112,001 targets observed in every quarter and 86,645 observed in a subset of the 17 quarters (Seader et al. 2015). The calibrated pixel-level images and processed light curves are publicly available at the Mikulksi Archive for Space Telescopes (MAST),15 along with thorough documentation via the Kepler Instrument Handbook (Van Cleve & Caldwell 2009), the Kepler Data Characteristics Handbook (Christiansen et al. 2013b), the Kepler Archive Manual (Thompson & Fraquelli 2014), and the Kepler Data Release 24 Notes (Thompson et al. 2015a).

Seader et al. (2015) discuss in detail the process of identifying threshold crossing events (TCEs), which are periodic flux decrements that may be consistent with the signals produced by transiting exoplanets. Each TCE has an associated Kepler Input Catalog (KIC) ID, period, epoch, depth, and duration. For DR24, Seader et al. (2015) identified a total of 20,367 TCEs, which are publicly available at the NASA Exoplanet Archive16 in the Q1–Q17 DR24 TCE table. We employ these 20,367 TCEs as our starting point to produce a PC catalog with the goal of designating each TCE as a PC or false positive (FP). In the next two subsections, we explore the TCE FP population (TCEs that are not due to transiting planets) and the false negative population (transiting planets that were not detected).

2.1. The TCE FP Population

In Figure 1, we plot a histogram of the number of Q1–Q17 DR24 TCEs identified as a function of period (Seader et al. 2015). We also plot the TCE populations from the two previously published searches, which used data from Q1–Q16 (Tenenbaum et al. 2014) and Q1–Q12 (Tenenbaum et al. 2013), processed by previous versions of the Kepler pipeline. Given that the observed period distribution of transiting planets is thought to be relatively flat and smooth in log-space (Howard et al. 2012; Fressin et al. 2013), and that the population of TCEs has varied significantly between successive data releases and pipeline versions, it is clear that all of these TCE catalogs contain a large number of FPs.

Figure 1.

Figure 1. Distribution of TCEs as a function of period, with uniform bins in log-space. TCEs from Q1–Q12 (Tenenbaum et al. 2013) are plotted in blue, TCEs from Q1–Q16 (Tenenbaum et al. 2014) are plotted in green, and TCEs from Q1–Q17 DR24 (Seader et al. 2015) are plotted in red.

Standard image High-resolution image

For Q1–Q17 DR24 and Q1–Q12, there is a particularly large excess at short periods principally due to short-period, quasi-sinusoidal variable stars, e.g., rapid rotators with strong starspots and pulsating stars, as well as eclipsing binary (EB) stars. Spikes seen in this short-period regime are due to contamination from bright variable stars (Coughlin et al. 2014), such as RR Lyrae at 0.567 days (−0.25 in log-space), V2083 Cyg at 0.934 days (−0.03 in log-space), and V380 Cyg at 12.426 days (1.09 in log-space). For Q1–Q12 and Q1–Q16 there is a large spike of excess TCEs at ∼372 days (2.57 in log-space), which is due to quasi-sinusoidal-like red noise produced by "rolling-band" instrumental artifacts (Van Cleve & Caldwell 2009) that repeat at Kepler's orbital period. For Q1–Q16, and to a lesser extent Q1–Q17, there is a broad excess of long-period TCEs at periods ≳200 days. These are due to short-duration systematics (caused by cosmic rays, flares, starspots, stellar pulsation, edge effects around gaps, and similar features) that occur throughout the light curves and produce a TCE when three events happen to be equally spaced in time. In the Q1–Q17 data, a spike at ∼459 days (2.66 in log-space) can be seen, corresponding to TCEs that were generated due to edge effects from three equally spaced data gaps, and thus this ∼459 day systematic is common to many stars across the entire field.

2.2. The TCE False Negative Population

The injection of artificial transits into the pixel-level data is crucial to fully characterize the false negative rate and compute accurate occurrence rates. The completeness (i.e., how often a transiting planet signal is recovered) of the Kepler pipeline has been measured for both individual transit events (Christiansen et al. 2013a) and multiple transit events spanning a year of data (Christiansen et al. 2015). An injection run for the entire Q1–Q17 DR24 data set has been completed and the results are publicly available (Christiansen 2015). We employ these results in quantifying the accuracy of our PC catalog (see Section 5.3).

In the Q1–Q17 DR24 version of the Kepler pipeline, a new veto was added called the "statistical bootstrap test," which adjusts the detection threshold of the pipeline to account for the presence of non-Gaussian noise—see the appendices of Jenkins (2002), Seader et al. (2015), and Jenkins et al. (2015) for details. While this test was successful in eliminating many long-period FPs compared to the previous Q1–Q12 and Q1–Q16 runs (see Figure 1), a subtle flaw introduced excess noise into the statistic. This eliminated a significant number of valid long-period, transit-like signals, especially at low signal-to-noise ratio (S/N), which may have included some previously designated near Earth-size PCs in the habitable zones of their host stars from Rowe et al. (2015) and Mullally et al. (2015a). Results from the Q1–Q17 DR24 transit injection run (Christiansen 2015) also indicate that the bootstrap test introduced a period-dependent, non-uniform planet search, which complicates the computation of planetary occurrence rates. Future Kepler pipeline runs will not employ the bootstrap test as a veto within the transiting planet search (TPS) module, but rather retain a correctly implemented version as a diagnostic metric.

While they are not true false negatives, as they are not transiting planets, we also note that the on-target, contact EB candidates identified by the Kepler Eclipsing Binary Working Group17 (EBWG; Prša et al. 2011; Slawson et al. 2011; Kirk et al. 2016) were purposely excluded from this transit search, as sinusoidal and quasi-sinusoidal signals are not considered to be transit-like for mission purposes, and significantly increase processing time. There was a total of 1033 targets excluded, which we list in Table 1. "Contact" is defined as having a morphology parameter (Matijevič et al. 2012) greater than 0.6. Detached eclipsing binaries were not excluded as they are sufficiently transit-like to include in this catalog. Stars that were not searched for transits can also be identified by lacking a value for "duty cycle" in the Q1–Q17 DR24 stellar table, which is publicly available at the NASA Exoplanet Archive.

Table 1.  The 1033 Contact Eclipsing Binaries Excluded from the Q1–Q17 DR24 Kepler Pipeline Transit Search

KIC ID
001433410
001572353
001573836
001868650
002012362
002141697
002159783
002162283
002302092
002305277
...

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

3. ROBOTIC VETTING

In previous PC catalogs (Borucki et al. 2011a, 2011b; Batalha et al. 2013; Burke et al. 2014; Rowe et al. 2015), various plots and diagnostics for each TCE were visually examined by members of the Threshold Crossing Event Review Team (TCERT), which consists of professional scientists who have a thorough understanding of Kepler data systematics and the various types of FP scenarios. Mullally et al. (2015a) employed partial automation in the Q1–Q16 catalog through the use of three simple parameter cuts, principally to cull out a large number of long-period FPs, as well as a robotic procedure to identify a particular subset of centroid offsets (see Section 5.2 of Mullally et al. 2015a).

The need to fully automate the dispositioning of TCEs, a long-standing objective of the Kepler mission, is principally driven by the desire to compute accurate planet occurrence rates, which requires that every TCE be dispositioned in a uniform manner so that it can be subjected to quantitative evaluation. As manual inspection by TCERT members is very time-consuming, it is often not feasible to examine each of the ∼20,000 TCEs produced by the Kepler pipeline. While TCERT members are well-trained, as humans they do not always agree with each other, and individuals may disposition a given TCE differently depending on external factors such as the time of day, their mood, other TCEs examined recently, etc. However, humans are naturally adept at pattern recognition and categorization, and TCERT has developed an efficient and comprehensible workflow procedure, based on understood physical processes, while working on the previous six PC catalogs.

Thus, for automating the TCE dispositioning process, we have specifically chosen a robotic vetting procedure that operates via a series of simple decision trees. Hereafter referred to as the "robovetter," it attempts to mimic the well-known human vetting process, providing a specific reason for dispositioning any TCE as an FP. The robovetter was initially developed based on the results of the Q1–Q16 catalog (Mullally et al. 2015a) and then further refined based on the results of manual checks on the the Q1–Q17 DR24 data set by TCERT members.

In Rowe et al. (2015) and Mullally et al. (2015a), FP TCEs were assigned to one or more of the following FP categories.

  • •  
    "Not Transit-lik": a TCE whose light curve is not consistent with that of a transiting planet or EB, such as instrumental artifacts and non-eclipsing variable stars.
  • •  
    "Significant Secondary": a TCE that is observed to have a significant secondary event, indicating that the transit-like event is most likely caused by an EB. (Self-luminous, hot Jupiters with a visible secondary eclipse are also in this category, but are still given a disposition of PC.)
  • •  
    "Centroid Offset": a TCE whose signal is observed to originate on a nearby star, rather than the target star, based on examination of the pixel-level data.
  • •  
    "Ephemeris Match Indicates Contamination": a TCE that has the same period and epoch as another object, and is not the true source of the signal given the relative magnitudes, locations, and signal amplitudes of the two objects.

In Figure 2, we present a flowchart that outlines our robotic vetting procedure. As can be seen, each TCE is subjected to a series of "yes" or "no" questions (represented by diamonds) that either disposition it into one or more of the four FP categories, or else disposition it as a PC. Behind each question is a series of more specific questions, each answered by quantitative tests. These tests are designed with the same "innocent until proven guilty" approach that was used by TCERT members in previous catalogs, such that no TCE is dispositioned as an FP without substantial evidence. Quantitatively, we aim to preserve at least ∼95% of injected transits while rejecting as many FPs as possible.

Figure 2.

Figure 2. Overview flowchart of the robovetter. Diamonds represent "yes" or "no" decisions that are made with quantitative metrics. A TCE is dispositioned as an FP if it fails any test (a "yes" decision) and is placed in one or more of the FP categories. If a TCE passes all of the tests (a "no" decision for all tests) it is dispositioned as a PC. The section numbers on each component correspond to the sections in this paper where these tests are discussed. More in-depth flowcharts are provided for the not transit-like and significant secondary modules in Figures 3 and 4.

Standard image High-resolution image

We note that for all of the robovetter tests that require a phased light curve and model fit, we utilize two different detrendings and model fits. In the Kepler pipeline, the Data Validation (DV) module produces a harmonic-removed, median-detrended, phased flux light curve, along with a transit-model fit (Wu et al. 2010). However, the harmonic remover is known to suppress short-period (≲3 days) signals, such that short-period eclipsing binaries with visible secondaries can appear as transiting planets with no visible secondary (Christiansen et al. 2013a). It can also make variable stars with semi-coherent variability, such as starspots or pulsations, appear as transit-like signals. Thus, we create phased flux light curves via an alternate detrending method that utilizes the pre-search data conditioned (PDC) time-series light curves and the non-parametric penalized least-squares detrending method of Garcia (2010), which includes only the out-of-transit points when computing the filter. These alternately detrended light curves are then phased and fit with a simple trapezoidal transit model. This alternate detrending technique is effective at accurately detrending short-period eclipsing binaries and variable stars, i.e., preserving their astrophysical signal. Every test that is applied to the DV phased light curves is also applied to the alternate detrending—failing a test using either detrending results in the TCE being classified as an FP.

The robovetter first checks if the TCE corresponds to a secondary eclipse associated with a previously examined system. If not, the robovetter then checks if the TCE is transit-like or not. If it is transit-like, then the robovetter looks for the presence of a secondary eclipse. In parallel, the robovetter also looks for evidence of a centroid offset and an ephemeris match to other TCEs and variable stars in the Kepler field. In the following subsections, we describe in detail each of these tests in the order in which they are performed by the robovetter.

3.1. The TCE is the Secondary of an Eclipsing Binary

If a TCE under examination is not the first one in a system, then the robovetter checks if there exists a previous TCE with a similar period that was designated as an FP due to a significant secondary (see Section 3.3). To compute whether two TCEs have the same period within a given statistical threshold, we employ the period matching criteria of Coughlin et al. (2014, see Equations (1)–(3)), σP, where higher values of σP indicate more significant period matches. Here, we re-state the equations as

Equation (1)

Equation (2)

Equation (3)

where PA is the period of the shorter-period TCE, PB is the period of the longer-period TCE, rint() rounds a number to the nearest integer, abs() yields the absolute value, and erfcinv() is the inverse complementary error function. We consider any value of σP > 3.25 to indicate significantly similar periods.

If the current TCE is (1) in a system that has a previous TCE dispositioned as an FP due to a significant secondary, (2) matches the previous TCE's period with σP > 3.25, and (3) is separated in phase from the previous TCE by at least 2.5 times the transit duration, then the current TCE is considered to be a secondary eclipse. In this case, it is designated as an FP and is classified into both the not transit-like and significant secondary FP categories—a unique combination that can be used to identify secondary eclipses while still ensuring that they are not assigned Kepler Object of Interest (KOI) numbers (see Section 4.2). Note that since the Kepler pipeline identifies TCEs in order of their S/N, from high to low, a TCE identified as a secondary can sometimes have a deeper depth than the primary, depending on their relative durations and shapes.

There are two cases where we modify the three criteria above. First, it is possible that the periods of two TCEs will meet the period matching criteria, but be different enough to have their relative phases shift significantly over the ∼4 year mission duration. Thus, the potential secondary TCE is actually required to be separated in phase by at least 2.5 times the previous TCE's transit duration over the entire mission time frame in order to be labeled as a secondary. Second, the Kepler pipeline will occasionally detect the secondary eclipse of an EB at a half, third, or some smaller integer fraction of the orbital period of the system, such that the epoch of the detected secondary coincides with that of the primary. Thus, for the non-1:1 period ratio cases, we do not impose the phase separation requirement. (Note that Equations (1)–(3) allow for integer period ratios.)

3.2. Not Transit-like

A very large fraction of FP TCEs have light curves that do not resemble a detached transiting or eclipsing object. These include quasi-sinusoidal light curves from pulsating stars, starspots, and contact binaries, as well as more sporadic light curves due to instrumental artifacts. In previous PC catalogs, a process called "triage" was employed whereby the human vetters looked at the phased light curves to determine whether the TCEs were not transit-like or should be given KOI numbers, which are used to keep track of transit-like systems over multiple Kepler pipeline runs. We thus employ a series of algorithmic tests to reliably identify these not transit-like FP TCEs, as shown by the flowchart in Figure 3.

Figure 3.

Figure 3. Not transit-like flowchart of the robovetter. Diamonds represent "yes" or "no" decisions that are made with quantitative metrics. If a TCE fails any test (via a "yes" response to any decision), then it is dispositioned as a not transit-like FP. If a TCE passes all tests (via a "no" response to all decisions), then it is given a KOI number and passed to the significant secondary module (see Section 3.3 and Figure 4). The section numbers on each decision diamond correspond to the sections in this paper where these tests are discussed.

Standard image High-resolution image

3.2.1. Not Transit-shaped

The human members of TCERT were given training and diagnostic plots that allowed them to quickly distinguish between a quasi-sinusoidal shaped light curve and one that is more detached due to a transit or eclipse. Also, they were trained to recognize if an individual event is due to a transit or a systematic feature, such as a sudden discontinuity in the light curve. As such, we sought metrics for the robovetter that would similarly detect quasi-sinusoidal light curves and systematics.

3.2.1.1. The LPP Metric

Many short-period FPs are due to variable stars that exhibit a quasi-sinusoidal phased light curve. Matijevič et al. (2012) used a technique known as Local Linear Embedding (LLE), a dimensionality reduction algorithm, to classify the "detachedness" of Kepler EB light curves on a scale of 0–1, where 0 represented fully detached systems with well-separated, narrow eclipses and 1 represented contact binaries with completely sinusoidal light curves. We use a similar technique, known as Local Preserving Projections (LPP; He & Niyogi 2004), to distinguish transit-like signals from not transit-like signals (Thompson et al. 2015b). LPP returns a single number that represents the similarity of a TCE's shape to that of known transits. Unlike LLE, LPP can be applied to any TCE, not just those that lie within the parameter space of the training set. Thus, LPP is more suitable for separating transit-like TCEs from all of the other not transit-like TCEs, and can be run on artificially injected transits.

To calculate the LPP metric, we start with detrended Kepler light curves. We then fold and bin each light curve into 141 points, ensuring adequate coverage of both the in- and out-of-transit portions of the light curve. We exclude points near a phase of 0.5, as the presence of a secondary eclipse in a short-period binary may unduly influence the LPP value, and we seek to classify detached eclipsing binaries as transit-like. These 141 points act as the initial number of dimensions that describe each TCE. Using a subset of known transit-like TCEs, we create a map from the initial 141 dimensions down to 20 dimensions. We apply this map to all TCEs and measure the average Euclidean distance of each to the 15 nearest known transit-like TCEs. This average distance is the value of the LPP metric. When small, it means other transit-like TCEs are nearby in the 20-dimensional space, and thus it is likely to be shaped like a transit. We calculate this LPP transit metric for all TCEs using both the DV and the alternate detrending, as described in Section 3.2.

In order to quantitatively determine a threshold between transit-like and not transit-like, we run the LPP classifier on both detrendings of the injected transits (see Section 5.3), which we know a priori are all transit-shaped, barring any light-curve distortion due to detrending. We then fit a Gaussian to the resulting distribution, computing its median and standard deviation. We then select a maximum LPP cutoff such that we expect less than one false negative in 20,367 TCEs, via

Equation (4)

where NTCEs = 20,367, yielding σLPP = 4.06. Any TCE with an LPP value greater than the median plus 4.06 times the standard deviation, using either detrending, is considered to be not transit-like.

3.2.1.2. The Marshall Metric

A number of long-period FPs are the result of three or more systematic events that happen to be equidistant in time and produce a TCE. There are two prominent types of systematic events in Kepler data: sudden pixel sensitivity dropouots (SPSDs) and step-wise discontinuities. SPSDs are due to cosmic-ray impacts that temporarily reduce the detection sensitivity of the impacted pixels, resulting in a sudden drop in flux followed by an asymptotic rise back to the baseline flux level over a timescale of a few hours (Van Cleve & Caldwell 2009). Step-wise discontinuities are sudden jumps in the baseline flux level, in either the positive or negative flux direction, and are typically due to imperfect detrending, but may have other causes. If a TCE is due to several of these events that are of similar S/N, then they will not be flagged as FPs without examining the shape of their individual events.

In order to detect TCEs due to SPSDs and step-wise discontinuities, we developed the "Marshall" metric (Mullally et al. 2015b). Marshall fits a transit, SPSD, and step-wise discontinuity model to each individual event of a long-period TCE. The Bayesian Information Criterion (BIC; Schwarz 1978) is then used to select which model best fits each individual transit event given each model's number of degrees of freedom. If either the SPSD or step-wise discontinuity model have a lower BIC value than the transit model by a value of 10 or more for a given transit event, then that event is determined to be due to a systematic rather than a transit. After evaluating each individual event, if there are fewer than three events that are determined to be due to transits, then the TCE is dispositioned as a not transit-like FP. This is in agreement with the Kepler mission requirement of detecting at least three valid transits in order to generate a TCE.

3.2.2. Previous TCE With Same Period

Most quasi-sinusoidal FPs produce multiple TCEs at the same period, or at integer ratios of each other. If a TCE in a system has been declared as not transit-like due to another test, then it is logical that all of the subsequent TCEs in that system at the same period, or ratios thereof, should also be dispositioned not transit-like. Thus, we match the period of a given TCE to all of the previous not transit-like FPs via Equations (1)–(3). If the current TCE has a period match with σP > 3.25 to a prior not transit-like FP, then it is also dispositioned as a not transit-like FP.

Similarly, some TCEs are produced that correspond to the edge of a previously identified transit-like TCE in the system. This often results when the previous TCE corresponding to a transit or eclipse is not completely removed prior to searching the light curve for another TCE. Thus, we match the period of a given TCE to all of the previous transit-like TCEs via Equations (1)–(3). If the current TCE has a period match with σP > 3.25 to a prior transit-like FP, and the two epochs are separated in phase by less than 2.5 transit durations, then the current TCE is dispositioned as a not transit-like FP. For clarity, we note that it is sometimes possible that the periods of two TCEs will meet the period matching criteria, but be different enough to have their epochs shift significantly in phase over the ∼4 year mission duration. Thus, if they are separated in phase by less than 2.5 transit durations at any point in the mission time frame, the current TCE is dispositioned as a not transit-like FP.

3.2.3. The Model-shift Uniqueness Test

If a TCE under investigation is truly a PC, then there should not be any other transit-like events in the light curve with a depth, duration, and period similar to the primary signal, in either the positive or negative flux directions, i.e., the transit event should be unique in the phased light curve. Many FPs are due to quasi-sinusoidal signals (see Section 2), and thus are not unique in the phased light curve. In order to identify these cases, TCERT developed a "model-shift uniqueness test" and used it extensively for identifying FPs in both the Q1–Q12 (Rowe et al. 2015) and Q1–Q16 (Mullally et al. 2015a) PC catalogs.

See Section 3.2.2 of Rowe et al. (2015) and page 20 of Coughlin (2014) for figures and a detailed explanation of the "model-shift uniqueness test;" in brief, after removing outliers, the best-fit model of the primary transit is used as a template to measure the best-fit depth of the transit model at all other phases. The deepest event aside from the primary (pri) transit event is labeled as the secondary (sec) event, the next-deepest event is labeled as the tertiary (ter) event, and the most positive (pos) flux event (i.e., shows a flux brightening) is labeled as the positive event. The significances of these events (σPri, σSec, σTer, and σPos) are computed assuming white noise as determined by the standard deviation of the light-curve residuals. Also, the ratio of the red noise (at the timescale of the transit duration) to the white noise (FRed) is computed by examining the standard deviation of the best-fit depths at phases outside of the primary and secondary events. When examining all of the events among all of the TCEs, the minimum threshold for an event to be considered statistically significant is given by

Equation (5)

where Tdur is the transit duration and P is the period. (The quantity P/Tdur represents the number of independent statistical tests for a single target.) When comparing two events from the same TCE, the minimum difference in their significances in order to be considered distinctly different is given by

Equation (6)

In the robovetter, we disposition a TCE as a not transit-like FP if ${\sigma }_{{\rm{Pri}}}/{F}_{{\rm{Red}}}\lt {\sigma }_{{\rm{FA}}}$, ${\sigma }_{{\rm{Pri}}}-{\sigma }_{{\rm{Ter}}}\lt {\sigma }_{{\rm{FA}}}^{\prime }$, or ${\sigma }_{{\rm{Pri}}}-{\sigma }_{{\rm{Pos}}}\lt {\sigma }_{{\rm{FA}}}^{\prime }$ for either the DV or alternate detrending. These criteria ensure that the primary event is statistically significant when compared to the systematic noise level of the light curve, the tertiary event, and the positive event, respectively.

3.2.4. Dominated by Single Event

The depths of individual transits of PCs should be equal to each other, and thus, assuming constant noise levels, the S/Ns of individual transits should be nearly equivalent as well. In contrast, most of the long-period FPs that result from three or more equidistant systematic events are dominated in S/N by one of those events. The Kepler pipeline measures detection significance via the Multiple Event Statistic (MES), which is calculated by combining the Single Event Statistic (SES) of all of the individual events that comprise the TCE—both the MES and SES are measures of S/N. Assuming that all of the individual events have equal SES values,

Equation (7)

where NTrans is the number of transit events that comprise the TCE. Thus, SES/MES = 0.577 for a TCE with three transits, and less for a greater number of transits. If the largest SES value of a TCE's transit events, SESMax, divided by the MES is much larger than 0.577, then this indicates that one of the individual events dominates when calculating the S/N.

In the robovetter, for TCEs with periods greater than 90 days, if SESMax/MES > 0.9, then it is dispositioned as a not transit-like FP. The value of 0.9 was empirically chosen based on the results of transit injection (Section 5.3) to reject a minimal number of valid planetary candidates, accounting for natural deviations of SES values due to light-curve systematics and changes in local noise levels. The period cutoff of 90 days is applied because short-period TCEs can have a large number of individual transit events, which dramatically increases the chance of one event coinciding with a large systematic feature, thus producing a large SESMax/MES value despite being a valid planetary signal.

3.3. Significant Secondary

If a TCE is deemed to be transit-like by passing all of the tests presented in Section 3.2 on both detrendings, then it is given a KOI number. However, many of these KOIs are FPs due to eclipsing binaries and contamination from nearby variable stars. In order to produce a uniform catalog, we do not designate any TCE as an FP on the basis of its transit depth or inferred radius—see Section 7 item 6 of Mullally et al. (2015a) for more detail. Thus, being agnostic to stellar parameters, the only way to definitively detect an EB via a Kepler light curve is by detecting a significant secondary eclipse. We employ a series of robotic tests to detect secondary eclipses, as shown by the flowchart in Figure 4.

Figure 4.

Figure 4. Flowchart describing the significant secondary tests of the robovetter. Diamonds represent "yes" or "no" decisions that are made with quantitative metrics. The multiple arrows originating from "Start" represent decisions that are made in parallel.

Standard image High-resolution image

3.3.1. Subsequent TCE With Same Period

Once the Kepler pipeline detects a TCE in a given system, it removes the data corresponding to this event and re-searches the light curve. It is thus able to detect the secondary eclipse of an EB as a subsequent TCE, which will have the same period as, but a different epoch than, the primary TCE. Thus, utilizing Equations (1)–(3), the robovetter dispositions a TCE as an FP due to a significant secondary if its period matches a subsequent TCE within the utilized tolerance (σP > 3.25), and they are separated in phase by at least 2.5 times the transit duration. For clarity, we note again that it is sometimes possible that the periods of two TCEs will meet the period matching criteria but be different enough to have their epochs shift significantly in phase over the ∼4 year mission duration. Thus, this phase-separation requirement is must be upheld over the entire mission duration in order to disposition the TCE as an FP due to a significant secondary.

Occasionally, the Kepler pipeline will detect the secondary eclipse of an EB at half, third, or some smaller integer fraction of the orbital period of the system. In these cases, the epoch of the TCE corresponding to the secondary will overlap with that of the primary. These cases are accounted for by not requiring a phase separation of at least 2.5 transit durations when a period ratio other than unity is detected. (Note that Equations (1)–(3) allow for integer period ratios.) While this approach will likely classify any multi-planet system in an exact 2:1 orbital resonance as an FP due to a significant secondary, in practice, this is non-existent. Exact 2:1 orbital resonances, where "exact" means that the period ratio is close enough to 2.0 over the ∼4 year mission duration to avoid any drift in relative epoch, appear to be extremely rare (Fabrycky et al. 2014). Also, they would produce strong transit timing variations (TTVs), which would likely preclude their detection. The Kepler pipeline employs a strictly linear ephemeris when searching for TCEs, and thus while planets with mild TTVs, e.g., deviations from a linear ephemeris less than the transit duration, are often detected, planets with strong TTVs, e.g., deviations from a linear ephemeris greater than the transit duration, are often not detected.

3.3.2. Secondary Detected in Light Curve

There are many cases when a secondary eclipse does not produce its own TCE, most often when its MES is below the Kepler pipeline detection threshold of 7.1. The model-shift uniqueness test, discussed in Section 3.2.3, is well-suited to automatically detect secondary eclipses in the phased light curve, as it searches for the next two deepest events aside from the primary event. It is thus able to detect the best-candidate secondary eclipse in the light curve and assess its significance. The robovetter dispositions any TCE as an FP due to a significant secondary if all three of the following conditions are met, for either the DV or alternate detrending: σSec/FRed > σFA, ${\sigma }_{{\rm{Sec}}}-{\sigma }_{{\rm{Ter}}}\gt {\sigma }_{{\rm{FA}}}^{\prime }$, and ${\sigma }_{{\rm{Sec}}}-{\sigma }_{{\rm{Pos}}}\gt {\sigma }_{{\rm{FA}}}^{\prime }$ (see Section 3.2.3). These criteria ensure that the secondary event is statistically significant when compared to the systematic noise level of the light curve, the tertiary event, and the positive event, respectively.

There are two exceptions when the above-mentioned conditions are met, but the robovetter does not designate the TCE as an FP. First, if the primary and secondary are statistically indistinguishable, and the secondary is located at phase 0.5, then it is possible that the TCE is a PC that has been detected at twice the true orbital period. Thus, the robovetter labels a TCE with a significant secondary as a PC when ${\sigma }_{{\rm{Pri}}}-{\sigma }_{{\rm{Sec}}}\lt {\sigma }_{{\rm{FA}}}^{\prime }$ and the phase of the secondary is within 1/4 of the primary transit's duration of phase 0.5. Second, hot Jupiter PCs can have detectable secondary eclipses due to planetary occultations via reflected light and thermal emission (Coughlin & López-Morales 2012). Thus, a TCE with a detected significant secondary is labeled as a PC with the significant secondary flag (in order to facilitate the identification of hot Jupiter occultations) when the geometric albedo is less than 1.0, the planetary radius is less than 30 R, the depth of the secondary is less than 10% of the primary, and the impact parameter is less than 0.95. The additional criteria beyond the albedo criterion are needed to ensure that this test is only applied to potentially valid planets and not grazing eclipsing binaries. We calculate the geometric albedo by using the stellar mass, radius, and effective temperature from Huber et al. (2014), and the values of the period and radius ratio from the DV module of the Kepler pipeline.

3.3.3. Odd/Even Depth Difference

If the primary and secondary eclipses of an EB are similar in depth, and the secondary is located near phase 0.5, then the Kepler pipeline may detect them as a single TCE at half the true orbital period of the EB. In these cases, if the primary and secondary depths are dissimilar enough, then it is possible to detect it as an FP by comparing the depths of the odd- and even-numbered transit events. Thus, we compute the following statistic, for both the DV and alternate detrending,

Equation (8)

where dodd is the median depth of the odd-numbered transits, deven is the median depth of the even-numbered transits, σodd is the standard deviation of the depths of the odd-numbered transits, and σeven is the standard deviation of the depths of the even-numbered transits. For the alternate detrending with a trapezoidal fit, we use all of the points that lie within ±30 minutes of the central time of transit, as well as any other points within the in-transit flat portion of the trapezoidal fit. For the DV detrending, we use all of the points within ±30 minutes of the central time of transit. (This threshold corresponds to the long-cadence integration time of the Kepler spacecraft. Including points farther away from the central time of transit degrades the accuracy and precision of the test.) If σOE > 1.7 for either the DV or alternate detrending, then the TCE is labeled as an FP due to a significant secondary. The value of 1.7 was empirically derived utilizing manual checks and transit injection.

3.4. Centroid Offset

Given that Kepler's pixels are 3farcs98 square (Koch et al. 2010) and the typical photometric aperture has a radius of 4–7 pixels (Bryson et al. 2010), it is quite common for a given target star to be contaminated by light from another star. If that other star is variable, then that variability will be visible in the target aperture at a reduced amplitude. If the variability due to contamination results in a TCE, then it is an FP, whether the contaminator is an EB, planet, or other type of variable star (Bryson et al. 2013). For example, if a transit or an eclipse occurs on a bright star, then a shallower event will be observed on a nearby, fainter star. Similarly, a star can be mistakenly identified as experiencing a shallow transit if a deep eclipse occurs on a fainter, nearby source.

The DV module of the Kepler pipeline produces "difference images" for each quarter, which are made by subtracting the average flux in each pixel during each transit from the flux in each pixel just before and after each transit (Bryson et al. 2013). If the resulting difference image shows significant flux change at a location (centroid) other than the target, then the TCE is likely an FP due to a centroid offset. In prior catalogs, TCERT members manually examined the difference images to look for evidence of a centroid offset, as fully described in Bryson et al. (2013) and Sections 3.2.3–3.2.6 of Rowe et al. (2015). In this catalog, the search for centroid offsets was fully robotized and confirmed to reproduce the results of earlier catalogs using human vetting (F. Mullally et al. 2016, in preparation).

In our robotic procedure to detect FPs due to centroid offsets, we first check that the difference image for each quarter contains a discernible stellar image and is not dominated by background noise. This is done by searching for at least three pixels that are adjacent to each other and brighter than a given threshold, which is set by the noise properties of the image. We use an iterative σ clipping approach to eliminate bright pixels when calculating the background noise, as the star often dominates the flux budget of a substantial number of pixels in the aperture.

For those difference images which are determined to contain a discernible stellar image, we first search for evidence of contamination from sources that are resolved from the target. Since resolved sources near the edge of the image may not be fully captured, Pixel Response Function (PRF—Kepler's point-spread function convolved with the image motion and the intra-pixel CCD sensitivity) fitting approaches do not often work well to detect them. Instead, we check if the location of the brightest pixel in the difference image is more than 1.5 pixels from the location of the target star. If at least two-thirds of the quarterly difference images show evidence of an offset by this criterion, we disposition the TCE as an FP due to a centroid offset. Note that FPs due to stars located many pixels from the target, i.e., far outside the target's image, are not detected by this approach, but rather through ephemeris matching (see Section 3.5).

If no centroid offset is identified by the previous method, we then look for contamination from sources that are unresolved from the target. We measure the PRF-fit centroid of the difference images and search for statistically significant shifts with respect to the PRF centroid of both the out-of-transit images as well as the catalog position of the source. Following Bryson et al. (2013), a TCE is marked as an FP due to a centroid offset if there is a 3σ significant offset larger than 2'', or a 4σ offset larger than 1''. F. Mullally et al. (2016, in preparation) show that when simulated transits are injected at the catalog positions of Kepler stars, these robotic methods result in <1% of valid PCs being marked incorrectly as FPs.

Note that if there are less than three difference images with a discernible stellar image, then no tests are performed and the TCE is not declared an FP by the centroid module.

3.5. Ephemeris Matching

Another method for detecting FPs due to contamination is to compare the ephemerides (periods and epochs) of TCEs to each other, as well as to other known variable sources in the Kepler field. If two targets have the same ephemeris within a specified tolerance, then at least one of them is an FP due to contamination. Coughlin et al. (2014) used Q1–Q12 data to compare the ephemerides of KOIs to each other and eclipsing binaries known from both Kepler- and ground-based observations. They identified over 600 FPs via ephemeris matching, of which over 100 were not known as FPs via other methods. They also identified four main mechanisms of contamination. The results of Coughlin et al. (2014) were incorporated in Rowe et al. (2015, see Section 3.3). Mullally et al. (2015b, see Section 5.3) slightly modified the ephemeris matching process of Coughlin et al. (2014) and applied it to all of the Q1–Q16 TCEs, as well as to known KOIs and EBs, identifying nearly 1000 TCEs as FPs.

In this Q1–Q17 DR24 catalog, we use the same method as Coughlin et al. (2014), with the modifications of Mullally et al. (2015b, see Section 5.3), to match the ephemerides of all Q1–Q17 DR24 TCEs (Seader et al. 2015) to the following sources.

  • •  
    Themselves.
  • •  
    The list of 7348 KOIs from the NASA Exoplanet Archive cumulative KOI table after the closure of the Q1–Q16 table and publication of the last catalog (Mullally et al. 2015a).
  • •  
    The Kepler EBWG of 2605 true EBs found with Kepler data as of 2015 March 11 (Prša et al. 2011; Slawson et al. 2011; Kirk et al. 2016).
  • •  
    J.M. Kreiner's up-to-date database of ephemerides of ground-based eclipsing binaries as of 2015 March 11 (Kreiner 2004).
  • •  
    Ground-based eclipsing binaries found via the TrES survey (Devor et al. 2008).
  • •  
    The General Catalog of Variable Stars (GCVS Samus et al. 2009) list of all known ground-based variable stars, published 2015 February 06.

Through ephemeris matching, we identify 1910 Q1–Q17 DR24 TCEs as FPs. Of these, 189 were identified as FPs only due to ephemeris matching. We list all 1910 TCEs in Table 2, as this information is valuable for studying contamination in the Kepler field. (Note that each TCE identified consists of its KIC ID and planet number, separated by a dash.) We also list in Table 2 each TCE's most likely parent, the period ratio between child and parent (Prat), the distance between the child and parent in arcseconds, the offset in row and column between the child and parent in pixels (ΔRow and ΔCol), the magnitude of the parent (mKep), the difference in magnitude between the child and parent (ΔMag), the depth ratio of the child and parent (Drat), the mechanism of contamination, and a flag to designate unique situations. In Figure 5, we plot the location of each FP TCE and its most likely parent, connected by a solid line. TCEs are represented by solid black points, KOIs are represented by solid green points, EBs found by Kepler are represented by solid red points, EBs discovered from the ground are represented by solid blue points, and TCEs due to a common systematic are represented by open black points. The Kepler magnitude of each star is shown via a scaled point size. Note that most parent-child pairs are so close together that the line connecting them is not easily visible on the scale of the plot.

Figure 5.

Figure 5. Distribution of ephemeris matches on the focal plane. Symbol size scales with magnitude, while color represents the catalog in which the contaminating source was found. Blue indicates that the true transit is from a variable star only known as a result of ground-based observations. Red circles are stars listed in the Kepler EBWG catalog, green are KOIs, and black are TCEs. Open black points represent TCEs due to a common systematic. Black lines connect false positive matches with the most likely contaminating parent. In most cases, the parent and child are so close that the connecting line is invisible. Note that FP TCEs due to the common systematic are not connected by lines as they are not due to contamination from a variable source.

Standard image High-resolution image

Table 2.  The 1910 Q1–Q17 DR24 TCEs Identified as FPs due to Ephemeris Matches

TCE Parent Prat Distance ΔRow ΔCol mKep ΔMag Drat Mechanism Flag
      ('') (Pixels) (Pixels)          
001295289-01 ... ... ... ... ... ... ... ... Systematic 0
002163326-01 ... ... ... ... ... ... ... ... Systematic 0
002166206-01 3735.01 1:1 8.3 −1 −2 17.64 −4.34 3.4523E+02 Direct-PRF 0
002297793-01 ... ... ... ... ... ... ... ... Systematic 0
002305311-01 002305372-pri 1:1 42.6 2 10 13.82 1.14 6.9390E+03 Direct-PRF 0
002308603-01 ... ... ... ... ... ... ... ... Systematic 0
002309585-01 5982.01 1:1 11.7 −2 1 13.93 1.45 6.5525E+00 Direct-PRF 0
002437112-01 3598.01 1:1 19.7 −5 1 17.63 −1.48 7.0495E+02 Direct-PRF 0
002437112-02 3598.01 1:2 19.7 −5 1 17.63 −1.48 8.2520E+02 Direct-PRF 0
002437112-03 3712.01 1:1 15.1 4 1 16.99 −0.84 7.6798E+02 Direct-PRF 0

Note. A suffix of "pri" in the parent name indicates the object is an EB known from the ground and the child TCE matches to its primary. Similarly, a suffix of "sec" indicates the child TCE matches the secondary of a ground-based EB. Parent names are listed, in priority order when available, by (1) their Bayer designation (e.g., RR-Lyr-pri), (2) their EBWG designation (e.g., 002305372-pri), (3) their KOI number (e.g., 3735.01), and (4) their TCE number (e.g., 002437452-01). A flag of 1 indicates that the TCE is a bastard, which are cases where two or more TCEs match each other via the Direct-PRF contamination mechanism, but neither can physically be the parent of the other via their magnitudes, depths, and distances, and thus the true parent has not been identified. A flag of 2 indicates cases of column anomalies that occur on different outputs of the same module. These cases likely involve cross-talk to carry the signal from one output to another. TCEs due to the common systematic do not have information listed for a parent source, as they are not caused by a single parent.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

The larger number of matches compared to the Q1–Q12 and Q1–Q16 catalogs is predominately the result of a much larger short-period FP population compared to Q1–Q16, and an extended baseline compared to Q1–Q12, coupled with matching all TCEs and not just KOIs. In Q1–Q17 DR24, we identify an additional contamination mechanism, which we label "Common Systematic." As mentioned in Section 2, these are over 200 TCEs that are caused by 3 systematic events that are common to all Kepler CCDs and happen to be equidistant in time with a spacing of ∼459 days.

We also identify 119 examples of "Column Anomaly," which is a previously identified mechanism where a parent is able to contaminate a child at large distances if they both lie on the same column of a CCD. This mechanism is particularly pernicious because it does not result in a visible centroid offset; the apparent location of the transit signal via the difference images coincides with the target. If the parent is not observed by Kepler, then the child could go undetected as an FP due to the column anomaly, as was recently the case for KOI 6705.01 (Gaidos et al. 2016). The large number of examples of column anomaly now available in the Q1–Q17 DR24 catalog reveals the following.

  • •  
    Despite equally searching for matches in row and column, no instance of "row-anomaly" has been found to occur.
  • •  
    The CCDs are read out in the column direction.
  • •  
    In 91.6% of cases, the child is at a higher row number than the parent, and thus the parent's pixels are read out before the child's. (The remaining 8.4% of cases may not have the true parent identified, but rather a sibling, as only the most likely parent is listed, and many parents are unobserved by the spacecraft.)
  • •  
    Most cases show that the depth of the child increases over time.
  • •  
    The effect appears to exhibit seasonal depth variations in most cases.
  • •  
    The average depth ratio between parent and child is a factor of ∼104, and typically the parent and child have similar magnitudes.

Combining these details leads to our conjecture that the column anomaly is due to decreasing charge transfer efficiency over time, likely due to cosmic-ray impacts. When the CCD is read out, some charge from the parent is left behind due to charge transfer inefficiency. As the child is read out and its electrons pass through the pixels where the parent was, the child picks up some of the parent's left-behind electrons. Thus, the variable signal from the parent is induced in the child. As more cosmic-ray impacts accumulate over time, the amount of charge left behind by the parent increases, resulting in an increase in contamination, and thus an increase in the observed depth of the child. Seasonal variation is seen as the parent and child rotate between four CCDs with season, and the amount of degradation varies with CCD. The average depth ratio, along with the delta magnitudes observed, indicate that a charge transfer efficiency of ∼99.99% is consistent with the observed contamination, i.e., a degradation of ∼0.01%. This is well within the range observed on Hubble's Advanced Camera for Surveys and other space-based detectors (see Section 3.7 of Sirianni et al. 2005 and references therein).

4. TCE DISPOSITIONING AND KOI MODELING

The robovetter was run on all 20,367 Q1–Q17 DR24 TCEs. In the following subsections, we describe the process of preparing the input for the robovetter, federating old and designating new KOI numbers, and modeling the KOIs to obtain planetary parameters with robust uncertainties.

4.1. The Robovetter Input

In Table 3, we list each of the 20,367 Q1–Q17 DR24 TCEs and all of the parameters that were used by the robovetter. These include the following:

  • •  
    the period of the TCE in days, epoch in Barycentric Kepler Julian Date (BKJD), and duration of the transit in hours, all from the DV module of the Kepler pipeline;
  • •  
    the MES, and maximum SES value used in determining the MES, from the Kepler pipeline;
  • •  
    the LPP value using the DV detrending (LPPDV) and the alternate detrending (LPPAlt);
  • •  
    the value of the Marshall metric;
  • •  
    the odd/even depth statistic (σOE) for both the DV and alternate detrendings;
  • •  
    the metrics produced from the model-shift uniqueness test (see Section 3.2.3) for both the DV and alternate detrending;
  • •  
    the radius of the planet in R, calculated by multiplying the radius ratio from the DV module of the Kepler pipeline and the stellar radius value from Huber et al. (2014);
  • •  
    the albedo (A), primary depth (Dpri), secondary depth (Dsec), and secondary's phase (Phsec) for the DV and alternate detrendings (see Section 3.3.2);
  • •  
    the disposition value from the centroid module (a value of 1 indicates that a significant centroid offset was detected, while a value of 0 indicates no offset).

Table 3.  The Robovetter Input Parameters for the 20,367 Q1–Q17 DR24 TCEs

TCE Period Epoch Duration SESMax MES LPPDV LPPAlt Marshall ${\sigma }_{{\rm{OE,DV}}}$
  (Days) (BKJD) (hr)            
000757450-01 8.884923 134.452041 2.078 1.130e+02 5.240e+02 2.370e–04 4.100e–05 0.000e+00 0.000e+00
000892667-01 2.262112 132.171131 7.509 3.890e+00 8.037e+00 4.608e–03 1.884e–03 0.000e+00 6.794e–02
000892772-01 5.092598 133.451376 3.399 3.810e+00 1.562e+01 1.337e–03 1.081e–03 0.000e+00 1.692e–01
001026032-01 8.460442 133.774329 4.804 4.957e+02 3.889e+03 4.660e–04 8.300e–05 0.000e+00 1.244e–01
001026032-02 4.230222 133.998093 4.606 1.704e+02 1.440e+03 3.030e–04 3.900e–05 0.000e+00 0.000e+00
001026133-01 1.346292 132.841605 1.626 4.530e+00 1.051e+01 3.540e–03 6.126e–03 0.000e+00 6.815e–02
001026133-02 2.691910 132.267127 5.530 3.320e+00 1.135e+01 4.856e–03 7.933e–03 0.000e+00 6.316e–02
001026957-01 21.761298 144.779125 1.277 2.383e+01 1.034e+02 2.570e–04 1.230e–04 0.000e+00 1.615e–01
001028018-01 0.614378 131.652061 1.448 6.850e+00 1.281e+01 6.614e–03 6.647e–03 0.000e+00 8.698e–04
001160891-01 0.940463 132.400156 3.354 4.010e+00 1.203e+01 7.627e–03 2.202e–03 0.000e+00 7.456e–03

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

4.2. KOI Federation and New KOI Designation

Transit-like signals found over the course of the Kepler mission are given KOI numbers in order to facilitate the tracking of these objects over multiple runs of the Kepler pipeline. Using the same procedure as in Mullally et al. (2015a, see Section 4.1), which examines the number of overlapping in-transit cadences between two ephemerides, we federate 5992 Q1–Q17 DR24 TCEs to existing KOIs.

Given that there were 7348 KOIs known prior to the Q1–Q17 DR24 pipeline run, this indicates, at first glance, a 81.5% KOI recoverability rate. Unrecovered KOIs can be planets in systems with large TTVs, or transit-like systems in regions of parameter space that are affected by Kepler pipeline changes (see Section 2). However, some unrecovered KOIs could have been dispositioned as not transit-like FPs after being promoted to KOI status in previous catalogs, and thus are rightfully not recovered by the pipeline due to additional data and improvements in the data processing and detection algorithms. (As a rule, once a KOI number is designated, it is never removed from the catalog.) Thus, given that there were 6491 transit-like KOIs known prior to the Q1–Q17 DR24 pipeline run, of which 5854 federated, this indicates a 90.2% transit-like KOI recoverability rate.

With respect to the Q1–Q17 DR24 the robovetter, we assign new KOI numbers to nearly all of the transit-like TCEs (i.e., those that were not designated not transit-like) that did not federate with previously known KOIs. New KOIs on stars that had previously associated KOIs were given the same base KOI number with the next-highest unused planet number. New KOIs on stars that did not have any previously associated KOIs were given numbers of 6252.01 and higher. The only TCEs deemed to be transit-like by the robovetter that did not receive KOI numbers were the 25 systems listed in Table 4. These systems were so complicated or unusual (e.g., extreme TTV systems, circumbinary planets, seasonally dependent contamination, severe detrending issues) that the resulting TCEs did not accurately correspond to the underlying transit-like signal. In total, we created 1478 new KOIs, thus extending the total number of KOIs from all of the KOI catalogs to 8826. Note that while developing the robovetter, some KOI numbers were assigned to TCEs that were initially dispositioned as transit-like but, due to code changes, were later dispositioned as not transit-like, and thus there are some new KOIs in this catalog that are dispositioned as not transit-like FPs.

Table 4.  The 25 Anomalous TCEs that were Deemed Transit-like by the Robovetter but Were Not Assigned KOI Numbers

Q1–Q17 DR24 TCE
002157247-01
003098184-01
003650049-01
004247023-01
004384675-04
005983532-01
006462874-01
006462874-02
006762829-03
006762829-04
006964043-01
007024511-01
007918172-01
007918172-02
008009496-01
008414907-01
008435232-01
009032900-02
009902856-01
009957659-01
010223616-01
010743597-04
011513441-01
012644769-03
012644774-01

Download table as:  ASCIITypeset image

In Table 5, we list all 20,367 TCEs, their assigned KOI numbers (if transit-like), their the robovetter dispositions (PC or FP), the values of the four major flags (as described in Section 3), and a comment field that has mnemonic flags that describe the result of each individual the robovetter test. Detailed descriptions for each mnemonic flag are located in Appendix B. Note that every FP will have at least one major flag set, and could have any combination of all four. When both the not transit-like and significant secondary flags are set, it indicates that the TCE corresponds to the secondary eclipse of a system (e.g., TCE 001026032-02 in Table 5). While we do not assign new KOI numbers to TCEs that are dispositioned as secondary eclipses in this catalog, there are pre-existing KOIs that federate with Q1–Q17 DR24 TCEs dispositioned as secondary eclipses. PCs will not have any major flags set, unless the system is a hot Jupiter with a visible secondary eclipse due to planetary reflection and/or thermal emission, in which case the significant secondary flag will be set. This information is also publicly available at the NASA Exoplanet Archive in the Q1–Q17 DR24 KOI table.

Table 5.  The Robovetter Dispositions, Major Flags, and KOI Numbers for the 20,367 Q1–Q17 DR24 TCEs

TCE KOI Disp N S C E Comments
000757450-01 0889.01 PC 0 0 0 0 ...
000892667-01 ... FP 1 0 0 0 LPP_DV_TOO_HIGH
000892772-01 1009.01 FP 0 0 1 0 CLEAR_APO
001026032-01 6252.01 FP 0 1 0 0 SIG_SEC_IN_DV_MODEL_SHIFT—SIG_SEC_IN_ALT...
001026032-02 ... FP 1 1 0 0 THIS_TCE_IS_A_SEC
001026133-01 ... FP 1 0 0 0 LPP_DV_TOO_HIGH—LPP_ALT_TOO_HIGH—ALT_SI...
001026133-02 ... FP 1 0 0 0 LPP_DV_TOO_HIGH—LPP_ALT_TOO_HIGH—ALT_SI...
001026957-01 0958.01 PC 0 0 0 0 KIC_OFFSET
001028018-01 ... FP 1 0 0 0 LPP_DV_TOO_HIGH—LPP_ALT_TOO_HIGH—EYEBALL...
001160891-01 ... FP 1 0 0 0 LPP_DV_TOO_HIGH—DV_SIG_PRI_OVER_FRED_TOO...

Note. For the four major flags, not transit-like is abbreviated as "N," significant secondary is abbreviated as "S," centroid offset is abbreviated as "C," and ephemeris match is abbreviated as "E." The mnemonic flags in the comments column are separated by dashes and described in Appendix B.

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

4.3. KOI Modeling

In order to obtain transit-model fits with robust uncertainties, we model every KOI in the same manner as described in Rowe et al. (2015, see Section 5) and Mullally et al. (2015b, see Section 6.2). To summarize briefly, we fit all of the available PDC data from DR24 at MAST after detrending via a polynomial filter as described in Section 4 of Rowe et al. (2014). We use the transit model of Seager & Mallén-Ornelas (2003), assuming a circular orbit, with the quadratic limb-darkening law of Claret & Bloemen (2011), calculated for the Kepler passband. Uncertainties are calculated using a Markov Chain Monte Carlo (MCMC; Ford 2005) method with four chains of 105 fits each, discarding the first 20% of each chain, to construct the posterior distributions. The transit-model fit parameters are then combined with the stellar parameters to produce planetary parameters. The MCMC chains are publicly available and documented in Rowe & Thompson (2015).

KOIs that existed prior to Q1–Q17 DR24 were not re-fit in this work, and thus use stellar values from the Q1–Q16 stellar catalog (Huber et al. 2014) and contain values for their fit parameters identical to those given in Mullally et al. (2015a). Newly designated KOIs are fit using the DR24 light curves and use stellar values from the updated Q1–Q17 DR24 stellar catalog (Huber 2014). The best-fit value and 1σ uncertainties of each parameter are listed at the NASA Exoplanet Archive, along with the MCMC chains themselves. Note that not all KOIs could be modeled, which typically occurs when the polynomial filter (a separate detrending used specifically for the MCMC fitting) does not recover the transit events with sufficient S/N. These cases are designated in the KOI catalog by a value of "none" for the "fittype" parameter, and only the period, epoch, and duration of the federated TCE are reported.

5. ANALYSIS OF THE Q1–Q17 DR24 CATALOG

In order to be confident that the robovetter is properly reproducing the results of human TCERT members, it is informative to compare the Q1–Q17 DR24 KOI catalog to past KOI catalogs. Also, there are several ancillary Kepler catalogs that provide valuable checks on the quality of the KOI catalog. The injection of artificial transits into the Kepler pixel-level data also provides a valuable diagnostic of the performance of the robovetter and the completeness of the KOI catalog. Examining the results with respect to single- and multi-planet systems is yet another check to ensure the fidelity of the catalog. Finally, detecting potentially rocky planets that are possibly in the habitable zone of their host star is Kepler's primary science goal, and as such those candidates are given extra scrutiny.

5.1. Comparison to Past KOI Catalogs

Of the 5854 transit-like KOIs that existed prior to the Q1–Q17 DR24 activity and were detected as TCEs by the Q1–Q17 DR24 Kepler pipeline, 5700 were dispositioned by the robovetter as transit-like, yielding a 97.4% transit-like KOI recoverability rate for the robovetter. Similarly, of the 3772 pre-existing PCs that were re-detected, 3654 were dispositioned as PCs, yielding a 96.9% PC the robovetter recoverability rate. Finally, of the 2220 pre-existing FPs that were re-detected, 1983 were dispositioned by the robovetter as FPs, yielding a 89.3% FP the robovetter recoverability rate.

Compared to past catalogs, the dispositions of 118 KOIs changed from PC to FP, and 237 went from FP to PC. Examining these KOIs, we note that many changed dispositions due to the robovetter out-performing the human vetters. For example, the robovetter reliably detects very small secondary eclipses that the humans tended to miss. Also, the robovetter does not declare FPs based on transit depth alone, which was a directive given to the human vetters, but not followed by all of the vetters. Thus, the Q1–Q17 DR24 catalog contains more PCs with very deep depths compared to previous catalogs. We also note that the Q1–Q6 and Q1–Q8 catalogs were not solely based on the TCE list from the Kepler pipeline, and included KOIs found by other transit search techniques as well as manual light curve inspection.

5.2. Comparison to Ancillary Kepler Catalogs

5.2.1. The Eclipsing Binary Working Group Catalog

The EBWG catalog is the result of years of effort by EBWG to identify and classify every EB observed by Kepler (Prša et al. 2011; Slawson et al. 2011; Kirk et al. 2016), and provides a valuable test of the efficiency of the robovetter in detecting EBs. We searched the EBWG catalog for systems with visible secondaries, since  the robovetter is designed to only disposition EBs as FPs if a distinct, significant secondary event is detected. (FPs are purposely not designated based on depth or inferred size alone—see Section 3.3.) At the time of closing the Q1–Q17 DR24 KOI table, there were 933 detached eclipsing binaries in the EBWG catalog with a distinct secondary eclipse, as defined by a EBWG morphology parameter of less than 0.6 (Matijevič et al. 2012) and a secondary eclipse that is either offset from phase 0.5 by at least 0.01 phase or has a depth at least 10% different than the primary.

Of these 933, 894 are detected as TCEs by the Q1–Q17 DR24 Kepler pipeline, yielding a Kepler pipeline EB detection efficiency of 95.8%. Examining the 39 that were not detected, they appear to have either (1) very low S/N, (2) very short periods and shapes such that the harmonic remover may have suppressed their signal, or (3) extremely long periods such that less than three primary transits are visible, as is required for a TCE detection. Of the 894 that were detected as TCEs,  the robovetter designates 805 as FPs specifically due to a significant secondary, yielding a the robovetter EB detection rate of 90.0%. Of the 89 that the robovetter did not explicitly label EB, 40 were labeled not transit-like FPs, principally by the LPP metric, thus still yielding a the robovetter FP detection rate of 94.5%. The remaining 49 systems were principally called PCs due to either (1) detrending that significantly suppressed the depth of the secondary or (2) detection by the Kepler pipeline at half the orbital period, with the resulting odd/even difference not detected by the robovetter.

We also note that EBWG often draws upon the results of the TCERT vetting from each catalog, and after performing their own vetting procedure, may incorporate them into the EBWG catalog. Prior to closing the Q1–Q17 DR24 KOI table, we found that the robovetter had identified several hundred TCEs as on-target EBs that were not yet cataloged by EBWG. The list of these potentially new EBs was sent to EBWG, who then incorporated many of them into the EBWG catalog, prior to performing the comparison above.

5.2.2. The False Positive Working Group Catalog

The False Positive Working Group (FPWG) is manually vetting every KOI previously identified as an FP, along with a subset of PCs, to create the FPWG catalog (Bryson et al. 2015). Unlike TCERT, FPWG takes a best-knowledge approach, using any and all available pieces of information to vet each KOI, including follow-up observations. This also includes designating FPs on the basis of transit depth, or inferred planetary radius, alone. We select the 1346 certified FPs from the FPWG table, at the time of closing the Q1–Q17 DR24 KOI table, which federate to Q1–Q17 DR24 TCEs and have inferred planetary radii of less than 25 R. Of the 1346, the robovetter designates 1253 as FPs, yielding a 93.1% rate of agreement.

5.2.3. The Kepler Autovetter

Another ancillary catalog generated for the Q1–Q17 DR24 activity is the "autovetter" catalog (Catanzarite 2015; McCauliff et al. 2015), which uses a random forest machine learning approach to automatically classify TCEs based on training sets from previous KOI catalogs, using metrics from both DV and TCERT. It classifies each TCE into one of three categories: PC, astrophysical false positive (AFP), or non-transiting phenomenon (NTP). It defines PCs as TCEs that are consistent with a transiting planet, AFPs as TCEs that are due to detached or contact eclipsing binaries, pulsating stars, starspots, and other periodic signals of astrophysical origin, and NTPs as TCEs that are of instrumental or systematic origin.

There are 3900 Q1–Q17 DR24 TCEs that the autovetter labels as PCs, of which the robovetter designates 3775 as PCs, for an agreement rate of 96.8%. There are 16,467 TCEs that the autovetter labels AFPs or NTPs, of which the robovetter designates 15,944 as FPs, for an agreement rate of 96.8%. However, it is difficult to compare the AFP and NTP categories to any of the four major the robovetter FP flags, as the robovetter considers contact eclipsing binaries, pulsating stars, starspots, and other quasi-sinusoidal signals, along with instrumental noise, to be not transit-like, while the autovetter only considers instrumental noise to be non-transiting phenomenon.

5.2.4. Planet Hunters

As part of the Zooniverse citizen science platform (Simpson et al. 2014), Planet Hunters (PH) is a project where humans visually check Kepler light curves to search for transit signals, especially those not detected by the Kepler pipeline. We compiled a list of 63 PCs published by Planet Hunters (Fischer et al. 2012; Lintott et al. 2013; Schwamb et al. 2013; Wang et al. 2013; Schmitt et al. 2014a, 2014b) and compare them to the Q1–Q17 DR24 KOI Catalog. Of the 63, 38 were detected as TCEs at the period identified by PH and were dispositioned as PC, 4 were detected as TCEs at the period identified by PH but dispositioned as FP, 8 had TCEs detected around the same target but not at the period identified by PH, and 13 had no TCEs detected around the target. Of the four that were identified at the same period but were declared FPs by the robovetter, one was deemed not transit-like due to the Marshall metric, one was deemed to have a secondary eclipse by the model-shift test on the DV detrending, and the other two were deemed to have centroid offsets. The remaining PH candidates appear to mostly be planets around binary stars and in multi-planet systems with strong TTVs, or have very long periods such that three transits may not be visible, and thus are not expected to be detected as TCEs by the Kepler pipeline. However, these systems are extremely interesting scientifically, and so the PH work highlights the importance of manual inspection in a data set as rich and complex as that from Kepler.

5.2.5. Confirmed Planets

The NASA Exoplanet Archive designates some KOIs as "confirmed planets" based on the results of follow-up observations published in the literature. The follow-up observations may directly determine the mass of the planet via radial-velocity measurements, statistically validate the planet by fully characterizing the host star and any possible nearby sources of contamination, or in any other way demonstrate evidence for a planetary origin of the transit signal at the ∼99% confidence level. The designation of confirmed planets by the Exoplanet Archive is completely independent of the PC/FP disposition given by TCERT, which is based solely on Kepler data.

Of the 985 confirmed Kepler planets that were listed in the NASA Exoplanet Archive, at the time of closing the Q1–Q17 DR24 KOI table, and that federate with Q1–Q17 DR24 TCEs, the robovetter designates 976, or 99.1%, as PCs. Of the nine confirmed planets that were designated as FPs, two were dispositioned not transit-like, four were dispositioned as having significant secondaries, and three were dispositioned as having a centroid offset. For the two FPs due to being not transit-like, one failed due to the LPP test and the other due to the model-shift uniqueness test, in both cases using the DV detrending. Upon manual inspection, these transit signals seem to be distorted in the DV detrending so that they no longer appear to be transit-like, probably because of DV's harmonic remover. For the four FPs due to significant secondaries, two appear to be caused by poor detrending that mimicked the appearance of a secondary, and two are due to remaining systematics from strong TTVs. For the three FPs due to centroid offsets, two appear to be due to systematics resulting from strong TTVs in multi-planet systems, and the other one is due to the very large proper motion of the target, which is a late-type M dwarf.

In all nine cases, we conclude that these confirmed planets should have been dispositioned as PCs. While we will strive to further improve the robovetter to disposition these confirmed systems correctly, overall, these nine systems are outliers with no consistent cause, and the robovetter is very efficiently (>99%) dispositioning confirmed planets as PCs.

5.3. Artificial Transit Injection

The primary method of measuring the efficiency of a transit detection pipeline is to inject artificial transit signals into the calibrated pixel-level data, with a range of parameters, and determine what fraction are detected as a function of those parameters. For the Kepler pipeline, Christiansen et al. (2013a) measured the detection efficiency of individual transit events, finding that they were generally recovered to a high fidelity of ∼99.7%. Christiansen et al. (2015) then extended this work by injecting full time-series transit signals into a year of Kepler data, and were able to map out the actual Kepler pipeline recoverability rate as a function of MES, which is crucial for accurately determining planet occurrence rates (Burke et al. 2015).

In Q1–Q17 DR24, artificial transits were injected into the entire Kepler data set at the pixel level, one injected transiting planet signal per star, with periods between 0.5 and 500 days and planetary radii between 0.25 and 20 R (Christiansen 2015). A small fraction of these were purposely injected up to ∼10'' away from the target star in order to simulate FPs due to a centroid offset. The exact same version of the Kepler pipeline that produced the Q1–Q17 DR24 TCEs was used to search this injected data set. In total, there were 42,264 injected signals detected by the Kepler pipeline, which we refer to as "injTCEs."

We disposition the injTCEs with the exact same version of the robovetter that was used to disposition the Q1–Q17 DR24 TCEs (Coughlin 2015b). In Table 6, we list each injTCE via its KIC number, along with the resulting the robovetter disposition, major flags, and all injected and recovered parameters. In Figure 6, we plot the fraction of on-target injTCEs that were labeled as PCs by the robovetter (the PC recovery fraction) as functions of their MES, period, planetary radius, planetary insolation flux, stellar radius, and stellar temperature.

Figure 6.

Figure 6. Fraction of injected transit signals, recovered by the Kepler pipeline (i.e., injTCEs), that were labeled as PCs by the robovetter. White areas represent bins where no injTCEs were detected. Top left: the PC recovery fraction as a function of MES. Top right: the PC recovery fraction as a function of period. Middle left: the PC recovery fraction as a function of period and MES. Middle right: the PC recovery fraction as a function of period and planet radius. Bottom left: the PC recovery fraction as a function of host star radius and temperature. Bottom right: the PC recovery fraction as a function of planet radius and insolation flux. Note that the insolation flux was calculated via S = (Teq/255)4, where S is the insolation flux relative to the Earth, Teq is the equilibrium temperature of the planet in Kelvin as calculated by the Kepler pipeline, and 255 K is the Earth's equilibrium temperature.

Standard image High-resolution image

Table 6.  Injected TCEs, the Robovetter Dispositions, and Significant Parameters

KIC Disp N S C E Skygroup injPeriod injEpoch injDepth injDuration
              (Days) (BKJD) (ppm) (hr)
1701692 PC 0 0 0 0 71 357.1302 54933.9512 1224 7.31
1719026 PC 0 0 0 0 84 92.2382 54912.5810 120 7.79
1719262 PC 0 0 0 0 84 267.0962 55037.5158 164 10.29
1719371 PC 0 0 0 0 84 96.4779 54934.2982 1324 3.43
1719472 PC 0 0 0 0 84 287.7573 55174.7539 533 12.04
1719550 FP 0 1 0 0 84 80.8994 54908.5248 238 5.75
1719927 PC 0 0 0 0 84 282.5501 54980.8943 5009 10.58
1720670 PC 0 0 0 0 84 84.3903 54926.2464 1117 5.24
1721110 PC 0 0 0 0 84 3.2179 54900.7105 70 5.02
1721133 PC 0 0 0 0 84 132.4065 54911.5644 1779 8.53

Only a portion of this table is shown here to demonstrate its form and content. A machine-readable version of the full table is available.

Download table as:  DataTypeset image

The robovetter dispositioned 34,210 of the 35,917 injTCEs without centroid offsets as PCs, yielding a 95.25% pass rate. Examining Figure 6, specifically the top left panel, it can be seen that the PC recovery fraction increases with increasing MES. (Note that the Kepler pipeline has a minimum detection threshold of 7.1 MES, and very few transit signals were injected with MES greater than 100.) While very low MES detections pass ∼90% of the time, the highest MES detections pass ∼98% of the time, as the vetting metrics become more reliable at higher MES values. Examining the top right panel, the PC recovery fraction increases with decreasing period. (Note that no signals were injected with periods greater than 500 days.) These two trends can also be seen in the middle left panel where the PC recovery fraction is shown as a function of both period and MES, and in the middle right panel where the PC recovery fraction is shown as a function of planet radius and period. The bottom left panel indicates that planets around higher-temperature and more evolved stars, particularly the instability strip at ∼7500 K, may also have decreased PC recovery fractions compared to cooler, main-sequence stars, likely due to increased systematic noise from stellar pulsation that is not fully corrected by either of the two detrendings employed by TCERT.

Finally, we examine the PC recovery fraction of on-target injTCEs with radius (Rp) and insolation flux (Sp) values within 25% of that of Earth's values (0.75 > Rp > 1.25 R and 0.75 > Sp > 1.25 S). There are 118 on-target injTCEs that meet these Rp and Sp criteria, of which 116 are designated as PCs by the robovetter, therefore yielding a 98.3% PC recovery fraction. This can be seen graphically in the bottom right panel of Figure 6, where the area around Earth's values (1.0 R, 1.0 S) shows a very high PC recovery fraction. If we add the additional constraint that the host star's effective temperature (T) is within 500 K of the Sun's (5300 < T < 6300 K), in addition to the previous radius and insolation flux constraints, then the TCERT detection efficiency is 96.1%, as 49 of 51 on-target injTCEs that meet these criteria are designated as PCs.

Note that one could make a the robovetter with a 100% detection efficiency by simply passing every TCE as a PC—however, this would be a very poor the robovetter, as it would not identify any FPs! We have specifically designed the robovetter to identify as many FPs as possible while still correctly identifying at least ∼95% of true planetary signals. This means that correcting for the robovetter's detection efficiency will only affect derived occurrence rates at the ∼5% level for the entire population, which is small compared to other systematic effects that influence the determination of planetary occurrence rates (see Figure 10 of Burke et al. 2015). We note that specific regions of interest may have higher or lower detection efficiencies.

At present (i.e., for DR24), we do not have a complete measure of how many true, underlying FPs that the robovetter dispositions as PCs. This injection-only run included signals purposely injected off-target to simulate FPs due to centroid offsets, and found a ∼50% detection rate at a separation of 2'' (0.5 pixels) when recovered with an MES of 20 (F. Mullally et al. 2016, in preparation). To assess other types of FPs, we recommend (1) injecting EB signals to simulate FPs due to significant secondaries, (2) inverting the light curve and performing a transit search to simulate the population of not transit-like FPs, operating under the general observation that most not transit-like FPs tend to be symmetrical, and (3) shuffling the Kepler data by season and performing a transit search to simulate long-period FPs. Such activities are likely vital to fully evaluate the FP rate of the Kepler pipeline and the robovetter, and thus determine accurate occurrence rates, especially for those with radii and insolation fluxes comparable to the Earth.

5.4. Systems With Multiple PCs

In the Q1–Q17 DR24 catalog, there are a total of 4293 PCs in 3355 systems. Of these systems, 636 contain 2 or more PCs, with a total of 1632 PCs in multi-PC systems. Compared to past catalogs, looking at systems that increased in PC count, we find the following:

  • •  
    47 systems went from 1 $\to $ 2 PCs;
  • •  
    1 system went from 1 $\to $ 3 PCs;
  • •  
    9 systems went from 2 $\to $ 3 PCs;
  • •  
    4 systems went from 3 $\to $ 4 PCs;
  • •  
    1 system went from 4 $\to $ 5 PCs.

The system of five PCs, KOI 4032, appears to be a particularly interesting compact, multi-planet system, as all five PCs have periods between 2.9 and 7.2 days, with inferred radii between 0.8 and 1.0 R, around a solar-type star (5575 K, 1.06 R).

Of the 7470 KOIs in the Q1–Q17 DR24 catalog, 5864 are in single KOI systems, and 1786 are in multi-KOI systems (at least two KOIs associated with the same target). Of the 5864 KOIs in single systems, 2661 are dispositioned as PCs and 3023 as FPs, yielding a 51.6% FP rate. Of the 1786 KOIs in multiple systems, 1632 are dispositioned as PCs and 154 as FPs, yielding a 8.6% FP rate. The lower FP rate is expected for multi-KOI systems, as systems with multiple KOIs are more likely to contain actual PCs (Rowe et al. 2014). While expected, this analysis provides a valuable check that the robovetter is not dispositioning a significant number of KOIs as FPs simply due to the fact they are in multi-KOI systems.

5.5. Potentially Rocky Planets in the Habitable Zone

In Figure 7, we plot every Q1–Q17 DR24 TCE that was dispositioned as a PC by the robovetter as a function of its inferred planetary radius (Rp) and insolation flux (Sp). We also utilize point size to represent the S/N of each candidate, and the color of the point to indicate the effective temperature of the host star. We use vertical dashed lines to indicate the insolation flux levels of Mars and Venus as a broad guide to a potential habitable zone. We use a horizontal dashed line to mark the radius at which a planet has about an even chance of being a terrestrial, rocky planet (Rogers 2015).

Figure 7.

Figure 7. Plot of planet radius vs. insolation flux for all of the planet candidates known in the Q1–Q17 DR24 KOI catalog. (Note that some planet candidates, particularly those at large radii, lie outside the chosen axis limits for the plot, and thus are not shown.) The temperature of the host star is indicated via the color of each point, and the signal to noise of the detection is indicated via the size of each point. The two vertical dashed lines indicate the insolation flux values of Mars and Venus as a broad guide to a potential habitable zone. The horizontal dotted line is set at 1.6 R as a suggested guide to where roughly half of the planets are expected to be rocky (Rogers 2015).

Standard image High-resolution image

As can be seen, while there are thousands of PCs, only a small percentage lie within the potential habitable zone. Smaller planets with lower insolation flux levels are predominately found around late-type stars. This is primarily an observational bias, as planets with shorter periods and larger radii relative to their host stars are more easily detected. Many of the small, low insolation flux planets have an S/N of ∼10 or less. In this low S/N regime, the odds of the TCE being an FP that is undetectable by the robovetter is enhanced, and thus these candidates should continue to be treated with caution. More work is needed to obtain a quantitative measure of the rate of undetected low S/N FPs residing in the catalog.

Potentially rocky, habitable planets are the most important targets for follow-up observations to determine the frequency of Earth-size planets in the habitable zones of other stars. In Table 7, we list all of the PCs in the Q1–Q17 DR24 catalog that have Rp < 2.0 R and Sp < 2.0 S. We list the values for their transit-model S/N, inferred planet radius, insolation flux, and host star effective temperature and radius from the Q1–Q17 DR24 KOI catalog. In addition, as discussed in Section 5.2.5, the NASA Exoplanet Archive maintains a list of KOIs that have been confirmed as planets via follow-up observations and/or statistical analyses, and assigns them Kepler confirmed planet numbers, e.g., Kepler-1b. (Again, note that the KOI PC/FP disposition and the NExScI confirmed planet designation are completely independent.) If the planet is listed as a confirmed planet at the NASA Exoplanet Archive confirmed planets table, we also list its Kepler confirmed planet number, reference for the confirmation, and values for the planetary radius, planetary insolation flux, and host star effective temperature and radius from the reference. If insolation flux was not given in the reference, then we derive it from other values given in the reference via

Equation (9)

where a is the semimajor axis of the planet's orbit in au, T is in Kelvin, 5777 K is the effective temperature of the Sun, and Sp and Rp are in Earth units.

Table 7.  Small Planet Candidates Potentially in the Habitable Zone in the Q1–Q17 DR24 Catalog (Rp < 2.0 R and Sp < 2.0 S)

    Catalog Values   Confirmed Values  
KOI S/N Rp S T R Confirmed Rp S T R Reference
    (R) (S) (K) (R) Name (R) (S) (K) (R)  
172.02 20.7 1.74 1.59 5637 0.94 Kepler-69c 1.71 1.92 5638 0.93 Barclay et al. (2013)
438.02 36.9 1.76 1.28 3985 0.54 Kepler-155c 2.24 2.43 4508 0.62 Rowe et al. (2014)
463.01 72.1 1.57 1.26 3387 0.30 ... ... ... ... ... ...
571.05 12.4 1.06 0.25 3761 0.46 Kepler-186f 1.17 0.30 3755 0.52 Torres et al. (2015)
701.03 45.0 1.73 1.17 4797 0.65 Kepler-62e 1.61 1.19 4925 0.64 Borucki et al. (2013)
701.04 18.1 1.42 0.41 4797 0.65 Kepler-62f 1.41 0.42 4925 0.64 Borucki et al. (2013)
775.03 29.0 1.80 1.91 3898 0.54 Kepler-52d 1.95 2.81 4263 0.56 Rowe et al. (2014)
812.03 28.1 1.94 1.16 3887 0.48 Kepler-235e 2.22 1.96 4255 0.55 Rowe et al. (2014)
854.01 30.6 1.96 0.64 3593 0.47 ... ... ... ... ... ...
947.01 54.6 1.88 1.80 3750 0.46 ... ... ... ... ... ...
a1126.02 13.8 1.80 0.21 5209 0.59 ... ... ... ... ... ...
1422.02 34.4 1.65 1.73 3517 0.37 Kepler-296d 2.09 2.90 3740 0.48 Barclay et al. (2015)
1422.04 17.0 1.23 0.37 3517 0.37 Kepler-296f 1.80 0.62 3740 0.48 Barclay et al. (2015)
1422.05 14.0 1.08 0.84 3517 0.37 Kepler-296e 1.53 1.41 3740 0.48 Barclay et al. (2015)
1681.04 10.6 0.77 1.63 3669 0.35 ... ... ... ... ... ...
1989.01 32.0 1.84 1.83 5804 0.84 ... ... ... ... ... ...
2124.01 21.6 1.00 1.84 4029 0.55 ... ... ... ... ... ...
2184.02 9.2 1.89 1.63 4893 0.65 ... ... ... ... ... ...
2418.01 16.7 1.12 0.35 3724 0.41 ... ... ... ... ... ...
2529.02 12.8 1.90 1.28 4299 0.51 Kepler-436b 2.73 1.69 4651 0.70 Torres et al. (2015)
2626.01 16.2 1.12 0.65 3482 0.35 ... ... ... ... ... ...
2650.01 14.1 1.25 1.14 3735 0.40 Kepler-395c 1.32 2.97 4262 0.56 Rowe et al. (2014)
2719.02 14.0 1.72 1.99 4827 0.82 ... ... ... ... ... ...
3010.01 16.6 1.56 0.93 3903 0.52 ... ... ... ... ... ...
3138.01 10.8 0.57 0.47 2703 0.12 ... ... ... ... ... ...
3255.01 27.0 1.37 1.78 4427 0.62 Kepler-437b 2.14 2.15 4551 0.68 Torres et al. (2015)
3282.01 17.9 1.97 1.30 3894 0.54 ... ... ... ... ... ...
3284.01 16.4 0.98 1.31 3688 0.46 Kepler-438b 1.12 1.40 3748 0.52 Torres et al. (2015)
4036.01 25.6 1.83 1.02 4893 0.76 ... ... ... ... ... ...
4054.01 27.3 1.99 1.41 5380 0.78 ... ... ... ... ... ...
4060.01 27.3 1.96 1.82 5984 0.89 ... ... ... ... ... ...
4087.01 23.9 1.47 0.39 3813 0.48 Kepler-440b 1.86 1.20 4134 0.56 Torres et al. (2015)
4356.01 16.5 1.91 0.29 4366 0.46 ... ... ... ... ... ...
4427.01 13.7 1.47 0.17 3668 0.43 ... ... ... ... ... ...
4450.01 15.1 1.98 1.38 5536 0.82 ... ... ... ... ... ...
4550.01 12.5 1.73 1.04 4771 0.70 ... ... ... ... ... ...
4622.01 13.9 1.93 0.34 4243 0.63 Kepler-441b 1.64 0.21 4340 0.55 Torres et al. (2015)
4742.01 12.5 1.56 1.08 4569 0.65 Kepler-442b 1.34 0.66 4402 0.60 Torres et al. (2015)
5202.01 8.8 1.83 0.63 6014 0.96 ... ... ... ... ... ...
5236.01 22.5 1.98 0.79 6241 1.03 ... ... ... ... ... ...
b5475.01 28.0 1.66 0.68 6070 0.81 ... ... ... ... ... ...
5856.01 12.7 1.70 1.47 5906 0.85 ... ... ... ... ... ...
c6343.01 9.9 1.90 0.61 6117 0.95 ... ... ... ... ... ...
c6425.01 8.7 1.50 0.68 5942 0.95 ... ... ... ... ... ...
6676.01 10.3 1.81 1.18 6553 0.96 ... ... ... ... ... ...
6971.01 12.2 1.60 1.66 4989 0.79 ... ... ... ... ... ...
7016.01 11.8 1.13 0.56 5578 0.79 Kepler-452b 1.63 1.10 5757 1.11 Jenkins et al. (2015)
7179.01 8.2 1.18 1.29 5845 1.20 ... ... ... ... ... ...
7223.01 9.1 1.53 0.57 5370 0.73 ... ... ... ... ... ...
c7235.01 8.6 1.15 0.75 5606 0.76 ... ... ... ... ... ...
c7470.01 8.9 1.90 0.60 5128 0.99 ... ... ... ... ... ...
c7554.01 8.1 1.98 1.12 6315 1.09 ... ... ... ... ... ...
c7567.01 11.3 1.46 0.10 4486 0.65 ... ... ... ... ... ...
c7591.01 8.2 1.30 0.33 4906 0.67 ... ... ... ... ... ...
7592.01 10.4 1.55 0.07 3761 0.53 ... ... ... ... ... ...

Notes. KOIs with confirmed planet numbers have been confirmed as planetary in nature either via ground-based follow-up observations or statistical analyses. In these cases, we list the confirmed Kepler planet number, the confirmed values for the planet's radius, insolation flux, and stellar effective temperature and radius, and reference for the confirmation study.

aKnown to be a false positive via manual inspection. bModeled at twice the true orbital period. cLikely to be an FP due to low-amplitude systematics given detailed manual vetting of the PDC light curves.

A machine-readable version of the table is available.

Download table as:  DataTypeset image

There are a number of PCs in Table 7 that are new in the Q1–Q17 DR24 catalog. A much larger fraction of them orbit solar-like stars compared to previously known PCs in the table, as well as having lower insolation flux values. Specifically, KOIs 6343.01, 6425.01, 7016.01, 7223.01, 7235.01, and 7470.01 have inferred radii between 1.13 and 1.90 R, insolation fluxes between 0.56 and 0.75 S, and orbit stars with T between 5128 and 6117 K. We first note that they are generally also at lower S/N compared to previously known PCs which, coupled with being in single systems, puts them at higher risk for being undetected, low S/N FPs. We also note that if any of them are confirmed to be planets by subsequent observations and analyses, then their resulting radii and insolation fluxes could change significantly as a result of more accurate stellar parameters. However, the fact that there are a significant number of new PCs that orbit Sun-like stars and have insolation fluxes even less than that of Earth's represents great progress by the Kepler mission in determining the fraction of Earth-size planets in the habitable zones of Sun-like stars. In order to facilitate the prioritization of follow-up observations, in the subsections below, we examine in detail each PC from Table 7 that is new in the Q1–Q17 DR24 catalog. We utilize both the TCERT vetting forms (Coughlin 2015a, publicly available for every Q1–Q17 DR24 TCE at the Exoplanet Archive), as well as the PDC data from MAST.

5.5.1. KOI 1126.02

KOI 1126.02 is a new KOI that is not correctly dispositioned by the robovetter. KOI 1126 (KIC 006307521) is contaminated by a nearby EB with a period of 29.745 days and a clearly visible secondary that is about half as deep as the primary. The first TCE produced by the Kepler pipeline, 006307521-01, which federates to KOI 1126.01, is detected at a period of 29.745 days, and the robovetter correctly dispositions it as an FP due to a significant secondary, centroid offset, and ephemeris match. After removing the primary transits, the Kepler pipeline re-searched the data and detected a second TCE, 006307521-02, at a period of 475.954 days, or ∼16 times the first TCE's and EB's period, corresponding to a subset of just three of the EB's secondary eclipses.

As it did appear to be transit-like, TCE 006307521-02 was designated as the new KOI 1126.02. However, the detected period was off enough from an exact 16:1 ratio that it just barely failed to period match to either the previous TCE or the parent EB. Also, the robovetter centroid module has a safeguard to protect low S/N PCs, where no TCE is designated as an FP if it does not have at least three valid centroid measurements. The middle of the three events for 006307521-02/KOI 1126.02 fell close enough to a data gap to prevent a valid centroid measurement, and thus the object was passed by the centroid module. However, we note that the Q1–Q17 DR24 KOI catalog prioritizes uniformity over the accuracy of individual targets, and this example shows why it is prudent to manually inspect the Q1–Q17 DR24 TCERT vetting forms (Coughlin 2015a) before committing precious telescope time to observing individual high value targets.

5.5.2. KOI 1681.04

KOI 1681.04 appears to be a strong PC with an inferred sub-Earth-size of 0.77 R in a ∼22 day orbit around a late M dwarf (0.35 R, 3669 K), resulting in an insolation flux 1.6 times that of Earth. There are three previously known PCs in this system with shorter periods and radii of 0.69, 0.71, and 0.99 R. The existence of this new candidate in a multi-planet system lends higher confidence to it being a real planet (Rowe et al. 2014). This new candidate was also recently detected and published by Dressing & Charbonneau (2015).

5.5.3. KOI 2719.02

KOI 2719.02 was first identified as a KOI in the Q1–Q16 KOI catalog (Mullally et al. 2015a), but was considered to be not transit-like and dispositioned as an FP. KOI 2719.02 was re-detected as a Q1–Q17 DR24 TCE and dispositioned as a PC by the robovetter. Manually examining the TCERT diagnostics, KOI 2719.02 does indeed appear to be a strong PC. It is possible that detrending differences are responsible for the vetting differences between catalogs, as the star does have strong variability, though not near the periods of either KOI in the system. With a radius of 1.72 R and an insolation flux of 1.99 S, given its period of 106 days around a 0.82 R, 4827 K star, it is likely that KOI 2719.02 lies interior to the habitable zone, but still forms part of an interesting multi-planet system given that the inner candidate, KOI 2179.01, has a nearly identical size of 1.71 R, though with an insolation flux of 152 S.

5.5.4. KOI 5475.01

KOI 5475.01 was first detected as a Q1–Q16 TCE at a period of 448 days and dispositioned as an FP due to a significant secondary (Mullally et al. 2015a). In the Q1–Q17 DR24 catalog, KOI 5475.01 was detected as a TCE at a period of 224 days and was dispositioned by the robovetter as a PC. Manual inspection confirms that there is no discernible odd–even difference in Q1–Q17 DR24, and thus the Q1–Q16 TCE detection was at twice the true orbital period, resulting in a perceived secondary of identical depth and width at a phase of 0.5 during the Q1–Q16 vetting. While the period of this candidate is 224 days, it was first identified in Q1–Q16 and modeled with a period of 448 days (see Section 4.3). Thus, the resulting insolation flux value of 0.68 S given in the Q1–Q17 DR24 catalog is too low, and should actually be 1.71 S, while the radius of 1.66 R is still correct. This candidate also forms part of an interesting multi-planet system with the inner candidate, KOI 5475.02, at a radius of 0.54 R and insolation flux of 230 S, both around a 0.81 R, 6070 K host star.

5.5.5. KOI 6343.01

KOI 6343.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.90 R and insolation flux of 0.61 S, given its period of 569 days around a 0.95 R, 6117 K star. Manual inspection of its PDC light curve reveals that, of its three transit events, the second event is likely a low-amplitude SPSD and the other two events may be smaller amplitude systematic events. The SPSD feature is not as readily visible in either the DV or alternate detrending, though it is perhaps not surprising given the KOI's low S/N of 9.9. We note the value of the Marshall metric is 8.1 for KOI 6343.01, which is very close to the threshold value of 10.0, above which the robovetter classifies TCEs/KOIs as FPs due to systematic events. For clarity, in Figure 8, we plot the distribution of the Marshall metric for injTCEs (see Section 5.3) where it can be seen that very few injected transiting planets have Marshall metrics near 10 or higher, even at low S/N. Overall, we deem it very likely that this object is actually a result of low-amplitude systematics.

Figure 8.

Figure 8. Distribution of Marshall metric values for the injected TCEs. The red line represents all injected TCEs with computed Marshall metrics, while the blue line represents the subset of those with an S/N less than 10. The vertical dashed line represents the value above which the robovetter dispositions TCEs as FPs. Note that very few injected TCEs have Marshall values above this cutoff value.

Standard image High-resolution image

5.5.6. KOI 6425.01

KOI 6425.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.50 R and an insolation flux of 0.68 S, given its period of 521 days around a 0.95 R, 5942 K star. Manual inspection of its PDC light curve reveals that of its three transit events, the first event (in Q2) is likely due to a low-amplitude SPSD and the second event (in Q7) may be due to an edge effect. We note the value of the Marshall metric is 7.8 for KOI 6425.01, which is very close to the threshold value of 10.0, above which the robovetter classifies TCEs/KOIs as FPs due to systematic events.

5.5.7. KOI 6676.01

KOI 6676.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.81 R and insolation flux of 1.18 S, given its period of 439 days around a 0.96 R, 6553 K star. Manual inspection of its PDC light curve reveals no systemic source of the signal for any of its three transits, and its Marshall metric value is 0.95, which is well below the FP threshold of 10.0. Thus, KOI 6676.01 appears to be a reliable PC, though we note it has an S/N of 10.3.

5.5.8. KOI 6971.01

KOI 6971.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.60 R and insolation flux of 1.66 S, given its period of 129 days around a 0.79 R, 4989 K star. Manual inspection reveals this to be a strong PC with 10 observed transits.

5.5.9. KOI 7016.01

KOI 7016.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.19 R and insolation flux of 0.56 S, given its period of 385 days around a 0.79 R, 5578 K star. Given these catalog parameters, it represents one of the most Earth-like PCs in the sample, at least in terms of size, insolation flux, and solar-type host star. This KOI was recently designated Kepler-452b as it was validated by Jenkins et al. (2015) via spectroscopic follow-up of the host star and statistical analyses. However, they found that due to the host star being a more evolved star than previously indicated, the planet actually has a radius of 1.6 R and an insolation flux of 1.1 S.

5.5.10. KOI 7179.01

KOI 7179.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.18 R and insolation flux of 1.29 S, given its period of 407 days around a 1.2 R, 5845 K star. Manual inspection reveals no evidence for any of its three transit events being due to systematics, though the KOI has a very low S/N of 8.2, and so it is difficult to definitely discern the shape of individual events. KOI 7179.01 has a Marshall metric value of 5.9, which is a moderate value only 1.2σ from the peak of the low S/N injTCE distribution (see Figure 8). Overall, this appears to be a good, but low S/N PC with Earth-like size, insolation flux, and solar-type host star.

5.5.11. KOI 7223.01

KOI 7223.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.53 R and insolation flux of 0.57 S, given its period of 317 days around a 0.73 R, 5370 K star. Manual inspection reveals this to be a strong PC with five observed transits. KOI 7223.01 represents another new, possibly rocky PC in the habitable zone of a late G-type star.

5.5.12. KOI 7235.01

KOI 7235.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.15 R and insolation flux of 0.75 S, given its period of 300 days around a 0.76 R, 5606 K star. Manual inspection of its PDC light curve reveals that of its five transit events, two of them occur on the edges of gaps. Of the remaining three, one or two may be due to a low-amplitude SPSD, though it is difficult to be sure given the KOI's low S/N of 9.1. KOI 7235.01 has a fairly high value for the Marshall metric of 8.1, which is close to the 10.0 FP threshold. Overall, we deem it most likely that this object is due to low-amplitude systematics.

5.5.13. KOI 7470.01

KOI 7470.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.90 R and insolation flux of 0.60 S, given its period of 393 days around a 0.99 R, 5128 K star. Manual inspection of its PDC light curve reveals that of its three transit events, the middle event (in Q9) is very likely due to a SPSD or step-wise discontinuity. Also, the value of the Marshall metric is 8.2, which is close to the 10.0 FP threshold. Overall, we deem it very likely that this object is actually a result of low-amplitude systematics.

5.5.14. KOI 7554.01

KOI 7554.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.98 R and insolation flux of 1.12 S, given its period of 483 days around a 1.09 R, 6315 K star. Manual inspection of its PDC light curve reveals that of its three transit events, the last event (in Q14) is very likely due to a SPSD. The value of the Marshall metric for KOI 7554.01 is 5.1. Overall, we deem it likely that this object is actually a result of low-amplitude systematics.

5.5.15. KOI 7567.01

KOI 7567.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.46 R and insolation flux of 0.10 S, given its period of 608 days around a 0.65 R, 4486 K star. Manual inspection of its PDC light curve reveals that of its three transit events, the first event (in Q1) is very likely due to a SPSD. The value of the Marshall metric is 9.5, which is very close to the 10.0 FP threshold. Overall, we deem it very likely that this object is actually a result of low-amplitude systematics.

5.5.16. KOI 7591.01

KOI 7591.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.30 R and insolation flux of 0.33 S, given its period of 328 days around a 0.67 R, 4906 K star. Manual inspection of its PDC light curve reveals that of its three transit events, the second event (in Q5) is likely due to a SPSD. The value of the Marshall metric is 6.7, which is moderately high. Overall, we deem it likely that this object is actually a result of low-amplitude systematics.

5.5.17. KOI 7592.01

KOI 7592.01 is a newly detected single PC in Q1–Q17 DR24 with an inferred size of 1.55 R and insolation flux of 0.07 S, given its period of 382 days around a 0.53 R, 3761 K star. Manual inspection reveals no evidence for any of its three transit events being due to systematics, though this KOI has a low S/N of 10.4, and so it is difficult to definitively discern the shape of its three transit events. KOI 7592.01 has a Marshall metric value of 8.4, which is fairly high given the FP threshold of 10.0, but still only 2.2σ from the median Marshall value for low S/N injTCEs (see Figure 8). Overall, this appears to be a borderline, low S/N PC. If the signal really is due to a planet, then it would be quite unique as it is the candidate with the lowest insolation flux in the Q1–Q17 DR24 catalog, given its long period and M dwarf host star.

6. DISCUSSION

The Q1–Q17 DR24 KOI catalog represents the first time that every TCE from a Kepler pipeline search has been uniformly vetted. Of the 20,367 Q1–Q17 DR24 TCEs, the robovetter ruled 13,283 as not transit-like, and another 2786 as transit-like FPs, leaving 4298 PCs. (Note that 5 of the TCEs that were designated as PCs were not given KOI numbers, as discussed in Section 4.2, resulting in a total of 4293 PCs in the Q1–Q17 DR24 catalog.) Combining these results with previous Kepler catalogs, there are now 4696 PCs in the cumulative Kepler KOI catalog. Due to the uniform vetting, the vast majority of known KOIs from previous catalogs have been re-vetted, many of which were previously vetted with only a few quarters of Kepler data. This should result in more accurate dispositions for most KOIs, though we note that the catalog values uniformity over individual correctness. Users of the catalog who are interested in investigating individual KOIs are encouraged to check the disposition from this catalog, as well as the dispositions given by previous catalogs.

As only known contact eclipsing binaries were excluded from the Kepler pipeline transit search, and as the robovetter designates specific categories of FPs, the catalog is also a valuable repository of information for detached eclipsing binaries and other specific classes of FPs. For example, there are 1215 on-target eclipsing binaries in the Q1–Q17 DR24 KOI catalog, which can be identified as those KOIs that were dispositioned as FPs only due to a significant secondary (i.e., no centroid offset nor ephemeris match was identified). The study of these EBs can yield valuable stellar science, especially when coupled with follow-up observations. There are 1730 KOIs dispositioned as FPs due to a centroid offset or ephemeris match, which is a valuable sample for studying how Kepler targets are contaminated across the entire field. A third category of interest includes those KOIs that had visible secondary eclipses that could be attributed to planetary reflection and/or thermal emission, identified as PCs with the significant secondary flag marked, of which 40 exist in the catalog.

In Figure 9, we plot a histogram of the number of Q1–Q17 DR24 TCEs, the number of TCEs designated as transit-like (KOIs), and the number of KOIs designated as PC as a function of period (similar to Figure 1) and planetary radius. As can be seen, the short- and long-period TCE excesses, as well as the local TCE period spikes, have generally been eliminated. The TCEs with very small and very large radii are also eliminated. FP KOIs, which are represented by the difference between the green and blue lines, must either have a significant secondary, centroid offset, or ephemeris match, and thus are principally due to eclipsing binaries. As expected, the FP KOI population is dominated by short periods and large radii, similar to the Kepler EB period and radius distribution (Prša et al. 2011; Slawson et al. 2011).

Figure 9.

Figure 9. Distribution of Q1–Q17 DR24 TCEs (red), KOIs (green), and PCs (blue) as a function of period (top) and radius (bottom). Note that the difference between the red and green lines represents the population of not transit-like FPs, and the difference between the green and blue lines represents the population of the transit-like FPs. Also note that, as shown, some planet candidates can have very large inferred radii, as FPs are purposely not designated based on depth or inferred size alone.

Standard image High-resolution image

This is also the first time that artificial transit injection has been used in both the development and evaluation of the Kepler pipeline and the TCERT vetting process. We note again that special care should be taken in computing occurrence rates using this catalog due to the period-dependent search performed by the Q1–Q17 DR24 Kepler pipeline as a result of the bootstrap veto (Section 2.2). Kepler mission completeness and reliability products18 should also be used in conjunction with the Q1–Q17 DR24 catalog when computing occurrence rates. However, overall, the uniform vetting via our robotic approach, coupled with the transit injection results, enables a more accurate computation of the number of Earth-size planets in the habitable zone of Sun-like stars.

Overall, the robovetter is successful in robustly identifying FPs while retaining valid PCs. As presented in Section 5.3, the robovetter has a small false negative rate as measured by artificial transit injection. Given the qualitatively efficient elimination of FP TCEs shown in Figure 9, and the quantitative comparison to ancillary catalogs in Section 5.2, the robovetter also likely has an overall small FP rate. We note again here that the robovetter purposely does not designate FPs based on transit depth or inferred planet size in order to produce a uniform catalog that is agnostic concerning the stellar parameters—PCs with inferred radii several times that of Jupiter and larger are very likely to be due to eclipsing binaries, and users of this catalog are encouraged to make cuts on the inferred radii where it is appropriate for their scientific objectives. We also note that the FP rate is likely enhanced for very low S/N candidates (≲10), as shown by the manual inspection of new, low S/N candidates in Section 5.5. We stress that full simulations of FPs, alongside the existing simulated planet transits, are needed to fully quantify the FP rate as a function of S/N and other parameters, and thus calculate accurate occurrence rates.

The robovetter has known areas where it could be improved. For example, there are a handful of slightly eccentric eclipsing binaries with nearly equal primary and secondary depths that are detected as TCEs at half the true orbital period. These are not detected by the current odd–even depth test, and thus a test to search for an odd–even epoch offset is needed. A small number of TCEs due to flux contamination are sometimes detected at an integer ratio of the true orbital period, and when seasonal depth variations are present, they can sometimes escape identification by the robovetter. In addition, planets with strong TTVs can be erroneously labeled as FPs, though the number of these systems is extremely low both due to their intrinsic occurrence rate as well as non-detection by the Kepler pipeline, which is not designed to detect non-periodic transits resulting from strong TTVs. Finally, there are a number of variable stars which generate TCEs at the same period as their variability. While the stellar variability is obvious in the PDC data, both detrendings can sometimes make the resulting TCE appear to be transit-like, and thus a test to detect these cases would be valuable. These issues will be examined, and the robovetter further improved, for the next Kepler PC catalog.

7. CONCLUSION

We produced, for the first time, a uniform PC catalog based on the entire 48-month Kepler data set. We developed a robotic vetting program that mimics the human decision-making process employed by previous catalogs to examine the periodic signals identified by the Kepler pipeline. Our robotic vetting approach is able to eliminate the vast majority of FP signals, while simultaneously retaining greater than 98% of artificially injected planets similar to Earth in size and insolation flux, and over 99% of confirmed planets. Coupled with the injection of artificial transits, these advancements allow for a more accurate computation of the fraction of Earth-size planets in the habitable zone of Sun-like stars. We note that this robotic vetting approach can be readily applied to other large-scale photometric survey missions, such as K2 (Howell et al. 2014), TESS (Ricker et al. 2015), and LSST (Ivezic et al. 2008).

We thank the anonymous referee for a careful reading of the paper and comments which greatly helped to improve the readability of the paper. B.Q. gratefully acknowledges support by an appointment to the NASA Postdoctoral Program at the Ames Research Center, administered by Oak Ridge Associated Universities through a contract with NASA. D.H. acknowledges support by the Australian Research Council's Discovery Projects funding scheme (project number DE140101364) and support by the National Aeronautics and Space Administration under grant NNX14AB92G issued through the Kepler Participating Scientist Program. This paper includes data collected by the Kepler mission. Funding for the Kepler mission is provided by the NASA Science Mission directorate. The authors acknowledge the efforts of the Kepler Mission team for obtaining the calibrated pixel, light curve, and data validation reports used in this publication, which were generated by the Kepler Mission science pipeline through the efforts of the Kepler Science Operations Center and Science Office. The Kepler Mission is lead by the project office at NASA Ames Research Center. Ball Aerospace built the Kepler photometer and spacecraft which is operated by the mission operations center at LASP. These data products are archived at the NASA Exoplanet Science Institute, which is operated by the California Institute of Technology, under contract with the National Aeronautics and Space Administration under the Exoplanet Exploration Program. This research has made use of NASA's Astrophysics Data System. Some of the data presented in this paper were obtained from the Mikulksi Archive for Space Telescopes (MAST). STScI is operated by the Association of Universities for Research in Astronomy, Inc., under NASA contract NAS5-26555. Support for MAST for non-HST data is provided by the NASA Office of Space Science via grant NNX09AF08G and by other grants and contracts.

APPENDIX A: LIST OF ACRONYMS

NASA missions like Kepler tend to accumulate a large number of acronyms. Hence, we provide a summary of those used in this paper for easy reference, along with their definitions.

  • BKJD: Barycentric Kepler Julian Date: BKJD = BJD-2454833.0.
  • DR: Data Release.
  • DV: Data Validation: The module of the Kepler pipeline that provides diagnostics for TCEs.
  • EB: Eclipsing Binary.
  • EBWG: Kepler's Eclipsing Binary Working Group.
  • FP: False Positive.
  • FPWG: Kepler's False Positive Working Group.
  • HZ: Habitable Zone: The region around a star where a planet could have surface temperatures that allow for the presence of liquid water.
  • KIC: Kepler Input Catalog: The catalog of stars in the Kepler field that was used for target selection.
  • KOI: Kepler Object of Interest: A unique identifier of a signal consistent with a transiting or eclipsing system.
  • MCMC: Markov Chain Monte Carlo.
  • MES: Multiple Event Statistic: The S/N for the detection of a TCE by the TPS module of the Kepler pipeline.
  • PC: Planet Candidate.
  • SES: Single Event Statistic: The S/N for the detection of an individual transit-like event by the TPS module of the Kepler pipeline.
  • S/N: Signal-to-noise Ratio.
  • TCE: Threshold Crossing Event: A series of periodic flux decrements consistent with the signal produced by a transiting planet.
  • TCERT: Threshold Crossing Event Review Team: A committee that reviews TCEs to identify FPs and PCs.
  • TPS: Transiting Planet Search: The module of the Kepler pipeline that searches for transits.
  • TTV: Transit Timing Variation: A deviation in the expected time of transit due to gravitational interaction in multi-planet systems.

APPENDIX B: THE ROBOVETTER MNEMONIC FLAGS

In Table 5, we list mnemonic flags that describe the results of individual the robovetter tests in the comments column. Here, we describe the meaning of each mnemonic flag.

  • ALT_ROBO_ODD_EVEN_TEST_FAIL: The TCE failed the robovetter's odd–even depth test on the alternate detrending, and thus is marked as an FP due to a significant secondary.
  • ALT_SEC_COULD_BE_DUE_TO_PLANET: A significant secondary eclipse was detected in the alternate detrending, but it was determined to possibly be due to planetary reflection and/or thermal emission. While the significant secondary major flag remains set, the TCE is dispositioned as a PC.
  • ALT_SEC_SAME_DEPTH_AS_PRI_COULD_BE_TWICE_TRUE_PERIOD: A significant secondary eclipse was detected in the alternate detrending, but it was determined to be the same depth as the primary within the uncertainties. Thus, the TCE is possibly a PC that was detected at twice the true orbital period. When this flag is set, it acts as an override to other flags such that the significant secondary major flag is not set, and thus the TCE is dispositioned as a PC if no other major flags are set.
  • ALT_SIG_PRI_MINUS_SIG_POS_TOO_LOW: The difference of the primary and positive event significances, computed by the model-shift test using the alternate detrending, is below the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates the primary event is not unique in the phased light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • ALT_SIG_PRI_MINUS_SIG_TER_TOO_LOW: The difference of the primary and tertiary event significances, computed by the model-shift test using the alternate detrending, is below the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates the primary event is not unique in the phased light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • ALT_SIG_PRI_OVER_FRED_TOO_LOW: The significance of the primary event divided by the ratio of red noise to white noise in the light curve, computed by the model-shift test using the alternate detrending, is below the threshold σFA. This indicates the primary event is not significant compared to the amount of systematic noise in the light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • CENTROID_SIGNIF_UNCERTAIN: The significance of the centroid offset cannot be measured to high enough precision, and thus the centroid module can not confidently disposition the TCE as an FP. This is typically due to having only a very small number (3 or 4) of offset measurements, all with low S/N.
  • CLEAR_APO: The TCE was marked as an FP due to a centroid offset because the transit occurs on a star that is spatially resolved from the target.
  • CROWDED_DIFF: More than one potential stellar image was found in the difference image. The EYEBALL flag is always set when the CROWDED_DIFF flag is set.
  • DV_ROBO_ODD_EVEN_TEST_FAIL: The TCE failed the robovetter's odd–even depth test on the DV detrending, and thus is marked as an FP due to a significant secondary.
  • DV_SEC_COULD_BE_DUE_TO_PLANET: A significant secondary eclipse was detected in the DV detrending, but it was determined to possibly be due to planetary reflection and/or thermal emission. While the significant secondary major flag remains set, the TCE is dispositioned as a PC.
  • DV_SEC_SAME_DEPTH_AS_PRI_COULD_BE_TWICE_TRUE_PERIOD: A significant secondary eclipse was detected in the DV detrending, but it was determined to be the same depth as the primary within the uncertainties. Thus, the TCE is possibly a PC that was detected at twice the true orbital period. When this flag is set, it acts as an override to other flags such that the significant secondary major flag is not set, and thus the TCE is dispositioned as a PC if no other major flags are set.
  • DV_SIG_PRI_MINUS_SIG_POS_TOO_LOW: The difference of the primary and positive event significances, computed by the model-shift test using the DV detrending, is below the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates the primary event is not unique in the phased light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • DV_SIG_PRI_MINUS_SIG_TER_TOO_LOW: The difference of the primary and tertiary event significances, computed by the model-shift test using the DV detrending, is below the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates the primary event is not unique in the phased light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • DV_SIG_PRI_OVER_FRED_TOO_LOW: The significance of the primary event divided by the ratio of red noise to white noise in the light curve, computed by the model-shift test using the DV detrending, is below the threshold σFA. This indicates the primary event is not significant compared to the amount of systematic noise in the light curve, and thus the TCE is dispositioned as an FP with the not transit-like major flag set.
  • EYEBALL: The metrics used by the centroid module are very close to the decision boundaries, and thus the centroid disposition of this TCE is uncertain and warrants further scrutiny. No TCEs are marked as an FP due to a centroid offset if this flag is set.
  • FIT_FAILED: The transit was not fit by a model in DV and thus no difference images were created for use by the centroid module. Thus, the TCE is not failed due to a centroid offset by default. This flag is typically set for very deep transits due to eclipsing binaries.
  • INVERT_DIFF: One or more difference images were inverted, meaning the difference image claims the star got brighter during transit. This is usually due to variability of the target star and suggests the difference image should not be trusted. When this flag is set, the TCE is marked as a candidate that requires further scrutiny, i.e., the EYEBALL flag is set and the TCE is not marked as an FP due to a centroid offset.
  • KIC_OFFSET: The centroid module measured the offset distance relative to the star's recorded position in the Kepler Input Catalog (KIC), not the out-of-transit centroid. The KIC position is less accurate in sparse fields, but more accurate in crowded fields. If this is the only flag set, there is no reason to believe a statistically significant centroid shift is present (F. Mullally et al. 2016, in preparation).
  • LPP_ALT_TOO_HIGH: The LPP value (Thompson et al. 2015b), as computed using the alternate detrending, is above the robovetter threshold. This indicates the TCE is not transit-shaped, and thus is dispositioned as an FP with the not transit-like major flag set.
  • LPP_DV_TOO_HIGH: The LPP value, as computed using the DV detrending, is above the robovetter threshold. This indicates the TCE is not transit-shaped, and thus is dispositioned as an FP with the not transit-like major flag set.
  • MARSHALL_FAIL: The TCE failed the Marshall metric (Mullally et al. 2015b), which indicates that the TCE's individual transits are not transit-shaped and more likely due to instrumental artifacts. Thus, the TCE is dispositioned as an FP with the not transit-like major flag set.
  • OTHER_TCE_AT_SAME_PERIOD_DIFF_EPOCH: Another TCE on the same target with a higher planet number was found to have the same period as the current TCE, but a significantly different epoch. This indicates the current TCE is an EB with the other TCE representing the secondary eclipse. If the ALT_SEC_COULD_BE_DUE_TO_PLANET and DV_SEC_COULD_BE_DUE_TO_PLANET flags are not set, the TCE is dispositioned as an FP with the significant secondary major flag set.
  • PARENT_IS_X: The TCE has been identified as an FP due to an ephemeris match. This flag indicates the most likely parent, or true physical source of the signal, where X will be substituted for the parent's name. Note that X is not guaranteed to be the true parent, but simply is the most likely source given the information available.
  • PERIOD_ALIAS_IN_ALT_DATA_SEEN_AT_X:1: Using the results of the model-shift test (specifically the phases of the primary, secondary, and tertiary events) a possible period alias is seen at X:1, where X is an integer. This indicates the TCE has likely been detected at a period that is X times longer than the true orbital period. This flag is currently informational only and not used to declare any TCE an FP.
  • RESID_OF_PREV_TCE: The TCE has the same period and epoch as a previous transit-like TCE. This indicates the current TCE is simply a residual artifact of the previous TCE after it was removed from the light curve. Thus, the current TCE is dispositioned as an FP with the not transit-like major flag set.
  • SAME_P_AS_PREV_NTL_TCE: The current TCE has the same period as a previous TCE that was dispositioned as FP with the not transit-like major flag set. This indicates that the current TCE is due to the same not transit-like signal. Thus, the current TCE is dispositioned as an FP with the not transit-like major flag set.
  • SATURATED: The star is saturated. The assumptions employed by the centroid the robovetter module break down for saturated stars, so the TCE is marked as a candidate requiring further scrutiny, i.e., the EYEBALL flag is set and the TCE is not marked as an FP due to a centroid offset.
  • SEASONAL_DEPTH_DIFFS_IN_ALT: There appears to be a significant difference in the computed TCE depth when using the alternate detrending light curves from different seasons. This indicates significant light contamination is present, usually due to a bright star at the edge of the image, which may or may not be the source of the signal. As it is impossible to determine whether or not the TCE is on-target from this flag alone, it is currently informational only and not used to declare any TCE an FP.
  • SEASONAL_DEPTH_DIFFS_IN_DV: There appears to be a significant difference in the computed TCE depth when using the DV detrending light curves from different seasons. This indicates significant light contamination is present, usually due to a bright star at the edge of the image, which may or may not be the source of the signal. As it is impossible to determine whether or not the TCE is on-target from this flag alone, it is currently informational only and not used to declare any TCE an FP.
  • SIG_SEC_IN_ALT_MODEL_SHIFT: The significance of the secondary event divided by the ratio of red noise to white noise in the light curve, computed by the model-shift test using the alternate detrending, is above the threshold σFA. Also, the difference between the secondary and tertiary event significances, and the difference between the secondary and positive event significances, both computed by the model-shift test using the alternate detrending, is above the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates that there is a unique and significant secondary event in the light curve, i.e., a secondary eclipse. Thus, assuming the ALT_SEC_COULD_BE_DUE_TO_PLANET flag is not set, the TCE is dispositioned as an FP with the significant secondary flag set.
  • SIG_SEC_IN_DV_MODEL_SHIFT: The significance of the secondary event divided by the ratio of red noise to white noise in the light curve, computed by the model-shift test using the DV detrending, is above the threshold σFA. Also, the difference between the secondary and tertiary event significances, and the difference between the secondary and positive event significances, both computed by the model-shift test using the DV detrending, is above the threshold ${\sigma }_{{\rm{FA}}}^{\prime }$. This indicates that there is a unique and significant secondary event in the light curve, i.e., a secondary eclipse. Thus, assuming the DV_SEC_COULD_BE_DUE_TO_PLANET flag is not set, the TCE is dispositioned as an FP with the significant secondary flag set.
  • SIGNIF_OFFSET: There is a statistically significant shift in the centroid during transit. This indicates the variability is not due to the target star. Thus, the TCE is dispositioned as an FP with the centroid offset major flag set.
  • THIS_TCE_IS_A_SEC: The TCE is determined to have the same period, but different epoch, as a previous transit-like TCE. This indicates that the current TCE corresponds to the secondary eclipse of an EB (or planet if the ALT_SEC_COULD_BE_DUE_TO_PLANET or DV_SEC_COULD_BE_DUE_TO_PLANET flags are set.) Thus, the current TCE is dispositioned as an FP with both the not transit-like and significant secondary major flags set.
  • TOO_FEW_CENTROIDS: The PRF centroid fit used by the centroid module does not always converge, even in high S/N difference images. This flag is set if centroid offsets are recorded for fewer than 3 high S/N difference images.
  • TOO_FEW_QUARTERS: Fewer than three difference images of sufficiently high S/N are available, and thus very few tests in the centroid module are applicable to the TCE. If this flag is set in conjunction with the CLEAR_APO flag, the source of the transit may be on a star clearly resolved from the target.
  • TRANSITS_NOT_CONSISTENT: The TCE had a max_ses_in_mes/mes ratio of greater than 0.9, and a period greater than 90 days. This indicates that the TCE is dominated by a single large event, and thus is due to a systematic feature such as a sudden pixel sensitivity dropout. Thus, the TCE is dispositioned as an FP with the not transit-like major flag set.

Footnotes

Please wait… references are loading.
10.3847/0067-0049/224/1/12