The Occurrence of Rocky Habitable Zone Planets Around Solar-Like Stars from Kepler Data

We present occurrence rates for rocky planets in the habitable zones (HZ) of main-sequence dwarf stars based on the Kepler DR25 planet candidate catalog and Gaia-based stellar properties. We provide the first analysis in terms of star-dependent instellation flux, which allows us to track HZ planets. We define $\eta_\oplus$ as the HZ occurrence of planets with radius between 0.5 and 1.5 $R_\oplus$ orbiting stars with effective temperatures between 4800 K and 6300 K. We find that $\eta_\oplus$ for the conservative HZ is between $0.37^{+0.48}_{-0.21}$ (errors reflect 68\% credible intervals) and $0.60^{+0.90}_{-0.36}$ planets per star, while the optimistic HZ occurrence is between $0.58^{+0.73}_{-0.33}$ and $0.88^{+1.28}_{-0.51}$ planets per star. These bounds reflect two extreme assumptions about the extrapolation of completeness beyond orbital periods where DR25 completeness data are available. The large uncertainties are due to the small number of detected small HZ planets. We find similar occurrence rates using both a Poisson likelihood Bayesian analysis and Approximate Bayesian Computation. Our results are corrected for catalog completeness and reliability. Both completeness and the planet occurrence rate are dependent on stellar effective temperature. We also present occurrence rates for various stellar populations and planet size ranges. We estimate with $95\%$ confidence that, on average, the nearest HZ planet around G and K dwarfs is about 6 pc away, and there are about 4 HZ rocky planets around G and K dwarfs within 10 pc of the Sun.


INTRODUCTION
One of the primary goals of the Kepler mission Koch et al. 2010;Borucki 2016) is to determine the frequency of occurrence of habitablezone rocky planets around Sun-like stars, also known as "η ⊕ ". Habitable-zone rocky planets are broadly construed as any rocky planet in its star's habitable zone (HZ), roughly defined as being at the right distance from the star so that its surface temperature would per-mit liquid water (see §2). Measuring η ⊕ informs theories of planet formation, helping us to understand why we are here, and is an important input to mission design for instruments designed to detect and characterize habitable-zone planets such as LUVOIR (The LUVOIR Team 2019) and HabEX (Gaudi et al. 2020).
Kepler's strategy to measure η ⊕ was to continuously observe >150,000 Solar-like main-sequence dwarf stars (primarily F, G, and K) with a highly sensitive photometer in Solar orbit, identifying planets through the detection of transits. In the process, Kepler revolutionized our perspective of exoplanets in the Galaxy. The planet catalog in the final Kepler data release 25 (DR25) contains 4034 planet candidates (PCs; Thompson et al. 2018), leading to the confirmation or statistical validation of over 2,300 exoplanets 1 -more than half of all exoplanets known today.
Identifying habitable zone rocky planets proved to be a greater challenge than anticipated. Based on the sensitivity of the Kepler photometer and the expectation that Solar variability is typical of quiet main-sequence dwarfs, it was believed that four years of observation would detect a sufficient number of rocky habitable-zone planets to constrain their frequency of occurrence. However, Kepler observations showed that stellar variability was, on average, ∼50% higher than Solar variability (Gilliland et al. 2011), which suppressed the number of habitable-zone rocky planets that could be detected in four years. In response, Kepler's observational time was extended to eight years, but the failure of reaction wheels, required to maintain precise pointing, prevented continuation of high-precision observations in the original Kepler field after four years (Howell et al. 2014). Furthermore, by definition, Kepler planet candidates must have at least three observed transits. The longest orbital period with three transits that can be observed in the four years of Kepler data is 710 days (assuming fortuitous timing in when the transits occur). Given that the habitable zone of many F and late G stars require orbital periods longer than 710 days, Kepler is not capable of detecting all habitable-zone planets around these stars.
The result is Kepler data in which transiting rocky habitable zone planets are often near or beyond Kepler's detection limit. Of the thousands of planets in the DR25 catalog, relatively few are unambiguously rocky and near their habitable zones: there are 56 such PCs with radius ≤ 2.5 R ⊕ , and 9 PCs with radius ≤ 1.5 R ⊕ (using planet radii from Berger et al. (2020a)). As described in §2, we expect many planets near the habitable zone larger than 1.5 R ⊕ to be non-rocky. These small numbers present challenges in the measurement of the frequency of occurrence of habitable-zone planets.
Converting a planet catalog into an underlying occurrence rate is also challenging due to the existence of selection effects and biases, with issues only exacerbated in the η ⊕ regime. Planet candidate catalogs generally suffer from three types of error: • The catalog is incomplete, missing real planets.
• The catalog is unreliable, being polluted with false positives (FPs).
Near the detection limit, both completeness and reliability can be low, requiring careful correction for the computation of occurrence rates. The DR25 planet candidate catalog includes several products that facilitate the characterization of and correction for completeness and reliability (Thompson et al. 2018). The data supporting completeness characterization, however, are only supplied for orbital periods of 500 days or less, requiring extrapolation of completeness for planets beyond these orbital periods. These issues are summarized in Figure 1, which show the DR25 PC population and its observational coverage, observational error, completeness, and reliability. The details of these populations are given in Appendix C.
Our calculation of habitable zone occurrence will be in terms of planet radius and instellation flux, measuring the photon flux incident on the planet from its host star, which allows us to consider each star's habitable zone. We will proceed in two steps: 1. Develop a model describing the planet population in the neighborhood of the habitable zone ( §3.4).
Because this is a statistical study, the model will be based on a large number of stars, using the observed DR25 planet candidate catalog and Gaiabased stellar properties, and will include corrections for catalog completeness and reliability.
2. The derivation of the average number of rocky planets per star in each star's habitable zone from the planet population model ( §3.5). This will often be done in a subset of the parameter space used to compute the population model.
When computing a quantity over a desired range of parameters such as radius and instellation flux, it is often the case that using data from a wider range will give better results. For example, it is well known that polynomial fits to data have higher uncertainty near the boundaries of the data. As explained in §2, we are primarily interested in rocky habitable zone planets, with planet radii between 0.5 R ⊕ and 1.5 R ⊕ and instellation flux within each star's estimated habitable zone, for stars with effective temperature between 4800 K and 6300 K.
To create our population model, we will use a larger domain with a planet radius range of 0.5 R ⊕ to 2.5 R ⊕ , and an instellation range from 0.2 to 2.2 times Earth's insolation, which encloses the habitable zones of all the stars we consider. We will focus on using two stellar populations: one exactly matching our desired effective temperature range of 4800 K to 6300 K and one with a larger range of 3900 K to 6300 K to investigate whether the larger range will improve our results. Most of our results will be reported for both stellar populations because it is possible that including stars in the 3900 K to 4800 K range will bias our results. We will have a population model for each choice of stellar population. Once we have our population models, we will use them to compute our definition of η ⊕ , the average number of planets per star with radii between 0.5 and 1.5 R ⊕ , in the star's habitable zone, averaged over stars with effective temperature from 4800 K to 6300 K. In the end we will find that the two stellar populations predict similar median values for η ⊕ , but the model using stars with effective temperatures from 3900 K to 6300 K yields significantly smaller (though still large) uncertainties. While we are focused on our definition of η ⊕ , occurrence rates over other ranges of planet radius and stellar temperature are of interest. We will use population model based on the 3900 K to 6300 K stellar population to compute the average number of habitable zone planets per star for various ranges of planet radius and stellar effective temperature.

Previous Kepler-based η ⊕ Estimates
Attempts to measure η ⊕ and general occurrence rates with Kepler have been made since the earliest Kepler catalog releases (Borucki et al. 2011). Youdin (2011) and Howard et al. (2012) were two influential early studies, in which planets found in only the first four months of data (Q0-Q2) were used to constrain Kepler occurrence rates. Youdin (2011) developed a maximum likelihood method to fit an underlying planetary distribution function, which later influenced the Poisson likelihood function method adopted by, e.g., Burke et al. (2015); Bryson et al. (2020a). Howard et al. (2012) took an alternative approach of estimating occurrence rates in bins of planets, defined over a 2D grid of planet radius and orbital period. In each bin, non-detections are corrected for by weighting each planet by the inverse of its detection efficiency. This inverse detection efficiency method (IDEM) is one of the most popular approaches in the literature. Catanzarite & Shao (2011) and Traub (2012) (Q0-Q5, Borucki et al. 2011) were among the first papers to focus on the η ⊕ question specifically. Later η ⊕ papers from Dressing & Charbonneau (2013) (Q1-Q6, Batalha et al. 2013), Kopparapu et al. (2013) (Q1-Q6), Burke et al. (2015) (Q1-Q16, Mullally et al. 2016), and Silburt et al. (2015) (Q1-Q16) were able to take advantage of newer planet catalogs based on increased amounts of data. Other papers have used custom pipelines to search Kepler light curves to estimate η ⊕ with indepen-dently produced planet catalogs: namely, Petigura et al. (2013) Garrett et al. 2018).
Comparisons between these η ⊕ studies are challenging due to the wide variety of catalogs used, some of which are based on only a fraction of the data as others. Characterization of completeness has also varied between authors, with some assuming a simple analytic model of detection efficiency (e.g., Youdin 2011;Howard et al. 2012), some empirically estimating detection efficiency with transit injection/recovery tests (e.g., Petigura et al. 2013;Burke et al. 2015), and others simply assuming a catalog is complete beyond some threshold (e.g., Catanzarite & Shao 2011). Borucki et al. (2011) provided a comprehensive analysis of completeness bias, reliability against astrophysical false positives and reliability against statistical false alarms based on manual vetting and simple noise estimates. Fully automated vetting was implemented via the Robovetter (Coughlin 2017) for the Kepler DR24 ) and DR25 catalogs. The final Kepler data release (DR25), based on the full set of Kepler observations and accompanied by comprehensive data products for characterizing completeness, has been essential for alleviating issues of completeness and reliability. The DR25 catalog is now the standard used by occurrence rate studies (e.g., Mulders et al. 2018;Hsu et al. 2018;Zink et al. 2019;Bryson et al. 2020a).
DR25 was the first catalog to include data products that allowed for the characterization of catalog reliability against false alarms due to noise and systematic instrumental artifacts, which are the most prevalent contaminants in the η ⊕ regime. Thus nearly all previous works did not incorporate reliability against false alarms in their estimates. Bryson et al. (2020a) was the first to directly take into account reliability against both noise/systematics and astrophysical false positives, and in doing so found that occurrence rates for small planets in long-period orbits dropped significantly after reliability correction. Mulders et al. (2018) attempted to mitigate the impact of contamination by using a DR25 Disposition Score cut (see §7.3.4 of Thompson et al. 2018) as an alternative to reliability correction. As shown in Bryson et al. (2020b), while this approach does produce a higher reliability planet catalog, explicit accounting for reliability is still necessary for accurate occurrence rates.
Studies have also varied in stellar property catalogs used, and exoplanet occurrence rates have been shown to be sensitive to such choices. For instance, the discovery of a gap in the radius distribution of small planets, first uncovered in observations by Fulton et al. (2017), was enabled by improvements in stellar radius measurements by the California Kepler Survey (CKS; Petigura et al. 2017;Johnson et al. 2017). The use of Gaia DR2 parallaxes, which have resulted in a reduction in the median stellar radius uncertainty of Kepler stars from ≈ 27% to ≈ 4% (Berger et al. 2020b), has been another significant improvement with important implications for η ⊕ . Bryson et al. (2020a) showed that occurrence rates of planets near Earth's orbit and size can drop by a factor of 2 if one adopts planet radii based on Gaia stellar properties rather than pre-Gaia Kepler Input Catalog stellar properties.

Our Work
Measuring η ⊕ requires a definition of what it actually means to be considered a rocky planet in the habitable zone. Different authors use different definitions, including regarding whether η ⊕ refers to the number of rocky habitable zone planets per star, or the number of stars with rocky habitable zone planets. In this paper, for reasons detailed in §2, we define η ⊕ as the average number of planets per star with planet radius between 0.5 and 1.5 Earth radii, in the star's habitable zone, where the average is taken over stars with effective temperatures between 4800 K and 6300 K. We compute η ⊕ for both conservative and optimistic habitable zones, denoted respectively as η C ⊕ and η O ⊕ . Most of the existing literature on habitable zone occurrence rates are in terms of orbital period, where a single period range is adopted to represent the bounds of the habitable zone for the entire stellar population considered. However, no single period range covers the habitable zone for a wide variety of stars. Figure 2 shows two example period ranges used for habitable zone occurrence rate studies relative to the habitable zone of each star in our stellar parent sample. The SAG13 2 habitable zone range of 237 ≤ period ≤ 860 days is shown in blue, and ζ ⊕ , defined in Burke et al. (2015) as within 20% of Earth's orbital period, is shown in orange. While these period ranges cover much of the habitable zone for G stars, they miss significant portions of the habitable zones of K and F stars, and include regions outside the habitable zone even when restricted to G stars. This will be true for any fixed choice of orbital period range 2 https://exoplanets.nasa.gov/exep/exopag/sag/#sag13 for the range of stellar effective temperatures required for good statistical analysis. Such coverage will not lead to accurate occurrence rates of planets in the habitable zone. Given that the period ranges of many habitable zone definitions also extend beyond the detection limit of Kepler, computing η ⊕ requires extrapolation of a fitted population model to longer orbital periods. Lopez & Rice (2018) and Pascucci et al. (2019) present evidence and theoretical arguments that inferring the population of small rocky planets at low instellation from the population of larger planets at high instellation can introduce significant overestimates of η ⊕ .
For these reasons, we choose to work in terms of the instellation flux, measuring the photon flux incident on the planet from its host star, rather than orbital period. In §3 we describe how we adopt existing occurrence rate methods and completeness characterizations to use instellation flux instead of orbital period. We address concerns with extrapolating completeness to long orbital periods by providing bounds on the impact of the limited coverage of completeness data ( §3.3). Following Howard et al. (2012); Youdin (2011);Burke et al. (2015), among others, we compute the number of planets per star f . As in Youdin (2011) and Burke et al. (2015), we first compute a population model in terms of the differential rate λ ≡ d 2 f /dr dI, where r is the planet radius and I is the instellation flux. We consider several possible functional forms for λ, and will allow λ to depend on the stellar host effective temperature. We compute λ over the radius range 0.5 R ⊕ ≤ r ≤ 2.5 R ⊕ and instellation flux range 0.2 I ⊕ ≤ I ≤ 2.2 I ⊕ , averaged over the effective temperatures of the stellar population used for the computation ( §3.4). Occurrence rates will be computed by integrating λ over the desired planet radius and instellation flux range, and averaging over the desired effective temperature range to give f , the average number of planets per star. ( §3.5).
By restricting our analysis to planets with r ≤ 2.5 R ⊕ in regions of instellation flux close to the habitable zone, we believe we are avoiding the biases pointed out by Lopez & Rice (2018) and Pascucci et al. (2019) -As seen in Figure 1, there are more detected planets with 1.5 R ⊕ ≤ r ≤ 2.5 R ⊕ than planets with 0.5 R ⊕ ≤ r ≤ 1.5 R ⊕ , so our results in §4 will be driven by these larger planets, but all planets we consider are at similar low levels of instellation. In Figure 2 of Lopez & Rice (2018) we note that for instellation flux between 10 and 20 there is little change in the predicted relative sizes of the 1.5 R ⊕ ≤ r ≤ 2.5 R ⊕ and 0.5 R ⊕ ≤ r ≤ 1.5 R ⊕ planet populations. Naively extrapolating this to instellation < 2 in the HZ, we infer that the sizes of these larger and smaller planet populations in the HZ are simi-lar. Therefore by working at low instellation flux we are likely less vulnerable to overestimating the HZ rocky planet population. Figure 2. The habitable zone flux range compared with example orbital periods, previously used to estimate habitable zone occurrence for G, K, and F stars. For each star in the stellar parent sample, we show the instellation flux range of each orbital period range, with the SAG13 instellation flux range shown as the blue region (comprised of a horizontal blue line for each star showing the flux range for that orbital period range) and ζ⊕ shown as the orange region. The solid green lines are the boundaries of the optimistic habitable zone, while the dashed green lines are the boundaries of the conservative habitable zone (see §2). The planet population is the same as in Figure 1, and are sized by their radius.
We use both Poisson likelihood-based inference with Markov-Chain Monte Carlo (MCMC) and likelihoodfree inference with Approximate Bayesian Computation (ABC). The Poisson likelihood method is one of the most common approaches to calculating exoplanet occurrence rates (e.g. Burke et al. 2015;Zink et al. 2019;Bryson et al. 2020a), while ABC was only applied for the first time in Hsu et al. (2018Hsu et al. ( , 2019. As described in §3.4.2, these methods differ in their treatment of reliability and input uncertainty, and allow us to assess possible dependence of our result on the assumption of a Poisson likelihood function. We present our results in §4. Recognizing the importance of reliability correction in the η ⊕ regime, and confirming the same impact of reliability as Bryson et al. (2020a), we choose to report only our reliability-incorporated results. We also present results both with and without incorporating uncertainties in planet radius and instellation flux and host star effective temperature in our analysis. Our final recommended population models and implications for η ⊕ , as well as how our results relate to previous estimates are discussed in §5.
All results reported in this paper were produced with Python code, mostly in the form of Python Jupyter notebooks, found at the paper GitHub site 3 .

Notation
When summarizing a distribution, we use the notation m +e1 −e2 to refer to the 68% credible interval [m − e2, m + e1], where m is the median and "n% credible interval" means that the central n% of the values fall in that interval. We caution our distributions are typically not Gaussian, so this interval should not be treated as "1σ". We will also supply 95% and 99% credible intervals for our most important results. We use the following notation throughout the paper: r: Planet radius in units of Earth radii R ⊕ . I: Planet instellation flux in units of Earth instellation I ⊕ .
T eff : Stellar effective temperature in Kelvins. When referring to a planet, this is the effective temperature of that planet's host star. f : The number of planets per star, typically a function of r, I and T eff . λ: The differential rate population model ≡ d 2 f /dr dI, typically a function of r, I and T eff . λ is defined by several parameters, for example exponents when λ is a power law. θ: the vector of parameters that define λ, whose contents depend on the particular form of λ. η ⊕ without modification refers to the average number of habitable-zone planets per star with 0.5 R ⊕ ≤ r ≤ 1.5 R ⊕ and host star effective temperature between 3900 K and 6300 K, with the conservative or optimistic habitable zone specified in that context. η C ⊕ and η O ⊕ , respectively, specifically refer to occurrence in the conservative and optimistic habitable zones. Additional subscripts on η ⊕ refer to different stellar populations. For example η C ⊕,GK is the occurrence of conservative habitable zone planets with 0.5 R ⊕ ≤ r ≤ 1.5 R ⊕ around GK host stars.

Characterizing Rocky Planets in the Habitable Zone
A key aspect in computing habitable zone planet occurrence rates is the location and width of the Habitable Zone. Classically, it is defined as the region around a star in which a rocky-mass/size planet with an Earthlike atmospheric composition (CO 2 , H 2 O, and N 2 ) can sustain liquid water on its surface. The insistence on surface liquid water is important for the development of life as we know it, and the availability of water on the surface assumes that any biological activity on the surface alters the atmospheric composition of the planet, betraying the presence of life when observed with remote detection techniques.
Various studies estimate the limits and the width of the HZ in the literature (see Kopparapu (2018) and Kopparapu et al. (2019) for a review), and explored the effect of physical processes such as tidal locking, rotation rate of the planet, combination of different greenhouse gases, planetary mass, obliquity, and eccentricity on HZ limits. These effects paint a more nuanced approach to identify habitability limits, and are particularly useful to explore those environmental conditions where habitability could be maintained. However, for the purpose of calculating the occurrence rates of planets in the HZ, it is best to use a standard for HZ limits as a first attempt, such as Earth-like conditions. One reason is that it would become computationally expensive to estimate the occurrence rates of HZ planets considering all the various HZ limits arising from these planetary and stellar properties. Furthermore, future flagship mission concept studies like LUVOIR (The LUVOIR Team 2019), HabEX (Gaudi et al. 2020), and OST (Meixner et al. 2019) use the classical HZ limits as their standard case to estimate exoEarth mission yields and for identifying associated biosignature gases. Therefore, in this study we use the conservative and optimistic HZ estimates from Kopparapu et al. (2014), where the conservative inner and outer edges of the HZ are defined by the 'runaway greenhouse' and 'maximum greenhouse' limits, and the optimistic inner and outer HZ boundaries are the 'recent Venus' and 'early Mars' limits. By using these HZ limits, we (1) are able to make a consistent comparison with already published occurrence rates of HZ planets in the literature that have also used the same HZ limits, (2) provide a range of values for HZ planet occurrence, and (3) obtain an 'average' occurrence rate of planets in the HZ, as the conservative and optimistic HZ limits from Kopparapu et al. (2014) span the range of HZ limits from more complex models and processes.
We consider planets in the 0.5 − 1.5 R ⊕ size range to calculate rocky planet occurrence rates, as studies have suggested that planets that fall within these radius bounds are most likely to be rocky (Rogers 2015;Wolfgang et al. 2016;Chen & Kipping 2017;Fulton et al. 2017). While some studies have indicated that the rocky regime can extend to as high as 2.5 R ⊕ (Otegi et al. 2020), many of these high radius-regime planets seem to be highly irradiated planets, receiving stellar fluxes much larger than the planets within the HZ. Nevertheless, we have also calculated occurrence rates of planets with radii up to 2.5 R ⊕ . We note that Kane et al. (2016) also used Kopparapu et al. (2014) HZ estimates to identify HZ planet candidates using DR24 planet candidate catalog and DR25 stellar properties.
We also limit the host stellar spectral types to stars with 4800 ≤ T eff ≤ 6300 K, covering mid K to late F. The reason for limiting to T eff > 4800 K is two fold: (1) The inner working angle (IWA, the smallest angle on the sky at which a direct imaging telescope can reach its designed ratio of planet to star flux) for the LUVOIR coronagraph instrument ECLIPS falls off below 48 milli arc sec at 1 micron (3λ/D) for a planet at 10 pc for T eff ≤ 4800 K, and (2) Planets are likely tidal-locked or synchronously rotating below 4800 K that could potentially alter the inner HZ limit significantly (Yang et al. 2013;Yang et al. 2014;Wolf & Toon 2015;Way et al. 2015;Godolt et al. 2015;Kopparapu et al. 2016;Kopparapu et al. 2017;Bin et al. 2018). The upper limit of 6300 K is a result of planets in the HZs having longer orbital periods around early F-stars, where Kepler is not capable of detecting these planets, as described in §1.

Effective Temperature Dependence of the Width of the Habitable Zone
The width of the HZ for hotter stars is larger than the width for cooler stars, implying that the habitable zone occurrence rate may be dependent on the host star effective temperature. In this section we derive an approximate form for this effective temperature dependence, which we refer to as the "geometric effect".
We compute the instellation flux I on a planet orbiting a particular star as I = R 2 * T 4 /a 2 , where R * is the stellar radius in Solar radii, T = T eff /T is the effective temperature divided by the Solar effective temperature, and a is the semi-major axis of the planet orbit in AU. We assume the orbit is circular. Then the size of the habitable zone ∆a is determined by the instellation flux at the inner and outer habitable zone boundaries, I inner and I outer , as The factor 1/ √ I outer − 1/ √ I inner has a weak T eff dependence, ranging from 1.25 at 3900 K to 0.97 at 6300 K, which we crudely approximate as constant in this paragraph. We also observe that, for the main-sequence dwarf stellar populations we use in our computations (described in §3.1), R * has an approximately linear dependence on T , which we write as (τ T + R 0 ) (τ ≈ 1.8 and R 0 ≈ −0.74). Therefore ∆a ∝ (τ T + R 0 ) T 2 . (2) So even if the differential occurrence rate λ has no dependence on a, and therefore no dependence on I, the habitable zone occurrence rate may depend on T eff simply because hotter stars have larger habitable zones.
Several studies, such as Burke et al. (2015) and Bryson et al. (2020a), studied planet occurrence in terms of the orbital period p and have shown that df /dp is wellapproximated by a power law p α . In Appendix A we show that this power law, combined with the relationship between instellation flux and period, implies that the instellation flux portion of the differential rate function λ, df /dI, has the form where ν = − 3 4 α − 7 3 and δ = −ν −1. This form incorporates the combined effects of the size of the habitable zone increasing with T eff , as well as dependence from the period power law p α . The derivation in Appendix A uses several crude approximations, so Equation (3) is qualitative rather quantitative.
In §3.4 we consider forms of the population model λ that separate the geometric effect in Equation (2) from a possible more physical dependence on T eff , and compare them with direct measurement of the T eff -dependence. To separate the geometric effect, we incorporate a geometric factor g(T eff ) inspired by Equation (2). Because of the crude approximations used to derive Equations (2) and (3) we use an empirical fit to the habitable zone width ∆a for all stars in our stellar sample. Because we will use this fit in models of the differential population population rate df /dI in §3.4.1, we perform the fit computing ∆a for each star using a fixed flux interval ∆I ∈ [0.25, 1.8]. Because a = R 2 * T 4 / √ I, ∆a is just a scaling of each star's luminance R 2 * T 4 by the factor 1/ √ 0.25 − 1/ √ 1.8. As shown in Figure 3, ∆a is well-fit, with well-behaved residuals, by the broken power law If the semi-major axes of planets are uniformly distributed in our stellar sample, then we expect that habitable zone planet occurrence would have a T eff dependence due to Equation 4. In individual planetary systems, however, there is evidence of constant spacing in log(a) (Weiss et al. 2018), implying spacing proportional to ∆a/a. In this case there would be no impact of the larger habitable zones with increasing T eff : taking a to be the average of the inner and outer semi-major axes, the star-dependent terms cancel, so ∆a/a is the same for all stars, independent of T eff . This would imply that HZ occurrence has no T eff dependence due to the increasing size of the HZ. Figure 3. The width of the optimistic habitable zone (outer HZ boundary minus inner HZ boundary) as a function of effective temperature for all the stars in our parent sample with effective temperature between 3900 K and 6300 K. The line shows the broken power law fit in Equation (4). The percentage residual from the fit is shown in the lower panel.

METHODOLOGY
We base our occurrence rate of f planets per star on a differential population rate model λ(I, r, T eff ) = d 2 f (I,r,T eff ) dI dr that describes how f varies as a function of incident stellar flux I and planet radius r. We allow λ(I, r, T eff ) (and therefore f (I, r, T eff )) to depend on the host star effective temperature T eff . In §3.4 we use the DR25 planet candidate catalog to determine λ. We cannot, however, simply take all the planet candidates in the DR25 catalog at face value. We must statistically characterize and correct for errors in the catalog.
The DR25 planet candidate catalog contains 4034 planet candidates, identified through a uniform method of separating planet candidates from false positives and false alarms (Thompson et al. 2018). This automated method is known to make mistakes, being both incomplete due to missing true transiting planets, and unreliable due to misidentifying various types of astrophysical false positives and instrumental false alarms as transiting planets. Low completeness and low reliability are particularly acute near the Kepler detection limit, which happens to coincide with the habitable zones of F, G, and K stars. We characterize DR25 completeness and reliability using the synthetic data described in Thompson et al. (2018) with methods described in Bryson et al. (2020a). We correct for completeness and reliability when determining the population rate λ using the methods of Bryson et al. (2020a) and Kunimoto & Bryson (2020).
The methods used in Bryson et al. (2020a) and Kunimoto & Bryson (2020) computed population models in orbital period and radius. Generalizing these methods to instellation flux, radius, and effective temperature is relatively straightforward, with the treatment of completeness characterization presenting the largest challenge. In this section we summarize these methods, focusing on the changes required to operate in instellation flux rather than period and to allow for dependence on T eff .

Stellar Populations
As in Bryson et al. (2020a), our stellar catalog uses the Gaia-based stellar properties from Berger et al. (2020b) combined with the DR25 stellar catalog at the NASA Exoplanet Archive 1 , with the cuts described in the baseline case of Bryson et al. (2020a). We summarize these cuts here for convenience.
We begin by merging the catalogs from Berger et al. (2020b), the DR25 stellar catalog (with supplement), and Berger et al. (2018), keeping only the 177,798 stars that are in all three catalogs. We remove poorly characterized, binary and evolved stars, as well as stars whose observations were not well suited for long-period transit searches (Burke et al. 2015;Burke & Catanzarite 2017) with the following cuts: • Remove stars that, according to Berger et al. (2018), are likely binaries, leaving 160,633 stars.
• Remove stars that have evolved off the main sequence, recomputing the Evol flag described in Berger et al. (2018) using the Berger et al. (2020b) stellar properties, leaving 105,118 stars.
• Remove noisy targets identified in the KeplerPorts package 4 , leaving 103,626 stars.
• Remove stars with NaN limb darkening coefficients, leaving 103,371 stars.
• Remove stars with NaN observation duty cycle, leaving 102,909 stars.
• Remove stars with a decrease in observation duty cycle > 30% due to data removal from other transits detected on these stars, leaving 98,672 stars.
• Remove stars with data span < 1000 days, leaving 87,765 stars.
Selecting the FGK stars with effective temperature between 3900 K and 7300 K, which is a superset of the stellar populations we consider in this paper, we have 80,929 stars. We are primarily interested in habitable zone occurrence rates for stars with effective temperatures hotter than 4800 K, and Kepler observational coverage is very poor above 6300 K (see §2). We fit our population model using two stellar populations, and examine the solutions to determine which range of stellar temperature is best for computing the desired occurrence rate. These stellar populations are: • Hab: Stars with effective temperature between 4800 K and 6300 K (61,913 stars).
The effective temperature distribution of the stars in these populations is shown in Figure 4. This distribution has considerably fewer cooler stars than we believe is the actual distribution of stars in the Galaxy. Our analysis is weighted by the number of stars as a function of effective temperature. There are two stellar population cuts recommended in Burke & Catanzarite (2017) that we do not apply. The first is the requirement that stellar radii be less than 1.35 R (1.25 R in Burke & Catanzarite (2017), but Burke now recommends 1.35 R (private communication)). We do not impose this stellar radius cut, instead opting for the physically motivated selection based on the Evol flag. After our cuts, 6.8% of the the hab2 population contains stars that have radii larger than 1.35 R . The completeness analysis for these stars is not expected to be as accurate as for smaller stars.
The second recommended cut that we do not apply is the requirement that the longest transit duration be less than 15 hours. This cut is due to the fact planet search in the Kepler pipeline does not examine transit durations longer than 15 hours (Twicken et al. 2016). For the hab2 population, assuming circular orbits, transit durations of planets at the inner optimistic habitable zone boundary exceed 15 hours for 2.7% of the stars. Transit durations of planets at the outer optimistic habitable zone boundary exceed 15 hours for 35% of the stars, with the duration being less than 25 hours for 98.7% of the stars. While transit durations longer than 15 hours will have an unknown impact on the completeness analysis of these stars, there is evidence that the impact is small. KOI 5236.01, for example, has a transit duration of 14.54 hours, orbital period of 550.86 days and a transit signal to noise ratio (S/N) of 20.8. KOI 5236.01 is correctly identified in the Kepler pipeline in searches for transit durations of 3.5 to 15 hours. KOI 7932.01, has a transit duration of 14.84 hours, orbital period of 502.256 days, and a transit S/N of 8.1, among the smallest transit S/N for planet candidates with period > 450 days. KOI 7932.01 is correctly identified in searches using transit durations of 9 to 15 hours. So even for low S/N transits, the transit can be identified in searches for transit durations 9/15 times the actual duration. If these examples are typical, we can expect that transit durations of up to 25 hours will be detected. While these examples do not show that the impact of long transits on completeness is actually small, the bulk of these long durations occur in orbits beyond 500 days, so they are absorbed by the upper and lower bounds in completeness we describe in §3.3.2. We are confident that long transit durations, as well as large stars, cause completeness to decrease, so their impact falls within these upper and lower bounds.

Planet Input Populations
We use planet properties from the Kepler DR25 Threshold Crossing Events (TCE) catalog (Twicken et al. 2016), with the Gaia-based planet radii and instellation flux from Berger et al. (2020a).
Three DR25 small planet candidates that are near their host star's habitable zones (planet radius ≤ 2.5 R ⊕ and instellation flux between 0.2 and 2.2 I ⊕ ) are not included in our planet sample. KOI 854.01 and KOI 4427.01 are orbiting host stars with effective temperatures ≤ 3900 K, and KOI 7932.01's host star is cut from our stellar populations because it is marked "Evolved" (see Bryson et al. 2020a).

Detection and Vetting Completeness
The DR25 completeness products are based on injected data -a ground-truth of transiting planets obtained by injecting artificial transit signals with known characteristics on all observed stars at the pixel level . A large number of transits were also injected on a small number of target stars to measure the dependence of completeness on transit parameters and stellar properties (Burke & Catanzarite 2017). The data are then analyzed by the Kepler detection pipeline (Jenkins et al. 2010) to produce a catalog of detections at the injected ephemerides called injected and recovered TCEs, which are then sent through the same Robovetter used to identify planet candidates.
Detection completeness is defined as the fraction of injected transits that are recovered as TCEs by the Kepler detection pipeline, regardless of whether or not those TCEs are subsequently identified as planet candidates. We use the detection completeness of Burke & Catanzarite (2017), which was computed for each target star as a function of period and simulated Multiple Event Statistic (MES), based on stellar noise properties measured in that star's Kepler light curve. MES is a measure of the signal-to-noise ratio (S/N) that is specific to the Kepler pipeline (Jenkins et al. 2010). The result is referred to as completeness detection contours.
Vetting completeness is defined as the fraction of detected injected transits that were identified as planet candidates by the Robovetter (Coughlin 2017). We compute vetting completeness for a population of stars based on the simulated MES and orbital period of injected transits. We use the method of Bryson et al. (2020a), which models vetting completeness as a binomial problem with a rate given by a product of rotated logistic functions of MES and orbital period. We assume that vetting completeness and detection completeness are independent, so we can multiply them together to create combined completeness contours.
The product of vetting and detection completeness as a function of period and MES is converted to a function of period and planet radius for each star. This product is further multiplied by the geometric transit probability for each star, which is a function of planet period and radius, given that star's radius. The final result is a completeness contour for each star that includes detection and vetting completeness, and geometric transit probability.
We need to convert the completeness contours from radius and period to radius and instellation flux. For each star, we first set the range of instellation flux to 0.2 ≤ I ≤ 2.2, which contains the habitable zone for FGK stars. We then interpolate the completeness contour from period to instellation flux via I = R 2 * T 4 /a 2 , where R * is the stellar radius, T = T eff /T is the effective temperature relative to the Sun, and a is the semi-major axis of a circular orbit around this star with a given period.
Once the completeness contours are interpolated onto radius and instellation flux for all stars, they are summed or averaged as required by the inference method used in §3.4.2.

Completeness Extrapolation
For most stars in our parent sample, there are regions of the habitable zone which require orbital periods beyond the 500-day limit of the period-radius completeness contours. Figure 5 shows the distribution of orbital periods at the inner and outer optimistic habitable zone boundaries for FGK stars in our stellar sample relative to the 500-day limit. We see that a majority of these stars will require some completeness extrapolation to cover their habitable zones, and a small fraction of stars have no completeness information at all. It is unknown precisely how the completeness contours will extrapolate out to longer period, but we believe that the possible completeness values can be bounded.
We assume that completeness is, on average, a decreasing function of orbital period. Therefore, the completeness beyond 500 days will be less than the completeness at 500 days. While this may not be a correct assumption for a small number of individual stars due to local completeness minima in period due to the window function (Burke & Catanzarite 2017), we have high confidence that this assumption is true on average. We therefore bound the extrapolated completeness for each star by computing the two extreme extrapolation cases: • Constant completeness extrapolation, where, for each radius bin, completeness for periods greater than 500 days is set to the completeness at 500 days. This extrapolation will have higher completeness than reality, resulting in a smaller completeness correction and lower occurrence rates, which we take to be a lower bound. In the tables below we refer to this lower bound as "low" values. Here "low" refers to the resulting occurrence rates, and some population model parameters in the this case will have higher values.
• Zero completeness extrapolation, where, for each radius bin, completeness for periods greater than 500 days is set to zero. Zero completeness will have lower completeness than reality, resulting in a larger completeness correction and higher occurrence rates, which we take to be an upper bound. In the tables below we refer to this upper bound as "high" values. Here "high" refers to the resulting occurrence rates, and some population model parameters in this case will have lower values.
We solve for population models and compute occurrence rates for both extrapolation cases. Figure 6 shows the relative difference in the completeness contours summed over all stars. We see that for effective temperatures below ∼4500 K the difference between constant and zero completeness extrapolation is very close to zero, because these cooler stars are well-covered by completeness data (see Figure 1). We therefore expect the upper and lower occurrence rate bounds to converge for these stars. The Poisson likelihood we use requires the completeness contours summed over all stars, while the ABC method requires the completeness averaged over all stars. We observe a significant dependence of summed completeness on effective temperature, shown in Figure 7. We address this dependence of completeness on effective temperature by summing (for the Poisson likelihood) or averaging (for ABC) the completeness contours in effective temperature bins, as described in §3.4.2.

Reliability
We compute planet reliability as in Bryson et al. (2020a). Because this is done as a function of multiple event statistic (MES) and period, there is no change from the methods of that paper.
3.4. Computing the Population Model λ(I, r, T ) As described in §1.2, we develop a planet population model using a parameterized differential rate function λ, and use Bayesian inference to find the model  Figure 5. The distribution orbital periods of the inner and outer optimistic habitable zone boundaries. We show the orbital period distribution of circular orbits at the outer (blue) and inner (orange) boundaries of the optimistic habitable zone for our FGK stellar sample. The blue vertical dashed line at 500 days indicates the limit of the completeness contours, beyond which there is no completeness data. The orange vertical dotted line at 710 days shows the limit of Kepler coverage, in the sense that beyond 710 days there is no possibility of three detected transits resulting in a planet detection. Stars whose orbital period for the inner habitable zone boundary is beyond 500 days have no completeness data in their habitable zone, while stars whose outer habitable zone boundary orbital period is beyond 500 days require some completeness extrapolation. Kepler cannot detect habitable zone planets for Stars whose inner habitable zone orbital period is beyond 710 days, while stars whose outer habitable zone orbital periods are beyond 710 days have only partial coverage, which will decrease completeness.
parameters that best explains the data. To test the robustness of our results, we use both the Poisson-Likelihood MCMC method of Burke et al. (2015) and the Approximate Bayesian Computation method of Kunimoto & Matthews (2020) to compute our population model. Both methods are modified to account for vetting completeness and reliability, with the Poissonlikelihood method described in Bryson et al. (2020a) and the ABC method described in Kunimoto & Bryson (2020) and Bryson et al. (2020b). New to this work, we also take into account uncertainties in planet radius, instellation flux, and host star effective temperature, described in §3.4.2.

Population Model Choices
We consider three population models for the differential rate function λ(I, r, T ). These models are functions of instellation flux I, planet radius r and stellar effec-tive temperature T eff . These models depend on possibly different sets of parameters, which we describe with the parameter vector θ. For each model, we will solve for the θ that best describes the planet candidate data.
where g(T ) is given by Equation (4). The normalization constants C i in Equation (5) are chosen so that the integral of λ from r min to r max and I min to I max , averaged over T min to T max , = F 0 , so F 0 is the average number of planets per star in that radius, instellation flux and effective temperature range. λ 1 allows for dependence on T eff beyond the geometric dependence described in §2.2, breaking possible degen-  eracy between any intrinsic T eff and the geometric dependence by fixing the geometric dependence as g(T ). So, for example, if the planet population rate's dependence is entirely due to the larger HZ for hotter stars, captured in λ 1 by g(T ), then there is no additional T eff dependence and γ = 0. λ 2 does not separate out the geometric T eff dependence. λ 3 assumes that there is no T eff dependence beyond the geometric effect.

Inference Methods
Both the Poisson likelihood and ABC inference methods use the same stellar and planet populations, and the same characterization of completeness and reliability computed using the approach of Bryson et al. (2020a). These steps are as follows: • Select a target star population, which will be our parent population of stars that are searched for planets. We apply various cuts intended to select well-behaved and well-observed stars. We consider two such populations, defined by effective temperature range as described in §3.1, in order to explore the dependence of our results on the choice of stellar population.
• Use the injected data to characterize vetting completeness.
• Compute the detection completeness using a version of KeplerPorts 5 modified for vetting completeness and insolation interpolation, incorporating vetting completeness and geometric probability for each star, and sum over the stars in effective temperature bins, as described in §3.3.1.
• Use observed, inverted, and scrambled data to characterize false alarm reliability, as described in §3.3.3.
• Assemble the collection of planet candidates, including computing the reliability of each candidate from the false alarm reliability and false positive probability.
• For each model in Equation 5, use the Poisson likelihood or ABC methods to infer the model parameters θ that are most consistent with the planet candidate data for the selected stellar population.
Because vetting completeness and reliability depend on the stellar population and the resulting planet catalog, all steps are computed for each choice of stellar population.
A full study of the impact of the uncertainties in stellar and planet properties would include the impact of uncertainties on detection contours, and is beyond the scope of this paper. However, we study the impact of the uncertainties in planet radius, instellation flux and host star effective temperature, shown Figure 1, on our occurrence rates. For both the Poisson likelihood and ABC methods we perform our inference computation both with and without uncertainties, which allows us to estimate the approximate contribution of input planet property uncertainties on the final occurrence rate uncertainties. In Bryson et al. (2020a) it was shown that the uncertainties in reliability characterization have effectively no impact.
For the Poisson likelihood inference of the parameters in Equation (5) without input uncertainty, reliability is implemented by running the MCMC computation 100 times, with the planets removed with a probability given by their reliability. The likelihood used in the Poisson method is Equation (17) in Appendix B. For details see Bryson et al. (2020a).
We treat input uncertainties similar to how we treat reliability: we run the Poisson MCMC inference 400 times, each time selecting the planet population according to reliability. We then sample the planet instellation flux, radius and star effective temperature from the two-sided normal distribution with width given by the respective catalog uncertainties. We perform this sampling prior to restricting to our period and instellation flux range of interest so planets whose median property values are outside the range may enter the range due to their uncertainties. The posteriors from the 400 runs are concatenated together to give the posterior distribution of the parameters θ for each model. This approach to uncertainty does not recompute the underlying parent stellar population with re-sampled effective temperature uncertainties, because that would require re-computation of the completeness contours with each realization, which is beyond our computational resources. Shabram et al. (2020) preforms a similar uncertainty study, properly re-sampling the underlying parent population and observes an impact of uncertainty similar to ours (see §4.2). Our analysis of uncertainty should be considered an approximation. While the result is not technically a sample from a posterior distribution, in §4 we compare the resulting sample to the posterior sample from the model neglecting these uncertainties and find that the population parameter values and resulting occurrence rates change in a predictable way.
The ABC-based inference of the parameters in Equation (5) is computed using the approach of Kunimoto & Bryson (2020), with some modifications to accommodate temperature dependence and uncertainties on planet radius, instellation flux, and temperature.
In the ABC method, the underlying Kepler population is simulated in each completeness effective temperature bin separately. N p = F 0 N s h(T ) planets are drawn for each bin, where N s is the number of stars in the bin and h(T ) collects the model-dependent temperature terms from Equation (5), averaged over the temperature range of the bin and normalized over the entire temperature range of the sample. Following the procedure of Mulders et al. (2018), we assign each planet an instellation flux between 0.2 and 2.2 I ⊕ from the cumulative distribution function of I β , and a radius between 0.5 and 2.5 R ⊕ from the cumulative distribution function of r α .
The detectable planet sample is then simulated from this underlying population by drawing from a Bernoulli distribution with a detection probability averaged over the bin's stellar population. We compare the detected planets to the observed PC population using a distance function, which quantifies agreement between the flux distributions, radius distributions, and sample sizes of the catalogs. For the distances between the flux and radius distributions, we chose the two-sample Anderson-Darling (AD) statistic, which has been shown to be more powerful than the commonly used Kolmogorov-Smirnoff test (Engmann & Cousineau 2011). The third distance is the modified Canberra distance from Hsu et al. (2019), where n s,i and n o,i are the number of simulated and observed planets within the ith bin's temperature range, and the sum is over all N bins. For more details, see Bryson et al. (2020b). These simulations are repeated within a Population Monte Carlo ABC algorithm to infer the parameters that give the closest match between simulated and observed catalogs. With each iteration of the ABC algorithm, model parameters are accepted when each resulting population's distance from the observed population is less than 75th quantile of the previous iteration's accepted distances. Following the guidance of Prangle (2017), we confirmed that our algorithm converged by observing that the distances between simulated and observed catalogues approached zero with each iteration, and saw that the uncertainties on the model parameters flattened out to a noise floor.
This forward model is appropriate for estimating the average number of planets per star in a given flux, radius, and temperature range, similar to the Poisson likelihood method. However, rather than requiring many inferences on different catalogues to incorporate reliability or input uncertainty, we take a different approach. For reliability, we modify the distance function as described in Bryson et al. (2020b). In summary, we replace the two-sample AD statistic with a generalized AD statistic developed in Trusina et al. (2020) that can accept a weight for each datapoint, and set each observed planet's weight equal to its reliability. We also alter the third distance (Equation 6) so that a planet's contribution to the total number of planets in its bin is equal to its reliability. As demonstrated in Kunimoto & Bryson (2020) and Bryson et al. (2020b), this weighted distance approach gives results consistent with the Poisson likelihood function method with reliability. Meanwhile, to account for input uncertainty, the observed population is altered for every comparison with a simulated population by randomly assigning each observed planet a new radius, instellation flux, and host star effective temperature from the two-sided normal distribution with width given by their respective uncertainties.

Computing Occurrence Rates
Once the population rate model λ has been chosen and its parameters determined as described in §3.4, we can compute the number of habitable zone planets per star. For planets with radius between r 0 and r 1 and instellation flux between I 0 and I 1 , for a star with effective temperature T eff the number of planets per star is For a collection of stars with effective temperatures ranging from T 0 to T 1 , we compute the average number of planets per star, assuming a uniform distribution of stars in that range, as We typically compute equation (8) for every θ in the posterior of our solution, giving a distribution of occurrence rates. The habitable zone is not a rectangular region in the I-T eff plane (see Figure 1), so to compute the occurrence in habitable zone for a given T eff , we integrate I from the inner habitable zone flux I out (T eff ) to the outer flux λ(I, r, T eff , θ) dI dr. (9) The functions I out (T ) and I in (T ) are given in Kopparapu et al. (2014) and depend on the choice of the conservative vs. optimistic habitable zone. f HZ (T eff ) will be a distribution of occurrence rates if we use a distribution of θ. For a collection of stars with effective temperatures ranging from T 0 to T 1 , we compute f HZ (T eff ) for a sampling of T ∈ [T 0 , T 1 ], and concatenate these distributions together to make a distribution of habitable zone occurrence rates f HZ for that radius, flux and temperature range. When we are computing f HZ to determine the occurrence rate for a generic set of stars, we uniformly sample over [T 0 , T 1 ] (in practice we use all integer Kelvin values of T ∈ [T 0 , T 1 ]). The resulting distribution is our final result. Figure 8 shows the impact of uncertainty in stellar effective temperature on the habitable zone boundaries. For each star we computed the uncertainty in the habitable zone boundaries with 100 realizations of that star's effective temperature with uncertainty, modeled as a two-sided Gaussian. The grey regions in Figure 8 show the 86% credible intervals of the uncertainty of the habitable zone boundaries. These intervals are small relative to the size of the habitable zone, and are well centered on the central value. For example, consider the inner optimistic habitable zone boundary, which has the widest error distribution in Figure 8. The median of the difference between the median habitable zone uncertainty and the habitable zone boundary without uncertainty is less than 0.002%, with a standard deviation less than 0.9%. Therefore, we do not believe that uncertainties in habitable zone boundaries resulting from stellar effective temperature uncertainties have a significant impact on occurrence rates.

Inferring the Planet Population Model Parameters
For each choice of population differential rate model from Equation (5) and stellar population from §3.1, we determine the parameter vector θ with zeroextrapolated and constant extrapolated completeness, giving high and low bounds on occurrence rates. These solutions were computed over the radius range 0.5 R ⊕ ≤ r ≤ 2.5 R ⊕ and instellation flux range 0.2 I ⊕ ≤ I ≤ 2.2 I ⊕ using the hab and hab2 stellar populations described in §3.1. We perform these calculations both without and with input uncertainties in the planet ra-dius, instellation flux, and T eff shown in Figure 1. Example of the resulting θ posterior distributions are shown in Figure 9 and the corresponding rate functions λ are shown in Figure 10. The solutions for models 1-3 for the hab and hab2 stellar populations computed using the Poisson likelihood method are given in Table 1, and the results computed using ABC are given in Table 2. An example of the sampled planet population using input uncertainties is shown in Figure 11.
We compared models 1-3 using the Akaike Information Criterion (AIC), but AIC did not indicate that one of the models was significantly more consistent with the data than another. The resulting relative likelihoods from the AIC analysis relative to model 1 were 0.31 for model 2 and 2.07 for model 3. Such small relative likelihoods ratios are not considered compelling.
The F 0 parameter, giving the average number of planets per star in the solution domain (see §5) indicates that solutions using zero completeness extrapolation (see §3.3.2), yield higher occurrence than constant completeness extrapolation. This is because zero completeness extrapolation induces larger completeness corrections. The zero completeness extrapolation solution provides an upper bound on the habitable zone occurrence rate, and the constant extrapolation solution provides a lower bound. Reality will be somewhere between, and is likely to be closer to the zero extrapolation case for hotter stars, which have lower completeness in their habitable zone.
The hab stellar population is a subset of the hab2 population, so one may expect that they give similar solutions. But the hab stellar population contain significant regions of extrapolated completeness and lowreliability planet candidates for the flux range considered in our solution (see Figure 1). While the hab2 population contains the same regions, hab2 also contains many cooler stars that host higher-reliability planet candidates. These stars provide a better constraint on the power laws we use to describe the population. The result is that the hab2 solution has significantly smaller uncertainties than the hab solution, as seen in Tables 1 and 2. Table 3 gives η ⊕ , computed using the Poisson likelihood method for the optimistic and conservative habitable zones for the hab and hab2 stellar populations and models 1-3. The low and high values correspond to the solutions using constant and zero completeness extrapolation, which bound the actual occurrence rates (see §3.3.2). We see the expected behavior of zero completeness extrapolation leading to higher occurrence due to a  Table 4 gives the same occurrence rates computed using the ABC method. The distributions of η ⊕ using these models and the Poisson likelihood method are shown in Figure 12. We see that for each model, when incorporating the input uncertainties, the hab and hab2 stellar populations yield consistent values of η ⊕ . Without using the input uncertainties the hab population yields consistently lower values for η ⊕ , though the difference is still within the 68% credible interval. Model 3 with hab2 gives generally larger occurrence rates than model 1. We also see that the median occurrence rates are about ∼ 10% higher when incorporating input uncertainties, qualitatively consistent with Shabram et al. (2020), who also sees higher median occurrence rates when incorporating uncertainties. It is not clear what is causing this increase in occurrence rates: on the one hand the sum of the inclusion probability, defined in Appendix C, for the planet candidates in Table 8 is 53.6, compared with 54 planet candidates in the analysis without uncertainty, indicating that, on average, more planets exit the analysis than enter the analysis when incorporating uncertainties. On the other hand, the sum of the inclusion probability times the planet radius is 106.6, compared with 105.0 for the planet candidates in the analysis without uncertainty, indicating that, on average, larger planets are entering the analysis. This may have an impact on the power law model, leading to higher occurrence rates. Table 5 gives occurrence rates for a variety of planet radius and host star effective temperature ranges, com-puted using the hab2 stellar population and models 1-3. We see that the uncertainties for the 1.5 -2.5 R ⊕ planets are significantly smaller than for the 0.5 -1.5 R ⊕ planets, indicating that the large uncertainties in η ⊕ are due to the small number of observed planets in the 0.5 -1.5 R ⊕ range. The distributions of occurrence for the two bounding extrapolation types is shown in Figure 13. The difference between these two bounding cases is smaller than the uncertainties. Table 6 gives the 95% and 99% intervals for the 0.5 -1.5 R ⊕ planets using model 1 computed with hab2. Figure 14 shows the dependence of the habitable zone occurrence rate on effective temperature for models 1-3 based on the hab2 stellar population, and model 1 for the hab stellar population. For each model, the occurrence using zero and constant extrapolation is shown. Models 1 and 2 show a weak increase in occurrence with increasing effective temperature. Model 3 shows a stronger increase occurrence with effective temperature, consistent with model 3's assumption that the only temperature dependence is the geometric effect described in §2.2. However, as shown in Figure 6, the difference between constant and zero extrapolated completeness is near zero for T eff ≤ 4500 K, so we would expect the difference in occurrence rates to be close to zero in that temperature range. This is true for models 1 and 2, but not true for model 3. We take this as evidence that models 1 and 2 are correctly measuring a T eff dependence beyond the geometric effect. We recognize that the statistical evidence for this T eff dependence is not compelling, since the overlapping 68% credible intervals for the two completeness extrapolations would allow an occurrence rate independent of T eff . An issue that arises with zero completeness extrapolation (= high bound) is that, strictly speaking, PCs with orbital periods > 500 days are in a region of zero completeness around their host star and would not contribute to the Poisson likelihood (see Equation (16) in Appendix B). There is one such PC in the hab2 stellar sample with reliability = 0.67. Performing our Poisson inference, removing planets with period > 500 days, for model 1 and incorporating input uncertainties yields an optimistic η ⊕ = 0.70 +1.01 −0.41 , compared with 0.88 +1.27 −0.51 (from Table 3) when including the planet. While well within the 68% credible interval of the result with this planet included, removing this planet has a noticeable impact on the upper bound for the optimistic η ⊕ . However this planet was in fact detected, implying that the completeness is not zero for periods > 500 days, at least for this planet's host star. If the actual completeness is very close to zero, a planet detection implies a large population. We therefore leave this planet in the analysis, thinking of "zero completeness" as a limit of the completeness going to zero when the habitable zone includes orbital periods > 500 days, summed or averaged over the stellar population for our computations.

Instellation Flux vs. Orbital Period
We choose to compute our occurrence rates as a function of instellation flux for two major reasons: this allows a more direct characterization of each star's habitable zone, and it allows a more constrained extrapolation to longer orbital periods than working directly in orbital period.
By considering instellation flux, we can measure habitable zone occurrence by including observed planets from the habitable zone of their host stars, which is not possible across a wide range of stellar temperatures when using orbital period (see Figure 2). Instellation flux also allows a direct measurement of the impact of uncertainties in stellar effective temperature and planetary instellation flux.
The habitable zone of most G and F stars includes orbital periods that are beyond those periods well-covered by Kepler observations (see Figures 1 and 2), requiring Note-The low and high bounds correspond to the constant and zero completeness extrapolation of §3.3.2. "With Uncertainty" means planet candidate radius, instellation flux and host star effective temperature uncertainties were taken into account.
significant extrapolation of orbital-period based planet population models to long orbital periods. Such extrapolation is poorly known or constrained, leading to possible significant and unbounded inaccuracies. In instellation flux, however, there is planet data throughout those regions of our domain of analysis that have reasonable completeness (see Figure 1) so no extrapolation in instellation flux is required. In this sense, replacing orbital period with instellation flux (determined by orbital period for each star) moves the problem of extrapolating the population model to longer orbital period to extrapolating the completeness data to lower instellation flux. In §3.3.2 we argue that completeness, on average, decreases monotonically with decreasing instellation flux. This allows us to bound the extrapolated completeness between no decrease at all (constant extrapolation) and zero completeness for instellation flux for orbital periods beyond the 500-day limit where completeness was measured. We then perform our analysis for the two extrapolation cases, and find that their difference in habitable zone occurrence rates is small relative to our uncertainties. In this way we provide a bounded estimate of habitable zone occurrence rates using instellation flux, rather than the unbounded extrapolation resulting from using orbital period.

Comparing the Stellar Population and Rate Function Models
Our approach to measuring η ⊕ is to compute the planet population rate model λ(r, I, T eff ) ≡ d 2 f (r, I, T eff )/dr dI, integrate over r and I and average over T eff . We compute the population model λ using the hab and hab2 stellar populations ( §3.1) to measure the sensitivity of our results to stellar type, and we consider several possible functional forms for λ (Equation 5).

Comparing Population Rate Function Models
We believe that we have detected a weak dependence of habitable zone occurrence on host star effective temperature T eff , with hotter stars having slightly higher habitable zone occurrence. Model 2 (F 0 r α I β T γ eff ), which directly measures T eff dependence as a power law with exponent γ, indicates a weak T eff dependence for the zero completeness extrapolation case, though in the constant extrapolation case γ includes 0 in the 68% credible interval (see Table 1). This is somewhat remarkable given that, as discussed in §2.2, the size of the habitable Table 3. η⊕, computed using population models based on the hab and hab2 stellar populations, with and without uncertainties for models 1-3 and using the Poisson method.  zone grows as at least T 3 eff . This is consistent with model 1 (F 0 r α I β T γ eff g(T eff )), which includes a fixed g(T eff ) term from Equation 4, reflecting the increase in size of the habitable zone with increasing temperature, and an additional T γ eff power law to capture any additional T eff . In Table 1 we see that model 1 yields a very weak or negative value for γ, consistent with the weak direct detection of T eff dependence in model 2. The consistency between models 1 and 2 is further indicated by the fact that they yield very similar occurrence rates, as shown in Tables 3, 4 and 5, as well as Figure 14.
Model 3 (F 0 r α I β g(T eff ))) assumes that the T eff dependence of habitable zone occurrence is entirely due to the increase in size of the habitable zone with increasing T eff . When averaged over our η ⊕ effective temperature range of 4800 K -6300 K, model 3 yields somewhat higher occurrence rates than models 2 and 3 (see Tables 3 and  4, and Figure 12). Models 1 and 2 have the expected behavior of the high and low bounds converging for cooler stars (see Figure 14), consistent with the extrapolation options coinciding for these stars (see Figure 6). Model 3 (λ 3 = F 0 C 3 r α I β g(T )) does not have this behavior but model 3's fixed T eff dependence does not allow such a convergence.
Because models 1 and 2 detect a weaker T eff dependence than would be expected due to the larger HZ for hotter stars (Equation (4) if planets were uniformly distributed, we don't believe that model 3 is the best model for the data.
Models 1 and 2 yield essentially the same habitable zone occurrence results, but model 1 separates the geometric effect from intrinsic T eff dependence. We there-

With Uncertainty
Without Uncertainty  fore emphasize model 1, but model 2 provides a direct measure of the total T eff dependence.

Comparing the hab and hab2 Stellar Populations
Without input uncertainties, the hab and hab2 stellar populations yield interestingly different values for η ⊕ . However, Tables 1 and 2 shows that the F 0 parameter fits for hab have significantly larger relative uncertainties than hab2, with hab having ≈ 200% positive uncertainties compared with the ≈ 100% positive uncertainties for the hab2 fits. In addition, the effective temperature exponent γ has larger absolute uncertainties for hab than hab2. These larger uncertainties propagate to larger relative uncertainties in the occurrence rates in Tables 3,  4, and 5. This can also be seen by comparing hab and hab2 for model 1 in Figure 14. These large uncertainties result in the differences between hab and hab2 being less than the 68% credible interval. With input uncertainties, the results for the hab and hab2 stellar populations are more consistent, being well inside the 68% credible interval, but hab still has larger relative uncertainties.
We believe the larger uncertainties for hab relative to hab2 is due to the hab2 population being less well covered by the Kepler observations than hab. A larger fraction of planet candidates for stars in the hab effective temperature range of 4800 K -6300 K are in a region of lower completeness and reliability (Figure 7), and have poorer observational coverage (Figure 1). The hab2 population, with an effective temperature range of 3900 K -6300 K includes regions with better observational coverage and more reliable planet detections.
Basing our occurrence estimates on hab2 covers the full range of K stars without extrapolation, allowing us to produce, for example, GK or G or K habitable zone occurrence rates using the same population model. This avoids possible ambiguities that may be due to different population models. Finally, when considering T eff uncertainties, there are several planets close to the lower hab boundary at 4800 K, which lead to larger impact of T eff uncertainties on the population model because those planets will be in some uncertainty realizations and not in others (see Figure 1). In contrast, the lower T eff boundary for hab2 is outside the 68% credible interval for all detected planets. Therefore, although the hab population exactly matches our effective temperature range for η ⊕ , we prefer models computed using the hab2 population.
To summarize, we adopt model 1 based on hab2 for our primary reported result, but we also provide the results for models 1-3 and the hab stellar populations.

Computing η ⊕
We find reasonable consistency in η ⊕ across models for both the hab and hab2 stellar population as shown in Tables 3 and 4. Table 5 gives occurrence rates for  Figure 12. The distribution of η⊕ for population models computed using the hab (dashed lines) and hab2 (solid lines) stellar populations, for the three models in Equation (5), demonstrating that we get similar results from models 1 and 2 for both stellar populations. Medians and 68% credible intervals are shown above the distributions. The result from the hab2 population including effective temperature dependence is shown with the thick black line. Top: incorporating the uncertainty on planet radius, and stellar instellation and stellar effective temperature. Bottom: without incorporating uncertainties. Left: the conservative habitable zone. Right: the optimistic habitable zone.
several planet radius and stellar effective temperature ranges, using the population model from the hab2 population. The uncertainties reflecting our 68% credible intervals for our η ⊕ , counting HZ planets with radius 0.5 -1.5 R ⊕ , are large, with positive uncertainties being nearly 150% of the value when using input uncertainties. Comparing occurrence rate uncertainties with and without input uncertainties in Tables 3 and 4, we see that the bulk of the uncertainties occur without input uncertainties, while using input uncertainties increases the output uncertainty by nearly 20%. The much smaller uncertainties for the larger planets (1.5 R ⊕ ≤ r ≤ 2.5 R ⊕ ) in Table 5 suggest the large uncertainties in η ⊕ are driven by the very small number of detections in the η ⊕ range, combined with the very low completeness (see Figure 1). Low completeness will cause large completeness corrections, which will magnify the Poisson uncertainty due few planet detections. There are more planets with larger radius, which are in a regime of higher completeness, resulting in lower uncertainties for larger planet occurrence rates.

Implications of η ⊕
Estimates of η ⊕ are useful in calculating exoEarth yields from direct imaging missions, such as the flagship concept studies like LUVOIR and HabEX. These  mission studies assumed an occurrence rate of η ⊕ = 0.24 +0.46 −0.16 for Sun-like stars based on the NASA ExoPAG SAG13 meta-analysis of Kepler data ). The expected exoEarth candidate yields from the mission study reports are 54 +61 −34 for LUVOIR-A (15m), 28 +30 −17 for LUVOIR-B (8m), and 8 for HabEx 4m baseline configuration which combines a more traditional coron-agraphic starlight suppression system with a formationflying starshade occulter. Table 5 provides η ⊕ values based on three models for G (Sun-like) and K-dwarfs. If we assume the range of η ⊕,G from conservative to optimistic HZ from Table 5 for planets in the 0.5 − 1.5 R ⊕ from the "low" end, say for Model 1, η ⊕,G would be between 0.38 +0.50 −0.22 and 0.60 +0.75 −0.34 . Added after acceptance: While these η ⊕ values appear to be larger than the 0.24 +0.46 −0.16 occurrence rate assumed by the mission studies, it should be noted that these studies adopted radius range of 0.82 to 1.4 R ⊕ , and a lower limit of 0.8 * a −0.5 , where a is the HZ-corrected semi-major axis. This is slightly a smaller HZ region and lower radius than the one used in our study. As a result, it is possible that we might be agreeing with their assumed η ⊕ value if we use the same bounding boxes. Computing the conservative habitable zone as described in §3.5 but replacing the planet radius range with 0.82 ≤ r ≤ 1.4 R ⊕ gives a lower bound of 0.18 +0.16 −0.09 and an upper bound of 0.28 +0.30 −0.14 , nicely bracketing the value assumed in mission studies.
η ⊕ can also be used to estimate, on average, the nearest HZ planet around a G and K-dwarf assuming the planets are distributed randomly. Within the Solar neighborhood, the stellar number density ranges from 0.0033 to 0.0038 pc −3 for G-dwarfs, and 0.0105 to 0.0153 pc −3 for K-dwarfs (Mamajek & Hillenbrand 2008;Kirkpatrick et al. 2012)  HZ planets pc −3 . The nearest HZ planet around a Gdwarf would then be expected to be at a distance of d = (3/(4π×n p )) 1/3 , where n p is the planet number density in pc −3 . Substituting, we get d between 5.9 +1.76 −1.26 pc and 5.5 +1.83 −1.34 pc, or essentially around ∼6 pc away. A similar calculation for K-dwarfs assuming Model 1 conservative HZ η ⊕,K values from Table 5 indicates that, on average, the nearest HZ planet could be between 4.1 +1.19 −0.90 pc and 3.6 +1.05 −0.79 pc, or around ∼4 pc away. An additional speculative calculation one could do is to take the number of G-dwarfs in the Solar neighborhood within 10 pc -19 from RECONS 7 -and multiply it with the "low" conservative η ⊕,G value from Model 1, 7 http://www.recons.org/census.posted.htm 0.38 +0.50 −0.22 . We then get 7.2 +9.5 −4.2 HZ planets around Gdwarfs (of all sub-spectral types) within 10 pc. A similar calculation for K-dwarfs from the same RECONS data with 44 stars, and Model 1 "low" value, conservative HZ η ⊕,K = 0.32 +0.35 −0.17 indicates that there are 14 +15 −7.5 HZ planets around K-dwarfs within 10 pc. It should be noted that the numbers for the nearest HZ planet and the number of HZ planets in the solar neighborhood used the "low" end of the rocky planet (0.5-1.5 R ⊕ ) occurrence rate values from Table 5. As such, these represent the lower bound estimates. In other words, there may potentially be a HZ rocky planet around a G or a Kdwarf closer, and may be more HZ planets, than the values quoted above. This can be quantified from the numbers shown in Table 6, which provides the 95% and 99% credible intervals of the upper and lower bounds on habitable zone occurrence for model 1 computed with hab2 and accounting for input uncertainty. If we use only the "low" and the lower end of the conservative HZ occurrence values from this table (0.07 for 95%, 0.04 for 99% credible intervals), then the nearest HZ planet around a G or K-dwarf star is within ∼6 pc away with 95% confidence, and within ∼ 7.5 pc away with 99% confidence. Similarly, there could be ∼4 HZ planets within 10 pc with 95% confidence, and ∼3 HZ planets with 99% confidence.
We again caution that these are only estimates and do not necessarily indicate actual number of planets that could be detectable or exist. The numbers provided in this section are first order estimates to simply show a meaningful application of η ⊕ , given its uncertainties. Mulders et al. (2018) and He et al. (2020) have shown that there is strong evidence for multiplicity and clustering of planets within a system. This implies that the nearest such planets would be farther than if there were not clustering.

Comparison with previous estimates of η ⊕
Our work is the first to compute habitable zone occurrence rates using the incident stellar flux for habitable zones around subsets of FGK stars based on the DR25 catalog, Gaia-based stellar properties, and using the habitable zone definition of Kopparapu et al. (2014). Other works in the literature have produced occurrence rates for orbital periods related to FGK habitable zones, but as discussed in §1.2 and §3.5, we are measuring occurrence between habitable zone boundaries as a function of T eff . This is a different quantity than occurrence rates based on orbital period. The few occurrence rate estimates in the literature based on instellation flux, such as Petigura et al. (2013), used a rectangular region in radius and instellation flux, and only approximated the habitable zone. Therefore, we do not directly compare our occurrence rates with those in previous literature.
We provide a formal computation of Γ ⊕ , which is commonly used to compare η ⊕ estimates (see, for example, Figure 14 of Kunimoto & Matthews 2020). For a planet of period p and radius r, we define Γ ≡ d 2 f /d log p d log r = p r d 2 f /dp dr, and Γ ⊕ is Γ evaluated at Earth's period and radius. We need to express Γ in terms of instellation flux I, which we will do using d 2 f /dp dr = d 2 f /dI dr (dI/dp).
For a particular star, the instellation on a planet at period p in years is given by I = R 2 * T 4 M −2/3 * p −4/3 , where R * is the stellar radius in Solar radii, M * is the stellar mass in Solar masses and T = T eff /T is the star's effective temperature divided by the Solar effective temperature. Then because d 2 f /dI dr = λ(I, r, T, θ), one of the differential population rate models from Equation (5). To compute Γ ⊕ for a particular star, we evaluate Equation 10 at r = 1 R ⊕ , p = 1 year, and I = R 2 * T 4 M −2/3 * , the instellation a planet with a one-year orbital period would have from that star. The result is the Γ ⊕ in radius and period implied by our differential population rate function in radius and instellation for that star, and may be compared directly with Γ ⊕ from period-based occurrence studies.
We compute Γ ⊕ using model 1 from Equation (5) with input uncertainty on the hab2 stellar population. For each star in hab2, we evaluate Equation 10 using the posterior θ distribution, and concatenate the resulting Γ ⊕ distributions from all the stars. We do this for both completeness extrapolations in §3.3.2, giving low and high bounds. This results in a Γ ⊕ between 0.45 +0.46 −0.24 and 0.50 +0.46 −0.26 . While this is a formal mathematical exercise that has not been demonstrated to be truly equivalent to Γ ⊕ defined in period space, the match between this value and our conservative habitable zone η C ⊕ = 0.37 +0.48 −0.21 -0.60 +0.90 −0.36 for the model 1, hab2 with input uncertainty in Table 3 is remarkable.
Our value of Γ ⊕ is somewhat higher than values using post-Gaia stellar and planet data (see, for example, figure 14 of Kunimoto & Matthews (2020)), but not significantly so. For example, Bryson et al. (2020a) found Γ ⊕ = 0.09 +0.07 −0.04 when correcting for reliability in periodradius space. Using the same population model Bryson et al. (2020a) found a SAG13 η ⊕ value of 0.13 +0.10 −0.06 . It is not clear how much we should expect Γ ⊕ to correspond with η ⊕ .

Effective Temperature Dependence
As described in §4.2 and Figure 14, our results indicate a weak, but not compelling, increase in HZ planet occurrence with increasing stellar effective temperature. This T eff dependence is weaker than would be expected if planet occurrence were uniformly spaced in semi-major axis (see §2.2) because hotter stars have larger HZs. This can be seen quantitatively in the median differential population rates for models 1 and 2 using the hab2 population in Tables 1 and 2. In model 2 we observe a median T eff exponent γ < 3, compared with the prediction of γ ≈ 3 to 4.5 due to the larger HZ for hotter stars from Equation (4). This is reflected in model 1, which includes the correction for the larger HZ so if T eff dependence were due only to the larger HZ then γ would equal 0. The high bound of model 1 finds a median γ < −1 indicating that we see fewer HZ planets than expected in the larger HZ of hotter stars if the planets were uniformly spaced. However the upper limits of γ's 68% credible interval in Tables 1 and 2 are consistent with the prediction of uniform planet spacing in larger HZs for hotter stars. For example, the posterior of γ for model 1 (hab 2 population, high bound) has γ ≥ 0 22.3% of the time.
Our detection of a weaker T eff dependence than expected from larger HZs is qualitatively consistent with the increasing planet occurrence with lower T eff found in Garrett et al. (2018) and Mulders et al. (2015). But our uncertainties do not allow us to make definitive conclusions about this T eff dependence.

Dependence on the Planet Sample
To study the dependence of our result on the planet sample, we performed a bootstrap analysis. We ran the Poisson likelihood inference using hab2 and model 1 with zero extrapolation (high bound) 400 times, resampling the planet candidate population with replacement. Each re-sampled run removed planets according to their reliability as described in §3.4.2, but did not consider input uncertainty. The concatenated posterior of these re-sampled runs gives  Tables 1 and 3, we see that the central values from the bootstrap study are well within the 68% credible interval of our results, and the uncertainties are as much as 50% higher.
A similar study of the dependence on the stellar sample is not feasible because each re-sampled stellar population would require a full re-computation of detection and vetting completeness and reliability. Performing hundreds of these computations is beyond our available resources.

Impact of Catalog Reliability Correction
All results presented in this paper are computed with corrections for planet catalog completeness and reliability (see §3.3). Figure 15 shows an example of what happens when there is no correction for catalog reliability. We compute η C ⊕ , occurrence in the conservative habitable zone, with model 1, zero completeness extrapolation (high value), accounting for input uncertainty and using the hab2 stellar population. With reliability correction, we have η C ⊕ = 0.60 +0.90 −0.36 and without relia-bility correction we have η C ⊕ = 1.25 +1.40 −0.60 . In this typical case reliability has a factor-of-two impact, consistent with Bryson et al. (2020a), though because of the large uncertainties the difference is less than the 68% credible interval. 0 1 2 3 4 5 Conservative HZ Occurrence Relative Frequency Reliability No Reliability Figure 15. A comparison of the distributions, with and without reliability correction, of the conservative habitable zone η C ⊕ computed with model 1, zero completeness extrapolation (high value), accounting for input uncertainty and using the hab2 stellar population.

η ⊕ based on the GK and FGK Stellar Populations
Our definition of η ⊕ , restricted to stars with effective temperatures between 4800 K and 6300 K, varies somewhat from the literature. To connect with other occurrence rate studies we repeat our analysis using the GK (3900 K ≤ T eff ≤ 6000 K) and FGK (3900 K ≤ T eff ≤ 7300 K) stellar populations to compute planet population models, with results in Table 7. We provide our η ⊕ derived from these stellar populations as well as habitable zone occurrence for the GK and FGK T eff ranges. The values for our definition of η ⊕ are consistent with the values in Tables 3 and 4. We caution, however, that the FGK population extends well into stellar effective temperatures where there are no planet detections and very low or zero completeness, so an FGK result necessarily involves extrapolation from cooler stars.

Caveats
While this study takes care to incorporate detection and vetting completeness, and importantly both reliability and observational uncertainty, there are still unresolved issues. We summarize these issues here, each of which can motivate future improvements to our occurrence rate model and methodology. Power Law Assumption: Products of power laws in radius and period are commonly adopted for planet population models in occurrence rate studies, but there is growing evidence that calls their suitability into question. For instance, improvements to stellar radius measurements have revealed that the radius distribution for small, close-in planets is bi-modal, rather than a smooth or broken power law , which has also been observed in K2 data (Hardegree-Ullman et al. 2020). Power laws are not capable of describing such non-monotonic populations. Looking at the bottom panels of Figure 10 (without uncertainty), some data points in the radius and instellation flux distributions do not lie along our inferred power laws. However, using input uncertainties (top panels of Figure 10) washes out this structure, making it more difficult to discern a failure or success of a power law model as a descriptor of the data. There is also strong evidence that populations are not well described by products of power laws in radius and period Lopez & Rice 2018) for orbital periods < 100 days. Therefore a product of power laws such as Equation (5) in radius and instellation flux is unlikely to be a good description of planet populations at high instellation. At the low instellation of the habitable zone, however, the observed PC population does not indicate any obvious structure (see Figure 1) Given that our domain of anal-ysis is plagued by few detections, low completeness, and low reliability, more observations are likely needed to determine more appropriate population models. Therefore, because most of our planet population have radii larger than 1.5R ⊕ , those larger planets are likely driving the population model, and may be causing bias in the model in the smaller planet regime due to our simple product power laws in Equation (5).
Planetary Multiplicity: Zink et al. (2019) point out that when short-period planets are detected in the Kepler pipeline, data near their transits are removed for subsequent searches, which can suppress the detection of longer period planets around the same star. They find that for planets with periods greater than 200 days detection completeness can be suppressed by over 15% on average. Our stellar population has removed stars for which more than 30% of the data has been removed due to transit detection via the dutycycle post stellar property from the DR25 stellar properties table (for details see Bryson et al. (2020a)). We have not attempted to quantify the extent to which this stellar cut mitigates the impact identified in Zink et al. (2019), nor have we accounted for this effect in our analysis.
Stellar Multiplicity Contamination: Several authors (Ciardi et al. 2015;Furlan & Howell 2017, 2020 have shown that undetected stellar multiplicity can impact occurrence rate studies in at least two ways. Stellar multiplicity can reveal planet candidates to be false positives, reducing the planet population, and valid planet candidates in the presence of unknown stellar multiplicity will have incorrect planet radii due to flux dilution. They estimate that these effects can have an overall impact at the 20% level. Stellar multiplicity can also bias the parent stellar sample because unaccounted for flux dilution will bias the completeness estimates. Our analysis does not take possible stellar multiplicity into account. However stellar multiplicity has been shown to be associated with poor quality metrics, specifically the BIN flag of Berger et al. (2018) and the GAIA RUWE metric (Lindegren 2018). For example, Kraus et al. (in prep) finds that few Kepler target stars with RUWE > 1.2 are single stars. As described in §3.1, we remove stars from our parent stellar population that have been identified as likely binaries in Berger et al. (2018) or have RUWE > 1.2, which is expected to remove many stars with undetected stellar multiplicity (for details see Bryson et al. 2020a). We have not attempted to quantify the impact of undetected stellar multiplicity for our stellar population after this cut.
Planet radius and HZ limits: There are several stellar, planetary and climate models dependent factors that could reduce the occurrence rates calculated in this work. It is quite possible that the uncertainties in stellar radii may alter the planet radii, moving some rocky planets into the mini-Neptune regime of > 1.5 R ⊕ . Or, it is possible that the upper limit of 1.5 R ⊕ is an overestimate of the rocky planet limit, and the rocky to gaseous transition may lie lower than 1.5 R ⊕ . Although, as pointed out in section 2, Otegi et al. (2020) indicate that the rocky regime can extend to as high as 2.5 R ⊕ , many of these large-radius regime planets are highly irradiated ones, so they may not be relevant to HZ rocky planets.
The HZ limits themselves may be uncertain, as they are model and atmospheric composition dependent. Several studies in recent years have calculated HZ limits with various assumptions (see Kopparapu et al. (2019) for review). In particular, the inner edge of the HZ could extend further in, closer to the star, due to slow rotation of the planet (Yang et al. 2014;Kopparapu et al. 2016;Way et al. 2016), and the outer edge of the HZ may shrink due to 'limit cycling', a process where the planet near the outer edge of the HZ around FG stars may undergo cycles of globally glaciated and un-glaciated periods with no long-term stable climate state (Kadoya & Tajika 2014Menou 2015;Haqq-Misra et al. 2016). Consequently, the number of planets truly in the habitable zone remain uncertain.

Reducing Uncertainties
Our computation of η ⊕ has large uncertainties, with the 68% credible interval spanning factors of 2 (see Tables 3, 4 and 5). The 99% credible intervals in Table 6 span two orders of magnitude. In §5.3 we discussed how comparing occurrence rates with and without input uncertainties in Tables 3 and 4 indicates that these large uncertainties are present before considering the impact of uncertainties in the input data. We also observed in Table 5 that the uncertainties are considerably smaller for planets larger than those contributing to our η ⊕ . We conclude that, while input uncertainties make a contribution, the dominant cause of our large uncertainties is Poisson uncertainty due to the very small number of habitable zone planets smaller than 1.5 R ⊕ in very low completeness regions of the DR25 planet catalog (see Figure 1). Our uncertainties may be close to a noise floor induced by the small number of small habitable zone planets resulting from low completeness.
These large Poisson-driven uncertainties are unlikely to be reduced by resolving the issues discussed in §5.10. Only by increasing the small planet catalog completeness, resulting in a larger small-planet habitable zone population, can these uncertainties be reduced.
There are two ways in which a well-characterized catalog with more small planets can be produced: • Develop improved planet vetting metrics that produce a catalog that is both more complete and more reliable than the DR25 catalog. There are several opportunities for such improved metrics, discussed in Bryson et al. (2020a), such as more fully exploiting pixel-level data and existing instrumental flags that can give more accurate reliability characterization than that given using DR25 products. This approach requires new vetting metrics. Bryson et al. (2020b) has shown that varying the DR25 Robovetter thresholds does not significantly change occurrence rates or their uncertainties once completeness and reliability are taken into account. In Appendix D we show that such changes in Robovetter metrics also do not significantly change the occurrence rates we find in this paper.
• Obtain more data with a quality similar to Kepler, likely through more space-based observations. In §1 we described how the unrealized Kepler extended mission, doubling the amount of data relative to DR25, was expected to significantly increase the yield of small planets in the habitable zone. An additional 4 years of data observing the same stars as Kepler with similar photometric precision would be sufficient. 8 years of observation on a different stellar population would also suffice. As of this writing, plans for spacebased missions such as TESS or PLATO do not include such long stares on a single field. For example, PLATO currently plans no more than 3 years of continuous observation of a single field 8 .

CONCLUSIONS
In this paper we compute the occurrence of rocky (0.5 R ⊕ ≤ r ≤ 1.5 R ⊕ ) planets in the habitable zone for a range of main-sequence dwarf stars from the Kepler DR25 planet candidate catalog and Gaia-based stellar properties. We base our occurrence rates on differential population models dependent on radius, instellation flux and host star effective temperature ( §3.4.1). Our computations are corrected for completeness and reliability, making full use of the DR25 data products. Using instellation flux instead of orbital period allows us to measure the occurrence in the habitable zone even though the habitable zone boundaries depend on stellar effective temperature ( §3.5). Instellation flux also allows us to transfer the unconstrained extrapolation required when extending analysis based on orbital period to a bounded extrapolation of detection completeness ( §3.3.2), and we present our results in terms of these upper and lower bounds ( §4). The difference between the upper and lower bounds is smaller than the 68% credible interval on these bounds.
We compute our occurrence rates using a range of models, stellar populations and computation methods. We propagate uncertainties in the input data, account for detection completeness that depends on the stellar effective temperature, and check the dependence of our result on the population via a bootstrap study. In all cases we find consistent results. We take this as evidence that our occurrence rates are robust.
We find a likely, though not statistically compelling, dependence of our occurrence rates on stellar host effective temperature T eff ( §4.2, Figure 14). Much of this dependence can be understood as due to the habitable zone being larger for hotter stars ( §2.2). But we find that the T eff dependence is weaker than would be expected on purely geometric grounds, implying a decreasing planet occurrence for longer-period orbits.
Our occurrence rates for rocky planets have large uncertainties. Comparing computations with and without input uncertainties, we find that these large uncertainties are not caused by the input uncertainties. Comparing the uncertainties on our rocky planets with the 8 https://www.cosmos.esa.int/web/plato/observation-concept uncertainties on the occurrence of larger planets (Table 5), we find that the larger planet occurrence has much lower uncertainty. We conclude that the large uncertainties are due to the extremely low completeness of the DR25 catalog for small planets in the habitable zone, leading to few planet detections. The only way we see to reduce these uncertainties is by generating more complete and reliable catalogs, either through improved analysis of existing data or through obtaining more data with quality comparable to Kepler ( §5.11).
Conservative habitability considerations ( §2) and the limited coverage of F stars in Kepler data ( §1) drive us to define η ⊕ as the average number of habitable zone planets per star as planets with radii between 0.5 R ⊕ and 1.5 R ⊕ and host star effective temperatures between 4800 K and 6300 K. Using this definition, we find that, for the conservative habitable zone, η ⊕ is between 0.37 +0.48 −0.21 and 0.60 +0.90 −0.36 planets per star, while for the optimistic HZ η ⊕ is between 0.58 +0.73 −0.33 and 0.88 +1.28 −0.51 planets per star. These occurrence rates imply that conservatively, to 95% confidence, the nearest rocky HZ planet around G and K-dwarfs is expected to be be within ∼ 6 pc ( §5.4). Furthermore, there could, on average, be 4 HZ rocky planets around G & K dwarfs, respectively, within 10 pc from the Sun.
We do not integrate over T s because that is fixed to the effective temperature of the star. We now cover our entire flux-radius range D with a sufficiently fine regular grid with spacing ∆p and ∆r so that each grid cell i centered at flux and radius (I i , r i ) contains at most one planet.
because the B i cover D and are disjoint. Here K is the number of grid cells and K 1 is the number of grid cells that contain a single planet. So the grid has disappeared, and we only need to evaluate λ(I, r, T s ) at the planet locations (I i , r i , T s ) and integrate η s λ over the entire domain. We now consider the probability of detecting planets around a set of N * stars. Assuming that the planet detections on different stars are independent of each other, then the joint probability of a specific set of detections specified by the set {n i , i = 1, . . . , N * } in cell i on on all stars indexed by s is given by When λ does not depend on effective temperature, we are able to factor N * s=1 exp − D η s (I, r)λ(I, r)dI dr as exp − D η(I, r)λ(I, r)dI dr , where η(I, r) = N * s=1 η s (I, r) is the sum of the completeness contours over all stars. When λ depends on effective temperature we partition the stars into effective temperature bins S k , and approximate T s as the average temperature in each binT k , so within each bin λ does not depend on the star. Then we can do the factoring within each bin: N * s=1 e − D ηs(I,r)λ(I,r,Ts)dI dr ≈ k s∈S k e − D ηs(I,r)λ(I,r,T k )dI dr = k e − s∈S k D ηs(I,r)λ(I,r,T k )dI dr = k e − D η k (I,r)λ(I,r,T k )dI dr = e − k D η k (I,r)λ(I,r,T k )dI dr (14) where η k (I, r) = s∈S k η s (I, r) is the sum of the completeness contours over the stars in bin S k . Note that we are not integrating over the effective temperature. Therefore where V = (∆I∆r) (K1N * ) .
We now let the rate function λ(I, r, T, θ) depend on a parameter vector θ, and consider the problem of finding the θ that maximizes the likelihood Because we are maximizing with respect to θ, we can ignore all terms that do not depend on θ. Therefore, maximizing equation (16) is equivalent to maximizing P {N s (B i ) = n s,i , s = 1, . . . , N * , i = 1, . . . , K|θ} = e − k D η k (I,r)λ(I,r,T k ,θ)dI dr K1 i=1 λ(I i , r i , T s , θ). (17) When we neglect the effective temperature dependence of λ and have only one effective temperature partition containing all the stars, equation (17) Table 8 give the properties of the DR25 candidates used in our study. These planet candidates are detected on FGK host stars (of which hab and hab2 are subsets) after the cuts described in §3.1. The basic PC population is that within our computation domain 0.5 R ⊕ ≤ r ≤ 2.5 R ⊕ and 0.2 I ⊕ ≤ I ≤ 2.2 I ⊕ , defined by the planet radius and instellation central values. When accounting for input uncertainties as described in §3.4.2, some of these planets exit our domain and other planets enter the domain. In a particular realization, only those planets in the domain are involved in the computation of population models and occurrence rates. The probability of a PC being in the domain in a particular realization is given by the "Inclusion Probability" column of Table 8. We list PCs with an inclusion probability > 1/4000, which, if reliability = 1, have a 10% chance of being included in one of the 400 realizations used in the computation with uncertainty.    0.01 0.07088 Table 8 continued that occurrence rate estimates should roughly agree between these alternative catalogs. Figure 17 shows the distribution of the model parameter F 0 for model 1 for these catalogs, computed without input uncertainties and zero completeness extrapolation with the hab2 stellar population. We see reasonable agreement between the Robovetter variations when reliability corrections are applied.  Figure 17. Distributions of the parameter F0 in model 1 (see Equation (5)), the occurrence for the hab2 stellar population for 0.5 R⊕ ≤ r ≤ 2.5 R⊕ and instellation flux range 0.2 ≤ I I⊕ ≤ 2.2 I⊕ for the high reliability (blue), DR25 (pink), FPWG PC (green) and high completeness (orange) vetting, computed with the Poisson method. Left: without correcting for reliability. Right: corrected for reliability.

0.84
Note-The posteriors of model 1 for the DR25, high reliability, high completeness, and FPWG PC catalogs from Bryson et al. (2020b) using the hab2 stellar population and the Poisson likelihood method with zero completeness extrapolation. The maximum separation in each row is the maximum over each row of the difference in medians divided by the propagated uncertainty of that distance.