Searching for Giant Exoplanets around M-dwarf Stars (GEMS) I: Survey Motivation

Recent discoveries of transiting giant exoplanets around M-dwarf stars (GEMS), aided by the all-sky coverage of TESS, are starting to stretch theories of planet formation through the core-accretion scenario. Recent upper limits on their occurrence suggest that they decrease with lower stellar masses, with fewer GEMS around lower-mass stars compared to solar-type. In this paper, we discuss existing GEMS both through confirmed planets, as well as protoplanetary disk observations, and a combination of tests to reconcile these with theoretical predictions. We then introduce the Searching for GEMS survey, where we utilize multidimensional nonparameteric statistics to simulate hypothetical survey scenarios to predict the required sample size of transiting GEMS with mass measurements to robustly compare their bulk-density with canonical hot Jupiters orbiting FGK stars. Our Monte Carlo simulations predict that a robust comparison requires about 40 transiting GEMS (compared to the existing sample of ∼15) with 5σ mass measurements. Furthermore, we discuss the limitations of existing occurrence estimates for GEMS and provide a brief description of our planned systematic search to improve the occurrence rate estimates for GEMS.


INTRODUCTION
Corresponding author: Shubham Kanodia skanodia@carnegiescience.edu M-dwarfs are the most common type of star in the Galaxy (Reid & Gizis 1997;Henry et al. 2006;Reylé et al. 2021).The M-dwarf spectral type spans almost an order of magnitude in mass ranging from ∼ 0.08 M ⊙ to ∼ 0.6 M ⊙ , and about 2600 K to 4000 K in effective temperature (Pecaut & Mamajek 2013).Compared to solar-type stars, these low mass stars are expected to have correspondingly lower mass protoplanetary disks (Andrews et al. 2013;Pascucci et al. 2016), and longer Keplerian orbital timescales (at a fixed distance).The combination of these factors is theorized to make it difficult to form giant planets around these stars in protoplanetary disks (Class II) under the core-accretion formation paradigm.Under this paradigm, traditionally it has been thought that a rocky heavy-element core of roughly ∼ 10 M ⊕ must first form, which is then followed by runaway gaseous accretion to rapidly accumulate a massive gaseous envelope1 (Mizuno 1980;Pollack et al. 1996).Early studies showed that due to the lower disk masses and longer orbital timescales, the formation of a protoplanet massive enough to initiate runaway gaseous accretion would take too long with respect to the lifetime of the gas (primarily H/He) in protoplanetary disks (Laughlin et al. 2004;Ida & Lin 2005).An alternative rapid formation mechanism has been proposed in the form of gravitational instability (GI; Boss 1997Boss , 2006)), which takes place during the proto-stellar phase (Class 0 or I disk) of massive disks when the star is still embedded in a molecular cloud (Lada 1987;Dauphas & Chaussidon 2011).
While it is estimated that M-dwarfs host multiple small terrestrial planets on average (Dressing & Charbonneau 2015;Hardegree-Ullman et al. 2019;Hsu et al. 2020), the occurrence of giant planets around these types of stars is more uncertain due to their rarity (Endl et al. 2006;Johnson et al. 2010a;Maldonado et al. 2019;Schlecker et al. 2022;Gan et al. 2023a;Bryant et al. 2023).Attempts to understand the occurrence of giant exoplanets around M-dwarf stars (GEMS) have traditionally been limited to radial velocity (RV) surveys (Endl et al. 2006;Johnson et al. 2010a;Maldonado et al. 2019;Sabotta et al. 2021;Schlecker et al. 2022;Pinamonti et al. 2022), since M-dwarfs accounted for only a minor fraction of the target stars observed by the Kepler mission (Borucki et al. 2010).This has recently started to change with NASA's Transiting Exoplanet Survey Satellite (TESS), and its all-sky coverage that includes millions of bright M-dwarfs amenable to RV follow-up (Ricker et al. 2014;Muirhead et al. 2018).Despite the low predicted occurrence of GEMS, enough Mdwarf host stars have been observed within the first few TESS cycles that attempts have been made to characterize the occurrence rate of transiting GEMS (Gan et al. 2023a;Bryant et al. 2023).However in subsequent sections, we discuss how these recent investigations are just a first step and motivate a more detailed analysis and characterization of the TESS detection sensitivity for GEMS, and subsequent estimation of their occurrence.
In this manuscript, we present the motivation for our Searching for GEMS survey, where in Section 2 we present the existing GEMS in planet samples, and protoplanetary disks.In Section 3, we discuss predictions from different formation and population synthesis models, and observational tests to distinguish between formation mechanisms.Next, in Section 4 we introduce our Searching for GEMS survey, its requirements, motivations and provide a brief outline, before summarizing our work in Section 5. Preliminary results and trends seen in these existing samples will be discussed and evaluated in part II of this work.

As Exoplanets
In Figure 1 we show the transiting and non-transiting GEMS as queried from the NASA Exoplanet Archive on 2023 June 7 (Akeson et al. 2013; NASA Exoplanet Archive 2023) resulting in 10 non-transiting (with M p sini > 100 M ⊕ ) and 18 transiting GEMS (with planetary radius ≳ 8 R ⊕ ) orbiting stars with T eff ≲ 4000 K, with precise masses > 3-σ (Table 1).In addition, there have been some detections of GEMS with microlensing (Suzuki et al. 2016), and direct imaging (Lannier et al. 2016;Nielsen et al. 2019) which contribute to statistical analysis; this will especially be true with hundreds of expected detections from Roman Space Tele-scope (Penny et al. 2019).While the transiting GEMS have largely been clustered around early M-dwarfs (M0-M2), there have been recent discoveries of four mid Mdwarf GEMS -TOI-5205 b (Kanodia et al. 2023b), TOI-3235 b (Hobson et al. 2023), TOI-519 b (Kagetani et al. 2023), and TOI-4860 b (Almenara et al. 2023;Triaud et al. 2023).The mid M-dwarf GEMS, in addition to recent RV detections around mid-to-late M-dwarfs (e.g.Morales et al. 2019;Quirrenbach et al. 2022), defy expectations from population synthesis models in protoplanetary disks and break the inferred mass-budget 2 .
We also note that while most of these transiting GEMS fall in the canonical hot-Jupiter parameter space based on their typical orbital periods of < 10 days, it is erroneous to classify them as hot.Because of their cooler and lower-mass M-dwarf hosts, the transiting GEMS are not 'hot', with equilibrium temperatures < 1000 K, and hence not expected to be inflated like hot-Jupiters (Weiss et al. 2013;Dawson & Johnson 2018;Thorngren & Fortney 2018).Furthermore, their a/R * are ≳ 10, and hence these objects may not be tidally locked or tidally circularized like hot-Jupiters.Thus, transiting GEMS likely have different planetary properties (both bulk and atmospheric) than hot-Jupiters, and should not be classified as such.

As structure in Protoplanetary Disks
In addition to the direct detection of exoplanets, the high-resolution and high-contrast imaging of disks has enabled the detection of structures and gaps in Class II protoplanetary disks.While there are a number of physical mechanisms that have been proposed to explain these structures, the presence of young proto-planets has been the topic of recent investigations (Dong et al. 2015).For example, van der Marel & Mulders (2021) identify potential trends between the incidence of structured disks and stellar mass that can be accounted for by massive Jovian-sized exoplanets.While they note a reduction in the prevalence of structured (rings or transition) with lower stellar masses, roughly 1 -10 % of their > 100 disks around such stars require the presence of a > 1 M J planet to explain the features present.These trends were extended to very low mass stars (VLMS) and brown dwarfs using high-angular resolution observations from ALMA, which suggested the presence of substructures in a fraction of these disks (Pinilla 2022).
2 An extreme example of this was the discovery of a 5 M J planet around a 25 M J brown dwarf, which likely formed through gravitational instability or fragmentation (Lodato et al. 2005 Figure 2. Green triangles show an estimate of the heavyelement content (MZ ) of GEMS from planetary interior models (Thorngren et al. 2016) compared to the Class II disk dust mass (M d ) estimates as orange squares (Manara et al. 2022) and the median (and 1-σ) trend seen in the Lupus sample (Ansdell et al. 2016).The green arrows show the approximate expected disk dust masses required to form the green triangles assuming 10% formation efficiency (Liu et al. 2019b).The red line shows the range of dust masses for Class 0 and I disks from Tychoniec et al. (2020).Takeaway: The formation of GEMS necessitates disks with many 100s of M⊕ of heavy-elements.This could be achieved by anomalously massive Class II disks with underestimated dust masses, or Class 0/I disks.Zhang et al. (2023) perform a similar analysis and compare the incidence of structures in disks in the Taurus star-forming region to infer the occurrence of protoplanets as a function of stellar mass and orbital-separation.
Additionally, Curone et al. (2022) performed a detailed analysis of the disk CIDA 1 around a ∼ 0.2 M ⊙ star and the substructures present therein, and indicate a planet > 1.4 M J at ∼ 10 AU being responsible for the observed morphology.Similarly Long et al. (2023) note the presence of two gaps in the disk around an M3.5 star J04124068+2438157, which they attribute to a Saturnmass planet at ∼ 90 AU.
To summarize, there is observational evidence regarding the existence of GEMS at large separations in young (Class II, i.e., < 10 Myr) systems, and at closer separations in mature stellar systems.This will be further bolstered by the addition of hundreds of GEMS projected to be astrometrically detected through Gaia DR4 (Sozzetti et al. 2014;Perryman et al. 2014).

Formation during protoplanetary phase
Under the core-accretion paradigm (Mizuno 1980)ignoring migration and transport of material across the disk -a massive solid core is formed by accretion inside

-p
⋆ GEMS discovered through this survey * GEMS with distance > 200 pc, and hence will not be included in the occurrence rate sample.We note that there exists HATS-77 b around a dwarf star with T eff of 4071 K, which is ostensibly an M-dwarf host (Jordán et al. 2022).However, since it the system is > 200 pc, it is not included in the statistical sample either ‡ TOI-4860 b was also confirmed in a separate publication by Triaud et al. (2023).† NGTS-1 b is not shown in the plots in this manuscript since it has an imprecise radius to due its grazing transit.Ida & Lin 2004a;Alibert et al. 2005).Once this zone is depleted, the planet reaches isolation mass where its accretion slows down and takes place primarily through the accretion of gas on to its envelope up until it reaches crossover mass where the mass of solids is roughly equal to the mass of gas (planetary mass of ∼ 30 M ⊕ ; heavyelement mass of ∼ 10 − 20 M ⊕ ).If this threshold is reached when the disk still contains its gas, it initiates exponential runaway gaseous accretion where a Neptune sized planet accretes enough gas to become a gas giant.
We note the caveats that recent studies have suggested that the energy released during accretion of the solid core might delay runaway gaseous accretion (Venturini & Helled 2020;Kessler & Alibert 2023), such that the runaway phase only occurs much later at a planet mass of ∼ 100 M ⊕ with a heavy-element mass of 20-30 M ⊕ .On the other hand, simulations have also shown the fea-sibility of forming giant planets with lower mass cores of ∼ 4 M ⊕ in lower surface density disks, albeit over much longer times-scales (Movshovitz et al. 2010).Therefore predictions for solid core mass are dependent on models and initial conditions assumed.Yet GEMS, especially those around mid-to-late M-dwarfs, continue to present a challenge for theories of planet formation during the protoplanetary phase, i.e., core-accretion (Ida & Lin 2005;Liu et al. 2019a;Miguel et al. 2020;Burn et al. 2021).Summarizing briefly, there are two main reasons for this difficulty in forming GEMS in the protoplanetary phase with core accretion.The first is the the dust mass3 budget of the disk available to form giant planets.Based on estimates of exoplanet heavy-element content (M Z ) and efficiency of planet formation, Kanodia et al. (2023b) show how median Class II disks might not have enough dust mass to form GEMS. In Figure 2 we compare the estimated heavy-element content (M Z ) for GEMS from planetary interior models (Thorngren et al. 2016), with disk dust masses (M d ) and show that assuming an optimistic planet formation efficiency of 10% through pebble accretion (Liu et al. 2019b) necessitates disks with many 100s of M ⊕ of heavy-elements.We note the caveat here that recent giant planet interior models tend to predict a lower heavy-element content when incorporating results from newer equations of state, results from the Solar-system gas giants, and atmospheric metallicity estimates (Miguel et al. 2022;Miguel & Vazan 2023;Müller & Helled 2021, 2022), which might alleviate some of this mass deficit.The second is the timescale of formation of a solid massive core to initiate runaway gaseous accretion: the crossover mass must be attained before the gas in the disk is accreted on to the host star.Laughlin et al. (2004) show that the lower disk masses (and subsequently surface density) coupled with longer Keplerian timescales (also due to lower host stellar masses) mean that planetary cores around M-dwarfs at 5 AU might take too long to form relative to the lifetime of the disks.
Underestimated disk dust masses: During the protoplanetary phase, i.e.Class II disks, the dust mass4 is typically estimated based on continuum flux measurements of mm sized dust particles, either extrapolating from measurements at 850 µm (Hildebrand 1983), or fitting SEDs to multi-wavelength data (Pinte et al. 2008).These flux-to-mass estimates break down if the continuum emission is optically thick (Eisner et al. 2018;Rilinger et al. 2023;Xin et al. 2023), or in the presence of gaps or rings in the disk (Liu et al. 2022), leading to underestimation of the dust masses by 3 -10x.
More fundamentally, the dust mass in mm sized particles in ALMA measurements does not represent the true primordial mass budget, with formation already underway locking up the dust in > mm sized particles (Greaves & Rice 2010;Najita & Kenyon 2014), as well as depletion due to radial drift (Appelgren et al. 2023).If the accretion processes required to form a massive core (to initiate runaway gaseous accretion) started earlier in the protoplanetary stage, the true mass budget available for planet formation would be larger than the 850 µm mass estimates.
Alleviating the timescale problem: The timescale problem could potentially be circumvented by pebble accretion, which is a much faster process than planetesimal accretion5 (Lambrechts & Johansen 2012;Savvidou & Bitsch 2023).Furthermore, new theories suggest that some young disks could concentrate solids in their midplane through self-gravitating spiral waves, thereby hastening the formation of a core massive enough to reach the runaway stage (Haghighipour & Boss 2003;Baehr 2023).Lastly, studies of large disk clusters have shown that the low mass M-dwarf disks tend to have longer lifetimes than those around more massive stars (Pfalzner et al. 2022), with some extreme examples existing in the form of so-called 'Peter-Pan' disks lasting many 10s of Myr (Flaherty et al. 2019;Silverberg et al. 2020;Coleman & Haworth 2020;Wilhelm & Portegies Zwart 2022).Together these factors might ease the timescale problem against the formation of GEMS.
Thus, to summarize, while it has traditionally been thought to have been difficult to form GEMS through core-accretion during the protoplanetary disk phase, this formation pathway cannot be ruled out given the dependence on numerous poorly understood disk parameters.

Formation during protostellar phase
An alternative to core accretion is the formation of GEMS in the protostellar phase, i.e., in Class 0, I disks (or protostars) through gravitational instability (GI; Kuiper 1951;Cameron 1978;Boss 1997).Under this mechanism, massive disks can start fragmenting and forming dense self-gravitating clumps as precursors to giant planets6 .Gravitational collapse is prevented close to the star due to the faster rotation period and thermal pressure (higher temperatures).Boss (2006Boss ( , 2011) ) show how GI can form giant planets around M-dwarfs with mass 0.1 and 0.5 M ⊙ at large separations, which is then followed by migration induced by the spiral arms (Boss 2023).Mercer & Stamatellos (2020) run smoothed particle hydrodynamics (SPH) simulations to show that the GI formation of GEMS necessitates disk-to-star mass ratios between ∼ 0.3 to 0.6.Haworth et al. (2020) presented a range of SPH models for massive disks around M-dwarf stars to show the difference between disk-to-star mass ratios for disks that remain axisymmetric, produce spiral arms or fragment into clumps.Boss & Kanodia (2023) perform GI-based population synthesis to quantify the frequency with which GEMS can be formed around a range of stellar masses ranging from 0.1 to 0.5 M ⊙ and disk-to-star mass ratios of 0.05 to 0.3.These mass ratios are consistent with those seen in Class 0/I protostellar samples from VLA (Tychoniec et al. 2018(Tychoniec et al. , 2020;;Xu 2022;Fiorellino et al. 2023).

Observationally distinguishing between formation mechanisms
In this subsection, we consider and establish a few potential mechanisms to observationally distinguish between forming GEMS via core-accretion vs. gravitational instability.

Stellar metallicity trends
Since the discovery of the first few gas giant exoplanets studies have shown that the occurrence of these planets around FGK stars is positively correlated with stellar metallicity (Gonzalez 1997;Santos et al. 2001;Fischer & Valenti 2005;Wang & Fischer 2015;Petigura et al. 2018;Narang et al. 2018;Osborn & Bayliss 2020).This positive metallicity trend agrees well with predictions from the core-accretion model of formation (Ida & Lin 2004b;Matsumura et al. 2021), where higher metallicity stars correspond to higher metallicity disks (higher solid surface density) and translates into faster solid core formation7 .This explanation of the metallicity trend is further aided by observational evidence that higher metallicity disks tend to live longer (Yasui et al. 2009(Yasui et al. , 2010)), with the higher opacity potentially shielding the disk against photoevaporation and reducing disk accretion (Yasui et al. 2021).Overall, numerous simulations and observational studies have shown that the core accretion paradigm of planet formation favours higher metallicity host stars.
On the contrary, simulations suggest that lower metallicity molecular clouds should favour gravitational fragmentation (Matsukoba et al. 2022;Elsender & Bate 2021), while GI is largely agnostic of the disk metallicity (Boss 2002).Put simply, in the optically thick environment of the disk, higher stellar metallicity leads to higher opacities (due to more free electrons in the disk) that can slow down the cooling timescales.However, Boss (2002) showed that at the typical separations at which GI occurs, the temperature equilibrates much faster than the dynamical timescales and a factor of few (± 0.5 dex) change in metallicity does not make an appreciable difference in the efficiency of GI.
This difference in stellar metallicity dependence is indeed seen as one of the cleaner ways to distinguish between the two mechanisms of giant planet formation given a large enough sample, which also sidesteps the numerous complications associated with other methods such as atmospheric chemistry (Venturini & Helled 2020;Mollière et al. 2022).This difference has been noticed in the observed sample of giant planets and brown dwarfs orbiting FGK stars around 4 -10 M J (Santos et al. 2017;Schlaufman 2018;Narang et al. 2018), with lower mass giant planets (∼ 1-4 M J ) preferring more metal rich stars than the more massive giant planets (> 10 M J ).This feature is hypothesized to be explained by prevalence of core-accretion formed planets below this transition, and GI formed objects (planets or brown dwarfs) above it.
Complexities in M dwarf metallicity determination -While studies have attempted to extend the positive metallicity dependence of giant planets to the M-dwarf spectral type (Johnson et al. 2010b;Maldonado et al. 2019;Gan et al. 2022), with their cooler stellar atmospheres, Mdwarfs are blanketed with molecular features that cause complexities in metallicity determination.The molecular features complicate continuum estimation (Pineda et al. 2013) and cause inaccuracies in line profile measurement due to line blending, especially in the optical, where the spectra is dominated by TiO as the dominant source of opacity (Kirkpatrick et al. 1991;Jorgensen 1994).Rains et al. (2021) discuss how the errors in the TiO line lists (McKemmish et al. 2019) manifest in discrepancies in model atmospheres.These issues are further exacerbated by low S/N observations in the optical due to the low effective temperatures and luminosities of these stars.

Planet atmospheric composition
Studies have suggested that the metallicity and composition of the atmospheres of giant planets could hold clues to their formation and evolution history, ranging from the mechanisms and location of formation, and the subsequent migration if applicable (Helled & Bodenheimer 2010;Öberg et al. 2011;Madhusudhan et al. 2014;Knierim et al. 2022;Dash et al. 2022, and numerous others).Further to the above discussions, studies have suggested potential differences in planetary atmospheres formed through core-accretion vs GI (Hobbs et al. 2022).However, this initial picture is likely to be complicated by location of formation and subsequent planetary accretion (Helled et al. 2014).Mollière et al. (2022) list the numerous pitfalls and challenges in trying to invert the formation history of individual planets based on present-day atmospheres.Therefore, extreme caution must be exercised in attempts to connect the present-day atmospheric observations with formation and evolution history.Instead it is possible that planetary mass -atmospheric (and bulk) metallicity trends across a homogeneous sample of planets, sampled across planetary mass and stellar mass space, present a better alternative (Thorngren et al. 2016;Teske et al. 2019;Welbanks et al. 2019).
We can now begin to test the limits of these current theories by atmospheric characterization of carefully selected samples across a larger stellar mass range with JWST and ARIEL in the future (Tinetti et al. 2016).

SEARCHING FOR GEMS SURVEY OUTLINE
Considering the theoretical considerations (and limitations) presented above, and the observational tests to discriminate between the formation of GEMS in the protoplanetary (through core-accretion) vs. protostellar (through gravitational instability), we ask the following questions: 1. Mass-Radius+ (M-R+) relations: Does the difficulty in forming GEMS in this mass starved regime through core-accretion manifest as a systematic difference in their bulk-properties as a function of stellar mass, i.e., characterizing the sample of giant planets in 3D (planet mass, planet radius, stellar mass)?
2. Occurrence Rates: Do GEMS have a lower occurrence compared to similar giant planets in short orbital periods (the canonical hot Jupiters) around FGK stars?In particular, the M-dwarf spectral type also allows comparison of the giant planet occurrence rate as a function of spectral sub-type.
To address these questions, we have started the Searching for GEMS survey, which is currently ongoing, and entails the follow-up and confirmation of new GEMS planet candidates in addition to performing demographical analysis on TESS light-curves to provide improved constraints on the occurrence of short-period GEMS.This transit survey will also contribute GEMS to answer the question: Does the metallicity transition in the planetary mass -stellar metallicity plane between core-accretion and GI (Section 3.2.1)also depend on the host-stellar mass?To answer this we will extend the 2D planetary mass -stellar metallicity plane to a third dimension to include stellar mass.

Requirements for M-R+ relations
In order to answer the question posed for M-R+ relations we generate a set of simulated catalogues of GEMS with mass measurements based on different assumptions, and assess our ability to recover the input trends as a function of catalogue size.

Mass-Radius (M-R)
We start off by trying to recover and quantify differences in the 2D mass-radius (M-R) distribution of giant planets between a sample of FGK stars ( ∼ 0.6 M ⊙ -1.5 M ⊙ ) and a simulated sample of GEMS (≲ 0.6 M ⊙ ).Our FGK sample consists of ∼ 350 transiting giant planets (15 M ⊕ ≳ R p ≳ 8 M ⊕ ) orbiting stars ranging from ∼ 0.6 M ⊙ to 1.5 M ⊙ with planetary masses known to better than 3-σ (Akeson et al. 2013; NASA Exoplanet Archive 2023).The steps followed are as follows: 1. Obtain joint mass-radius distributionf(m, r) F GK -for FGK giant planet sample: We fit the FGK sample using the updated nonparametric inference tool MRExo (Kanodia et al. 2019(Kanodia et al. , 2023a)), which utilizes beta-density (or normalized Bernstein) polynomials (Ning et al. 2018) to jointly fit their masses and radii and obtain the 2D probability density function (PDF) -f(m, r) -where m, r signify the planetary mass and radius respectively.We use the cross-validation functionality within this toolkit to optimize for the number of degrees (for the beta density functions) to be 48 in each dimension (Kanodia et al. 2019(Kanodia et al. , 2023a)).
2. Simulated M-dwarf sample: We perform rejection sampling in 2D to obtain a simulated M-dwarf sample with n planets in mass-radius space following the same distribution as the FGK sample, to which we ascribe 10-σ and 5-σ errors in radius and mass respectively.We then use MRExo to fit the joint M-R distribution -f(m, r) M =F GK -for this simulated sample.We repeat this step 100x times to obtain a distribution of PDFs for the simulated M-dwarf catalogue across 6 steps spanning a range of GEMS sample sizes from 15 to 150 (Figure 3).We simulate the entire M-dwarf catalogue from scratch (instead of training it on the existing transiting GEMS sample) to avoid any potential biases that could be incurred from this limited sample size, and estimate the ideal sample-size in an agnostic manner.
3. Obtain the conditional distribution of planetary mass for a given radiusf (m|r) -for real FGK and simulated M-dwarf samples: The 2D joint distribution -f(m, r) -can then be conditioned on Jupiter's radius (R J or 11.2 R ⊕ ) to obtain a PDF for the inferred mass of Jovian-sized objects -f (m|r = 1R J ).The ability to compare f (m|r = 1R J ) F GK and f (m|r = 1R J ) M =F GK in this simulated case where we have the same joint M-R distribution for the two (i.e., the conditional distributions should be similar), represents our 'best-case scenario' in being able to distinguish between two the inferred PDFs for two samples of a given size. 5. We then repeat steps 2 and 3, where we reduce the masses for the M-dwarf planet sample to 60% and 80% of the FGK sample to quantify the ability to distinguish between the M-dwarf and FGK samples as a function of M-dwarf sample size.
6. Finally we calculate the EMD and t-statistic for comparing histograms of the M-dwarf and FGK samples, and pick the optimum sample size where the 5th percentile of our metric comparing our test distribution with the FGK sample (f (m|r = 1R J ) M =80%F GK , f (m|r = 1R J ) F GK ) intersects with the 50th percentile (median) of the metric The solid line represents the median metric value, while the different shaded regions represent the 16-84%, 5-95%, and 1-99% percentile regions respectively.The vertical black line is the minimum sample size of transiting GEMS required based on the comparing the distributions.Left: Welch's t-test suggesting the need for about 20 GEMS if we solely wish to compare the median values of the two distributions assuming normality.Right: The Earth-mover distance, which suggests the need for about 35 GEMS to distinguish between the two distributions agnostic of Gaussian assumptions.Takeaway: Identifying the minimum sample size required to distinguish between a sample of giant planets around FGK stars and GEMS based on their M-R distribution when the GEMS are about 60% in mass of the FGK planets, we require about 35 GEMS if we do not assume an underlying normal distribution and 20 GEMS if we do.
distribution from our best-case scenario with similar distributions (f Based on the criterion defined above, we are able to distinguish between mean of the M-dwarf and FGK planetary M-R distributions when the M-dwarf distribution is reduced to 80% and 60% (i.e., a 20% and 40% difference) with about 100 and 20 GEMS respectively using Welch's t-test.Using the more agnostic and model-independent EMD, we require 35-40 GEMS to distinguish between the two distributions when the Mdwarf masses are offset to 60%, and ¿ 150 for the 80% case (Figure 4).Thus limiting the comparison between the two samples to just the M-R plane would require about 35 GEMS with mass measurements at the 5-σ level to ascertain a 40% difference in mass.

Mass-Radius-Stellar Mass (MRStM)
Instead of restricting our analysis to just the two M-R dimensions, we can incorporate additional information about these systems in the form of their stellar mass to jointly fit f(m, r, stm), where stm refers to the continuous variable spanning stellar mass using MRExo.Using the sample defined above, we perform a fit in 3D where the cross-validation method optimizes for the number of degrees to be 41 in each dimension.Conditioning on planetary radius and stellar mass to predict the planetary mass -f (m|r = 1R J , stm) -across a range of stellar masses, we obtain a nominal linear trend between the expectation values of the PDFs and stellar mass as shown in Figure 5.
Similar to the methodology from the 2D simulation, we generate synthetic M-dwarf GEMS samples and compare the predicted masses for these planets with those around more massive stars using the EMD metric.We do not investigate the uncertainties associated with the expectation values here (Figure 5), since the significance of this trend (both statistical and physical) will be discussed in a follow-up paper.We first generate a synthetic M-dwarf catalogue where the 2D joint f (m, r) distribution is same as that for a solar mass star, i.e., f (m, r|stm = 1.0 M ⊙ ).For this synthetic catalogue of planets, we ascribe stellar masses using a half-normal distribution ≡ 0.6 -|N (0, 0.1)|, to avoid an undue bias towards mid-to-late M-dwarfs or include K-dwarfs.
Then the mass for each M-dwarf planet is scaled based on an extrapolation of the linear trend shown in Figure 5, which is further normalized to be ∼ 1 for Jupiters around solar-mass stars.This way we ascribe a stellar mass dependence to the simulated GEMS, and generate 100 such catalogues for each M-dwarf sample size ranging from 15 to 150 and append these to the original FGK sample.This dependence is partly motivated by the existing sample, and will be explored in further detail (both empirically and theoretically) in part II of this work.After performing a 3D fit -f(m, r, stm) -for each catalogue of planets, we compare the f (m, r|stm = 0.75 M ⊙ ) with f (m, r|stm = 1.0 M ⊙ ) as a control that should not show any sample-size dependence (since fixed FGK sample), and then f (m, r|stm = 0.5 M ⊙ ) with f (m, r|stm = 1.0 M ⊙ ) to show the convergence of the EMD with increasing the M-dwarf sample size (Figure 6).
To establish convergence, we consider the M-dwarf sample size where the fractional change of the central 68th percentile (EMD 84% -EMD 16% ) of the EMD distribution with sample-size goes below 1%, which is at ∼ 40.To test the impact of the assumed slope on the inferred sample-size, we also compare the EMD for two f (m|r = 1R J ) distributions for stellar masses of 0.55 M ⊙ and 0.6 M ⊙ , i.e., with more similar distributions, and find that we attain a similar 1% convergence criterion with a GEMS sample of about 40 -50.
These simulations help us assert that we should be able to distinguish between different stellar-mass -planetary mass dependencies for Jovians with a sample size of about 40 -50 planets, which will help answer the first question regarding M-R+ relations -whether the difficulty in forming GEMS through core-accretion in a mass starved environment manifests as a systematic difference in their bulk-density as a function of stellar mass.

Limitations of Existing Transiting GEMS
Occurrence Estimates Gan et al. (2023a) have estimated the occurrence of short-period (0.8 -10 days) transiting giant planets (7 R ⊕ ≤ R p ≤ 2 R J ) around ∼ 60,000 early M-dwarfs with T eff between 2900 K and 4000 K, and stellar masses between 0.45 M ⊙ and 0.65 M ⊙ and 10.5 ≤ T mag ≤ 13.5.as 0.27 ± 0.09 %.Similarly, Bryant et al. (2023) suggest an occurrence rate of 0.194 ± 0.072 % for planets with (0.6 R J ≤ R p ≤ 2.0 R J ) orbiting ∼ 90,000 nearby lowmass stars with M * ≤ 0.71 M ⊙ .These investigations have already started to provide insight into the occurrence of transiting GEMS, however we note three main limitations with these existing studies that our survey design aims to improve upon: 1. Jovian-sized objects can range in mass from 0.3 M J to ∼ 100 M J , i.e, Saturn-massed to late Mdwarfs.It is not possible to confirm the plane-tary nature of transiting Jovian-sized objects, with statistical validation alone (i.e., without dynamical confirmation with RVs or TTVs).Therefore, the transiting objects used in these two studies to estimate occurrence rates are not necessarily all planets and likely contaminated by astrophysical false-positives such as brown-dwarfs and eclipsing binaries (EBs).This is indeed seen for TOI-5375 B that was identified as a planet candidate by Gan et al. (2023a), but confirmed to be an EB by Lambert et al. (2023).These studies have attempted to account for this by assigning a false positive probability (FPP) for the candidates discovered through their pipelines using statistical validation tools (Bryant et al. 2023) or estimating this based on the prevalence of brown-dwarfs and EBs in the literature (Gan et al. 2023a).However, literature samples of brown-dwarfs and giant planets are heterogeneous and not drawn from Mdwarf hosts, additionally, as pointed out by Bryant et al. (2023), the statistical techniques are limited in their ability to distinguish between giant planets, brown-dwarfs and very low-mass stars.Without more relevant FPP estimates (from M-dwarf host, short orbital period and homogeneous samples), or spectroscopic validation, these existing estimates should be considered carefully (and with the appropriate caveats) while comparing across samples and surveys.
2. The input sample of M-dwarfs for both studies conflates mid-to-late K-dwarfs with early Mdwarfs.For example, Bryant et al. ( 2023) have a T eff and stellar mass upper limit of 4500 K and 0.75 R ⊙ respectively, which includes mid-to-late K-dwarfs.Given the potentially positive correlation between stellar-mass and the occurrence of GEMS (Johnson et al. 2010a;Zhou et al. 2019;Bryant et al. 2023), this choice for the input stellar sample biases the occurrence rate to higher values.Bryant et al. (2023) account for this by dividing their stellar sample into different stellar mass bins.
In particular, the Gan et al. (2023a) sample is (mostly) magnitude limited, but confounded by the selection function of the QLP sample, and consists of 60,000 (primarily) early M-dwarfs.Con-versely, the Bryant et al. (2023) sample is volume limited to 100 pc, but also includes cuts based on magnitude, which precludes it from being volume complete across the entire M-dwarf spectral type8 .Furthermore, the distance and magnitude cut-offs chosen for the above studies do not include most of the confirmed planetary transiting GEMS that have been discovered thus far and are beyond 100 pc (Table 1).
Despite these limitations, these existing occurrence estimates already demonstrate the lower occurrence of transiting GEMS compared to FGK hot-Jupiters.We will attempt to tackle these limitations through our systematic candidate search and analysis that we describe in the survey work plan.

Survey Design
While the methodology and preliminary results from this ongoing survey will be described in upcoming manuscripts, we briefly describe the sample selection, work plan and survey status here.

Sample Selection
We adopt a volume limited sample spanning 200 pc, which covers most of the existing confirmed planets.Since confirming the planetary nature of transiting Jovian objects requires resource-intensive observations, this approach capitalizes on existing observations while providing a large sample for more precise occurrence estimates.This is to help overcome the limitations of existing efforts described above, primary among which is the need for spectroscopic validation of the planet candidates utilized in occurrence rate estimates, in the absence of which, the need for more informed and relevant false positive estimates.To do this, we start off with the Gaia DR3 catalogue (Gaia Collaboration et al. 2023), to which we apply the following cuts: (i) a parallax of ϖ > 5 mas (corresponding to an upper limit of 200 pc), (ii) a parallax error of σ ϖ < 1 mas, and (iii) null quality flags indicating the object is not a quasi-stellar object or galaxy (not in the Gaia DR3 qso candidates or galaxy candidates tables), not a binary star (not identified in Gaia DR3 as an astrometric, spectroscopic, or EB), and not a duplicated source.
The source identifier in each Gaia data release is not guaranteed to be the same and we obtain the Gaia DR2 source identifier using the dr2 neighbourhood table provided with the Gaia DR3 release.We reject sources  Previous studies have used an upper limit of 100 pc, which represents a small fraction (< 20%) of our sample.Our sample spans the full range of color and magnitude for M0-M9 .For reference, we indicate GJ 488 (M0V; square) and TRAPPIST-1 (M8V; triangle) in the CMD to approximately denote the two ends of the M-dwarf spectral type.The stellar radii, masses and T eff are calculated from MK s using relations from Mann et al. (2015Mann et al. ( , 2019)), whereas the TICv8.2contamination ratio is from Stassun et al. (2018).
without matching dr2 source id and dr3 source id because this can arise from a spurious detection in either DR2 or DR3 (Torra, F. et al. 2021) or erroneous proper motions from DR2 and double stars that are a resolved pair in Gaia DR3 (Fabricius et al. 2021).We crossmatch the Gaia objects with TICv82 using the Gaia DR2 source identifier (Stassun et al. 2019) and retain stars observed through TESS Cycle 5 using the high-precision TESS pointing tool 9 (Burke et al. 2020).To select the M dwarfs in the sample, we place constraints on the 2MASS colors and magnitudes (Table 7; Cifuentes et al. 2020) keeping stars with (i) 0.8 < J − K < 1.25 and (ii) 4.7 < M Ks < 10.1 , resulting in our final sample of ∼ 1 million stars, the properties of which are depicted in Figure 7. Splitting this along the spectral sub-type using M Ks cut-offs from Cifuentes et al. (2020), we note that about 21% of our sample are early M-dwarfs (¡ M2.5V; M Ks ¡ 6), 30% as mid M-dwarfs (M2.5V -M4; 7.1 ¿ M Ks ¿ 6), with the rest being late M-dwarfs.While we denote GJ 488 and TRAPPIST-1 as example M0/M9s (Figure 7), these do not set our CMD cut-offs due to the intrinsic astrophysical scatter (∼ 0.2 mag) in the colour of early M-dwarfs as seen in Figure 14 and A2 from Cifuentes et al. (2020).We use the CMD to define our sample instead of derived stellar parameters which often suffer from inaccuracies due to systematics in internal structure models and evolutionary tracks (Kesseli et al. 2018;Dieterich et al. 2021;Passegger et al. 2022).

Work Plan
Given their faintness, publicly available light curves such as from SPOC or QLP do not exist for our entire sample.Instead we will extract light curves from the full-frame image (FFI) images using the public package TESS-Gaia Light Curves (TGLC; Han & Brandt 2023) for all our targets, which performs a contamination correction for the TESS light curves based on their Gaia positions, magnitude and colour.A detailed characterization of the photometric performance of this reduction routine for our sample as a function of stellar properties, will be included in a future manuscript upon completion of said search.Then, we will search for transiting GEMS candidates around our M-dwarf sample using the box-least squares algorithm (Kovács et al. 2002) to identify candidates and a combination of publicly available tools to vet these candidates for astrophysical false positives (e.g., DAVE and TRICERATOPS; Kostov et al. 2019;Giacalone et al. 2021).The candidates that survive vetting will be observed from ground-based facilities to obtain transits (to rule out background EBs), 9 https://github.com/tessgi/tess-pointhigh-contrast imaging (to constrain dilution) and spectra (to rule out EBs and brown-dwarfs).This sample of well-vetted planets will be used to inform the occurrence rates for GEMS.
To estimate the number of true GEMS we will obtain here, we have assumed an occurrence rate of 0.1 -0.2% alongside a transit probability of 5 -10%; the occurrence rate is from a 0.4 % estimate for AFG stars from Zhou et al. (2019), which is then scaled by a stellar mass dependent ratio.Additionally, this estimate also agrees with recent upper limits from Gan et al. (2023a) and Bryant et al. (2023).We note that since we already have ∼ 15 confirmed transiting GEMS (Table 1), we can already place a lower-limit on the occurrence of shortperiod transiting GEMS at ∼ 0.03% (after accounting for transit probability).While we do not have good priors on the astrophysical false-positives (brown-dwarfs, EBs) around M-dwarfs at these orbital periods, we assume a 1:1 ratio between bonafide planets and false positives here.
Based on these estimates, we expect to discover about ∼ 100 transiting GEMS, the spectroscopically validated sub-sample (roughly half) of which will enable accurate (due to spectroscopic mass upper limits) and precise (due to the > 10x larger input sample-size) occurrences.The large sample-size will also enable estimating the occurrence rate as a function of stellar mass across the M-dwarf spectral sub-type.To convert our detection efficiency into occurrence rates, we will follow previous studies by performing injection & recovery tests to quantify the search sensitivity (Dressing & Charbonneau 2015;Gan et al. 2023a;Bryant et al. 2023;Ment & Charbonneau 2023).The parameters, methodology, and results from these tests will be published along with the final occurrence rates.
Despite using TGLC which performs a dilution correction based on Gaia magnitudes, to estimate the yield from our sample we assume a TICv8.2contamination ratio (Stassun et al. 2018) cut-off10 of 1.0, beyond which the field is likely to be too crowded for follow-up.
The bulk of our spectroscopic follow-up will be performed with the Habitable-zone Planet Finder (HPF; Mahadevan et al. 2012Mahadevan et al. , 2014) ) on the 10-m Hobby Eberly Telescope (Ramsey et al. 1998), NEID on the 3.5-m WIYN telescope (Halverson et al. 2016;Schwab et al. 2016), alongside MAROON-X on the 8-m Gemini-N (Seifahrt et al. 2022), and the Planet Finding Spectrograph (PFS) on the 6.5-m Magellan Clay telescope (Crane al. 2006(Crane al. , 2008(Crane al. , 2010)).Based on the existing observations and the exposure time calculator for HPF 11 , we estimate being able to perform spectroscopic validation for faint M-dwarfs going down to J < 15, which combined with the declination limits for the telescope of approximately -10 • to 72 • , and the contamination ratio cut-off defined above, covers about half our sample of a million M-dwarfs.In addition to this, we expect to be able to observe an additional fraction of stars with NEID, MAROON-X, which despite being optical/red-optical instruments operate on conventional telescopes with more lenient declination limits.Similarly PFS operates in the Southern hemisphere, but uses the iodine gas-cell technique for RV determination (Butler et al. 1996), making it less suitable for following-up faint, red M-dwarfs.Given the vagaries of telescope time allocation, we conservatively assume that using a combination of HPF, NEID, MAROON-X and PFS, at least half the stars in our 200 pc sample will be amenable to spectroscopic validation.Therefore, any planet candidates discovered in this half of the sample should be available for validation and potential confirmation.The faintness limit of HPF allows us to be almost volume limited (covers > 90% of the sample) in the region of the sky accessible to HET, and derive empirical FPP for our sample, which we can then apply to the rest of the candidates that might be too faint for follow-up from the other facilities.
A subset of the validated planets (∼ 40 as motivated in previous sections, of which 15 have already been confirmed) will be selected for more intensive follow-up to obtain planetary mass measurements, which can then be utilized for future investigations that will be discussed in our follow-up paper.Through ongoing observations of objects classified as TOIs or CTOIs that also form part of our 200 pc sample, the community has confirmed ∼ 15 GEMS so far, in addition to which we have also identified over 20 astrophysical false positives that will be discussed in a follow-up paper discussing preliminary results and trends from our survey.

Survey Status
The ongoing survey has currently led to the discovery and confirmation of nine of the transiting GEMS indicated in Table 1, with five additional planet confirmation manuscripts currently in preparation.The demographics analysis has been initiated, with the sample selection (as explained above) complete, and vetting procedure being finalized and trained based on a test run on a 100 pc sub-sample.Results from the 100 pc sample will 11 https://psuastro.github.io/HPF/Exposure-Times/be published as an intermediate paper, which will also serve as a comparison with previous works such as those by Bryant et al. (2023).Candidates from these searches are simultaneously being validated and followed-up from ground-based telescopes to confirm their planetary nature, and estimate their stellar and planetary properties.After finalizing the candidate search and vetting procedure, we will also perform injection and recovery tests to quantify our detection sensitivity, and go from candidate detection, to measured occurrences.Upon completion of the survey and confirmation of planet candidates, we will perform the statistical analysis outlined in previous sections to quantify the properties of the observed transiting GEMS and compare them with similar transiting giant planets around more massive stars.

Requirements for planetary mass -stellar metallicity investigations
Extending the planetary mass -stellar metallicity plane into the stellar mass dimension for GEMS requires not just additional planet discoveries, but also precise stellar metallicities for M-dwarfs.Furthermore, it will be important to determine the metallicities across the sample in a homogeneous manner to avoid methodology dependent metallicity offsets (Passegger et al. 2022).
As can be seen in Figure 1, the current sample of GEMS typically range in mass from around 0.3 -3 M J .These will be further augmented through our follow-up of transiting GEMS, and also ongoing uninformed RV surveys with new instruments such as HPF (Mahadevan et al. 2014), NEID (Halverson et al. 2016;Gupta et al. 2021), ESPRESSO (Pepe et al. 2021), CARMENES (Sabotta et al. 2021), SPIROU (Donati et al. 2020), NIRPS (Wildi et al. 2017), IRD (Kotani et al. 2018), etc.Most importantly, Gaia astrometric detections (primarily expected in DR4 in 2025-202612 ) are expected to add hundreds of GEMS and brown-dwarfs at intermediate orbital separations (∼ AU) with mass measurements (Casertano et al. 2008;Sozzetti et al. 2014;Perryman et al. 2014).An early example of this is GJ 463 b, for which a true mass was determined using a combination of RVs and Gaia astrometry (Endl et al. 2022;Sozzetti 2023).This would offer a homogeneous and well-characterized sample at intermediate-long periods, that is ripe for population studies.

SUMMARY
In this manuscript we discuss the small but growing sample of transiting giant exoplanets around M-dwarf stars (GEMS), and the challenges posed to their formation by existing theories of planet formation by coreaccretion and gravitational instability.We motivate the Searching for GEMS survey to find, and characterize short-period transiting GEMS from TESS and groundbased facilities.We utilize multi-dimensional nonparametric statistics in the publicly available package MRExo to predict the required sample-size to robustly confirm the tentative trends (between the mass of Jovian-sized planets and host stellar mass) seen in the data to be about 40 confirmed transiting planets with mass measurements.Lastly, we discuss the limitations of existing occurrence rates estimates for these GEMS, and highlight the stellar sample of ∼ 1 million M-dwarfs over a volume of 200 pc that we will use to characterize their occurrence.

ACKNOWLEDGEMENT
We thank the anonymous referee for the valuable feedback which has improved the quality of this manuscript.S.K. acknowledges and appreciates discussions with Kevin Schlaufman regarding target selection and the Gaia queries, and also Theodora for proof-reading, and Annie Clark for providing a suitable background.GS acknowledges support provided by NASA through the NASA Hubble Fellowship grant HST-HF2-51519.001-A awarded by the Space Telescope Science Institute, which is operated by the Association of Universities for Research in Astronomy, Inc., for NASA, under contract NAS5-26555.CIC acknowledges support from an appointment to the NASA Postdoctoral Program at the Goddard Space Flight Center, administered by the ORAU through a contract with NASA.
These results are based on observations obtained with the Habitable-zone Planet Finder Spectrograph on the HET.We acknowledge support from NSF grants AST-1006676, AST-1126413, AST-1310885, AST-1310875, ATI 2009889, ATI-2009982, AST-2108512, AST-2108801 and the NASA Astrobiology Institute (NNA09DA76A) in the pursuit of precision radial velocities in the NIR.The HPF team also acknowledges support from the Heising-Simons Foundation via grant 2017-0494.The Low Resolution Spectrograph 2 (LRS2) was developed and funded by the University of Texas at Austin McDonald Observatory and Department of Astronomy and by Pennsylvania State University.We thank the Leibniz-Institut für Astrophysik Potsdam (AIP) and the Institut für Astrophysik Göttingen (IAG) for their contributions to the construction of the integral field units.The Hobby-Eberly Telescope is a joint project of the University of Texas at Austin, the Pennsylvania State University, Ludwig-Maximilians-Universität München, and Georg-August Universität Gottingen.The HET is named in honor of its principal benefactors, William P. Hobby and Robert E. Eberly.The HET collaboration acknowledges the support and resources from the Texas Advanced Computing Center.We thank the Resident astronomers and Telescope Operators at the HET for the skillful execution of our observations with HPF.We would like to acknowledge that the HET is built on Indigenous land.Moreover, we would like to acknowledge and pay our respects to the Carrizo & Comecrudo, Coahuiltecan, Caddo, Tonkawa, Comanche, Lipan Apache, Alabama-Coushatta, Kickapoo, Tigua Pueblo, and all the American Indian and Indigenous Peoples and communities who have been or have become a part of these lands and territories in Texas, here on Turtle Island.
WIYN is a joint facility of the University of Wisconsin-Madison, Indiana University, NSF's NOIRLab, the Pennsylvania State University, Purdue University, University of California-Irvine, and the University of Missouri.The authors are honored to be permitted to conduct astronomical research on Iolkam Du'ag (Kitt Peak), a mountain with particular significance to the Tohono O'odham.Data presented herein were obtained at the WIYN Observatory from telescope time allocated to NN-EXPLORE through the scientific partnership of NASA, the NSF, and NOIRLab.
Data presented herein were obtained at the WIYN Observatory from telescope time allocated to NN-EXPLORE through the scientific partnership of the National Aeronautics and Space Administration, the Na-tional Science Foundation, and NOIRLab.This work was supported by a NASA WIYN PI Data Award, administered by the NASA Exoplanet Science Institute.These results are based on observations obtained with NEID on the WIYN 3.5 m telescope at KPNO, NSF's NOIRLab under proposal 2022B-785506 (PI: S. Kanodia), managed by the Association of Universities for Research in Astronomy (AURA) under a cooperative agreement with the NSF.This work was performed for the Jet Propulsion Laboratory, California Institute of Technology, sponsored by the United States Government under the Prime Contract 80NM0018D0004 between Caltech and NASA.

Figure 1 .
Figure 1.Planetary mass plotted as a function of stellar mass for GEMS.The transiting planets have true mass measurements, whereas Mp sini is plotted for the non-transiting ones, with the exception of GJ 463 b which has a true mass measurement of ∼ 1140 M⊕ (or ∼ 3.6 MJ ) from astrometry (Sozzetti 2023).

Figure 3 .
Figure 3.The 2D joint probability distribution -f(m, r)F GK -for ∼ 350 giant planets orbiting FGK stars shown in the background, where the black points are the real measurements of FGK planets, the red dots are an example simulation of 50 GEMS obtained via rejection sampling of the underlying FGK PDF, and the underlying colour is their probability.Each of these simulated GEMS sample of planets is compared with the FGK sample to estimate the minimum sample size required for the survey as described in Section 4.1.1.

4.
Metrics to compare the two 1-D PDFs: We then perform rejection sampling in 1-D of the normalized PDFs -f (m|r = 1R J ) -to obtain a histogram of 1000 simulated planetary masses to compare across the M-dwarf and FGK samples.We adopt the Earth-mover distance (EMD;Rubner et al. 1998;Ramdas et al. 2015), which is a metric to quantify the difference between two distributions without any assumptions of normality or similar variance.In other words, the EMD can be used to compare two distributions, and reduces to zero when they are completely identical.The EMD is identical to the Wasserstein distance(Kantorovich 1960;Ramdas et al. 2015) in the case of PDFs and is implemented in the scipy python package(Oliphant 2007;Virtanen et al. 2020).We do not adopt the Kullback-Leibler (KL) divergence since that assumes comparison of a sample with a known distribution(Kullback & Leibler 1951), and is asymmetric.For the purposes of comparing just the mean of the two PDFs (while assuming normality), we adopt Welch's t-test(Welch 1947;Ruxton 2006) implemented in scipy through the ttest ind function that allows for unequal variances.This is in lieu of the two-sampleKolmogorov-Smirnov (K-S;Kolmogorov 1933;Smirnov 1948) andAnderson- Darling (AD;Anderson & Darling 1952;Scholz & Stephens 1987) tests.

Figure 5 .EMDFigure 6 .
Figure5.The scatter plot shows the expectation value -f(m-r, stm) -for the Jovian sized objects orbiting different stellar masses, whereas the orange line shows the linear fit, which we then extrapolate to lower stellar masses for M-dwarfs.
of TIC contamination ratio.

Figure 7 .
Figure7.The ∼ 1.0 million stars observed by TESS through Cycle 5 and contained in our cuts in Section 4.3.Previous studies have used an upper limit of 100 pc, which represents a small fraction (< 20%) of our sample.Our sample spans the full range of color and magnitude for M0-M9 .For reference, we indicate GJ 488 (M0V; square) and TRAPPIST-1 (M8V; triangle) in the CMD to approximately denote the two ends of the M-dwarf spectral type.The stellar radii, masses and T eff are calculated from MK s using relations fromMann et al. (2015Mann et al. ( , 2019)), whereas the TICv8.2contamination ratio is fromStassun et al. (2018).

dwarf Giant Planets Max. Formation Efficiency ∼ 10% Class 0/I Cl as s II 850 µm ap pro xim ati on
).