Machine Learning Implementation on TTV Analysis using TTVFast

Exoplanet research has undergone rapid development lately due to the increasing number of space telescopes and satellite surveys launched in recent years. Two missions with the most exoplanet discoveries using transit methods are Kepler and TESS missions. Both missions focused on photometric surveys to detect planets by observing the periodic dimming of a stars brightness as a planet passes in front of it. There is another method that could take advantage of transit observations called transit timing variation method. By observing the deviation of the transit time of transiting planet we could infer the existence of another body in the system. Performing exoplanet parameter estimation using TTV data is an extensive process. Estimating exoplanet parameters using TTV, we must fit the TTV data using a wide range of parameters using n-body simulation. N-body simulation in this scale is computationally costly. To estimate planet parameters, we need to run thousands of n-body simulations. With the increasing trend of using machine learning methods in the research process, we try to implement the machine learning method to make TTV analysis more effective and efficient. We implement the application of machine learning to n-body simulation using REBOUND and TTVFAST and compare the results. REBOUND is a Python-based library designed for simulating the dynamics of n-body systems, particularly celestial bodies like planets, stars, and other objects that interact gravitationally. While TTVFAST is a modified n-body simulation code that is specifically designed to calculate TTV on transiting planetary systems. We found that TTVFAST is much faster than REBOUND when generating samples for training and testing while still maintaining similar accuracy. Also, the machine learning model generated from both data samples is performed similarly.


Introduction
The study of exoplanets has undergone a massive boost due to some new exoplanet dedicated missions launched to space in the last few years.More than 5000 exoplanets have been discovered to date, with almost 75% of them being detected using the transit method.Most of the transiting planet was detected using several surveys, with Kepler [1] and the Transiting Exoplanet Survey Satellite (TESS) [2] being two missions with the most contribution.Apart from the detection of numerous exoplanets, the transit method offers the possibility of searching for extra planets within the same system using Transit Timing Variations (TTV) [3].
The main challenge using the TTV method to detect exoplanets is that the TTV signals can be quite small to be detected.Detecting these subtle variations in transit timing requires highly precise and long-term observations, often spanning several orbital periods.That is why data from TESS is very suitable for this method, because it provides continuous, high-precision photometric data of a massive number of stars.From the mission design of TESS, about a total of 90 exoplanets TTV should be measurable [4].Also if we combine the transit lightcurve from TESS with other surveys, the number of exoplanets that the TTV signal is measurable should increase.Transit survey mission, ground based and space based, is still providing a large amount of transit data which could enrich the transit catalog database that could be used to make TTV analysis more possible and more accurate.

Transit Timing Variation
Currently only 26 exoplanets have been detected using TTV method, with the recent one being the detection of AU Mic d.AU Mic d is a planet with more than 10 times the mass of earth orbiting young red dwarf stars with half the mass of the sun.The transit data used in the detection is the combination of ground-based and space-based observations [5].Transit Timing Variations (TTV) is a method employed in the detection of exoplanets by studying the precise timing of a planet's transits in front of its host star.Typically, exoplanets exhibit regular and predictable transit patterns, with consistent intervals between these transits as they orbit their host stars.However, when multiple planets reside within the same planetary system the interval between transit undergoes some variation.In such cases, the gravitational influence of these unseen planets can perturb the orbit of the transiting planet, causing deviations in the timing of its transits from the expected regular pattern.Astronomers use high-precision observations of these transit timing variations to deduce the presence of additional planets, estimate their masses, orbits, and the dynamics of multi-planet systems.
It is very difficult to calculate TTV accurately using an analytical approach.We can use analytical approaches in some cases, e.g. using perturbation theory, but the application will be limited to certain systems in which the perturbation theory approach still gives acceptable results.There are multiple ways to model or calculate TTV from a planetary system where the orbital parameter is already known.The most reliable one is using N-body simulations.But N-body simulation is also computationally costly which makes the analysis not very efficient.There are some N-body codes that could be used to calculate the TTV of planetary systems, one of which is REBOUND [6].REBOUND, a powerful open-source software, is designed for simulating broad range of gravitational system including the dynamics of planetary systems.It serves as an essential tool to model the complex gravitational interactions occurring between celestial bodies within these systems.With REBOUND, scientists can accurately simulate and analyze perturbations between planets, gaining valuable insights into how these interactions shape the behavior and evolution of planetary orbits over extended periods.

TTVFast and Machine Learning
TTVFast calculates Transit Timing Variations (TTV) by using numerical methods and simulations to model the gravitational interactions between multiple planets within an exoplanetary system [7].It uses a symplectic integrator, which is a numerical method optimized for the specific task of simulating the gravitational interactions between multiple celestial bodies, such as planets in an exoplanetary system.While TTVFast employs principles of Nbody simulations, it's important to note that it is designed to be faster and more efficient than traditional N-body simulation methods for the specific purpose of calculating TTV.It focuses on the interactions between known planets in a system and aims to model their effects on each other's orbits, which lead to TTV signals.
Deep learning models, particularly neural networks with multiple hidden layers, excel at recognizing complex patterns within vast datasets.TTV signals contain very complex patterns which are very sensitive to small changes of the orbital parameters of the system.The deep learning model will be able to predict the parameters of the additional planet (M 2 , P 2 , e 2 , ω 2 ) given the known parameters (M * , M 1 , P 1 , e 1 , ω 1 ) and the TTV (O-C).To train the deep learning models, we need to generate a large dataset of exoplanet systems with a wide range orbital configuration.We generate training and testing dataset using TTVFast and also using REBOUND [6] as comparison.
To train the neural network effectively, random simulations of short-period planets orbiting solar-type stars are generated.The stellar masses considered range from 0.85 to 1.15 times that of the Sun, aligning with the focus on solar-type stars in this study.Inner planets are created with masses ranging from 0.66 to 150 times that of Earth, and periods spanning from 0.75 to 10 days, using a uniform distribution.The outer planet is generated differently; a conditional probability distribution is employed to ensure stable orbits, avoiding random generation of unstable configurations.The mass of the outer planet is randomly chosen within a range of 0.25 to 3 times that of the inner planet.The period of the outer planet is determined based on a period ratio between 1.25 and 2.5, adjusting if it falls within the hill sphere of the first planet.Additionally, a constraint is applied to exclude simulations resulting in Transit Timing Variations (TTVs) exceeding 10 minutes, thus preventing the generation of unstable orbital scenarios.This value is adopted from previous research on detecting exoplanet using TTV with deep learning method [8]. 10th

Training and Testing Dataset
The dataset that will be used to train machine learning model is generated from REBOUND and TTVFast.The sample amount is generally about 10000 simulations for both codes.The main difference between REBOUND and TTVFast is the simulation performance in terms of speed.To generate 10000 samples, REBOUND approximately takes almost 24 hours of simulation.While TTVFast only takes a few minutes to generate the same amount of simulations.The corner plot of the training dataset is shown in Figure 1 and Figure 2. We can see that both dataset is almost identical which makes TTVFast much more efficient.With this in mind, we could make bigger dataset with much wider range of planet orbital parameter using TTVFast to explore more kind of exoplanet planetary configuration.

Machine Learning Model
Machine learning model then trained and tested using both dataset resulting in two models to predict the planet orbital parameter.Both machine learning model parameters are identical, the only difference is the training and testing dataset.Figure 3 and 4 shows the accuracy of the model after being trained and tested in 100 epochs.From the training process we get similar model accuracy for both models, indicating that the dataset from TTVFast and REBOUND have no significant differences.The model only trained with 100 epoch, a longer epoch might affect the accuracy of the models.

Parameter Prediction
The machine learning model then applied to some random input to check the performance.We make artificial TTV signal from two simulation to see if the model will be able to predict the orbital parameter of the planet from the TTV that we input.What we input is mass of the star, mass of planet 1, and period of planet 1 and also the TTV value.The model should predict the parameter of second planet such as period, mass, eccentricity, and argument of perigee (ω).The comparison between actual parameter and predicted parameter for system A is shown in table ??.System A consisting star with mass 1.12 M ⊙ and planet with period of 3.2888 days and 80 M ⊕ .System B consisting star with mass 1 M ⊙ and planet with period of 1.5 days and 50 M ⊕ .The comparison of system B between actual parameter and predicted parameter is shown in table 2. Both model perform similarly for both system, but generaly model using TTVFast dataset is better at predicting the mass value and the model using REBOUND dataset is better at predicting the period value.For eccentricity and argument of perigee there is almost no different between two methods, and for system B it shows that both models cannot predict the argument of perigee at all because both models shows the same value which is really small near zero.Predicted value from model compared to actual TTV using model with TTVFast dataset We also show the comparison between actual TTV value and the TTV from predicted parameter.Figue 5 and 6 shows how the TTV predicted from model compared to actual TTV.It shows that model using TTVFast dataset is closer to the actual TTV than the model using REBOUND dataset.But in general these two models perform very similar in terms of how close it is to the actual parameter values.

Conclusion
We make machine learning model to predict the orbital parameter of additional planet in planetary system using TTV value as input.The training dataset is generated using N-body simulation code REBOUND and TTVFast.Both models show similar result in term of accuracy and performance.But TTVFast is much more faster and efficient in generating training dataset.The benefit of faster computational times is that we could make much bigger dataset with much boarder range of orbital parameter.That way, we could make prediction of planet orbital parameter from TTV without much restriction.Because model in this research is only capable to predict parameter from relatively small TTV with amplitude 0-10 minutes and in sort period planetary system.If we could make bigger training dataset with much more general parameter, the model should be able to predict parameter from much wider range of TTV signals.

Figure 1 .
Figure 1.Corner plot of training dataset using REBOUND.

Figure 5 .
Figure 5.Predicted value from model compared to actual TTV using model with REBOUND dataset.

Figure 6 .
Figure 6.Predicted value from model compared to actual TTV using model with TTVFast dataset

Table 1 .
Predicted orbital parameter for planet 2 in system A.

Table 2 .
Predicted orbital parameter for planet 2 in system B.