Inferring nonlinear fractional diffusion processes from single trajectories

We present a method to infer the arbitrary space-dependent drift and diffusion of a nonlinear stochastic model driven by multiplicative fractional Gaussian noise from a single trajectory. Our method, fractional Onsager-Machlup optimisation (fOMo), introduces a maximum likelihood estimator by minimising a field-theoretic action which we construct from the observed time series. We successfully test fOMo for a wide range of Hurst exponents using artificial data with strong nonlinearities, and apply it to a data set of daily mean temperatures. We further highlight the significant systematic estimation errors when ignoring non-Markovianity, underlining the need for nonlinear fractional inference methods when studying real-world long-range (anti-)correlated systems.

The dynamic behaviour of complex systems comprising many degrees of freedom poses significant challenges to analytic description and often eludes physical intuition.Typically, measurements only access few slowly evolving degrees of freedom which fluctuate stochastically, indicating the presence of hidden interactions within the system.If the system is organised in hierarchical selfsimilar subsystems, their fractal nature may be mirrored in scale-free correlated, i.e., fractional, fluctuations leading to non-negligible departure from Markovianity.Important examples of systems displaying fractional correlations include the climate [1][2][3][4][5], DNA [6,7], cell motility [8,9], and the brain [10,11].Recent efforts in studying such processes have focused on machine learning-based prediction; despite their power [12], such approaches often lack interpretability [13] or robustness in real-world scenarios [14,15].In order to build an analytic and intuitive understanding of such fractional processes, statistical inference, in which parameters of a conjectured stochastic model are inferred from experimental data, may help in providing clearer explainability and interpretation.Recent inference methods focused on either linear fractional [16][17][18][19], or nonlinear non-fractional processes [20][21][22][23][24][25][26][27][28][29][30].In scenarios, however, where nonlinear dynamics and fractional fluctuations appear simultaneously, their complicated interplay leads to many emergent features not present in either purely nonlinear or fractional models.Hence, if these phenomenologically richer scenarios are to be studied, more generalised statistical inference methods are required.
In this Letter, we propose an estimation method for a fully nonlinear, fractional model of time series, recovering space dependent drift and diffusion parameters from a single trajectory.Thus, it is suitable for scenarios in which only single realisations are available [31,32].In developing the estimation method, we draw on fieldtheoretic tools, in particular the Onsager-Machlup for-FIG.1.The stochastic difference equation given in Eq. ( 1) describes the stochastic evolution of a process xn subject to a nonlinear drift f (x) = −V (x) (V (x) shown as blue shade) and a non-homogeneous diffusion g(x) (rainbow coloured ground).The process is driven by multiplicative fractional driving noise g(x)∆ξ H .We propose an algorithm which infers both f (x), g(x) from a single finite trajectory (green dots).
malism [33][34][35][36] which we generalise to fractional processes.We benchmark the method using both synthetic and real world data, where it reconstructs nonlinear parameters and statistics with excellent accuracy.We further show that ignoring fractional correlations leads to systematically wrong parameter inferences.
Fractional processes-We discuss a stochastic model for nonlinear time series described by an Itô-type stochastic difference equation driven by fractional noise where f (x), g(x) denote space-dependent drift and diffusion terms of dimension The process is illustrated in Fig. 1.Eq. ( 1) is the first-order discretisation (Euler-Maruyama scheme) of a stochastic differential equation driven by fractional noise [37].While ∆t denotes the time step of the discretisation, ∆ξ H n is a discretised fractional Gaussian noise (fGN) [38,39], defined by its correlation matrix Choosing H = 1 2 , the noise ∆ξ H n is discrete standard Gaussian white noise.For H > 1 2 (H < 1 2 ), the increments ∆ξ H n are positively (negatively) correlated.
Inference method-We propose an algorithm that given a single measured trajectory x = {x 0 , x 1 , . . ., x N } sampled at ∆t infers the functional parameters f (x), g(x) as introduced in Eq. (1).The asymptotic scaling of the correlation function of Eq. ( 1) is solely determined by H and ∆t and does not depend on f (x), g(x).We assume that the Hurst parameter has already been estimated previously using established methods (e.g., [40][41][42][43]).
The central idea is to find the functions f and g which maximise the log-likelihood ln P [ x|f, g] of the observed trajectory x.The likelihood is obtained by measuring the likelihood of the noise realisation ∆ξ H = {∆ξ H 1 , ..., ∆ξ H N }, which together with f and g produces the measurement x.This realisation is found by inversion of Eq. ( 1), The probability to observe a measurement x then follows from P ξ [ ∆ξ H ], the probability distribution of a noise realisation, by a standard change of variables in R N , , where the nonlinear transform (3) induces a Jacobian determinant accounting for the measure change from n d∆ξ H n to n dx n [44].Since ∆ξ H is discrete fractional Gaussian noise, its probability distribution is a N -dimensional Gaussian distribution.We write the distribution field-theoretically as where A is the and C −1 is the matrix inverse of the correlation matrix given in Eq. ( 2) [45].Here, the non-diagonal correlation matrix accounts exactly for the full memory of the process.
The determinant ∂ ∆ξ H /∂ x itself is readily evaluated as ∂∆ξ H n /∂x m is a lower triangular matrix with all entries vanishing for m > n (see Eq. ( 3)).Hence, the determinant is given by the product over the diagonal entries, where we drop added normalisation constants independent of f or g.
In the field-theoretic literature, expressions of the form of Eq. ( 4) are well known for diagonal correlation matrices C, corresponding to uncorrelated noise with H = 1  2 , where they are referred to as Onsager-Machlup actions [33][34][35][36].These have been extensively studied to characterise the "most likely path" of a specifically parametrised stochastic process [34,[46][47][48][49][50].In this Letter, we take the opposite direction and use Onsager-Machlup theory to determine which parameters render the observed path the likeliest.In doing so, we draw a connection from established (Markovian) maximum likelihood estimators [22,25,29] to the Onsager-Machlup formalism.We generalise these results to fractionally driven processes and hence refer to the estimation method based on maximising (4) as fractional Onsager-Machlup optimisation (fOMo).
The problem of finding the optimum of Eq. ( 4) is generally solved numerically.This requires a finite functional representation of f (x), g(x), e.g., polynomials of finite de-gree.Nonetheless, in certain cases the optimum of S can be found analytically.
Exact results-The estimation method corresponds to finding the configuration f (x), g(x) that minimises that is parametrised by the inherently stochastic measurement x and therefore random ("disordered"), fOMo amounts to finding the saddle point of S, where simultaneously δS/δf ≡ δS/δg ≡ 0.
If g is fixed, but not necessarily constant, the saddle point in f can be found analytically.
We introduce the empirical propagator ], as well as the empirical velocity v n = (x n − x n−1 )/∆t.Inserting these into Eq.( 4), the action then reads omitting f -independent terms.This bilinear action corresponds to a Gaussian field theory [45] x for some L < N , the optimal coefficients f can be found by inserting the polynomial ansatz into Eq.( 5) and setting ∂S/∂ f = 0.The resulting action in the f resembles the Hamiltonian of L fully coupled harmonic oscillators in a confining harmonic potential which allows for a single optimal configuration corresponding to the estimate f ; a calculation provided in the SI shows that this minimum is given by f = (H −1 j) , where we introduced This estimate is exact for arbitrary inhomogeneous fixed diffusion g(x); it further simplifies when f is linear and g constant (discretised fractional Ornstein-Uhlenbeck process, see SI).
Analogously, we consider the case of f (x) general but fixed.Introducing the empirical noise correlation The action resembles a fully interacting N -body Hamiltonian on the positive half line; the "particles" at position ψ n are quadratically coupled to one another, confined harmonically for ψ n → ∞, since D nn > 0, yet repelled from the origin by a logarithmically diverging potential giving rise to a stable minimum satisfying a self-consistency relation (∆t) 2 ψn m D m,n ψm = 1 for all n.If g ≡ g 0 is assumed to be constant (additive noise), this immediately returns ĝ2 0 = (∆t) 2 m,n D m,n .If further H = 1 2 , D m,n is diagonal, and Eq. ( 6) resembles the Hamiltonian of N non-interacting particles in a logarithmic-harmonic potential ( [51]).The estimate for g 0 then recovers the well-known Markovian result ĝ2 0 = n (f (x n−1 )∆t − (x n − x n−1 )) 2 [22].Superposition of noise processes-The method is readily adapted to processes subject to different additional Gaussian noise sources which could model, for instance, the additional coupling of the process to a heat bath.We consider a generalisation of x n , the fractional process introduced in Eq. ( 1), by setting where we omit terms independent of f, g, and, in contrast to (4), replace the empirical propagator by Fast Inversion-In order to numerically evaluate the log-likelihood (4) the correlation matrix C has to be inverted.This is an operation of computational complexity O(N 3 ) and memory requirements of order O(N 2 ), rendering it prohibitively costly for times series of only intermediate length (N ∼ 10 4 ).We circumvent this problem by exploiting the Toeplitz structure of the correlation matrix; since the process is stationary, C mn depends on |m − n| only and can be inverted efficiently [53].In doing so, we reduce computational complexity to O(N 2 ) and memory requirements to O(N ) (see SI for further details), rendering fOMo computationally tractable even for long time series (N ∼ 10 6 ).This fast inversion method may be generalised to other stationary driving processes such as, for instance, tempered fractional noise [54][55][56].
Synthetic Data-We have succesfully tested fOMo with various nonlinear models and illustrate a representative example in the following.We consider a model system defined by Eq. (1) choosing f (x) = −0.25x 3 + 0.5x, corresponding to a double well potential, and inhomogeneous diffusion g(x) = 0.2x 2 + 0.5, implying large stochastic fluctuations far away from the origin.We assume the Hurst exponent to be already determined and infer f and g employing fOMo (Eq.( 4)).In order to highlight the significance of the noise correlations, we compare this estimate to a fOMo estimate where we wrongly fix H = 1 2 , whence C −1 is diagonal, effectively producing a Markovian maximum likelihood estimate (see [22]).We use polynomial basis functions for drift (L = 3) and diffusion (L = 2).For an ensemble of 100 trajectories (see Fig. 2 for details), we simultaneously infer drift and diffusion terms and measure the root-mean-square error (RMSE) of the inferred terms f , ĝ. Inference errors are weighted with the empirical invariant measure of the pro- , and analogous for g.
Remarkably, the Markovian model underestimates (overestimates) both drift and diffusion terms for H > 1 Root-mean-square-errors are computed using the empirical invariant density (see text).
(H < 1  2 ) while fOMo correctly recovers the true input functions (Fig. 2) with excellent accuracy.The (anti-)correlations of the noise process lead to a wider (narrower) width of the invariant density of the process, also explaining the visible deviations of the fOMo samples from the ensemble mean for H = 0.35 (Fig. 2 (top left panel)).Markovian modelling fails to recover the double well shape of the corresponding potential in our example for H = 0.35.The inference error of Markovian modelling steeply increases for small deviations from H = 1 2 , highlighting the necessity of taking fractional correlations into account.
We investigate the finite-size error scaling as a function of the trajectory length N and find a drift error scaling RMSE ∼ 1/ √ N for H = 1 2 , and RMSE ∼ N α with α ∼ H 2 for H = 1 2 , while the diffusion error scaling does not show a clear trend (see SI).
Temperature Data-We apply fOMo to daily mean temperature data recorded at Potsdam Telegrafenberg weather station, Germany.This time series consists of 130 years of uninterrupted temperature measurements taken from the ECAD project [57,58].Removing the seasonal cycle using a Fourier series, we obtain the temperature anomalies T , an approximately stationary time series.Temperature anomalies are long-range correlated [3,59], monofractal [60] and have been described by over-damped models driven by fractional Gaussian noise [61][62][63].We construct the correlation matrix using the estimated Hurst exponent H ≈ 0.65 [61], and a sampling time ∆t = 1 d.In this model, f may be interpreted as an atmospheric response function bearing the unit [f ] = K/day and its stochastic fluctuations g, measured in [g] = K/day H [62].We use cubic and quadratic polynomial ansatzes for drift and diffusion, respectively and employ fOMo and a Markovian maximum likelihood estimate (H = 1 2 , see above).In order to test the consistency of the inferred fOMo model, we generate a trajectory ensemble using the obtained parameters and apply fOMo to this ensemble.Temperature data and synthetic data show very good agreement (Fig. 3).The Markovian model significantly underestimates the deterministic force term compared to the force term inferred by fOMo which takes the long-range correlations into account.Unlike for the double well potential, the Markovian diffusion estimate agrees well with the fOMo estimate.This is due to the approximate linearity of f .Conclusion-By way of a fractional generalisation of the Onsager-Machlup formalism, we create an estimation method that is able to infer functional parameters of a fully nonlinear model driven by multiplicative fractional noise from a single trajectory.Applying the algorithm to both synthetic and real data, we recover excellent esti- mates even for Hurst values deep in the non-Markovian regime, where ignoring the (anti-)correlations of the fluctuations leads to gross systematic errors.The method provides a needed tool for modelling real-world complex systems whose fractal nature cannot be neglected, and illuminates intriguing connections bridging time series analysis with the statistical physics of fractionally correlated many-body systems.
Appendix C: A Note on the Numerics In this section, we elaborate on the computation of the stochastic action Eq. ( 4).For stationary processes, the autocovariance function solely depends on the difference between two time instances.Hence, the covariance matrix C ij is a (N − 1) × (N − 1) symmetric Toeplitz matrix, where N is the length of the time series, i.e.

2 m
where η n is an additional independent Gaussian noise term of correlation K m,n = η m η n .Repeating the previous steps in constructing the fractional Onsager-Machlup action, one inverts the new stochastic equation forx n to find ∆ξ H ( x) n = (x n −x n−1 −f (x n−1 )∆t−η n )/g(x n ) whichone inserts into the free action A[ ∆ξ H ] (see Eqs. (3), (4)).This transformation leaves the Jacobian, n |(g(x n ))| −1 , unchanged.Instead the path likelihood of x conditioned on a particular η, Eq. (4), acquires some extra terms of the form n η n a n − 1 ,n η m G m,n η n , where we abbreviate a n = m G m,n (f (x m−1 ) − v m ) ∆t.It remains to average over the additional noise distribution P η [ η] = exp −1/2 m,n η m (K −1 ) mn η n / (2π) N det K. Carrying out the Gaussian integral in n dη n , one finds that the modified action S = − ln P

FIG. 2 .
FIG. 2. Inference via fOMo correctly recovers drift and diffusion in contrast to Markovian estimation.Ensemble study with 100 trajectories with N = 10 6 , ∆t = 10 −2 , random initial conditions and equilibration time Teq = 10 4 [52].(top left and top center panels) Drift and (bottom left and bottom center panels) diffusion estimation via fOMo (red) and Markov leastsquares fit (brown) for anti-correlated (H = 0.35) and long-range correlated (H = 0.65) dynamics.Inferred drift and diffusion of all ensemble members are plotted in light grey, only visible in (top left panel).(top right panel) Reconstruction error of drift f (x) for fOMo (red) and Markovian estimate (brown).(bottom right panel) Reconstruction error of diffusion g(x).Root-mean-square-errors are computed using the empirical invariant density (see text).

FIG. 3 .
FIG. 3. Drift and diffusion term estimates (upper and center panel), and histogram (lower panel) for Potsdam daily mean temperature anomalies.(Upper and center panel) Blue lines represent the fOMo estimate, red lines the ensemble mean of fOMo inference from one hundred synthetic trajectories generated with the inferred model, and dark olive lines the Markovian model estimate.Red and blue lines show excellent agreement proving fOMo's consistency.(Lower panel) The exponential decay of the histogram's tails of the marginal density P (T ) is well captured for large negative anomalies while there is a slight offset for large positive anomalies where there are fewer data points.