Recovering `lost' information in the presence of noise: application to rodent–predator dynamics
V N Smelyanskiy1,5, D G Luchinsky2,3, M M Millonas4 and P V E McClintock3
1 NASA Ames Research Center, MS 269-2, Moffett Field, CA 94035, USA
2 Mission Critical Technologies Inc., 2041 Rosecrans Ave. Suite 225 El Segundo, CA 90245, USA
3 Department of Physics, Lancaster University, Lancaster LA1 4YB, UK
4 CardioPrint Biometrics Inc., 170 Boulder Brook Dr., Boulder Creek, CA 95006, USA
5 Author to whom any correspondence should be addressed.
E-mail: Vadim.N.Smelyanskiy@nasa.gov, d.luchinsky@lancaster.ac.uk, millonas@cardioprint.com and p.v.e.mcclintock@lancaster.ac.uk
Received 17 February 2009
Published 27 May 2009
| Abstract. A Hamiltonian approach is introduced for the reconstruction of trajectories and models of complex stochastic dynamics from noisy measurements. The method converges even when entire trajectory components are unobservable and the parameters are unknown. It is applied to reconstruct nonlinear models of rodent–predator oscillations in Finnish Lapland and high-Arctic tundra. The projected character of noisy incomplete measurements is revealed and shown to result in a degeneracy of the likelihood function within certain null-spaces. The performance of the method is compared with that of the conventional Markov chain Monte Carlo (MCMC) technique. |
Contents
1. Introduction
Measurements of complex (multidimensional, nonlinear) stochastic dynamical systems are often missing essential information about the dynamical model and system trajectory (hidden variables). For historic time series where measurements cannot be repeated, missing information can be considered `lost'. Examples include: reconstruction of hidden variables of climate evolution [1] from observable isotope concentrations; fluctuations in predator–prey communities [2]–[6]; and coupled matter radiation systems [7]. How to reconstruct missing information about a stochastic dynamical model, including the full system trajectory, from incomplete noisy measurements is an unresolved scientific conundrum of long standing.
Recently, this problem has received considerable attention. A particle filter approach [8] and the Markov chain Monte Carlo (MCMC) approach [9] were applied successfully to reconstruct the model parameters alone. A simplified version, with all the dynamical variables measured directly, was considered in [10]–[13] using Bayesian and log-likelihood approaches. In the absence of measurement noise, attractor reconstruction [14] in time-delayed variables was used [5] to infer model parameters; in the absence of dynamical noise, shooting and recursive approaches were applied to reconstruct both the parameters and the trajectory [15]. The problem has remained unresolved, however, for the practically important case where hidden dynamical variables, measurement noise and dynamical noise all coexist together. In this case, unknown dynamical model parameters are trading against hidden trajectory components within certain null-spaces leading to a quasi-degenerate likelihood function with a very complex `landscape' and numerous local minima (cf [15]).
In what follows, we present a solution of this general problem based on a novel Hamiltonian formalism derived from the first principles of stochastic dynamics [16]–[19] within the maximum likelihood estimation (MLE) framework. The key result is the discovery of the Hamiltonian equations underlying the MLE problem, by analogy with the Wentzel–Friedlin theory [17] of large fluctuations. It reveals the complex structure of the likelihood and provides for robust convergence. We apply it to the analysis of an archetypical problem in nonlinear dynamics and statistics: noise-driven rodent–predator oscillations [2, 3, 5, 6] for which the predator population could not in practice be observed. We show that, contrary to earlier beliefs [2, 3, 5, 6], noise-corrupted measurements of the prey dynamics alone are a sufficient basis for computing both the likelihood of the predator population and the ecological model parameters. We compare the performance of the method with that of the conventional [12, 13] MCMC technique. Our formalism promises to be useful, not only for the analysis of population dynamics with hidden variables (see e.g. several thousand time series in the database [20]), but also in many other disciplines and contexts.
2. A problem in ecology
Figure 1(a) plots N(t) time-series data (points) for a rodent (prey) population density in Finnish Lapland [20]. Such cyclic behavior among small mammalian species has preoccupied population ecologists for decades. Perhaps on account of its stochastic multiannual quasi-periodicity, and difficulties in measuring predator numbers, there is still no agreement on the mechanisms of oscillation, nor of the reasons why some populations are cyclic and others are not (e.g. noncyclic lemming populations in Arctic Canada [2, 3, 21]). In these settings, the ability to discriminate between various models by inferring both the model and the unobserved population is especially useful. In the example we consider here, the problem is to use a limited number (80) of noise-corrupted measurements to recover both the unmeasured predator population density P(t) and the underlying nonlinear stochastic dynamical model. To solve it, we introduce a Hamiltonian approach. In doing so, we chose from among the various predation models [2, 3] one that distinguishes between the specialist and generalist predators that respectively promote and suppress the cyclic behavior. This particular model explains many observed features of rodent–predator dynamics and can be written as [2, 3, 5]
Functions fn, p(σn, p,t) = (1–en, p sin(2πt +
n, p) + σn, pξn, p(t)) in (1) represent the modulation of the birth rate by periodic (seasonal) and stochastic forcing with amplitudes en, p and σn, p respectively, and
n, p are unknown phase shifts of periodic forcing. ξn, p(t) are mutually uncorrelated zero-mean white Gaussian dynamical noise sources; r and s are, respectively, the intrinsic rates of prey and predator growth; k is the habitat's prey-carrying capacity, which determines the saturation level of the prey population in the absence of predators. The effect of generalist predators (e.g. foxes) whose population does not change with N is encapsulated within the term
N2 with coefficient g (maximum mortality rate). The effect of the specialist predators (small mustelids) believed to maintain the oscillatory prey dynamics is described by the term
NP, with maximum killing rate c. However, the predator population is notoriously difficult to study in the field, so that the corresponding density P(t) is an unmeasured (hidden) variable. The measured rodent density N ' (t) is related to the actual (unknown) value N(t) via N ' = Neσobsη(t) where η(t) is white Gaussian measurement noise of unit intensity.
| Figure 1. (a) Measured population dynamics of small rodents in Finnish Lapland, 1952–1992 [20] (yellow dots) and the solution of the optimization problem (dashed line). (b) The hidden dynamics of the specialist predator population, recovered by use of (1). The inset shows a cross section of the weighted distribution of dynamical trajectories for 1956 (arrow in main figure). |
The precise functional forms are known neither for predation nor for the predator response, and some modifications of equations (1) have been considered [5]. By a change of variables x1(t) = log(N(t)/k '), x2(t) = log(q 'P/k ') (for known nominal values of the scaling coefficients k ' and q ') the predator–prey and the measurement equations are transformed into a standard [8]–[15] form with noise entering the equations only additively
Here x(t) = {xi(t)}i = 1L and {ym(t)}m = 1M are the state vector trajectory and observed time series, respectively. The existence of hidden variables implies that M < L (L = 2 and M = 1 in (1)). {Ki}i = 1L are components of a drift vector field with (unknown) model parameters c = {c1, ...} and Bmi is an M×L measurement matrix. ξi(t) and {βm(t)} are zero-mean components of white Gaussian processes characterized by L×L and M×M correlation matrixes
and
, respectively (M≤L). The stochastic model (2) is characterized by the full set of unknown parameters
. For our particular problem
, the measurement matrix
, and the vector fields are
The standard form (2) represents a generalization that can be applied to a great variety of stochastic dynamical model reconstruction problems with hidden variables.
3. Hamiltonian formalism for dynamical inference
3.1. General solution
We now formulate the general inference problem of finding unknown model parameters
and unobserved trajectory components (here x2(t)) in probabilistic terms. A key statistical quantity is the likelihood probability density functional (LPDF), which is proportional (within the MLE framework) to the posterior distribution
representing the joint probability density that the system trajectory is x(t) and that the system parameter values are
, conditioned on the observed time series
. It can be written as
, where
is a normalization constant. A negative log-likelihood functional
can then be found using the path-integral approach to fluctuational dynamics [18, 19], as
Here
and
with x = x(t), y = y(t). Also
where
is the number of data points, h is the sampling time, and M≤L implying the existence of hidden variables. Despite the hidden dynamical variables not being measured directly, the functional
(3) contains the dynamical coupling between the variables via the force fields Ki(x1, x2). If the measurements provide information sufficient to pin down both the key model parameters of the system and its trajectory, the joint LPDF
will be well localized in the vicinity of its maximum, thus revealing the most probable trajectory x*(t) and the parameter values
that minimize the functional
for a given set of measurements
.
We solve this variational problem by introducing a new paradigm in which
is viewed as the mechanical action of an auxiliary Hamiltonian system with coordinate x, momentum
and a Hamiltonian function H(x, p)
The extremum of
in the joint space
is then found by solving the coupled variational problems
The first condition corresponds to a solution of the boundary value problem (BVP) for the Hamiltonian equations
satisfying the boundary conditions p(0) = p(t) = 0. This BVP can be solved very efficiently using, e.g. the results of [22]. An interesting general property of this solution is that the difference between measured data y(t) and x(t) is fedback in the last term of the second equation in (6). This term effectively acts as a `control force' of an amplitude inversely proportional to the measurement noise intensity driving the x(t) towards its most probable value at a fixed
. We then fix the inferred trajectory x(t) and update the values of the parameters in the set
, using an analytic solution [11] of the second variational problem,
. This procedure is iterated until the desired convergence is achieved. The outcome is the most probable system trajectory x*(t) and set of model parameters
providing a minimum of the action functional
. It corresponds to a point p* = (x*(t),
) in the joint trajectory–parameter space.
3.2. Application to the ecological problem
Note that the accuracy of the presentation (3) depends strongly on the sampling rate 1/h. For the particular ecological problem (1), it was possible to simulate densely-sampled data by introducing a spline interpolation of the experimental points. Convergence was guaranteed by the fact that the data set has nine points per period of oscillation.
Next, we note that in the presence of hidden variables slight changes in the model parameters can lead to significant changes in the system's trajectory, giving rise to a complex shape for
. This is the case for the problem in hand because, with a periodic driving force, (1) can undergo a transition to chaos as the system parameters vary [2, 5]. To investigate the shape of
we define a window of possible values for each model parameter within the set of plausible parameter ranges elucidated in earlier research [2, 3, 5, 6]. For many parameters the windows are very broad. We show some of them as rectangular planes in figure 2. Starting from a random point inside the predefined range we find a point p* corresponding to one of the solutions of the dual variational problem (5). Random restarts are needed if there is (as in our case) more than one solution p*. Projections of the points p* onto the 2D parameter plains and corresponding 3D plots of the LPDF and
are shown in figure 2.
Figure 2. Weighted distributions of the inferred values of the model parameters: (a) r and rk ' /k and (b) s and sq/q ', and the corresponding action functions (c) and (d) . Their likelihood distributions are well localized around expected values r = 5.69±0.49, k = 76±17, s = 1.08±0.31, q = 43±22, g = 0.12±0.3, e1 = 1.4±0.4, e2 = 1.0±0.5 which are close to the values considered in the earlier ecological research. The projections of the points p* onto the two-dimensional (2D) parameter planes are shown in the top planes of (a) and (b) by black dots. |
For example figure 2(a) shows the projection of the LPDF on the plane of parameters (r,rk ' /k). In this figure, all the parameters except parameters r and rk ' /k are kept fixed at their mean values. Parameters (r,rk ' /k) are then chosen from the uniform distribution shown in the background of the figure. After one step of the double optimization procedure described above, the optimal values of the trajectory and parameters r and rk ' /k are obtained p* = (x*(t), r*, (rk '/k)*) and the corresponding value of the cost function
is calculated. The procedure is repeated N = 20 000 times. Then the projection of the LPDF surface on the (r,rk ' /k) plane (figure 2(a)) is found as follows:
where ρij is the density of points in the bin (ij),
is the mean value of the cost function in this bin, Nij is the subset of N found in the bin (ij), I = 60 and J = 40 are the number of bins in r and rk ' /k directions, and dri d(rk/k ' j) is the size (area) of the bin. The corresponding projection of the cost function (figure 2(c)) is found as
Projections of the LPDF and cost functions on the plane (s,sq/q ' ) (figures 2(b) and (d)) are found in a similar way.
It can be seen in the density plots that the LPDF has a very steep localized ridge whose top forms a `line'
in the joint parameter–trajectory space. Fluctuations along
are large and non-Gaussian, while fluctuations transverse to the ridge top are strongly suppressed (figure 2). The mechanical action along
varies slightly, each point on
corresponding approximately to a solution of (5). This difference in scales of fluctuations and the quasi-degenerate forms of
and LPDF can be attributed to the projective character of the measurements with M < L: they are incapable of resolving certain tradeoffs between the parameters and hidden trajectory components. The different points along
correspond to quite different time traces of the hidden trajectory component (predator oscillations shown in figure 1(b)) but they all fit very well to the set of prey population observations. Such tradeoffs are intrinsic to any given model and their understanding is crucial for experts applying the technique for data interpretation and model discovery.
4. Validation of the new dynamical inference approach
Next, we validate the approach by applying it to the inference of population dynamics in a stoat–lemming community [6] and by comparing it to the MCMC method. The corresponding data (figure 3) are similar to the earlier set [5] but with an important difference: the dynamics of both populations, prey N(t) and predator P(t), were recorded. We assume initially, however, that only the prey population was measured, and subsequently compare the inferred predator population with the measured one. The dynamical model [6] for this community is rather elaborate. However, given the close similarity between the two communities it is instructive to perform such a comparison treating (1) as a simplified model. Our results are shown in figure 3: the measured predator trajectory is reproduced, including the correct slope and scale of the oscillations, unlike the earlier simulations [6]. With sufficient care, the MCMC using a Gibbs sampler with the Metropolis–Hastings steps method can show comparable performance. We have found for a number of models, however, that convergence of the Hamiltonian method is faster and more robust with respect to the initial conditions due to the existence of local minima corresponding to nonsmooth time functions P*(t). These local minima slow down the convergence of the MCMC algorithm (see e.g. dashed blue curve in figure 3), but are excluded from the start in our Hamiltonian approach (4)–(6).
| Figure 3. Populations of (a) lemmings and (b) stoats (individuals ha–1) observed in the high-Arctic tundra (1988–2002) [6] are shown by yellow ellipses. Jagged thin solid lines show the initial guesses for each population. Dash-dot (red) lines show the population dynamics inferred from (1) using the Hamiltonian approach and assuming that only the lemming population was recorded, with a measurement error of 0.1, while the stoat dynamics remained hidden. Dashed (blue) lines are from MCMC after 40 000 iterations of the whole trajectory, under the same assumption. |
5. Conclusion
In conclusion, our Hamiltonian approach facilitates inference of parameters and hidden (unobserved) trajectory components of a nonlinear dynamical model in the presence of dynamical noise and measurement errors. The application of the method to ecological problems leads to the conclusion that a lack of observational data for predator populations need no longer constitute a fundamental obstacle to the inference of hidden population dynamics and ecological parameters from noisy measurements [2, 3, 5, 6]. The key limitation of the proposed method is that it relies on the Euler approximation in derivation of the cost function. Correspondingly, the accuracy of this approximation depends strongly on the sampling rate, which has to be dense enough. Here, we simulated dense sampling by spline interpolation and convergence was guaranteed by the data set having nine periods of population oscillation with nine points per period. Note, however, that the convergence of the BVP solution does not require dense sampling and in the future the restrictions on the sampling density can be relaxed. In general, the dependence of the uncertainty of inference on the number of data points is nontrivial in character and will be discussed in detail elsewhere. The method has allowed us to reveal a quasi-degeneracy of the log-likelihood along a hyperplane
in the joint parameters–trajectory space. The existence and shape of
reflects tradeoffs between parameters r and k ' /k (s and q/q ') on one hand, and hidden trajectory components on the other. Importantly, our procedure, unlike MCMC, avoids sampling in the trajectory space. Instead, it relies on a continuous-in-time solution of the deterministic auxiliary problem (6), thereby exploiting recent advances [22] in tackling the BVP. The method will also be applicable quite generally to cases where some state variables could not be recorded, e.g. those mentioned in the introduction [1, 23]. It could be particularly useful in the context of physiological measurements relating difficult-to-access parameters to noninvasively measured data [24], and to cases where dynamical variables are intrinsically hidden and cannot be measured experimentally, even in principle: e.g. flow parameters in aerospace applications [25] and susceptible and exposed populations in epidemiology [23].
Acknowledgments
We acknowledge support from NASA and the Engineering and Physical Sciences Research Council (UK).
References
V N Smelyanskiy et al 2009 New J. Phys. 11 053012
S Tadigadapa and K Mateti 2009 Meas. Sci. Technol. 20 092001
Akira Ozawa et al 2009 New J. Phys. 11 083029
Franz J Kaiser et al 2008 New J. Phys. 10 065013
R H M Smit et al 2009 New J. Phys. 11 073043
D. Adén et al 2009 ApJ 706 L150
B Ziaja et al 2008 New J. Phys. 10 043003
Jaume Haro and Emilio Elizalde 2008 J. Phys. A: Math. Theor. 41 372003
H. Ding et al 2008 EPL 83 47001
Luigi Delle Site et al 2007 J. Phys.: Condens. Matter 19 242101