Classification of anomalous diffusion in animal movement data using power spectral analysis

The field of movement ecology has seen a rapid increase in high-resolution data in recent years, leading to the development of numerous statistical and numerical methods to analyse relocation trajectories. Data are often collected at the level of the individual and for long periods that may encompass a range of behaviours. Here, we use the power spectral density (PSD) to characterise the random movement patterns of a black-winged kite (Elanus caeruleus) and a white stork (Ciconia ciconia). The tracks are first segmented and clustered into different behaviours (movement modes), and for each mode we measure the PSD and the ageing properties of the process. For the foraging kite we find $1/f$ noise, previously reported in ecological systems mainly in the context of population dynamics, but not for movement data. We further suggest plausible models for each of the behavioural modes by comparing both the measured PSD exponents and the distribution of the single-trajectory PSD to known theoretical results and simulations.


I. INTRODUCTION
In natural stochastic phenomena, random fluctuations are often modelled via uncorrelated "white noise", with finite mean and variance, and with a fluctuation amplitude that is independent of the frequency [1]. However, a large class of natural systems exhibit so-called 1/f noise, considered to be an emergent property of correlations that extend across multiple temporal scales [2][3][4][5][6][7]. Traditionally, 1/f noise is defined in terms of the power spectral density (PSD). For an individual trajectory, the PSD of a d-dimensional path, x(t) = (x 1 (t), x 2 (t), ..., x d (t)), as a function of time t, is defined in terms of the sum of squares of the Fourier transform over the d individual components [8,9] Here, t m is the measurement time of the process and f the frequency. "Coloured noise" corresponds to the asymptotic scaling S(f ) ∝ 1/f β of the averaged PSD at low frequencies (see below), where angular brackets imply ensemble averaging, and 0 ≤ β ≤ 2 [10]. Notably, we here focus on the regime 1 ≤ β ≤ 2 where S(f ) is nonintegrable as ∞ 0 S(f )df ∼ ∞ 0 1/f β df → ∞. Naturally, low frequencies that cause such divergence are associated with infinitely long correlation and measurement times. This non-integrability is often referred to as "the infrared catastrophe", as it implies that infinite energy is stored in the low frequency fluctuations, even in a bounded process, which is evidently not physical [2,11,12]. Nevertheless, experimentally, power-law shaped PSDs attributed to 1/f noise have been inferred from finite-time observations in systems ranging from semiconductor devices and metal films [2,13] to earthquakes [14] and human cognition [15].
In recent years, the "paradox" of the divergent power spectrum was resolved by showing that at finite measurement times the exponent β is not sufficient to characterise all properties of the PSD. Specifically, to physically interpret the apparent divergence at small f , one should use the non-stationary, ageing power spectrum, of the form [16,17] where z is called the ageing exponent. This form entails that the divergent nature of the spectrum expected at infinite times manifests itself through the (finite-time) finite value of a non-stationary power spectrum [16][17][18]. This ageing pattern has been measured in a range of processes [19][20][21][22][23], including intermittent quantum dots [12], telomeres in the nucleus of cells [9] and the motion of membrane proteins in living cells [24]. In ecology, 1/f noise is primarily observed in population dynamics and thought to account for the variability and autocorrelation of such ecological time-series [25]. For example, the variability of the population density in multiple species has been shown to increase over time, a phenomena that can be accounted for by 1/f noise [26]. Rapid increase in high-resolution movement data has led to the development of numerous statistical and numerical methods [27,28], as well as analytical methods [29] to analyse relocation trajectories. Data is often collected for multiple individuals and for long periods that encompass a range of behaviours, from local searches within a bounded patch, to long migratory flights [30]. Here, the PSD is also used in the analysis of movement data, mainly to detect frequencies that account for the variability across a movement path (see, e.g., [31]). However, to the best of our knowledge, neither the PSD nor its ageing effects have been used to model and classify the noise of movement paths in movement ecology. Specifically, ageing can have important implications on the process, entailing if it is stationary and ergodic, which in turn affects the ability to average over common observables (e.g., net and squared displacements, as well as velocity) [32,33].
In processes with 1/f 2 noise (corresponding to Brownian noise), finite-time effects in the power spectrum were shown to decay in the limit t → ∞, and averaging over an ensemble of independent trajectories, the value measured numerically from finite paths converges to a single function S(f ) as the path length is increased [8]. However, having an ageing power spectrum with 1 ≤ β < 2, the dependence of the PSD on both f and t m always persists. To account for the time dependence, periodograms are often used [34]. Additionally, several recent studies have focused on the PSD of individual trajectories, which enable data analysis of experimental systems even where only few trajectories are available. It is commonly the case that the PSD remains very stable across trajectories, as was explicitly shown for Brownian motion (BM) and fractional Brownian motion (FBM) [8,9].
Notably, the study of the ageing properties of PSD is complementary to the study of the ensemble-averaged (EA) and ensemble-averaged time-averaged (EA-TA) mean squared displacement (MSD) [35]. The EA-MSD, x 2 (t) , is defined as the squared displacement of an individual's position with respect to a reference position, averaged over an ensemble of movement paths. The EA-TA-MSD is given by averaging over the squared displacement performed during a time lag τ , and then averaging again over the ensemble [36,37] The EA-and EA-TA-MSD are commonly used to study correlations and ergodicity in anomalous systems [35], and when discrepancy appears the process is said to exhibit weak ergodicity breaking, see e.g., [38,39]. An important observable characteristic for different stochastic processes, is the scatter of amplitudes of the TA-MSD at a given time lag [35,36] In this study, we perform an empirical analysis of the PSD of movement tracks of a kite and a stork. These animals are chosen as their recorded tracks are of different spatiotemporal scales: high resolution data for the kite (0.25 Hz) and long track duration for the migrating stork (> 8 years). Additionally, for both animals the tracks are known to encompass a range of different behaviours (see below). As animal movement typically varies across spa-tiotemporal scales, and strongly depends on seasonality as well as behavioural mode [40], we focus here on interpreting different movement modes of an animal, shown to each have different (non-)stationary properties. We further analyse the single-trajectory PSD, which can vary depending on the physical process and on the dimension d. Finally, we find significant ageing effects when the initiation time of the process differs from the initial measurement time. To interpret our findings we use simulations and theoretical results developed in recent works for several well-known physical models, BM [8,41], FBM [9] and the subordination of FBM to continuous-time random walks (CTRW) [17,24] (details below). These models allow us to infer properties of the observed movement paths, particularly to distinguish sub-from superdiffusion, namely diffusion where then MSD grows slower or faster than linear in time, and to quantify its nonstationary properties.

A. Data collection
Black winged kite. An individual black-winged kite (Elanus caeruleus), residing in the Hula Valley, Israel, was tracked using ATLAS, an innovative reverse-GPS system. ATLAS localises extremely light-weight, lowcost tags [42], where each tag transmits a distinct radio signal which is detected by a network of base-stations distributed in the study area. Tag localisation is computed using nanosecond-scale differences in signal timeof-arrival to each station, alleviating the need to retrieve tags or have power-consuming remote-download capabilities [32,42]. The kite was tracked for 164 consecutive days in years 2019-2020, with a mostly constant tracking frequency of 0.25 Hz, see Ref. [32] for more details. Our analysis is limited to data collected during the activity hours (omitting the nights for the diurnal kite) and we further excluded data collected in proximity to the observed nest to focus on local search behaviour.
White stork. An adult white stork (Ciconia ciconia) was tracked between May 2012 and July 2020 using the configuration described in Ref. [43]. Here the GPS location and speed were recorded during the day at a frequency of 1/300 Hz when solar recharge was high (92% of the time), and at 1/1200 Hz otherwise. We omit days with lower frequency (< 1% of tracked days) and only include localisations that occur between the first recorded velocity of > 4 m/s [44].

B. Theoretical models
In this work we focus on empirical evidence for 1/f noise and ageing of the PSD in ecological movement data. In several cases it is possible to relate our findings to known physical models. Here, we list those models, for which analytical results for the PSD have been reported in recent years.

Brownian motion
For an overdamped, one-dimensional Brownian trajectory x(t), with t ∈ [0, t m ], the dynamics is given by the Langevin equation [45] where ξ(t) is a Gaussian, delta-correlated white noise term with zero mean and ξ(t)ξ(t ) = 2Dδ(t−t ), and D is the diffusion constant. For a particle governed by Eq. (4), the EA-and EA-TA-MSD, Eq. (3), are respectively given by δ 2 (τ, t) = 2Dτ and x 2 (t) = 2Dt [6,37]. As both show identical scaling with time, the process is ergodic in the (weaker) Boltzmann-Khinchin sense of equality between (long) time and ensemble averages [35]. For this process, the mean of the PSD follows [41] S entailing β = 2 and that the amplitude of the PSD is linear in D without explicit dependence on t m . Here, the non-integrability of the noise is generally solved by considering an unbounded process, however it does demonstrate the significant difference between "white" noise and processes where the noise is Brownian (sometimes called "red" or "brown" noise [8,25]). For any finite t m the single-trajectory PSD will fluctuate from one realisation to the next. The distribution of the single-trajectory PSD [Eq. (1)] around its ensemble average is then quantified by the amplitude [8] S where A d is a random number that depends on the choice of the stochastic model (see examples below), and can also depend on the dimension d, the frequency f , and the measurement time t m . The distribution of A d is highly informative since it is model specific as shown analytically for a number of processes [8,9,46,47]. In particular, it is important in studies where only few paths are available, as it relates the single PSD to its average. For BM, the distribution of the single-trajectory PSD around the mean (5) is given by Eq. (6), with [8] P where I ξ (·) is the modified Bessel function of the first kind and Γ(·) is the Gamma function. Notably, the shape of the distribution depends on the dimensionality, where dimensions lower than the embedding space of the process can be achieved by projection or component-wise measurement [8,9].

Fractional Brownian motion
As in the case of BM, FBM can also be defined in terms of the Langevin equation (4), replacing the deltacorrelated noise term ξ(t) with zero-mean, power law correlated fractional Gaussian noise ξ fGn (t) defined by [35] Here, H is termed the Hurst exponent and, similarly to the case of BM, the process is ergodic with δ 2 (τ, t) = 2Dτ 2H and x 2 (t) = 2Dt 2H , whereD is an effective diffusion constant [35,48,49]. In accordance with the definition of ξ fGn (t), for H > 0.5 the noise is persistent leading to a positively correlated process while for H < 0.5 it is antipersistent leading to a negatively correlated process. For FBM and other stationary (in increments) processes, the PSD is written in terms of the EA autocorrelation function Here, C EA and the time averaged (TA) autocorrelation function, defined by are identical at long times. In the limit of long measurement times it was shown that for FBM [9] S i.e., for persistent FBM the PSD is ageing, while it is not for subdiffusive FBM. Here, it is also possible to obtain the distribution of the single-trajectory PSD around the mean; for example, for superdiffusive FBM (H > 1/2) in d-dimensions it was found that [9] see Ref. [9] for more details and results. Here, the distributions P (A d ) are in general different from those of BM, compare Eq. (7).

Subordinated FBM
Here we consider a process in discrete time steps, expressed by the number of jumps n = 1, 2, 3, . . . , such that the autocorrelation function is given by where ∆x is a scaling parameter. Furthermore, H is analogous to the Hurst exponent of the FBM, although here it is not directly related to the scaling of the EA-MSD on time (see below). In this process, the discrete-time FBM (13) is subordinated to the operational time of the CTRW, such that after each FBM step, the particle is immobilised for a random waiting time τ drawn from a fat-tailed distribution [50] with 0 < α < 1, such that τ diverges. By combining the correlation structure of the jump sizes of FBM, with the unbounded waiting times of CTRW, this process effectively distinguishes the "operative clock", defined by the sequence of jump events themselves, from the "real" clock defined by t, thus allowing for both temporal correlations as well as ageing effects (as in individual processes, FBM only allows for the former, while CTRW allows for the latter 1 .) Waiting time distributions similar to Eq. (14) have been reported in multiple empirical systems, e.g., [32,[51][52][53][54].
As subordinated FBM is a non-stationary process, C EA and C T A are not identical, and the Wiener-Khinchin theorem no longer applies. Similar observations in other processes have led to the development of the ageing Wiener-Khinchin theorem in recent years [16,17], which enabled analytical calculations of the PSD of both stationary and non-stationary processes, relating the exponents β and z to underlying physical processes, see, e.g., [9,17,24,46,47,55]. In particular, for subordinated FBM the PSD has been rigorously computed in Ref. [24]. The leading term of the spectrum was found to again differ between positively and negatively correlated increments as [24] Note that for α = 1, Eq. (11) is retrieved, i.e., the process corresponds to FBM. In Eq. (15), for positively correlated increments H > 1/2, the process does not exhibit 1/f noise (i.e., β = 2, as is the case for BM, see the discussion in [9]); however, the ageing properties depend on α: for αH > 1/2 the spectrum increases with the measurement time t m as in superdiffusive FBM [9], while for αH < 1/2 the spectrum amplitude decreases with t m . In contrast, for anticorrelated increments in terms of the number of jumps n (H < 1/2), for any α < 1 the spectrum displays 1/f noise (β < 2) and ageing which decreases with the measurement time (z < 0). Below we compare these theoretical predictions to the recorded movement paths of a kite and a stork, for which we can obtain the values of α and H and infer the non-stationary properties of different movement modes.
Importantly, the values of α and H in this model can also be obtained from independent analysis of the EA-TA-MSD, Eq. (3). For the subordinated process, when 0 < α < 1 the EA-TA-MSD is given by [56] The EA-TA-MSD was found for our empirical data (kite and stork) in a recent study [44]; however, the MSD (and in particular the EA-MSD) is typically a noisy observable [24], and in some cases it was not possible to perform adequate power-law fits. Thus, the analysis of the PSD can be highly valuable when comparing the data to theoretical processes, as shown below. Finally, as the theoretical distribution of the single-trajectory PSD is yet unknown for subordinated FBM, we compare our results to stochastic simulations of the process 2 .

A. Kite
Similarly to Ref. [32], the kite's tracks are segmented into two behavioural modes: local searches (area restricted search) and commutes (directed flights between local searches). Localisations were segmented by detecting switching points in the data -distinct points in which the bird switches between the two behaviours [57]. Switching points were detected using spatiotemporal criteria segmentation, such that localisations that are in proximity to one another both in space and time, were segmented together [27]. In accordance with the conclusions of Ref. [32] we independently analysed the time series ensemble that represents instances of searches and the time series ensemble that represents commutes. Below, we compare these analyses to an unsegmented ensemble of daily tracks, showing that ecological knowledge of the underlying processes is crucial to correctly identify the noise properties of the data.
For searches, the PSD (Fig. 1b) These results are consistent with subdiffusive subordinated FBM with α = 0.43±0.04 and H 0, see Eq. (15). Notably, the value of H is close to zero and the error is relatively large, thus an 2 Here, the increments of the FBM, in discrete time n, are obtained using the python function fgn from the package fbm (Fractional Gaussian noise -fGn -is the increment process of the FBM). The increments, δrn, are projected to two dimensions by generating a random number θ ∈ [−π, π] and setting δxn = δrn cos θ and δyn = δrn sin θ. The FBM in two dimensions is given by the cumulative sums xn = n m=0 δxm and yn = n m=0 δym. The times between (the discrete-time) steps are then drawn from a Pareto distribution, Eq. (14). Notably, there are also other ways to simulate FBM for d = 2. We are currently studying other simulation methods and how they will affect our results. exact estimate of its value is hard to achieve. An analysis of the MSD for the same ensemble of searches produces δ 2 (τ, t) ∼ τ 0.55 t −0.55 , which is consistent with α = 0.45 ± 0.03 and H 0, see Eq. (16). Here, both the analysis of the MSD and of the PSD suggest that subordinated FBM is a plausible model for the movement within search grounds of this kite. In contrast, for the commutes (Fig. 1c) we find that S ∼ f −2 t 0.60 m , consistent with (ergodic) superdiffusive FBM with H = 0.80 ± 0.03, see Eq. (11). Here, analysis of the EA-TA-MSD gives a scaling of δ 2 (τ, t) ∼ τ 1.66 , suggesting H = 0.83 ± 0.03, in agreement with the analysis of the PSD.
We compare these results to the PSD of unsegmented daily trajectories (Fig. 1a). Here, the trajectories include both the commutes and searches, each occurring at different temporal and spatial scales. We find that β = 2.10, however there is strong ageing in the process as S ∼ t −0.66 m . Notably, the 1/f noise found during searches is completely skewed due to the lack of segmentation, and the process is most likely a mixture of (at least) two different processes, making these results hard to interpret. Additionally, the dependence of the PSD on t m (Fig. 1a, inset) does not provide a clear scaling form when normalised by f −β for different f as is the case for searches and commutes (Fig. 1b-c, insets). These findings highlight the importance of correctly accounting for different movement phases, allowing us to better interpret the 1/f noise in the data.
Some discussion on stationarity is in order here. For searches, the longer we observe the system, the smaller the amplitude A of the PSD becomes (Fig. 1b, inset). This occurs because the longer we track the searching bird, the more likely we are to find it "trapped" at a specific location for long periods; thus, the rate at which the animal moves is significantly reduced and the noise levels decrease accordingly [12]. In a recent work we have shown that this effect originates from long waiting times during searches [32]. The scaling exponent which is proportional to α [see Eq. (15)] allows for yet another way to quantify this effect. In contrast, for commutes the PSD increases with the measurement time (Fig. 1c, inset). This process can be modelled with the use of FBM (i.e., α > 1) and is Gaussian and ergodic in nature. As these flights are highly correlated, persistent movement at increasingly long measurement times increases the amplitude of the PSD -hence the growth of the PSD.
Finally, in Fig. 2 we plot the distribution of A d given by Eq. (6) for searches and commutes in d =1 and d =2 dimensions. For searches (Fig. 2a), we compare these distributions to simulation results for subordinated FBM with α = 0.45 and H = 0.01, which are consistent with the results obtained above. For both dimensions we find excellent agreement between the simulated and observed distributions. Interestingly, we find that for searches, the empirical distribution does not strongly differ between d = 1 and d = 2. Here, for subordinated FBM a theoretical distribution for A is yet unknown. In comparison, for the commutes (Fig. 2b) the distributions are in good agreement with the theory given by Eq. (12) for superdiffusive FBM.

B. Stork
The relatively low tracking frequency of the stork (one recorded point every five minutes) does not allow to identify and independently analyse potential searches, as these will include a very limited number of points (e.g., a search of 1 hour will include ∼ 12 points). Thus, daily tracks are not segmented but rather considered as a single process, limiting the scope of the analysis. Nevertheless, to properly account for the PSD for different behaviours, daily paths are clustered into subsets that represent distinct behaviours in a bird's life cycle [43]. This is done using a Gaussian mixture model (GMM) [58] based on the logarithm of the maximum displacement (∆X) that the stork performs during each day (Fig. 3a). We have found three significant movement modes which are highly correlated to the time of year: the mode with ∆X = 3 km occurs primarily (> 80%) between April and July, the mode with ∆X = 33 km occurs primarily (> 80%) between September and February, and the mode with ∆X = 213 km occurs primarily (> 87%) during wellknown migratory periods [43]. We thus refer to these three modes as breeding, wintering and migrating respectively and perform the analysis below for each subset of daily paths separately. Note that the GMM identified another mode with ∆X = 0.4 km. However, as this mode includes a small number of daily paths (60 days out of ∼ 2500), we discard it from our analysis.
For the breeding ensemble (966 days, Fig. 3b) we found S ∼ t −0.45 m f −1.9 , which is consistent with subordinated FBM with α = 0.55 ± 0.09 and H = 0.40 ± 0.05, i.e., a non-stationary process with near 1/f 2 noise. For the wintering mode (1068 days, Fig. 3c) S ∼ t 0.07 m f −2.10 , i.e., the noise is approximately 1/f 2 noise and we find no significant ageing patterns. Importantly, the fact that we measure exponents β = 2 and z = 0 in Eq. (2) does not entail that the process is Brownian, in accordance with the results of Ref. [44]. On the contrary, subordinated FBM with any α and H that fulfil 2Hα = 1.07 is a plausible model, as it is consistent with S ∼ t 0.07 m f −2.10 and Eq. (15). Here, the knowledge of the analytical scaling of the PSD [Eq. (15)] is not sufficient to determine the values of both α and H independently, and complementary analysis is crucial. For instance, based on the conclusions of Ref. [44] one can determine H 0.7 and α 0.77, which is consistent with the above relation. Below, we show that the distribution of the single PSD further supports this. For the migrating mode (480 days, Fig. 3d) we get S ∼ t 1.27 m f −2 , which is consistent with the ballistic motion of a migrating stork. Here, the stork migrates approximately in a straight line towards a stationary target (breeding and wintering grounds for spring and fall migrations, respectively), thus accounting for the rapid increase in amplitude (Fig. 3d, inset). Notably, the fact that the scaling with the measurement time is steeper than linear implies that, in addition to the highly correlated nature of this process, the animal is also accelerating, which is again consistent with the complementary analysis in Ref. [44]. 3 Indeed, storks roost in stopover sites during the night and tend to depart in the late morning, when soaring conditions improve, facilitating faster flights at lower energy costs [59].
In Fig. 4 we plot the distribution of the single PSD around its average for breeding, wintering and migration in one and two dimensions. For breeding we compare these distributions to simulations of subordinated FBM with α = 0.55 and H = 0.40 (see above). Similarly, for wintering we compare to subordinated FBM with α = 0.77 and H = 0.70. Here, the predicted distributions (7) for BM for d = 1, 2, can be shown to be far from the empirical distributions, entailing that this process is not Brownian. For both the breeding and wintering modes the empirical distributions show good agreement with simulated subordinated FBM. Note that although the ageing properties differ between these modes (see insets of Fig. 3b-c), the distributions of the single PSD ( Fig. 4a-b) are similar, suggesting some kind of unifying features of the underlying dynamics of the stork during breeding and wintering, see Ref. [44] for further discussion. Finally, for migration (Fig. 4c) we find that the distribution is in excellent agreement with the theoretical distribution for superdiffusive FBM given by Eq. (12).

C. Ageing effects
In many applications, the initial measurement time (the time from which the process is first observed) is different from the initiation time of the process, where the latter is often unknown. For nonergodic processes, such a discrepancy can make an anomalous process appear closer to normal diffusion, as the observed statistics of the initial unmeasured period can be distinct from the statistics of the measured period [18,53,[60][61][62]. We denote the time between the initiation of the process and the beginning of the measurement by t a , and refer to it as ageing time [35,62]. Here, we measure the exponents (2), blue crosses] and exponents γ and z in δ 2 (τ, t) ∼ t −z m τ −γ (orange circles) as measured for the kite's trajectories, as a function of the ageing time ta. In (a) the error bars represents variations between different measurement times, while in (b) they represent variations between different frequencies k.
β and z in Eq. (2) for the kite during ARS, as a function of t a , see Fig. 5. We vary t a between 1 and 512 by discarding all points occurring at t < t a for each t a and obtain the PSD from the remaining points in the range [t a , T ] (such that t m = T − t a ), T being the time since the unknown initiation of the process. Notably, searches typically last 40 min (= 2400 s), which is significantly longer than any choice of t a .
For the PSD of the searching kite, we find that variations in β are relatively small, with < 10% difference between its value at t a = 0 and its value at t a = 512. This entails that ageing the trajectory (starting the measurement at increasingly long lag time t a instead of the initiation time of the process) does not affect the powerlaw shape and 1/f noise. In contrast, the exponent z significantly changes with t a (> 70%), meaning that ageing the trajectory changes the non-stationary properties of the system. For instance, if the process is modelled with subordinated FBM (Sec. III A), analysis of the PSD can give different values of α, depending on t a . This effect has important implications for ecologists, as it is often the case that the initiation time of a motion and the initial measurement time are not the same. Importantly, similar ageing effects can appear in the analysis of the EA-TA-MSD, as can be seen by comparing Eqs. (15) and (16). In Fig. 5 we show a remarkable agreement between the two methods (PSD and MSD) with respect to the ageing analysis of the empirical tracks.

IV. DISCUSSION
In this work we analysed the PSD of ecological movement data. By segmenting and clustering the movement of a kite and a stork we are able to classify different behavioural modes based on the properties of the PSD. For the kite we found different characteristics for bounded searches and commutes. The PSD of the former dis-plays 1/f noise and strong ageing, both indicative of a non-stationary process and long-range correlations, as highlighted by comparison to known theoretical results and simulations of subordinated FBM. In contrast, for commutes, the process shows 1/f 2 noise and an amplitude that increases with the measurement time, suggesting superdiffusive movement patterns of the commuting kite. These results are in agreement with the conclusions of Ref. [44] for both the search and commute modes, strengthening the validity of the models suggested above. Comparing these results to the analysis of unsegmented tracks highlights the improved insight one can obtain by properly identifying behavioural modes.
For all movement modes of the stork we find approximately 1/f 2 scaling of the PSD; however, the ageing exponent z significantly varies between modes. While breeding is a non-stationary process, wintering appears stationary as indicated by the exponent z, and migrating yields a PSD that increases with time. Although this analysis may be limited by the resolution of the data, as discussed above, we used known theoretical results and simulations to show that these daily movement patterns can be modelled as a subordinated FBM (breeding and wintering) or as an FBM (commuting). The analysis performed here highlights the importance of considering the ageing exponent of the data and not solely the 1/f scaling. While the three modes of the stork only slightly differ in their β values and may appear Brownian, they significantly differ in the ageing exponent. In particular, the breeding periods displays clear non-stationary patterns with decreasing diffusivity over time (see also [44]).
Although the time series for the stork are relatively short (∼ 110 points for each day), the analysis presented in Fig. 3 spans multiple decades in frequency. This is in contrast to the analysis performed in the temporal domain in Ref. [44] where the power law scalings of the MSD were local and differed significantly between temporal scales. This indicates that the PSD is typically more robust, allowing for more accurate estimates of the underlying process. However, we stress that the PSD analysis used here [specifically Eq. (15)] does not generally give a unique set of model parameters. As shown above for the wintering stork, the PSD exponents can be consistent with a range of interconnected values of α and H. Moreover, one could mistakenly assume that the process is close to Brownian. Here, complementary analysis is vital for correct inference. For instance, analyses based on the amplitude scatter of the time-averaged MSD [35,36], the p-variation method [63,64], or the first-passage time distribution [65,66] are useful statistical observables for single particle tracking data. Nevertheless, our analysis demonstrates that the PSD method provides vital information in the analysis of movement data. Finally, our results highlight possible future directions, particularly in the study of the subordinated FBM and the CTRW formalism. First, to better quantify the empirical distributions found in Figs. 2 and 4, a theoretical study of distribution of the PSD for CTRW and subor-dinated FBM is required. Second, it has recently been shown that anomalous transport processes which do not obey the predictions of the Gaussian central limit theorem, can be decomposed into three individual effects (exponents); non-stationarity, temporal correlations and extreme events [44,[67][68][69]. It can thus be interesting to test whether the PSD method can be extended as an independent method to extract these exponents (the PSD analysis we use here only provides two independent exponents). To this end, it would also be useful to compare the insights extracted directly from the ageing PSD of the data, with that obtained using more data-driven approaches, such as machine-learning [70][71][72][73][74][75], or Bayesian maximum likelihood analysis [76]. Finally, following the empirical evidence that an ageing time can drastically affect the measured exponents, a general ageing theory for both CTRW and subordinated FBM is needed.

V. ACKNOWLEDGEMENTS
For fieldwork and technical assistance we thank Y. Orchan, R. Shaish, A. Levi, S. Rotics and M. Kaatz. RN acknowledges support from JNF/KKL grant 60-01-221-18 and DIP (DFG) grant NA 846/1. RN also acknowledges support from The Minerva Foundation, the Minerva Center for Movement Ecology and the Adelina and Massimo Della Pergola Chair of Life Sciences. RM acknowledges the German Science Foundation (DFG) for support within grant ME 1535/12-1. OV and MA acknowledge support from the ISF grant 531/20. MA also acknowledges Alexander von Humboldt Foundation for an experienced researcher fellowship.