Stochastic-like characteristics of arithmetic dynamical systems: the Collatz hailstone sequences

The numerical hailstone sequences, or orbits, generated by the Collatz map have been disclosed to present relevant features commonly associated with complex systems. It is so despite the extreme simplicity of the arithmetic dynamical system iteration rule. Indeed, for a positive integer n, the Collatz map f reads f(n)=n/2 ( f(n)=3n+1 ) for n even (odd). Seeking to elucidate this surprising fact, here we unveil distinct characteristics of stochastic-like behavior for collections of Collatz orbits by considering methods commonly employed to temporal series, as cryptography tests, power-spectrum, detrended fluctuation, auto-correlation and entropy measure. Besides confirming previous predictions that the Collatz orbits display some global properties of geometric Brownian motion, our results are likewise able to explain, at least heuristically, the reasons for so. In special, we show by means of comprehensive analysis that our findings cannot be ascribed to standard chaotic evolution. Moreover, we identify novel short- and mid-range correlations in the Collatz orbits. The Collatz map is hence a paradigmatic example of an arithmetic dynamical system which could also be regarded as displaying key characteristics of an arithmetic statistical physics system, explaining its dynamical richness.


Introduction
Is there a more primitive structure underlying natural phenomena than nature itself?This apparently meaningless question is in fact related to, quoting E Wigner, the 'unreasonable effectiveness of mathematics in the natural sciences' [1].The key issue is how pure mathematical patterns and formal relations could mimic or even explain essential characteristics of actual processes and dynamical behavior, which otherwise are understood as directly resulting from the physical laws applied to ontological objects (particles, mass, spin, charge, fields, etc) of the concrete world.In fact, this foundational problem has been discussed under many different points of view (for just a glimpse, see, e.g.[2][3][4][5]).
A distinct (less philosophical) perspective on the issue is just to consider 'plain' mathematical constructions and then to analyze how they might relate to emergent behavior in real systems.For instance, it is very representative the close association between: arithmetic (algebraic) geometry and quantum gauge field theories [6], basic problems in differential topology and the unfolding of phase transitions [7], and perhaps more striking, number theory patterns and dynamical systems, leading to the relatively new subject of arithmetic dynamics [8].
Arithmetic dynamical models are those whose evolution takes place in discrete spaces-e.g. in the set of integers or rationals-by the action of a map, so time is also discrete.In a narrower context, they just should connect results in the topic of Diophantine equations to discrete versions of dynamical systems [8,9].However, arithmetic dynamics have already found many applications [10][11][12], extrapolating the domain of dynamical systems.This is so because arithmetic dynamics deals mostly with p-adics [13,14].By its turn, p-adic numbers (simply put, the decomposition of an integer as a power series of a prime p, see next sections) are very useful technical tools in different areas of physics as quantum mechanics [15,16], quantum logic [17], gravity [18], and string theory [19].Further, they have been employed in the study of stochastic and complex systems, like in self-organized criticality (SOC) [20], phase transitions in Ising [21,22] and Potts [23][24][25][26] models, as well as in Gibbs measures [21,27,28], Markov processes [29], and diffusion in random media [30].For a complete review on p-adics usages see, e.g.[12].
In number theory (very boldly speculated to be the basics of fundamental physics [31]) there is a dichotomy between the deterministic character of a numerical series, given by a specific rule, and the eventual randomness (or more properly pseudo-randomness) of the full sequence of these numbers [32][33][34][35][36]. From this, a pertinent query is if associated to certain arithmetic dynamical systems we also could identify 'arithmetic statistical mechanics' models.Note it leads to a type of dilemma bearing a close parallel to the comprehension of an actual physical process: whether it can be fully explained in terms of deterministic laws, or random elements are dominant so that it essentially represents a stochastic dynamics.Relevant to emphasize that such dual behavior in nature [37] constitutes a possible mechanism for the unfolding of complexity [38,39].
In the search for elementary dynamics presenting clear features, but not all, of complex systems [40], the study in [41] has analyzed the so called Collatz hailstone sequences.They are generated by a discrete time t map of the form f(n t−1 ) = n t , for the n t 's integer numbers, thence constituting an arithmetic deterministic rule.The Collatz sequences are associated to the Collatz conjecture-arguably one of the most important open problems in mathematics [42] (details in the following).In [41] it was shown, somehow surprisingly, that quantities calculated from the Collatz sequences do display scale invariance.Moreover, some exploratory numerical calculations in [41] complied with formal results in the literature [43][44][45][46], demonstrating that if certain conditions could be meet (see section 2), then collections of Collatz sequences would exhibit akin properties to those of geometric Brownian motion (GBM) [47].GBM characterizes processes whose logarithm of the dynamical variable follows a Brownian motion [47].Since this latter is normal distributed, then the former is log-normal distributed.Noticeable, GBM is an important ingredient in explaining determined self-reproducing phenomena, such as of population [48], wealth [49][50][51], and finances [52].GBM has also been used to describe bacterial cell division [53] and noise effects in electrical circuits [54].
Although representing good indications, the partial results in [41]-which also do not focus in ensemble-like statistical analysis-are not enough to definitely demonstrate that the Collatz mapping is an arithmetic dynamical system also displaying features of an arithmetic statistical physics system.Thus, our aim here is to perform a comprehensive investigation of the Collatz sequences, treating them as arbitrary time series.From pertinent quantifiers like cryptography statistical tests, power-spectrum, detrended fluctuation analysis (DFA), autocorrelation function (ACF) and entropy measure, we disclose relevant stochastic-like traits in the Collatz dynamics.We also show that the standard properties of chaos cannot be attributed to the Collatz orbits, hence they cannot account for such trends.
Our findings unveil an important example of a mathematically-motivated set of very simple rules displaying the aforementioned dual behavior.Additionally, they reinforce the view that arithmetic dynamical systems potentially could constitute very basic instances of complex systems (see [41,55] and the references therein, as well as the topic issue in [56]).
The work is organized as follows.Section 2 briefly reviews the Collatz map, where also the idea of p-adic numbers and an appropriate representation for the Collatz sequences are discussed.Further, information-related issues are addressed.Section 3 details the methodology used in the problem analysis.The results are presented in section 4. Section 5 demonstrates that common features ascribed to chaotic systems do not apply to Collatz orbits.Final remarks and conclusion are drawn in section 6.

The Collatz map and its hailstone sequences
We address a 1D arithmetic dynamical system, the Collatz map [42], defined on the positive integers N + .For t = 0, 1, 2, . .., it is given by f : N + → N + , so that Assuming an initial condition n 0 ∈ N + , the corresponding Collatz sequence is generated by successive iterations of f producing the orbit . .} constituted by positive integers.We denote as {f (S) } the set of numbers up to S applications of f.The {f (S) }'s are frequently called hailstone sequences because of the multiple ascents and descents their terms go through, in analogy to real hailstones before their dropping [57,58].The Collatz map is related to a very simple to state but still unproved conjecture in number theory.Indeed, the Collatz conjecture [42,59] proposes that for any arbitrary finite initial positive integer n 0 , the orbit will always eventually reach 1.In other words, regardless of n 0 , there exists a finite T such that f T (n 0 ) = 1, i.e.O(n 0 ) = {n 0 , n 1 , . . ., n T = 1, . . .}, where T is usually called the total stopping time.Note that once reaching n T = 1, the orbit becomes periodic {. . ., n T = 1, 4, 2, 1, 4, 2, . . .}.By 2020 the conjecture has been tested true for all initial conditions n 0 < 2 68 [60].Moreover, T Tao has demonstrated that 'almost all orbits of the Collatz map attain almost bounded values' [61].From a dynamical system point of view, the conjecture asserts that the Collatz map has only one periodic orbit with infinite basin of attraction in N + .
Since if n is odd 3n + 1 is necessarily even, it is then common-and useful for our later analysis-to define the accelerated Collatz map [42] f : ( This is called the accelerated map because it excludes the mandatory process of always having a division by 2 after any iteration of an odd number, thus eliminating certain bias in f.Among the many fascinating characteristics of the Collatz sequences, we cite one very important in our present context.Under very specific conditions described in the appendix A, for a certain number k of iterations, a proper collection of Collatz hailstones orbits lead to a 50%-50% splitting between odd and even integers.Hypothesizing such balanced splitting is valid beyond k steps (the case if the Collatz conjecture could be settled positively), then it is possible to associate scaled hailstone sequences to log-normal distributions [43,45,46,62,63] (log-normal-like distributions arise from a variable performing GBM, see [53] for a review).Moreover, it also would indicate that Collatz hailstone sequences present a pseudo-random behavior-a trend often believed to make the conjecture so hard to prove.
The above and many other properties [42] of the Collatz sequences result in a very rich dynamics, despite the extreme simplicity of the evolution rules.Actually, such features have been explored in distinct model applications [64][65][66][67][68], moreover pointed out as establishing a connection between the Collatz map and the emergence of complexity in different contexts [41,55].

m-vectors representation for the Collatz dynamics
A very useful way to describe the successive Collatz map iterations is through a particular version of p-adics representation.Thus, here we just discuss the basic idea and then apply it to the Collatz map.For comprehensive reviews on p-adics see, e.g.[12,13,69].
Suppose any a ∈ Z and p a specified prime number.For 0 ⩽ |a| < p n , consider the integer coefficients {a 0 , . . ., a n−1 } ∈ Z p with 0 ⩽ |a i | ⩽ p − 1.So, we can write a as a p-adic as the following For non-negative a i 's, the set of p n combinations of the coefficients {a 0 , a 1 , . . ., a n−1 } is in bijection with the p n positive integers smaller than p n .The same is valid for non-positive a i 's regarding −p n .For the statistical analysis of the Collatz hailstone sequences, typically we need to investigate very long orbits, implying in very large n t 's.To handle these numbers, a particular version of the p-adic representation, with p = 2, has been developed in [41].It consists in writing any natural number n ∈ N + as Above, m 1 = 0, 1, 2, . . .(with m 1 = 0 if n is odd) and m i = 1, 2, . .., for i = 2, 3, . . ., r.This construction for n results in an unique m(n).Hence, for any positive natural number n, from such protocol we find a single m-vector (m 1 , . . ., m r ) of r components yielding equation (4).For full details about the advantages (as it concerns the application of the Collatz map) in written an integer as in equation ( 4) and an algorithm to obtain and manipulate the m-vector representation refer to [41].We also mention that m(n) will be very useful to implement relevant statistical analysis for the Collatz sequences, see section 3.2.An interesting property of equation ( 4) is that by applying log base 2 in the first relation of its r.h.s.we find (with q the integer part of q) Let now m(t) = (m 1 (t), m 2 (t), . . ., m r (t)) be the m-vector (using the above unique representation procedure) of n t = ft (n 0 ).We then define the M-matrix as . . .
where R is the largest dimension achieved by m(t) during the hailstone sequence.Thence, for any n t having r < R we set m r+1 (t) = m r+2 (t) = . . .= m R (t) = 0. Note from equation ( 6) that the sum of the elements in each row τ = 1, . . ., t of M(t) is essentially the value of log 2 [n τ ].Besides the computational convenience of working with the vectors m(t), instead of directly with n t , another advantage is that each hailstone sequence can be interpreted as a collection of time-series, (m 1 (1), . . ., m 1 (t)), (m 2 (1), . . ., m 2 (t)), . .., (m R (1), . . ., m R (t)).In this way, we can think about an ensemble of numbers m i (τ ) evolving in time-and described by the transpose of M(t), or M T (t)-whose sum potentially results in a Brownian motion.Recall that exponentiation of their sum roughly gives n t , which is associated to a GBM (see equation ( 6)).
As an example, figure 1(a) displays a heatmap representation of M T (T) for a very large initial condition, n 0 ≈ 2 968 , and figure 1(b) compares log 2 [n t ] with the sum of the m i 's at each t.As it should be, these quantities are practically the same.Further, successive blow-ups of a region delimited by a white dashed rectangle in the previous heatmap plot are depicted in figures 1(c)-(f).Such graphs help to better visualize the dynamics of m-vectors components as function of the time steps.
For instance, some recurrent structures are observed in distinct time intervals along which only m 1 changes, with m i >1 remaining fixed; indeed contrast row i = 0 with the others in figure 1(f).These time intervals emerge when at a specific t, n t = n odd × 2 τ (for n odd odd) and then m 1 (t + s) = m 1 (t) − 1 and m i>1 (t + s) = m i>1 (t) for s = 1, . . ., τ .On the other hand, when n t is odd, m i (t) → m i (t + 1) is a rather non-trivial map (which unfortunately we have not being able to obtain analytically), eventually discontinuing previous m i (t)'s patterns and given rise to m i (t + 1)'s resembling a pseudo-random distribution.

Tree-like structure and information restriction in the Collatz map
The Collatz orbits present a particularly curious behavior leading to a tree-like structure for the trajectories [42].Indeed, there are infinite many pairs of integers n ′ 0 = n ′′ 0 and associated integers t ′ = t ′′ such that Then, from this point on the orbits O(n ′ 0 ) and O(n ′′ 0 ) will coincide.This implies that the Collatz map is not everywhere invertible, a feature potentially influencing the statistical trends of the Collatz hailstone sequences from the point of view of a deterministic dynamical system.But could this fact ascribe a random-like character to the Collatz sequences?
To clarify the issue, consider the accelerated Collatz map f.The eventual loss of information in a specific step can be quantified from the probability of n t to completely determine its predecessor n t−1 [70].Note that for m any natural number, if n t = 3 m + 2 then n t−1 could be either even, n (1) t−1 = 2 m + 1.But for n t = 3 m or n t = 3 m + 1, necessarily n t−1 = 2 n t is even.In this way, one can define p(n t ) as the probability of n t−1 to be determined from n t , or (for arbitrary natural m) Thence, the information contained in the step t is [71] with the Shannon entropy of the hailstone sequence O(n 0 ) reading where T is the stopping time of the hailstone sequence.Since just p(n t ) = 1/2 accounts for information loss in the inverse map, one gets Here N 3 m+2 is the number of occurrences of n t = 3 m + 2 along the orbit.Given that only one in each three positive integers is in this form, for T large enough a proper estimation leads to N 3 m+2 ∼ T/3, so that H = T/6.
Comparing H with the upper bound entropy value, i.e. a hypothetical series with all p(n t ) = 1/2 and so H max = T/2, we have Therefore, no more than 1/3 (in many cases less than such a fraction) of information could not be recovered from a backtrack process in the accelerated hailstone sequences [72].However, as it concerns the map purely dynamical features, the present amount [71,73] and kind (non-invertibility) [72] of information restriction in the Collatz map should not be enough to induce a stochastic-like trend to the statistical properties of ensembles of orbits.

The statistical analysis characterizations
To study the statistical properties of the present arithmetic dynamical system, we choose 'samples' of orbits-interpreted as time series-and then perform determined statistical analysis, concretely, power-spectrum, DFA, auto-correlation, cryptography (pseudo-random bits) tests, and entropy measure.
Regarding typical features of the hailstone sequences, we remark the following.Along an overall decreasing trend (see a proper characterization in section 3.1), usually they are not structured or bias, tending to display fairly heterogeneous patterns of up and down variations for the n t 's.Actually, from many numerical tests, one finds that rather organized sequence of numbers, although existing, are not frequent in a sample of arbitrarily chosen initial n 0 's.By organized we mean orbits presenting sub-sequences of very particular lengths or very repetitive succession of ascents and descents.Hence, most of the hailstone sequences are 'typical' ones.
To create the orbits, the protocol follows specific ways to generate the m-vectors components of each n 0 , to be used in equation ( 4), but always with m 1 = 0 so the n 0 's are always odd numbers.All the m i 's can be selected randomly, then referred as random type, or following a given characteristic, i.e. all being, even, odd, prime, oscillating (a sine or cosine function-like distributions, thus with a maximum and minimum value), Pascal triangle distributed, or linear (fitting a straight line with either a positive or negative slope).Also, for these structured m-vectors, leading to the initial n 0 's of a certain type, we consider a blocksize parameter b.
This parameter determines the number b of successive exponents m i forming a motif.For instance, if we have an initial condition of odd type with b = 4, it means that we can have m 2 = 1, m 3 = 3, m 4 = 5, m 5 = 7, then m 6 = 1, m 7 = 3, m 8 = 5, m 9 = 7, etc.In the appendix B we give full details about the exact manner the n 0 's are generated.
Finally, we perform our statistical analysis assuming two kinds of time-series, viewed as 'data' .Either obtained directly from the n t 's or from their m-vectors representation.For each type of m-vector, relevant information about the associated sample orbits is given in table B1 of appendix B. For all the cases put together, we have investigated roughly 10 3 extremely long Collatz hailstone sequences.

Statistics quantification for the hailstone sequences
Perhaps, a natural first analysis for the Collatz orbits-regarding statistical properties-should be to verify how close they are to actual random sequences.With such aim, we can use certain reliable cryptography protocols, like frequency (monobit), poker and runs tests, see, e.g.[74].The key idea is to transform a given time series in a long string of zeros and ones (the so called associated parity sequence, appendix A).Then, to look at different aspects of these 0's and 1's distribution, calculating proper statistical quantifiers X. Lastly, for an ensemble of parity sequences, arising from an ensemble of time series, we can generate a χ 2 (X) distribution and compare with the expected results for the random case.The specificities of such approach as well as the concrete definitions and computations are presented in the appendix C.
Obviously, the hailstone sequences interpreted as time-series are non-stationary.Actually, by plotting the orbits in a semi-log scale (as in figure 1), we observe an overall exponential-like decreasing towards the unit.Hence, to derive a stationary or undrifted time series, we consider the following transformation.Given an orbit, we fit it through The parameter γ 2 is obtained by taking only the initial and final points, so that no regression, least squares, spline, etc, is applied.So, we can interpret γ 2 as the full orbit relaxation exponent.In this way, a stationary series results from Note that the elements of ∆ n0 represent the deviations from the global drift of the Collatz map.In other words, ∆ n0 is a deviation-from-drift time series (DDs).
In our numerical analysis we discuss the specific values of the pair α and β, verifying whether or not the equation ( 16) is observed according to the classes of initial conditions.We further calculate the average values of these exponents, denoted as α and β , by computing simple means of α and β for the hailstone sequences considered.
We obtain the ACF, denoted by C(τ ), as the following.For I T representing the mean of DDs increments and supposing a discrete time lag τ (τ = 0, 1, . . ., T), the ACF reads (17)

Statistics quantification for the m-vectors
The m-vector representation for the hailstone sequences terms n t allow a proper analysis of correlations.Indeed, each exponent m i (t) can be interpreted as a time-series and so a Pearson correlation matrix R can be created relating pair-wisely m i ′ and m i ′′ .In addition, from R we can employ a recent developed method (see below) to ascribe an entropy measure for the full sets of {m i (t)}.
Let m i = (m i (1), m i (2), . . ., m i (s)) be the time series of the first s steps of the ith m-vector component.Define M i = m i − m i as the deviation from the mean m i , with a variance σ 2 i = M 2 i .The correlation between the i ′ th and i ′′ th components is given by the Pearson correlation coefficient [85] From the first N components of the m-vectors, a N × N Pearson correlation matrix R can be calculated with entries R i ′ i ′′ given by equation (18).
To obtain an entropy measure from R, we consider the following.We write ρ = R/N.This matrix is Hermitian (in fact orthogonal), positive semi-definite (i.e. the eigenvalues are non-negative) and has unity trace.So, as shown in [86], a well-defined entropy measure is given by where λ j is the jth (from a total of N) eigenvalue of ρ.If the time-series are completely non-correlated, R i ′ i ′′ = δ i ′ i ′′ and all eigenvalues are λ i ′ = 1/N, leading to the maximum value of S max = ln[N].Otherwise S < S max , indicating some degree of correlation between the time-series.
Here we shall remark that by considering equation ( 6), the log of the Collatz orbit elements, at each time step, are given as a sum of r variables.Then, if the time-series of the distinct m i 's were non-correlated, this would mean that these variables could be associated to independent processes and a kind of central limit theorem would hold true for ln[ ft (n 0 )] (once it results from a sum of independent random variables).But in fact the m i (t)'s originate from a dynamical deterministic rule.Therefore, our correlation and entropy analysis aim to identify their degree of interdependence.
From the above, we must define a sample of N time-series, all of size s.For the numerical ranges used in the present work, we have verified that to take the first N = 100 exponents m i already leads to representative S's for the full m-vector.For each orbit, the correlation matrix and then the entropy are computed for different values of s.We have numerically found that in the great majority of the cases, the obtained entropy values are higher for s > T/2.So, we have varied s from T/2 to T with steps of size 0.05 T. Lastly, the maximum entropy value determined in such s range is selected as the orbit entropy.

Results and discussion
The distinct cryptography tests-monobit, poker 2, 3, 4 and runs-to help to infer the similarity of the Collatz orbits with random sequences are shown in figure 2. For all them, we compare the associated χ 2 d distribution of a true ensemble of random series, equation (C.4), with the numerical calculations for a representative collection of O(n 0 )'s, generated from the full set of initial n 0 's discussed in the appendix B. Note that regardless the test, the agreement is always fairly good.We mention we have generated many other examples, of 10 3 sequences each, consistently obtaining curves similar to those in figure 2.
In the appendix A we summarize formal findings in the literature, establishing a partial (i.e.k-truncated) pseudo-randomness of Collatz-generated parity sequences of zeros and ones.Nonetheless, they are not enough to guarantee that collections of Collatz orbits can display features of GBM.Of course, the plots in figure 2 are not definitive rigorous results.But given the way we sample the orbits (appendix B) and once five different cryptography tests point to the same direction, to date ours constitutes one of the strongest indications that the Collatz map leads to full pseudo-random parity vectors (thus, much beyond k-truncated).This is a necessary condition (appendix C) for ensembles of Collatz hailstone sequences to be   The mean values of the power spectra and DFA exponents for the increments of the DDs (over our full sample of orbits) are β = −0.002206 and α = 0.4895, hence 2 α − β ≈ 0.9812.Note that these averages are in fair agreement with the relation in equation (16).By separating the time series in types-according to the specific feature of the corresponding m-vectors-the resulting β and α , together with their respective standard deviations, are summarized in table 1.In figures 3(a) and (b) we display examples of power-spectrum and DFA curves for typical orbits.Remarkably, they present the characteristic behavior of white noise time series (for F(ℓ) for a non-typical orbit resulting from a very special n 0 , see the inset of figure 3(b)).Box-plots for the distribution of the β and α exponents for each initial condition type are depicted in figures 3(c) and (d).
Naturally, the orbits with random initial m-vector components should be expected as the most unbiased possible sample.But if the successive iterations of Collatz map somehow washes out information about the specificities of the initial n 0 (provided it is large enough and not too specific, e.g. 2 m1 ) then we can test a null-hypothesis.We can speculate if the more structured types of initial conditions display the same distributions of α and β than the random type.A t-test is then performed using the package HypothesisTests.jl(for the software language Julia) to compare the α and β distributions of the distinct cases.
For the prime type, the p-value for α (β) is 0.086 (0.78).Although relatively small for α, it is still above the commonly adopted criterion of statistical significance, i.e. p > 0.05.Of course, for β it is considerably high.Thus, the prime has no statistically significant difference from the random type.The same is true for the odd and even types.For the former, the p-values are 0.44 and 0.65 for β and α, respectively.For the latter, they are 0.13 and 0.067, so smaller than for the odd case, but still larger than the common standard of statistical significance.For oscillatory, Pascal and linear types, the p-values for α (β) are 0.09 (0.22), 0.11 (0.0078 < 0.05) and 0.65 (0.10), respectively.The only case with statistically significant difference from the random type is the β exponent for the Pascal type.
The above only discrepant case (Pascal) is better understood with the help of table 1 and figures 3(c) and (d).From the table 1 we see that the σ β and σ α for the Pascal type are the largest ones.This is corroborated by comparing the variability of the β's and α's observed in figures 3(c) and (d).As consequence, it is more usual for Pascal selected m(n 0 )'s to present non-typical orbits (although still rare enough for our statistical analysis purposes).For example, consider the Collatz sequence given rise to the largest α ≈ 0.84 in figure 3(d).The resulting F(ℓ) is depicted as the inset of figure 3(b).We observe it refers to a very specific initial condition with m(n 0 ) = (0, 1, 1, . . ., 1, 1) for r = 2100, being the Mersenne number 2 2100 − 1.However, for the other orbits of this type, there is not a great dispersion for the corresponding α's.
The situation is a bit distinct for the power-spectrum exponent β, with P(f) ∼ f −β , as one readily realizes from its Pascal box-plot in figure 3. Note the frequencies f associated to a given orbit obviously will depend on its particular shape.So, it implies that the set of hailstone sequences whose m i 's of n 0 are selected from the Pascal triangle tends to show more diverse patterns of trajectories than those in the other m-vector types.For instance, by direct inspection (results not shown) we have found that for the Pascal's there is a certain number of orbits displaying very long stretches of successive accents or descents, thus yielding β's which considerably deviate from the average value (a characteristic much more rare in the other cases).This explains the presence of some 'outliers' among the orbits for n 0 's generated by the Pascal triangle.
Hence, despite one atypical case illustrated above, our detailed numerical investigations for the full collection of orbits have strongly supported the following: (i) The increments of the DDs display a behavior akin to white noise, characterized by β = 0 and α = 1/2.This indicates the DDs are similar to a Brownian-like motion.Hence, recalling that the DDs are in logarithmic scale (see section 3.1), we can conclude that the hailstone sequences follow the general trend of GBM (a process whose logarithm is Brownian) with drift.(ii) The distribution of the power-spectrum and DFA exponents β and α for random and structured initial m-vectors tend to be comparable.This reinforces the idea that regardless the initial condition n 0 (if large and 'typical' as previously explained), the Collatz map generates alike global statistical patterns for the hailstone sequences.(arbitrary, i.e. not necessarily i.d.d.) process whose standard deviation σ is given by 1/ √ N (by the Bartlett's scheme, equation ( 20)), the cases where C(τ ) > 3 σ ensure that the auto-correlation is significant with more than 99.7% of certainty.These instances are highlighted in all the plots.For b = 2 and b = 3, the curves indicate short-to mid-range correlations for the structured initial n0's.On the other hand, for b = 5 this is not observed for any type of n0.The same trends are likewise found for the other structured types, but not shown here.
We should remark that (i)-(ii) above are in agreement with certain distinct analytical findings in the literature [42,45,62], but here reached through a completely different point of view.
The pseudo-random bits (cryptography) tests, power-spectrum and DFA computations indicate properties of (pseudo-) random processes for an ensemble of Collatz orbits.However, the Collatz map is a deterministic rule and somehow this must also be manifested in the hailstone sequences statistical characteristics.For example, if the increments time series were very narrowly linked to white noise, the ACFs should be rather small for any τ > 0 since for pure white noise C WN (τ = 0) = 1 and C WN (τ > 0) = 0.By calculating C(τ ) for a large number of orbits (representative examples are depicted in figure 4), we have identified some recurrent behavior.Indeed, as seen in all plots in figure 4, generally for our time series samples we have an initial very rapid decay for C(τ ), but not displaying exponential or power-law profiles.Furthermore, some considerable C(τ ) > 0 values are observed for not too long τ 's.
To better characterize the statistical significance of these auto-correlations, we need to estimate the standard deviation, σ C , of C for each τ .We do so in two different manners.First, recall that for times series of size T which are independent and identically distributed (i.i.d.), simply σ iid Of course, this would be a too simplistic assumption for our deviation-from-drift sequences.Nonetheless, σ iid C is useful as a standard reference.A second, far more appropriate estimation, is given by the Bartlett's scheme [81], or In the above expression one must set C(t < 0) = C(t > T) = 0. Note we recover the i.i.d condition, namely, √ T, for the pure white noise case.Figures 4(a), (d) and (g) show the ACF for some time series of the random type.These plots indicate that the most unbiased sample of orbits displays almost no significant auto-correlations, resembling white noise.Conversely, for the time series with structured initial conditions-of prime and even types-examples of the resulting auto-correlations are depicted in figures 4(b), (c), (e) and (f), for which the blocksize parameter are b = 2 and b = 3 (see the beginning of section 3).Now, significant short-to mid-range correlations can be identified.This points to the existence of mild deviations from the global GBM behavior, a possibility which has been hypothesized in [41].In fact, when we examine the Collatz hailstone sequences considering specific initial conditions-for instance, primes with b = 2 in the figure 5-we see plateaus of periodic ascents and descents along the orbits up to a certain time.Certainly, this is a departure from the GBM and explains the presence of auto-correlations.But interestingly, in figures 4(h) and (i), for which b = 5, the number of peaks for the auto-correlations is similar to those for n 0 's of the random type.This distinction between b's will be explained below.
Although not the goal of the present contribution, here we shall mention a link between the Collatz GBM-like trends and the Collatz conjecture itself.Conceivably, the conjecture confirmation could be achieved by rigorously mapping the hailstone sequences to a random walk model, specifically a GBM with an absorbing barrier (for extra discussions see, e.g.[41,42]).This would be so given the well known recurrence property of certain random walks in one and two dimensions [47]: due to the mapping, recurrence should then also be valid for the Collatz orbits.But this potential demonstration scheme cannot work if the observed correlations would hinder the Collatz sequences to be formally associated to random walks.However, the peaks in the auto-correlations in figure 4 only appears for some specific structured initial conditions and when the BlockSize parameter b (see appendix B) is small, b = 2 and b = 3.Indeed, for the case of b = 5 in figures 4(h) and (i), the auto-correlations are not significant (we have tested other larger b's, finding the same absence of correlations).Such results indicate that orbits deviating from GBM are rare, representing only fluctuations around β = 0 and α = 1/2, see figure 3.Moreover, even when present, the correlations are not strong enough to avert the GBM overall behavior.So, we speculate this is not a problem for a proof using the aforementioned strategy.
To better characterize the peaks in the plateau-like regions of the plots for the auto-correlations in figure 4 we analyze the M-matrix, equation ( 7), in terms of a heatmap representation.Figure 5 5(f), results from a somehow regular sequence of ups and downs evolution for the associated m i (t)'s, figure 5(d).This is a general tendency for small b's and time steps up to t ≈ r (as numerically checked for many other examples, but not shown here).Note that such trend is directly associated to the peaks observed in figure 4, appearing for steps up to 200 (recall that in these cases r = 180).As t increases, there is a sort of 'thermalization' of the m i (t)'s-due to the Collatz map action-and the GBM-like behavior with almost no deviations takes place.For greater b's, e.g.b = 5 in figures 5(g) and (h), even for a n 0 of structured type, the number of short-to mid-range auto-correlation peaks considerably diminishes, see the last row of figure 4. Indeed, intuitively speaking the initial m i 's (of n 0 ) are more 'randomized' if b is larger.Nevertheless, some structures (difficult to identify in orbits plots and auto-correlations analysis) are still observed in the heatmaps, figures 5(g) and (h).As a final simple example, supporting the above discussion, we consider a rather large r = 2100 for m(n 0 ) = (0, 1, 2, 1, 2, . ..), which is basically the most trivial oscillatory type initial condition having b = 2.The orbit heatmap is depicted in figure 6(a).The regular oscillations of the m i (t)'s up to t ∼ r is clearly seen.Thence, as expected from the previous considerations the ACF analysis in figure 6(b) shows a lot of relevant auto-correlation peaks for τ up to r.
The above results indicate that the m-vector evolution under f is strongly related to the statistical characteristics of the Collatz orbits.This fact can be made explicitly by computing a proper entropy measure for the time series originated from the m-vector components.This is relatively easy to implement from the method recently proposed in [86] and discussed in section 3.2.So, we calculate the Pearson correlation matrix R for the first N = 100 components of the m-vectors (refer to equation ( 18)), thus generating matrices of dimensions 100 × 100.For each matrix, we use equation ( 19) to obtain the resulting entropy.The time step span for these correlation matrices are chosen such to maximize the final value for S (see section 3.2).
In figures 7(a)-(f) we give examples of the R matrix for six distinct n 0 structured types, considering rather large r's, 2100 and 1000, and b equals to 2 and 3.Although not displayed, we mention that for n 0 's of the random type, the resulting R matrices resemble those for random temporal series processes.Indeed, in the appendix D we present an analysis for the distribution PDF(R) of the Pearson correlation matrix off-diagonal elements values.We show that for the random type, the PDF(R) is similar to the symmetric Laplace distribution.For the structured types, the resulting PDF(R)'s are skewed, with R i j > 0 (but relatively small) tending to be more probable.
The distributions of the normalized entropy S for each initial condition type are displayed in figure 8 as box-plots.For a given type, we calculate S and then define S = S/S max , with S max = ln[N], which for N = 100 gives S max ≈ 4.605 17.For all the types, the relative entropy averages, together with their standard deviations, are summarized in table 1.The mean relative entropy for the structured types are always lower than for the random type.By considering all the orbits in all our samples, we have that S = 0.9793, which is close to the maximum possible S max = 1.This points to fairly uncorrelated m-vectors after long enough time iterations of the Collatz map.Further, the entropy distributions for the structured n 0 's can be contrasted, by means of a t-test, with the null hypothesis of the random type case.For all six structured, i.e. prime, even, odd, Pascal, oscillatory and linear, p 0.05, meaning that the null hypothesis is rejected with statistical significance and  therefore such distributions are different from that of the random type.This is, of course, in agreement with the comparisons between the PDF(R)'s made in the appendix D. Finally, the above numerical findings lead to a heuristic, yet qualitatively sound, justification for the GMB-like character of the Collatz hailstone sequences.To a good extend, the m-vectors components can be regarded, after some t, as approximately following the central limit theorem.But given equation ( 6), the logarithm of n t is a sum of these m i (t)'s, thus a sum over reasonably uncorrelated variables (in the sense described along this work).Hence, given that Brownian motion may take place from the combined action of random forces (see, e.g.[87]), the mentioned parallel arises.

Collatz sequences do not comply with most of the standard features of chaotic behavior
It is a known fact that chaotic systems can generate time series resembling those from pure stochastic processes.An important example is the continuous variable-discrete time logistic map x t+1 = r x t (1 − x t ) in the particular case of parameter value r = 4 [88].Further, signals from chaotic synchronization or quasi-periodic dynamics can also exhibit colored noise [89][90][91].In this way, a pertinent question is whether or not the previously observed behavior in section 4 could be attributed to an eventual underlying (in opposition to explicit) non-linearity of the Collatz map, thus linked to chaos.
We start observing that regarding synchronization or quasi-periodicity, the Collatz sequences do not present (at least considering the already tested n 0 's in the literature [60] as well as the important result in [61]) quasi-recurrences or cycles others than {1, 4, 2}.In addition, a global definition of chaos for arithmetic dynamical systems is apparently lacking, unless in very specific contexts as in [92].Hence, we pragmatically adopt one of the most broadly used characterization of a 'traditional' chaotic dynamical system M : J → J [93].Then, we address how some properties of the accelerated Collatz map, f : N + → N + , compare with those from a generic M.
The mapping M : J → J is said chaotic if [93]; (i) periodic points of M are dense in J; (ii) M is topologically transitive, i.e. for any pair U, V ∈ J, there exists an integer k > 0 such that M k (U) V = ∅; (iii) M has a sensitive dependence on initial conditions, namely, for any x ∈ J and any neighborhood N of x, there are δ > 0, y ∈ N and an integer k > 0 such that |M k (x) − M k (y)| > δ.Often it is very hard to show that a certain M is indeed chaotic through formally proving (i)-(iii).Instead, usually one infers the character of M by numerically investigating the aforementioned conditions.Surprisingly, for the Collatz map is straightforward to rule out (i) and (ii).However, to study (iii) we will rely on numerical simulations.
The requirement (i) obviously would be violated if the Collatz conjecture was true.Indeed, then N + − {1, 2, 4} should have no periodic points.Nonetheless, the findings in [61] already provide a large subset of N + with no periodic orbits.Thus, (i) cannot stand.For (ii), we recall that for any non-negative integer m, if n t = 3 m + 2 then n t is the convergence point of two branches of the Collatz tree-like structure, for which n t−1 is either 2 (2 m + 3) or 2 m + 1, section 2.2.We name as B e [B o ] the even [odd] branch of n t , corresponding to the full collection of integers starting at arbitrarily large n e 0 [n o 0 ] and ending, by means of successive applications of f, at 2 (2 m + 3) [2 m + 1].Note there are infinite many pairs B e and B o .Now, by identifying V = B e and U = B o , no points in V can be reached from points in U through the action of f (and vice versa), see figure 9(a).Consequently, (ii) does not hold.
To discuss (iii), we adapt the finite size Lyapunov exponent (FSLE) method of [94].As in [94], we suppose two initial conditions n 0 and n 0 + h 0 , for h 0 = 1, and compute the separation between the orbits at each iteration, or h t (n 0 ) = | ft (n 0 ) − ft (n 0 + 1)|, until one of them reaches the periodic set.But when contrasted with the concrete problem in [94], the Collatz sequences display key extra features, namely, the notorious hailstone-like trait and that for large enough t's, eventually the orbits will end up close together.To incorporate them, as traditionally done [95] we assume FSLE for both the doubling [94] and halving times.Thus, we define τ 1 as the number of iterations for which for the first time h τ1 (n 0 ) is either ⩾ 2 h 0 or ⩽ h 0 /2.The subsequent τ i 's are such that h τ i (n 0 ) is either ⩾2 h τ i−1 or ⩽h τ i−1 /2, up to a maximum τ N d associated to the time in which one of the two orbits finally gets to {1, 4, 2}.So [94] yielding a set of FSLE {λ i }, from which we define the Lyapunov exponent of O(n 0 ) However, frequently we find pairs of successive integers n 0 and n 0 + 1 with n 0 ∈ B e and n 0 + 1 ∈ B o , for B e and B o as described above.Thence, necessarily their associated sequences must meet after some number of iterations, figure 9(b).In these cases, the FSLE is not suitable to measure sensitivity to initial condition once λ i = λ(n 0 ) = −∞, as expected for a super-stable orbit.Such dynamics cannot be ignored, with coalescent trajectories being ubiquitous in the Collatz map [42].In order to estimate the non-coalescence probability P nc as a function of log 2 [n 0 ], we have considered N (k) initial conditions in the range 2 k ⩽ n 0 ⩽ 2 k+1 (named the partition k) and counted the number of non-coalescent orbits N (k) nc among these N (k) .So, (k) is an assessment of the probability of non-coalescent orbits in the partition k.In our explicit simulations, for 4 ⩽ k ⩽ 17 we have assumed all the 2 k points in the corresponding partition, thus N (k) = 2 k .For 18 ⩽ k ⩽ 1000, we have randomly selected N (k) = 2 11 = 2048 initial conditions in k.In figure 9(b) we depict the probability of non-coalescence P (k) nc as function of k = log 2 [n 0 ].The probability decreases as a power law (in accordance with the trends unveiled in [41]) as the initial condition n 0 increases.Accordingly, on the one hand this discloses that the larger the n 0 , the larger the number of pairs of consecutive integers resulting in super-stable orbits with λ = −∞.On the other hand, provided the exponent is limited to a specific range, a power law for P (k) nc indicates that although becoming comparatively scarcer, always exist non-coalescent orbits regardless of n 0 .In fact, P For example, from a fitting across 100 < k < 1000, figure 9(b) shows that ν = 0.373, verifying equation (23).Figures 9(c) and (d) present the FSLE estimator λ(n 0 ) for the full collection of 71 501 non-coalescent orbits in the interval 2 4 ⩽ n 0 ⩽ 2 17 and for a set of 4908 (over 2 15 tested) non-coalescent orbits in the interval 2 1000 ⩽ n 0 ⩽ 2 1001 , respectively.All the FSLE in these ranges are positive, but tend to considerably decrease, approaching zero as we consider higher and higher partitions k, e.g.compare figures 9(c) and (d).
The previous results can be summarized as the following.(a) For n 0 large, but not too high, the non-coalescent orbits have positive λ's.Nevertheless, they are less common than the coalescent trajectories.(b) For higher k partitions, the density of non-coalescent sequences is even lower with their FSLE tending to zero.Remarkably, it suggests that ensembles of orbits with extremely big n 0 's represent critical states characterized by (almost) null Lyapunov exponents, in the sense proposed in [95,96].We note that a such type of regime has previously been speculated for the Collatz map via rather distinct arguments [41].Obviously, point (b) precludes (iii).Moreover, by also taking into account the vicinity (neighborhood) assumptions in the condition (iii), likewise (a) does not support it.
Concluding, it seems that chaos, at least under a more standard point of view, cannot account for the stochastic-like behavior of the Collatz map.

Conclusion
Differently from what it might seem, to properly identify which problems can or cannot be classified as complex systems is not a so direct task [97][98][99][100].In fact, such difficulty has motivated some proposals for establishing basic phenomenological conditions typifying complexity (see, e.g.[101,102]).For instance, an operational approach is to focus on the expected purely technical (even formal) properties of complexity [55,56,103].Then, one might search for minimal instances-either based on idealized physical models [104,105] or originated from plain mathematical constructions [106,107]-which could constitute prototypes examples for more realistic natural processes, and where characteristics commonly ascribed to actual complex systems (say, affinity, self-organization, feedback loops, hierarchical organization, etc [40,102,108]) may come about.
In [41], an association has been made between many general aspects of complex systems and the Collatz map.Further analysis, relating the resulting Collatz sequences and emergent patterns in general, have been addressed elsewhere (see, e.g.[55,65,109,110]).These works highlight certain striking features of this very straightforward arithmetic dynamical system with those in complexity, like scale invariance, SOC [40], built-in feedbacks, and surprisingly long-range power-law correlations (but in this latter case depending on fine tuned and exceptionally large initial conditions n 0 [41]).
A great challenge is therefore to comprehend why a so rich dynamics can steam from a so primitive mathematically-based structure.We have shown that the Collatz sequences, despite being generated by a deterministic rule, does display strong characteristics of a stochastic process: a GBM with some extra short range correlations.But given the fact that this type of dichotomy is an important mechanism leading to dynamical diversity and behavior richness [56,[111][112][113], then the above mentioned traits for the Collatz hailstone sequences, at least in part, should be understood from our present findings.relation with the GBM should follow.We notice that properly normalized processes from ARW yield GBM probability distributions [49].
However, it remains an open question whether the pseudo-randomness in p(n 0 ) is universal, valid beyond the first k steps.Certain partial results [45,62,63] indicate that scaled hailstone sequences are fairly log-normal distributed, but they are far from being conclusive.

Appendix B. Sampling the set of initial conditions
The process of generating initial conditions was performed in such way to guarantee suitable (i.e.typical and representative) orbits.The inputs for the code-freely available in the GitHub repository-that creates the initial n 0 's comprise four arguments, namely, • type (type of exponents m i ): it can be assigned as random, for random initial components, or prime, even, odd, Pascal triangle, oscillatory, and linear for structured initial components (with the exception of m 1 , which is always zero, so n 0 is always odd).We observe that from the sampling procedure parameters we can directly estimate how large is a created initial condition.We have a 0 = 2 log 2 [n0] , but from equation ( 6 Above, γ 2 ≈ 1/5 is a numerical fitting obtained by transforming the Collatz's into stationary sequences, as described in section 3.2. For the type = random, all the m-vector components are randomly chosen, with the maximum possible value set by MaxRand argument.For the analysis, in total 168 initial conditions of the random type have been generated, always with MaxRand = 10 so that m i = 5.5.From this total of 168, for 152 (16) we have chosen mVectorSize = 180 (=2100).
For each one of the types prime, even and odd, we have considered b = 2, 3, 4, 5, generating 2! + 3! + 4! + 5! = 152 initial conditions with mVectorSize = 180.Extra four n 0 's (thus, in a total of 156 initial conditions), with mVectorSize = 2100 and b = 2, 3, 4 and 5 have also been created in order to have very large orbits of these kinds.For the oscillatory, we have considered b = 2, 4, . . ., 100, originating 50 sine-like and 50 cosine-like initial conditions.To guarantee that in all these cases log 2 n 0 = 3000 (so that the a 0 's are not too apart from each other), we have chosen r 0 = ( log 2 n 0 b)/ ε j , where ε j = b 2 /4 + b.For the sine-like building block, another eight initial conditions have been generated for b = 2, 4, 6, 8: four with r 0 = 180 and four with r 0 = 2100.
Finally, for the linear type we have set b = 3, 4, . . .60, thus creating 57 initial conditions for the ascendant and 57 for the descendent linear building blocks.To fix log 2 n 0 = 2000, we have considered r 0 = ( log 2 n 0 b)/ ε j with ε j = b (b + 1)/2.Another three initial conditions for the ascendant pattern were generated for b = 30, 60, 180, with r 0 = 180.
All the above as well as information about the magnitude of the initial conditions and estimation of the total stopping times are summarized in table B1.

Figure 1 .
Figure 1.(a) Heatmap of M T (t) for a very large initial condition n0 (close to 2 968 ).Each column represents one time-step and each row contains the time series of the associated m-vector component.(b) For n0 of (a), the base two logarithm of the elements of O(n0) compared to the corresponding ∑ i m i (t).It illustrates the good approximation given by the equation (6).(c) Blow-up of the region marked by a white dashed rectangle in (a).(d)-(f) Blow-ups of the regions marked by white dashed rectangles in the predecessors plots (c)-(e).These zoom-ins allow to identify certain qualitative trends of the M(t) matrices, such as temporal stretches where only m1 changes, see (f).Also, at certain time scales (like a coarse-grained view), an apparent pseudo-randomness for the m-vectors is observed, (d).

Figure 2 .
Figure 2. The normalized PDFs of the corresponding quantifiers X of distinct cryptography tests for the parity of the hailstone sequences (appendix C).The dots result from the numerical calculations for a representative collection of orbits (main text).The dashed lines are the associated analytical curves (χ 2 d (X), equation (C.4)) of the random case.For the monobit d = 1 and for the poker d = 2 M − 1, where here M = 2, 3 and 4. For the runs d = 2 (L − 1), here with L = 8.All comparisons show good agreement between the parity of the hailstone sequences and the expected distributions of random series.

Table 1 .
For each type of m-vector, the mean value (computed over the Collatz orbit samples) of the exponents β and α, as well as the relative von Neumann entropy S. The respective standard deviations are also shown.

Figure 3 .
Figure 3.For a representative orbit of each m-vector type, plots of the power-spectrum P(f ) and of the DFA F(ℓ), respectively, in (a) and (b).The behavior in (a)-with the dashed line marking the value P = 0.3 ∝ f 0 -indicates the usual white noise trend.The curves in (b) are also compared with ℓ 1/2 .The reasonable fittings once more points to a white noise-like behavior.Box-plot distributions of (c) β and (d) α for the sample orbits of each m-vector type.The greatest dispersion results from the Pascal type.The Pascal orbit with the largest α in (d) corresponds to m(n0) = (0, 1, 1, . . ., 1, 1), so that n0 is the Mersenne number 2 2100 − 1, a rather structured initial condition.For such sequence, the associated F(ℓ) is shown in the inset of (b).The observed 'bumps' for large ℓ's illustrate the peculiarities of the ft (n0)'s due to this special initial n0.

Figure 4 .
Figure 4. Examples of ACFs for orbits with random (first column), prime (second column) and even (third column) initial conditions, also for b equals to 2, 3, 5, respectively, in the (a)-(c), (d)-(f) and (g)-(i) panels.For all the n0's, r = 180.For an i.d.d.(arbitrary, i.e. not necessarily i.d.d.) process whose standard deviation σ is given by 1/√ N (by the Bartlett's scheme, equation (20)), the cases where C(τ ) > 3 σ ensure that the auto-correlation is significant with more than 99.7% of certainty.These instances are highlighted in all the plots.For b = 2 and b = 3, the curves indicate short-to mid-range correlations for the structured initial n0's.On the other hand, for b = 5 this is not observed for any type of n0.The same trends are likewise found for the other structured types, but not shown here.
displays the entire hailstone sequences (in log 2 -linear scale) for two prime type initial conditions, with b = 2 and b = 5, as well as the heatmaps of the corresponding M T matrices.Details of the dynamics for the first 250 steps are also presented in figures 5(b) and (d), for b = 2, and figures 5(f) and (h), for b = 5.The descent staircase-like pattern in figure 5(b), absent in figure

Figure 8 .
Figure 8. Box-plot of the entropy distribution for (a) random and structured prime, odd, even and Pascal types, whose b parameter ranges from 2 to 6, and (b) oscillatory and linear, with b up to 100 (see appendix B).

Figure 9 .
Figure 9. (a) Schematic representation of the disjoint even Be and odd Bo branches of orbits which must concur at the point 3 m + 2 (with m non-negative integers).From this point on, under the action of the map f these orbits always coincide.(b) Average probability P (k) nc of an orbit to be non-coalescent as function of the partition number k (main text).In the present case, a power law fitting leads to an exponent ν = 0.373.The FSLE λ's are shown for all the non-coalescent orbits within 2 4 ⩽ n0 ⩽ 2 17 in (c) and for an ensemble of 2 15 orbits within 2 1000 ⩽ n0 ⩽ 2 1001 in (d).The λ's are non-negative, indicating that the non-coalescent sequences are sensitive to initial conditions.Nonetheless, λ tends to zero as n0 increases, compare (c) with (d).
• mVectorSize (variable r 0 ): the length of the m-vector.• MaxRand (only when type = random): establishes the maximum random value of the m-vector components.• BlockSize (variable b): used for structured initial components, setting the size of the building block of the m-vector.For BlockSize = b the m i components will be formed by blocks with b (denoted as [ε 1 , . . .ε b ]) elements of that type, repeated mVectorSize/b times.This imposes that mVectorSize must be divisible by b.Further, in each code realization the ε j 's of [ε 1 , . . .ε b ]: (i) are the first b integers of that type if type = prime, even, odd, but generated in an arbitrary order; (ii) are the elements of the bth row of the Pascal triangle, in an arbitrary order, when type = Pascal triangle; (iii) form a sine-like (cosine-like) building block [1, 2, . . ., A − 1, A, A − 1, . . ., 2] ([A, A − 1, . . ., 2, 1, 2, . . ., A − 1]), with A = (b + 2)/2 and b even when type = oscillatory; (iv) form an arithmetic progression from 1 to b (as well as its reverse) when type = linear.

Figure D1 .X
Figure D1.The (normalized) probability distribution function PDF(R) of the values of the Pearson correlation matrix R off-diagonal elements for all the initial conditions in the samples for random and structured types.The random case is compared with the Laplace distribution (2 β) −1 exp[−|x − µ|/β], where µ = 0.000 9493 and β = 0.0272.