When is the average number of saddle points typical?

A common measure of a function's complexity is the count of its stationary points. For complicated functions, this count grows exponentially with the volume and dimension of their domain. In practice, the count is averaged over a class of functions (the annealed average), but the large numbers involved can produce averages biased by extremely rare samples. Typical counts are reliably found by taking the average of the logarithm (the quenched average), which is more difficult and not often done in practice. When most stationary points are uncorrelated with each other, quenched and annealed averages are equal. Equilibrium heuristics can guarantee when most of the lowest minima will be uncorrelated. We show that these equilibrium heuristics cannot be used to draw conclusions about other minima and saddles by producing examples among Gaussian-correlated functions on the hypersphere where the count of certain saddles and minima has different quenched and annealed averages, despite being guaranteed “safe” in the equilibrium setting. We determine conditions for the emergence of non-trivial correlations between saddles, and discuss the implications for the geometry of those functions and what out-of-equilibrium settings might be affected.

Random high-dimensional energies, cost functions, and interaction networks are important across disciplines: the energy landscape of glasses, the likelihood landscape of machine learning and inference, and the interactions between organisms in an ecosystem are just a few examples [1][2][3][4].A traditional tool for making sense of their behavior is to analyze the statistics of points where their dynamics are stationary [5][6][7][8].For energy or cost landscapes, these correspond to the minima, maxima, and saddles, while for ecosystems and other non-gradient dynamical systems these correspond to equilibria of the dynamics.When many stationary points are present, the system is considered complex.
Despite the importance of stationary point statistics for understanding complex behavior, they are often calculated using an uncontrolled approximation.Because their number is so large, it cannot be reliably averaged.The annealed approximation takes this average anyway, risking a systematic bias by rare and atypical samples.The annealed approximation is known to be exact for certain models and in certain circumstances, but it is used outside those circumstances without much reflection [9][10][11].In a few cases researchers have instead made the better-controlled quenched average, which averages the logarithm of the number of stationary points, and find deviations from the annealed approximation with important implications for behavior [12][13][14][15][16]. Generically, the annealed approximation to the complexity is wrong when a nonvanishing fraction of pairs of stationary points have nontrivial correlations in their mutual position.
A heuristic line of reasoning for the appropriateness of the annealed approximation is sometimes made when the approximation is correct for an equilibrium calculation on the same system.The argument goes like this: since the limit of zero temperature in an equilibrium calculation concentrates the Boltzmann measure onto the lowest set of minima, the equilibrium free energy in the limit to zero temperature will be governed by the same statistics as the count of that lowest set of minima.This argument is strictly valid only for the lowest minima, which at least in glassy problems are rarely relevant to dynamical behavior.What about the rest of the stationary points?
In this paper, we show that the behavior of the ground state, or any equilibrium behavior, does not govern whether stationary points will have a correct annealed average.In a prototypical family of models of random functions, we determine a condition for when annealed averages should fail and some stationary points will have nontrivial correlations in their mutual position.We produce examples of models whose equilibrium is guaranteed to never see such correlations between thermodynamic states, but where a population of saddle points is nevertheless correlated.
We study the mixed spherical models, which are models of Gaussian-correlated random functions with isotropic statistics on the ( −1)-sphere.Each model consists of a class of functions  :   −1 → R defined by the covariance 2  3 +(1−)  .The blue region shows models which have some stationary points with nontrivial correlated (rsb) structure, and is given by   > 0 where   is found in (12).The yellow region shows where () =  ′′ () −1/2 is not convex and therefore nontrivial correlations between states are possible in equilibrium.The green region shows where nontrivial correlations exist at the ground state, adapted from [27].We find that models where correlations between equilibrium states are forbidden can nonetheless harbor correlated stationary points.between the functions evaluated at two different points    1 ,    2 ∈   −1 , which is a function of the scalar product (or overlap) between the two configurations: Specifying the covariance function  uniquely specifies the model.The series coefficients of  need to be nonnnegative in order for  to be a well-defined covariance.The case where  is a homogeneous polynomial has been extensively studied, and corresponds to the pure spherical models of glass physics or the spiked tensor models of statistical inference [17].Here we will study cases where  () = 1 2  3 + (1 − )  for  ∈ (0, 1), called 3 +  models.These are examples of mixed spherical models, which have been studied in the physics and statistics literature and host a zoo of complex orders and phase transitions [18][19][20][21][22][23][24][25].
There are several well-established results on the equilibrium of this model.First, if the function () =  ′′ () −1/2 is convex then it is not possible for the equilibrium solution to have nontrivial correlations between states at any temperature [26].1This is a strong condition on the form of equilibrium order.Note that non-convex  does not imply that you will see nontrivial correlations between states at some temperature.In the 3 +  models we consider here, models with  > 8 have non-convex  and those with  ≤ 8 have convex  independent of .Second, the characterization of the ground state has been made [18,19,22,27].In the 3 +  models we consider, for  > 12.430... nontrivial ground state configurations appear in a range of .These bounds on equilibrium order are shown in Fig. 1, along with our result for where the complexity has nontrivial correlations between some stationary points.As evidenced in that figure, correlations among saddles are possible well inside regions that forbid them among equilibrium states.
There are two important features which differentiate stationary points    * in the spherical models: their energy density  = 1   (   * ) and their stability  = 1  Tr Hess  (   * ).The energy density gives the 'height' in the landscape, while the stability governs the spectrum of the stationary point.In each spherical model, the spectrum of every stationary point is a Wigner semicircle of the same width  m = √︁ 4  ′′ (1), but shifted by constant.The stability  sets this constant shift.When  <  m , the spectrum has support over zero and we have saddles with an extensive number of downward directions.When  >  m the spectrum has support only over positive eigenvalues, and we have stable minima.2When  =  m , the spectrum has a pseudogap, and we have marginal minima.
The number N (, ) of stationary points with energy density  and stability  is exponential in .Their complexity Σ(, ) is defined by the average of the logarithm of their number: Σ(, ) = 1  log N (, ).
More often the annealed complexity is calculated, where the average is taken before the logarithm: Σ a (, ) = 1  log N (, ).The annealed complexity has been computed for these models [23,28], and the quenched complexity has been computed for a couple examples which have nontrivial ground states [14].The annealed complexity bounds the complexity from above.A positive complexity indicates the presence of an exponentially large number of stationary points of the indicated kind, while a negative one means it is vanishingly unlikely they will appear.The line of zero complexity is significant as the transition between many stationary points and none.In these models, trivial correlations between stationary points correspond with zero overlap: almost all stationary points are orthogonal to each other.This corresponds with replica symmetric (rs) order.The emergence of nontrivial correlations, and the invalidity of the annealed approximation, occurs when some non-vanishing fraction of stationary point pairs have a nonzero overlap.This corresponds to some kind of replica symmetry breaking (rsb).Here we restrict ourselves to a 1rsb ansatz, which corresponds to two kinds of pairs of stationary point: a fraction  of pairs have the trivial zero overlap, and the remaining fraction 1− have a nontrivial overlap  1 .
In the annealed or replica-symmetric case,  = 1 and all but a vanishing fraction of stationary points are uncorrelated with each other.Since other kinds of rsb order encompass 1rsb, we are guaranteed that Σ ≤ Σ 1rsb ≤ Σ a .We will discuss later in what settings the 1rsb complexity is correct.
When the complexity is calculated using the Kac-Rice formula and a physicists' tool set, the problem is reduced to the evaluation of an integral by the saddle point method for large  [14].The complexity is given by extremizing an effective action, for the action S 1rsb given by where Δ = 1 −  and The details of the derivation of these expressions can be found in [14].The extremal problem in β,  d ,  1 ,  d , and  1 has a unique solution and can be found explicitly, but the resulting formula is unwieldy.The action can have multiple extrema, but the one for which the complexity is smallest gives the correct solution.There is always a solution for  = 1 which is independent of  1 , corresponding to the replica symmetric case, and with Σ a (, ) = S 1rsb (,  |  1 , 1).The crux of this paper will be to determine when this solution is not the global one.
It isn't accurate to say that a solution to the saddle point equations is 'stable' or 'unstable.'The problem of solving the complexity in this way is not a variational problem, so there is nothing to be maximized or minimized, and in general even global solutions are not even local minima of the action.However, the stability of the action can still tell us something about the emergence of new solutions: when a new solution bifurcates from an existing one, the action will have a flat direction.Unfortunately this is difficult to search out, since one must know the parameters of the new solution, and  1 is unconstrained and can take any value in the old solution.
There is one place where we can consistently search for a bifurcating solution to the saddle point equations: along the zero complexity line Σ a (, ) = 0. Going along this line in the replica symmetric solution, the 1rsb complexity transitions at a critical point where  =  1 = 1 [14].Since all the parameters in the bifurcating solution are known at this point, we can search for it by looking for a flat direction.In the annealed solution for points describing saddles ( <  m ), this line is where we have chosen the lower branch as a convention (see Fig. 2) and where we define for brevity (here and elsewhere) the constants When  and its derivatives appear without an argument, the implied argument is always 1, so, e.g.,  ′ ≡  ′ (1).If  has at least two nonzero coefficients at second order or higher, all of these constants are positive.Though in figures we focus on the lower branch of saddles, another set of identical solutions always exists for (, ) ↦ → (−, −).
We also define  min , the minimum energy at which saddle points with an extensive number of downward directions are found, as the energy for which  0 ( min ) =  m .
Let  be the matrix of double partial derivatives of the action with respect to  1 and .We evaluate  at the replica symmetric saddle point  = 1 with the additional constraint that  1 = 1 and along the extremal complexity line (5).We determine when a zero eigenvalue appears, indicating the presence of a bifurcating 1rsb solution, by solving 0 = det .We find where  = − 1 2    −  ′  ′′  is proportional to the square-root term in ( 5) and the constants , , , and  are defined by Changing variables from  to  is convenient because the branch of ( 5) is chosen by the sign of  (the lower-energy branch we are interested in corresponds with  > 0).The relationship between  and  on the extremal line is  = 2ℎ 2 +  2 , where the constants , , and ℎ are given by The solutions for det  = 0 can be calculated explicitly and correspond to energies that satisfy This predicts two points where a 1rsb solution can bifurcate from the annealed one.The remainder of the transition line can be found by solving the extremal problem for the action very close to one of these solutions, and then taking small steps in the parameters  and  until it terminates.In many cases considered here, the line of transitions in the complexity that begins at  + 1rsb , the higher energy point, ends exactly at  − 1rsb , the lower energy point, so that these two points give the precise range of energies at which rsb saddles are found.An example that conforms with this picture for a 3 + 5 mixed model is shown in Fig. 2.
The expression inside the inner square root of ( 11) is proportional to If   > 0, then the bifurcating solutions exist, and there are some saddles whose complexity is corrected by a 1rsb solution.Therefore,   > 0 is a condition to see 1rsb in the complexity.If   < 0, then there is nowhere along the extremal line where saddles can be described by such a complexity.The range of 3 +  models where   is positive is shown in Fig. 1.Fig. 3 shows the range of energies where nontrivial correlations are found between stationary points in several 3 +  models as  is varied.For models with smaller , such correlations are found only among saddles, with the boundary never dipping beneath the minimum energy of saddles  min .Also, these models have a transition boundary that smoothly connects  + 1rsb and  − 1rsb , so  − 1rsb corresponds to the lower bound of rsb complexity.
-1.2 -1.1 -1 -0.9 ).The dashed black line shows the line of zero annealed complexity and enclosed inside the annealed complexity is positive.The solid black line (only visible in the inset) gives the line of zero 1rsb complexity.The red region (blown up in the inset) shows where the annealed complexity gives the wrong count and a 1rsb complexity in necessary.The red points show where det  = 0.The left point, which is only an upper bound on the transition, coincides with it in this case.The gray shaded region highlights the minima, which are stationary points with  ≥  m . min is marked on the plot as the lowest energy at which extensive saddles are found.In the top row the black line shows  min , the minimum energy where saddles are found, and in the bottom row this energy is subtracted away to emphasize when the rsb region crosses into minima.For most , both the top and bottom lines are given by  ± 1rsb , but for  = 14 there is a portion where the low-energy boundary has  1 < 1.In that plot, the continuation of the  − 1rsb line is shown dashed.Also marked is the range of  for which the ground state minima are characterized by nontrivial rsb.1rsb does and doesn't define the lower limit of energies where rsb saddles are found.In both plots the red dot shows  − 1rsb , while the solid red lines shows the transition boundary with the rs complexity.The dashed black line shows the rs zero complexity line, while the solid black line shows the 1rsb zero complexity line.The dashed red lines show where a nonphysical 1rsb phase appears (the spinodal of that phase).The dotted red line shows an abrupt phase transition between different 1rsb phases.Top:  = 0.67.Here the end of the transition line that begins at  + 1rsb does not match  − 1rsb but terminates at higher energies. − 1rsb still corresponds with the lower bound.Bottom:  = 0.69.Here the end of the transition line that begins at  + 1rsb terminates at lower energies than  − 1rsb , and therefore its terminus defines the lower bound.
For large enough , the range passes into minima, which is excepted as these models have nontrivial complexity of their ground states.This also seems to correspond with the decoupling of the rsb solutions connected to  + 1rsb and  − 1rsb , with the two phase boundaries no longer corresponding, as in Fig. 4. In these cases,  − 1rsb sometimes gives the lower bound, but sometimes it is given by the termination of the phase boundary extended from  + 1rsb .There are implications for the emergence of rsb in equilibrium.Consider a specific  with where the interaction tensors  are drawn from zero-mean normal distributions with ( ( ) ) 2 = !/2 −1 and likewise for  () .Functions  defined this way have the covariance property (1) with  () = 1 2   + (1 − )  .With the s drawn in this way and fixed for  = 3 and  = 14, we can vary , and according to Fig. 1 we should see a transition in the type of order at the ground state.What causes the change?Our analysis indicates that stationary points with the required order already exist in the landscape as unstable saddles for small , then eventually stabilize into metastable minima and finally become the lowest lying states.This is different from the picture of existing uncorrelated low-lying states splitting apart into correlated clusters.Where uncorrelated stationary points do appear to split apart, when  is decreased from large values, is among saddles, not minima.
A similar analysis can be made for other mixed models, like the 2 + , which should see complexities with other forms of rsb.For instance, in [14] we show that the complexity transitions from rs to full rsb (frsb) along the line which can only be realized when  ′′ (0) ≠ 0, as in the 2 +  models.For  > 2, this transition line always intersects the extremal line (5), and so rsb complexity will always be found among some population of stationary points.However, it is likely that for much of the parameter space the so-called one-full rsb (1frsb), rather than frsb, is the correct solution, as it likely is for large  and certain  in the 3 +  models studied here.Further work to find the conditions for transitions of the complexity to 1frsb and 2frsb is necessary.For values of  where there is trivial rsb in the ground state, we expect that the 1rsb complexity is correct.
What are the implications for dynamics?We find that nontrivial correlations tend to exist among saddle points with the largest or smallest possible index at a given energy density, which are quite atypical in the landscape.However, these strangely correlated saddle points must descend to uncorrelated minima, which raises questions about whether structure on the boundary of a basin of attraction is influential to the dynamics that descends into that basin.These saddles might act as early-time separatrices for descent trajectories of certain algorithms.With open problems in even the gradient decent dynamics on these models (itself attracted to an atypical subset of marginal minima), it remains to be seen whether such structures could be influential [28][29][30].This structure among saddles cannot be the only influence, since it seems that the 3 + 4 model is 'safe' from nontrivial rsb among saddles.
We have determined the conditions under which the complexity of the mixed 3 +  spherical models has different quenched and annealed averages, as the result of nontrivial correlations between stationary points.We saw that these conditions can arise among certain populations of saddle points even when the model is guaranteed to lack such correlations between equilibrium states, and exist for saddle points at a wide range of energies.This suggests that studies making complexity calculations cannot reliably use equilibrium behavior to defend the annealed approximation.Our result has direct implications for the geometry of these landscapes, and perhaps could be influential to certain out-of-equilibrium dynamics.
Funding information JK-D is supported by a DynSysMath Specific Initiative of the INFN.

1RSB T = 0 Figure 1 :
Figure1: A phase diagram of the boundaries we discuss in this paper for the 3+ model with  = 1 2  3 +(1−)  .The blue region shows models which have some stationary points with nontrivial correlated (rsb) structure, and is given by   > 0 where   is found in(12).The yellow region shows where () =  ′′ () −1/2 is not convex and therefore nontrivial correlations between states are possible in equilibrium.The green region shows where nontrivial correlations exist at the ground state, adapted from[27].We find that models where correlations between equilibrium states are forbidden can nonetheless harbor correlated stationary points.

Figure 2 :
Figure 2: Stationary point statistics as a function of energy density  and stability  for a model with  () =

Figure 3 :
Figure3: The range of energies where rsb saddles are found for the 3 +  model with varying  and .In the top row the black line shows  min , the minimum energy where saddles are found, and in the bottom row this energy is subtracted away to emphasize when the rsb region crosses into minima.For most , both the top and bottom lines are given by  ± 1rsb , but for  = 14 there is a portion where the low-energy boundary has  1 < 1.In that plot, the continuation of the  − 1rsb line is shown dashed.Also marked is the range of  for which the ground state minima are characterized by nontrivial rsb.

Figure 4 :
Figure 4: Examples of 3 + 14 models where the solution  −1rsb does and doesn't define the lower limit of energies where rsb saddles are found.In both plots the red dot shows  − 1rsb , while the solid red lines shows the transition boundary with the rs complexity.The dashed black line shows the rs zero complexity line, while the solid black line shows the 1rsb zero complexity line.The dashed red lines show where a nonphysical 1rsb phase appears (the spinodal of that phase).The dotted red line shows an abrupt phase transition between different 1rsb phases.Top:  = 0.67.Here the end of the transition line that begins at  + 1rsb does not match  − 1rsb but terminates at higher energies. − 1rsb still corresponds with the lower bound.Bottom:  = 0.69.Here the end of the transition line that begins at  + 1rsb terminates at lower energies than  − 1rsb , and therefore its terminus defines the lower bound.