Time scales in the dynamics of political opinions and the voter model

Opinions in human societies are measured by political polls on time scales of months to years. Such opinion polls do not resolve the effects of individual interactions but constitute a stochastic process. Voter models with zealots (individuals who do not change their opinions) can describe the mean-field dynamics in systems where no consensus is reached. We show that for large populations, the voter model with zealots is equivalent to the noisy voter model and it has a single characteristic time scale associated with the number of zealots in the population. We discuss which parameters are observable in real data by analysing time series of approval ratings of several political leaders that match the statistical behaviour of the voter model using the technique of the time-averaged mean squared displacement. The characteristic time scale of political opinions in societies is around 12 months, so it cannot be resolved by analysing election data, for which the resolution is several years. The effective population size in all fitted data sets is much smaller than the real population size, which indicates positive correlations of successive voter model steps. We also discuss the heterogeneity of voters as a cause of subdiffusion on long time scales, i.e. slow changes in the society.


Introduction
The voter model [1] describes the dynamics of opinions in a network of agents in different situations.It is one of many models in the context of social physics [2] that have gained popularity in recent years, highlighted by the 2021 Nobel Prize in Physics 'for groundbreaking contributions to our understanding of complex physical systems' to Syukuro Manabe, Klaus Hasselmann, and Giorgio Parisi [3].Originally, it was introduced in ecology for the competition of species [4].The model consists of a network of nodes (voters) that can adopt one of (in most cases) two opinions (states) [5] (see sketch in figure 1).The opinions are adapted iteratively by pairing up two voters through one of the network edges.Then one of the nodes adopts the state of the other node.This procedure is repeated many times, while the average state (sometimes called magnetisation [6]) changes.The dynamics encoded by most voter models is non-stationary, starting from a state of divided opinions and converging, eventually, to a (stationary) consensus, at which all voters adopt the same opinion [7][8][9].Accordingly, these types of models only cover a single time scale of a specific discussion depending on group size and other parameters.This time scale can be minutes in small groups or centuries in societies.The convergence time depends on factors such as the network topology [10,11] and the properties of the voters [12,13] such as a bias towards one opinion [14].
There are two types of applications of voter models for real political opinion data.On the one hand, data are available that resolve the individual voters such as information on parliamentary votes and parliamentary attendance [15] or data from social networks [16][17][18].The behaviour at short time scales is well-resolved in these data sets and the network structure can be studied in addition to the opinion formation.The voter model can be used to discuss various effects such as the network structure of societies [17][18][19], polarisation [20], filter bubbles [21], or the impact of bots in social networks [22].However, it cannot be applied to long time scales and it may not easily be generalised to the whole society level.On the other hand, data from political polls or elections [23][24][25] have a poor time resolution and no information on the individual agents, the only measurable observable being the share n + (t) of voters who support the proposition at each time step.Concurrently, data are available up to the time scale of centuries (for elections) and describe the average opinion of the whole society and not only a subgroup.While election results do not resolve individual agents, they may provide some spacial resolution of different districts within the country [26].
For both, the high-resolution data and the long-time society averages, modifications must be made to make the dynamics of the voter model more realistic [27] and reproduce features of real data.Most importantly, societies do not tend to reach consensus on any time scale that is observable in data.On such time scales of years to decades, approval of the relevant parties and politicians tends to fluctuate, and no consensus is reached.
In the presence of 'zealots' , that do not change their opinion, the system never reaches consensus and does not converge to a fixed proportion [28][29][30][31], because at all times there exist agents on both sides recruiting voters for their opinion.The average state keeps fluctuating around some mean value.This corresponds to the observations in all types of political polling: in the observable time range, no consensus is reached, neither on specific issues nor on opinions about parties or politicians.Another model version that is used to make the voter model more similar to political discussions is the noisy voter model [26,30,[32][33][34][35] or voter model with external influences [20,36] (appendix).The noisy voter model is also a popular tool in finance, where it is known as the herding model [37].The noisy voter model can be introduced in different ways.It adds a second source of opinion changes that do not originate from individual persuasion.This can be interpreted as a one-particle interaction [38] or an incorrect copy of the given opinion [26].Similar to the voter model with zealots, this model does not lead to consensus, but to fluctuating dynamics around some fixed value.There are several other modifications of the voter model that reproduce observed or suspected features of real opinion formation.An important generalisation is the heterogeneous voter model [39,40], in which voters adapt opinions with individual rates.
In this work, we revisit the question 'is the voter model a model for voters?' [26] or in other words: how can observed opinion time series be explained by voter models?In previous studies election data over several decades [24][25][26][41][42][43][44] were analysed and linked to voter models.The model can be used to fit parameters and make predictions for future election outcomes [25].Such studies focus on the probability density function (PDF) and not on auto-correlations and time scales.This is because the data have a very low resolution and the time scales under investigation are decades.For polling data [45], very few studies exist.
Using polling data has the advantage of a higher sampling frequency as compared to election data, so the model can be fitted to a much shorter period of time compared to a model that only uses election data.However, it has the disadvantage that polling data is subject to measurement noise, which adds one additional parameter to be estimated, as we will see in the text.The noise is due to selecting a subsample of voters for polling which might not have the same average opinion as the full population.Still, having a model of political opinions enables us to discuss features such as stability [46] of election results.The first step is to find a model that reproduces basic statistical observables.The second step is the definition of meaningful parameters, corresponding to the real world.We discuss both steps in the following.
We employ methods from time series analysis and the theory of stochastic processes to understand the time evolution of the opinion data.Our most important tool to find the correct model for data is the time-averaged mean squared displacement (TAMSD).This measure is introduced in section 2, along with the effect of noise on the TAMSD.In section 3 we recall properties of the voter model and analytical solutions to measures such as the PDF and the TAMSD.We first restrict ourselves to the voter model with zealots on the complete graph, which is an approximation that should be representative of a society in which everyone accesses the same media.For this model, results are known in literature.Then we discuss two generalisations of the model that help to better interpret the measured parameters.In section 4 we fit the voter model to real data, approval ratings of political leaders and support for parties.Finally, in section 5 we discuss the results and interpret the measured parameters.

The TAMSD
Fluctuations of time series can be classified via the mean-squared displacement (MSD).If several realisations of the process exist, the MSD is calculated by taking the ensemble average.In the case of polling data, the process is given by shares of a specific answer to a survey question for different polls over time.Each issue is different, so only one realisation of each process exists, analogous to single realisations in financial time series [47][48][49].Accordingly, we use the TAMSD for the analysis.Here, x(t) is the time series of length T to be analysed and ∆ is the lag time.Note, that the results between individual, finite time series (e.g. for two subsamples) typically show variations of their amplitude and may also differ from the ensemble average in many cases, the phenomenon of non-ergodic dynamics in the Boltzmann-Birkhoff-Khinchin sense [50,51].The function δ 2 x (∆) depends on the autocorrelations of the increments of x.The shape of δ 2 x (∆) is, therefore, characteristic of the underlying dynamics of the process under investigation [50].In this sense, the TAMSD is a similar tool to detrended fluctuation analysis [52], that can be used equivalently [53,54].If x(t), e.g. is described by Brownian motion with independent increments, δ 2 x (∆) ∼ ∆ scales linearly.The relevant case in our analysis is the TAMSD of confined motion [55], Here τ is the characteristic time scale, i.e. the relaxation time of a stochastically driven particle in a confining potential.The stability of this measure for a single trajectory is known [55].The TAMSD of confined Brownian motion is linear ∼2σ 2 ∆/τ at short times ∆ → 0. This linear scaling is called normal diffusion.If the MSD scales with a steeper slope than linear, i.e. δ 2 (∆) ≃ ∆ α with α > 1, the system is called superdiffusive.If the slope is shallower than linear, i.e. α < 1, the system is called subdiffusive [56].There are various processes that can generate either slope, as discussed in literature [57][58][59][60][61][62][63].Due to the confinement, the TAMSD converges to the constant 2σ 2 for ∆ → ∞ when the full potential is already explored.The variance of the process is connected to the TAMSD via σ 2 = lim ∆→∞ δ 2 x (∆)/2 [57].

Noisy data
The source of the noise in polling data is due to the nature of the sampling of voters.Since not all voters, but only a small group are polled, there is some uncertainty as to how representative the group is for the entirety of voters.Since new samples of voters are picked for each poll, temporal correlations in the noise can be neglected.We model the uncertainty due to the sampling as an additive noise term η(t) in the observed signal X(t) = x(t) + η(t), where the true signal x(t) and the noise are independent.For the TAMSD we find, accordingly, Such an additive error typically leads to underestimations in the short-time scaling exponent of the TAMSD [64].η(t) is an uncorrelated signal, i.e. the correlation time τ of the process η is zero.Accordingly, the TAMSD δ 2 η (∆) can be calculated from equation (3) in the limit τ → 0. We find a constant expression for the TAMSD, η(∆) = 2σ 2 η .Thus in order to see the actual TAMSD of the voter model in the data, one has to subtract a constant from the TAMSD of the observed signal according to where σ 2 η is the average variance of the noise over the whole trajectory.The decomposition of δ 2 X is analogous to the decomposition of fluctuation functions presented in [54,65].It allows one to separate the signals δ 2 X (t) and σ 2 η .For data analysis, an estimation of the variance of the noise is needed.We therefore require a proxy for measuring the noise amplitude, or we have to make an assumption on the data.For opinion polls with result X for M participants, the noise variance can be estimated as [66] Note, that X is always smaller or equal to 1. Accordingly, the variance σ 2 of the true process x(t) deviates from the variance σ 2 m of the measured process X(t) by For fitting the TAMSD in this manuscript, we first calculate the variance σ 2 .Then we calculate the TAMSD of the data and subtract the noise according to equation ( 4).Finally, we fit the short time slope of the TAMSD of the model to the TAMSD of the data.

The homogeneous voter model with zealots
We consider a complete graph with N voters as nodes as sketched in figure 1.Out of these N voters, Z are zealots and the remaining S = N − Z are susceptible voters.The nodes can be in one of the two states '+' and '−' .A number N + of voters support the proposition (e.g.favour a given politician or party), while N − = N − N + oppose the proposition.Analogously, we denote the zealots in support/opposition as Z ± and the supporting/opposing susceptible voters as S ± .In each step, we select two voters V i and V j from the full population.If node V i is a zealot, no change of state V i occurs, otherwise, node V i adopts the opinion of node V j .The share n + = N + /N of proponents only increases if the first voter has opinion '+' and the second voter previously had opinion '−' and is not a zealot.The probability for this case is n + S − /N.The share of proponents decreases in the opposite case with probability n − S + /N.Thus the system can be described as a Markov jump process with transition probabilities The following theory is analogous to the theory of the noisy voter model in the limit of large populations, N → ∞ (see appendix), so we can make use of literature results for both models.The expected change in the share of proponents in one step, i.e. the drift of the stochastic process is In order to understand the time series generated from the voter model, we reformulate equation ( 8) in terms of the observable n + , removing all dependent variables.We obtain which implies an exponential relaxation of the expectation value of n + (t), to z + = Z + /Z, which is the share of proponents in the set of zealots.In reality, not every step is recorded.Only after d steps, the share s + ∈ [0, 1] of proponents is recorded, and then again after 2d, 3d, ….The relaxation time of the process in units of recorded steps is The PDF of the share n + is known [67] for the noisy voter model (for s + , see appendix).After applying a variable transformation and it has the variance [68] where the approximation holds for N ≫ Z. Thus the majority of voters are able to change their minds, which appears to be a realistic setting.We see that the variance σ 2 depends, to good approximation, only on the number of zealots and is independent of the number S of susceptibles.Calculating the variance, only one of these parameters can be determined.The exact shape of the PDF determines all parameters at once, however, differences between two PDFs with the same variance are hard to detect.
To summarise, the voter model with zealots for a large population N only depends on three parameters: (i) the share z + of zealot proponents that is identical to µ, the average value of n + , (ii) the number Z of zealots, that can be calculated from σ 2 , and (iii) the ratio N 2 /d, that can be measured from the short-∆ behaviour of the TAMSD δ 2 (∆), i.e. τ .
We now briefly discuss possible extensions of this model accounting for modified dynamics.However, we note that when analysing data, only the few parameters above can be inferred reliably.Accordingly, the paradigmatic model for data analysis is the standard homogeneous model presented above.The generalisations are nevertheless useful because they help us understand the meaning of the voter model parameters and possible deviations of the data from the model.

Superdiffusion: effective number of voters N
In this section we introduce a generalisation of the voter model with zealots by defining the effective number of voters in addition to the total number of voters.This modification does not affect neither on the PDF or the TAMSD.However, it helps interpreting the model parameters defined in section 3.1.
In real-world systems, there is usually a cause why voters are influenced.It happens not only because they hear the opinion of other voters, but also because, e.g. a statement that a politician made or an external event, that requires the voters to reconsider their opinions.Such events provide arguments in the discussions among voters and can be considered to be the major driver of the dynamics.When such an event occurs, which either is in favour of the proposition (or not), the proponents (or opponents) will be able to convince not only one voter but several voters at a time.In other words, there is a correlation between opinions that are adapted in two successive steps.
Instead of picking a new agent at each time step who influences another voter, one agent is able to influence several voters, so the same agent stays the influencer for several time steps.We define the length of this period with a fixed influencing agent as a Poissonian random variable with an average number ν.The system can no longer be described as a Markov jump process because probabilities depend on the previous state of the influencing voter.At each step, the implementation can be done by deciding with a random variable, whether or not the influencing agent should be changed, where the probability of not changing the agent is exp(−1/ν).Accordingly, we can modify equation ( 8) and get Transforming all time-dependent terms to expressions of n + , we get The dynamics (the deterministic drift) in equation ( 14) corresponds to an autoregressive model of order 2 (AR(2)) and has the same shape of autocorrelations as the 'finite correlation time' model in [59].
If ν and d are of the same order of magnitude, the slope of the TAMSD becomes steeper than linear for small ∆ < ν according to equation (14).Such a model is superdiffusive for short times ∆.We show the deviation of the TAMSD of model ( 14) with ν > d from the TAMSD (3) of the uncorrelated standard model in figure 2(a) ('+').For ∆ → 0, the TAMSD of equation ( 14) scales quadratically [59].
For ν ≪ d, the correlated model with N total nodes is similar to the standard homogeneous voter model introduced above (see figure 2(a) with parameters ν = 10, d = 60).In a period of length ν, a number of-on average-ν voters are confronted with the opinion of the influencing agent.A share of 1 − Z/N total of these voters are susceptible on average, so the following parameter transformation ensures that statistical properties of the standard voter model are reproduced by the model with correlated steps, The parameter N is no longer the size of the population, but rather an effective size, the density stays the same under the transformation (15) (see figure 2(b)).

Subdiffusion: a heterogeneous voter model
Finally, we introduce a second generalisation of the voter model with zealots.The setting is the same as in the voter model with zealots, however, now, each node i has a property c i ∈ [0, 1], which describes how easily it changes its state.If the parameter is c i = 1, the node behaves like a susceptible voter in the model above.If the parameter is c i = 0, the node is a zealot, who never changes its state.If c i is between 0 and 1, the voter is in principle susceptible but only changes its state with probability c i .As a special case, we consider a model, in which some individuals are zealots c i = 0 and all other individuals have the same property c j = c.The dynamic of the system is identical to the normal voter model with zealots, except for the time steps, where the effective number of steps d is different from the total number d total , Indeed the resulting time series is stretched in time by the factor 1/c.A different concept, describing a quantity related to c is internal aging [69].Note, that with a data-driven approach, we can only measure d and not d total .The TAMSD and PDF of the model for different values of d total but with the same d are indistinguishable our numerics (see figures 2(c) and (d)).The model can be called homogeneous if it has three different types of voters.Z zealots, S susceptibles, and I intermediate nodes with 0 < c < 1.In this case, the TAMSD can be understood by a superposition of the short-time TAMSD where intermediates are counted as zealots (Z ′ = Z + I, S ′ = S) and the long-time TAMSD where intermediates are counted as susceptibles (Z ′ = Z, S ′ = S + I).The resulting TAMSD in the intermediate time range grows with some subdiffusive scaling exponent < 1 (see figure 3(a)).The variance can be calculated from the number of zealots in the model, equivalently to the homogeneous voter model (see section 3.1), since intermediates behave like susceptibles at long times.The density of a heterogeneous model in figure 3(b) is well approximated by a homogeneous model with S ′ = S + I susceptible voters.Note, that in figure 3(c), we see that for two populations, the TAMSD is difficult to distinguish for values of c > 0.1, so heterogeneity only changes this statistical measure in very heterogeneous populations, while the statistical behaviour of a population with some smaller heterogeneity is still well approximated by the homogeneous Heterogeneous populations with various groups of voters with different c-parameters can produce arbitrary TAMSD functions.One important class are scale-free models with power-law growth of the TAMSD.Other versions of the voter model have previously been linked to 1/f noise and long-range memory [38,70,71].The study of these voter models is relevant especially for economics and finance, where power-laws are omnipresent [72].Here we see that heterogeneity of voters is a particularly simple way of generating complex autocorrelation functions.

Approval ratings of political leaders: fitting the homogeneous voter model
To illustrate the above models, we apply these to investigate the statistics of approval ratings of various political leaders.We consider data for Angela Merkel [73], Barack Obama [74], David Cameron [75], Justin Trudeau [77], Emmanuel Macron [78], and Shinzō Abe [76] (in contrast to the other data sets, the latter describes government approval, not personal approval).All of these leaders were reelected at least once.For each data set we consider the percentage of people who respond, that they approve of the work of the political leader.This share corresponds to the ratio n + in our model.n + changes over time with each new poll.The opposing voters n − consist of both the people who respond that they disapprove and those who did not respond at all.Some of the data sets are not homogeneous in time, i.e. the number of polls per year is not constant.One reason for this inhomogeneity is that the demand for political polls increases before elections, the other reason is a change in the strategy of the polling agency over the years.The details on the data sets and our preprocessing can be found in table 1.To make the time series more homogeneous, we excluded some data points in months or years where the resolution was sufficiently high already.Moreover, we did not include more than 13 data points per year, so in most cases, the time series we used is shorter than the original data.The number of participants is reported by the polling agency for each poll.For most agencies, the number of participants fluctuates from poll to poll.In order to estimate the error, we took a typical number of participants reported in table 1.
Five out of the six time series are shown in figure 4(a).For all six data sets, we calculate µ = Z + /Z and the variance σ 2 using equation ( 6) with the sample sizes M given in table 1.These two properties enables us Table 1.Summary of the polling data sets used in this text.The data on Merkel, Obama, Cameron, Trudeau, and Macron are personal approval ratings of political leaders in power, the data on Abe measures government approval.The data on UK parties measures voting intention, i.e. which share of the population intends to vote for the Conservative party and the Labour party, respectively.In the table we show the length of these data sets, how many data points we included in the analysis (others were excluded to make the density in time more homogeneous), the period of time in which these polls were performed, and a typical number of participants, that we use in our analysis to subtract the noise from the TAMSD (see equation ( 4)).

Subject agency
Merkel Forschungsgruppe Wahlen [73] Obama Gallup [74] Cameron Ipsos [75] Abe NHK [76] Trudeau Angus Reid Institute [77] Macron BVA [78] UK Parties Ipsos [79] First   1); the dashed lines show the residual δX(∆) − 2σ 2 η , which is used for analysis; the short time scaling of the residuals is close to linear for all three data sets.Stripes in the background indicate linear scaling.(c) PDF of three of the data sets and fitting models with parameters given in table 2.
to estimate Z + and Z − = Z − Z + .Finally, we calculate the TAMSD and treat the noise according to equation ( 4).The effect of subtracting the noise variance is shown in figure 4(b), for the approval ratings of Angela Merkel.Taking into account the noise reduces the slope of the TAMSD, generating a close-to-linear scaling for small ∆.Fitting the TAMSD enables us to calculate the correlation time τ and thus the value of N 2 /d.The obtained parameters for all data sets are shown in table 2. In figure 4(c), we show the PDF of three data sets along with the obtained model PDFs.The models describe the measured PDFs reasonably well, however, the data sets are too short to see smooth functions and exact fits.The best test if the voter model in fact describes the dynamics of political opinions is the TAMSD.In figure 5, we show the master curve of TAMSD functions for all data sets-rescaled by σ 2 and τ in the y and x direction, respectively.For ∆/τ < 5 the data sets collapse rather well to one curve, that is well approximated by the theoretical TAMSD, given by equation (2).For larger ∆, some of the curves (for Cameron, Macron, and Abe), increase significantly-this deviation from the theory can be attributed mostly to very high approval ratings just after the politician came to power, i.e. some election euphoria (see figure 4(a)).Despite these deviations, the voter model with zealots is a good first approximation, describing the rough statistics of the dynamics of opinions on political leaders.Figure 5. TAMSD of opinions follows the theory of the voter model with zealots.We show the master curve of rescaled TAMSD curves for approval ratings of political leaders.The corresponding fit parameters can be found in table 2. The grey interval corresponds to the model uncertainty (98%) of voter models using the parameters obtained from the approval ratings of Shinzo Abe.

Support for political parties: society changes on long time scales
For the support of political parties, we can apply the same analysis as for the approval ratings of leaders.In political polls, approval is measured by asking people which party they would currently vote for.While the question is not binary, it can be reformulated into a set of questions 'Would you vote for party X?' .In each case, there are supporters and opponents of each party.There are also zealots, who will always vote for a specific party or never vote for them.We focus on the support for parties in the United Kingdom (UK), again from Ipsos [79], available for the years since 1978.We focus on the two major parties: the Conservative party and the Labour party.We consider both time series, calculate the TAMSD, and subtract the measurement noise (based on the sample size of 1000, which is about the typical value in the Ipsos polls).We calculate the mean variance σ 2 and fit the correlation times τ (both taking into account the noise due to the sample size) to the averaged TAMSD of both time series (see figure 4(b)).The obtained parameters, Z + , Z − , and d, match the theory curve for short times smaller than τ , as displayed in table 2. Compared to the parameters obtained for political leaders, the characteristic time scale τ is of the same order of magnitude.The number of zealots Z is also similar for party support compared to personal approval ratings.The long-time behaviour beyond the correlation time τ differs between parties.It cannot be generally determined from a data set of around 50 years in length.However, we can use other data sets of political party support to learn more about these long time scales.Support for political parties on long time scales is measured regularly through elections.In figure 6(a), we show the TAMSD of the British election time series of the Conservative and Labour parties along with the TAMSD of their polling results.The election data of 28 parliamentary elections from 1918 to 2019 can be found at [80].Similar to the long-time behaviour of the TAMSD of polling data, we see an irregular increase in the displacement.
In order to understand more about the long time behaviour of the TAMSD of political party support, we look at data from a second country.In the United States of America (USA), the time series of election results are long and can be obtained for all states [81].We show the TAMSD for ten different states from different regions of the country (California, Florida, Illinois, Montana, New Hampshire, New York, Ohio, Texas, Utah,  [75], the dashed lines are calculated from election data [80].Black lines represent the fitted voter models with τ = 11.5 months for the Labour party and τ = 14.2 months for the Conservative party.(b) TAMSD for data from ten US-American states; the black line is the ensemble average for each ∆, the dashed-dotted line is the power law fit to this ensemble average (the same line is also shown in figure (a)); the scaling exponent is 0.3.and Virginia) in figure 6(b).We find an average power law scaling exponent of 0.3, which well describes most of the time series in the data set.We conclude that the TAMSD of political opinions on long time scales is subdiffusive.Such a subdiffusive scaling can be explained by a heterogeneous voter model (see section 3.3).In addition to the fast time scale of τ ≈ 1 year, there is at least one slow time scale on which the society itself-and therefore its zealot structure-changes.Such a deep change in societies naturally happens at least once per generation, since the acting personas retire or die, while new citizens and zealots are coming of age.The scaling is also compatible with the long-time polling and election results of the British Conservative and Labour parties (see figure 6(a)), where we show the scaling exponent via the dashed line for comparison).We note that the scaling is rather noisy here, due to the short time series.

The effective number of voters
In section 3.2, we showed how the voter model becomes superdiffusive on short time scales if correlations of successive steps are considered.The data analysis does not indicate such an effect, however, a closer look reveals that some short-time correlation is required to make the obtained parameters meaningful.
In table 2 in the last row, we report our measurements for N 2 /d in units of 1/months.Since d is the number of steps per time unit, N/d is the frequency for each voter of being exposed to a opinion (and being ready to change the own opinion, since c = 0).We can assume that typical susceptible voters will still not be ready to change their mind more than once a month, i.e.N/d < 1, from which follows N < N 2 /d.So according to the last row of table 2, the effective population size is smaller than some hundred voters (e.g.297 effective voters in Germany concerning Angela Merkel), while the true population in these countries is many million voters.Thus we see that the effective population size that we measure in the data is much smaller than the true population size, indicating strong correlations of successive steps.

Discussion
Public opinions are shifting perpetually.The key politicians change as well as the political agendas.Furthermore, the population is in a permanent state of change due to human ageing, but also shifts in their communication, means, and networks.Nevertheless, it is important to understand the stationary behaviour of a system before changes can be correctly set into context.Such a stationary theory is the noisy voter model (see appendix) which is well known in opinion dynamics and finance.An alternative suggestion has been the voter model with zealots, which is equivalent for large populations, but does not postulate external influence.The model effectively describes the stochastic dynamic in a potential defined by the zealots.The autocorrelation function decays exponentially identical to confined Brownian motion.At short times, the TAMSD scales linearly.By generalising the model, on the one hand, we can associate faster scaling (scaling exponents > 1) with correlations between successive steps, i.e. one voter influences several other voters at once.On the other hand, slower scaling (exponents < 1) can be associated with heterogeneity among voters, i.e. not all voters are persuaded equally easily, which leads to additional time scales in the system, i.e. the time scale on which even the zealots change their opinion.
Polling results on parties may change due to the changing personal (i.e.key politicians), opinions on specific issues due to changed legislation or because the perceived relevance of the topic compared to other topics changed due to external events.Politicians may lose popularity due to errors or unfulfilled expectations if they only stay in power for a short time.Such phase transitions require a more general model with additional slow time scales.The approval ratings of several reelected political leaders are good examples of relatively stationary time series in opinion dynamics.For instance, for the former German chancellor Angela Merkel, even though approval fluctuated strongly and changed, e.g.due to different crises [82], they also returned to previous levels, which-in the language of our model-means that even when public opinion changed, the underlying structure of zealots in favour and against Angela Merkel did not change.
Two of our main findings are that we find different characteristic parameters for different political leaders.Moreover, we show that the effective population size (in the sense of voters switching their opinions independently without correlations to other switches) in all cases is much smaller than the actual size.From this observation, it can be concluded that there is a correlation between successive steps of opinion formation in the model.However, the resolution of the data does not resolve this correlation.The calculated slope of the TAMSD at short times is linear, which also shows that non-zealot voters generally are not too different from one another (heterogeneity cannot be detected with our method below a certain threshold, see section 3.3) in the sense of how often they change their minds, i.e. how easily they are convinced.In this sense the standard homogeneous voter model with zealots (or the noisy voter model) is already a quite good description for the data.It can indeed be used to model stochasticity and compare parameters among data sets.
In addition, we analysed the time series of support for political parties in the UK and support for parties in national elections for various states of the USA.While polls for political parties exhibit only small deviations from the expected scaling of the TAMSD, election data are not suitable to be analysed by a stationary voter model, because the resolution of the data is too low to resolve the characteristic time scale.Moreover, the dynamics is subdiffusive due to slow changes in the population (zealots).
Making use of theoretical knowledge about the noisy voter model, we developed a framework for analysing public opinion time series from the perspective of stochastic processes.We pointed out the properties of the system which are relevant for understanding the spectrum and autocovariance of the data.In this approach we neglected all details about human interactions and network effects arising form realistic structures beyond the fully connected network.Such effects might be considered in a second step as an additional source of heterogeneity, leading to corrections in all statistical measures.Nevertheless, we show that the homogeneous model with one characteristic time scale can be a reasonable approximation and fits very well the time series of approval ratings of several reelected political leaders.
We hope that our methods and data analysis inspires more research in the field of empirical applications of voter models.We have outlined two important generalisations of the voter model, one leading to superdiffusion and one leading to subdiffusion.Both effects can be expected to be present in data, but difficult to measure, because the correlations of successive steps are too short to be resolved in a data set with monthly resolution, and heterogeneity might require more sensible statistical tools.Understanding these effects would constitute a further step in the direction to a physical theory of opinion dynamics with interpretable parameters.

Figure 1 .
Figure 1.(a) Sketch of the fully connected voter network, consisting of opponent zealots Z − , opponent susceptibles S − , proponent zealots Z + , and proponent susceptibles S + .The thin lines represent the edges in the fully connected network.The oval shape marks the set of susceptible voters.(b) Possible time evolution of the network shown in (a) and of the size of the four groups Z + , S+, S − , and Z − .While groups of zealots have a constant size, susceptibles have fluctuating sizes.

Figure 2 .
Figure 2. Correlated steps lead to a decreased effective population size and superdiffusion ant short time scales.Invariance of the model under shifts in the parameters.(a) TAMSD of systems with the same effective number N of voters.For the set of parameters with ν = 10, Z = 25, Z + = 10, N total = 550 and d total = 7, we see a short-time superdiffusive scaling, in contrast to the indicated linear scaling of the theory curve equation (2) with parameters Z = 25, Z + = 10, N = 80 and d = 1.Stripes in the background indicate linear scaling.(b) PDFs of voter models with zealots with the same effective number N of voters; for the same parameters as in figure (a).The theory curve follows equation (11).The case ν = 10, d total = 7, Z = 25, Z + = 10, N = 80 is shown as a histogram.(c) TAMSD of systems with the same effective number d of steps; the dots represent a system with parameters d total = 10, Z = 25, Z + = 10, N = 80, c = 0.1.The theory curve (full line) is calculated from equation (2) for parameters d = 1, Z = 25, Z + = 10, N = 80.Stripes in the background indicate linear scaling.(d) PDF of systems with the same effective number d of steps; the parameters are the same as in figure (c).The histogram represents the simulated system with c = 0.1 and d = 10, the full line the theory curve (equation (11)) of a system with d = 1.The gaps are due to binning issues in a small population.

Figure 3 .
Figure 3. Heterogeneous voter model leads to subdiffusive dynamics.Heterogeneous voter model with three groups S, I, and Z: numerical calculations were performed with a population size N = 280, step number d = 100, and zealots Z = 20 (Z + = 8, Z − = 12).(a) Approximation of the TAMSD of a model with c = 0.03 and I = 100 (blue line).The dashed lines represent theory curves for the homogeneous voter model with intermediates counted as susceptible and zealots, respectively.The superposition of the dashed lines δ1(∆) + δ2(∆) is a good approximation (up to the longer characteristic time scale) of the heterogeneous model (dotted-dashed line).Stripes in the background indicate linear scaling.(b) Approximation of the PDF of a model with c = 0.03 and I = 100 by a homogeneous model (dashed line) in which all intermediate voters are counted as susceptible, i.e.Z = 20 and S = 260.(c) Variation of the parameter c of I = 100 intermediates.The dashed lines represent the limits c = 0 and c = 1, for which the model is homogeneous.(d) TAMSD: variation of the parameter I of intermediate voters with the parameter c = 0.03.The dashed lines represent the limits I = 0 and S = 0, for which the model is homogeneous.

Figure 4 .
Figure 4. Statistical analysis of opinion data.(a) Time series of approval ratings of five political leaders; the vertical axis shows the support x(t) (corresponding to n + (t) in the model) and the horizontal axis the time of the poll.(b) TAMSD of the approval rating (full lines) of Angela Merkel, Barak Obama, and Emanuel Macron along with the uncertainty (dotted-dashed lines), estimated from the typical number of participants in the polls (see table1); the dashed lines show the residual δX(∆) − 2σ2  η , which is used for analysis; the short time scaling of the residuals is close to linear for all three data sets.Stripes in the background indicate linear scaling.(c) PDF of three of the data sets and fitting models with parameters given in table 2.

Figure 6 .
Figure 6.At long times, opinions are not in equilibrium, but change in a subdiffusive way.(a) TAMSD of the Conservative andLabour parties in the UK; the full coloured lines are calculated from polling data[75], the dashed lines are calculated from election data[80].Black lines represent the fitted voter models with τ = 11.5 months for the Labour party and τ = 14.2 months for the Conservative party.(b) TAMSD for data from ten US-American states; the black line is the ensemble average for each ∆, the dashed-dotted line is the power law fit to this ensemble average (the same line is also shown in figure (a)); the scaling exponent is 0.3.

Table 2 .
Estimated parameters of the homogeneous model for approval ratings (second column) and party support (third column).The first three rows show the measured statistical quantities µ, σ, and τ .The last three rows show the corresponding model parameters Z + , Z − , and N 2 /d.The unit of τ is months, and the unit of d is 1/months.