Emergence of multilevel selection in the prisoner's dilemma game on coevolving random networks

We study the evolution of cooperation in the prisoner's dilemma game, whereby a coevolutionary rule is introduced that molds the random topology of the interaction network in two ways. First, existing links are deleted whenever a player adopts a new strategy or its degree exceeds a threshold value, and second, new links are added randomly after a given number of game iterations. These coevolutionary processes correspond to the generic formation of new and deletion of existing links that, especially in human societies, appear frequently as a consequence of ongoing socialization, change of lifestyle or death. Due to the counteraction of deletions and additions of links the initial heterogeneity of the interaction network is qualitatively preserved, and thus cannot be held responsible for the observed promotion of cooperation. Indeed, the coevolutionary rule evokes the spontaneous emergence of a powerful multilevel selection mechanism, which despite of the sustained random topology of the evolving network, maintains cooperation across the whole span of defection temptation values.


Introduction
Social dilemmas emerge if actions warranting individual success harm collective wellbeing [1]. While cooperative behavior is universally regarded as the rational strategy leading away from the impending social decline, its evolution in groups of selfish individuals is puzzling [2]. Evolutionary game theory is frequently used as the framework within which answers to the puzzle are sought [3], and the prisoner's dilemma game seems particularly suited in this respect. In this game mutual cooperation yields the highest collective payoff that is equally shared between the players. However, defectors will do better as individuals if their opponents cooperate. In well mixed populations selfishness always leads to mutual defection [4], whereby players remain empty-handed and the society suffers. The pivotal study that launched a spree of activity aimed towards resolving the dilemma is due to Nowak and May [5], who showed that spatial structure may maintain cooperative behavior in the prisoner's dilemma game. Further mechanisms promoting cooperation are kin selection [6], direct and indirect reciprocity [7,8], as well as group [9,10,11] and multilevel selection [12,13], as recently reviewed in [14]. Related specifically to the present work is the promotion of cooperation via multilevel selection, which can be related to group selection [15], although the latter term is frequently addressed in a rather problematic fashion (see [16] for a recent recap).
Particularly vibrant in recent years has been the subject of evolutionary games on complex networks [17,18]. High cooperation levels reported on scale-free networks highlighted the beneficial impact of heterogeneity that characterizes their degree distribution [19]. Notably, the promotive impact of heterogeneous states on the evolution of cooperation has been reported also in other contexts [26,27]. Although several studies have since elaborated on different aspects of strategy adoption on complex networks [20,21,22,23,24,25,28,29,30,31], open questions remain. Foremost, it is still of interest to investigate how the coevolution of networks affects the evolution of strategies. Following the earlier works in the context of evolutionary game theory [32,33,34], it has recently been shown that highly heterogeneous interaction networks may evolve spontaneously from simple coevolutionary rules [35,36,37,38,39], and moreover, processes like appropriate reactions to adverse ties [40,41], reputation-based partner choice [42], as well as increase of teaching activity [43,44], have all been considered as coevolutionary rules that can promote cooperative behavior. This subject is intimately connected with the seminal works on network growth [45,46] and their resilience to error and attack [47,48,49,50] as means to alter the topology in order to affect the spread of epidemics and viral infections [51,52,53] in an efficient way [54], as recently argued in [55]. Moreover, the evolution on networks is increasingly often accompanied also by the evolution of networks not just in the context of evolutionary games [56,57], but indeed networks are to be seen as evolving entities that may substantially influence all dynamical processes that are taking place on them [58,59].
Aim of the present work is to show that coevolutionary rules molding the interaction network can influence the evolution of cooperation not via the emergence of strong degree heterogeneity, as reported earlier [37,38,39], but may indeed introduce processes that affect the macroscopic dynamics of strategy adoption. We introduce a simple strategy-independent coevolutionary rule that qualitatively preserves the Poissonian degree distribution of the initial random interaction network. The coevolutionary rule involves the deletion of existing links upon the adoption of a new strategy or exceeding of a given maximal degree, and the addition of new links after every τ full Monte Carlo steps. These coevolutionary additions and deletions of links are motivated by the fact that, especially in human but also in animal societies, ties between members of the population change in time. In particular, the formation of links is commone.g. we socialize, we make friends -thus, in the course of a lifetime we certainly form many new ties with others. We take this into account in our model by adding randomly a new link to each player every τ full Monte Carlo steps. Note that since we add new links every τ game iterations, the number of links an individual accumulates over time can be directly linked with age. This is why extinction (death) of players is considered by deleting existing links whenever the degree of a player exceeds a threshold value. And finally, there are situations in life when an individual changes a significant part of its existence. Within our model we consider this to be the player's strategy, while in reality examples that can be considered as related are changing the lifestyle, moral values or political orientation, all of which typically lead to restructuring of one's connections with others. In accordance, we therefore delete existing links of the player that changes its strategy. From a biological viewpoint, the latter act can be linked with an invasion of the subordinate species and the subsequent replacement by a newborn of the victor.
The proposed coevolutionary rule thus generically describes the formation and deletion of links with other members of the society in a general and strategy-independent (the same rules apply for cooperators and defectors) manner. Jointly, the two processes of link addition and deletion annihilate each other's fingerprint on the heterogeneity of the degree distribution, thereby eliminating network structure as a potentially decisive factor by the evolution of cooperation. Remarkably though, we demonstrate that the coevolutionary rule spontaneously evokes a dynamical mechanism that can be interpreted as a multilevel selection (see e.g. [12]) in that, on the macroscopic level, groups of cooperators have much better chances of dissemination than groups of defectors, whereas on the microscopic level, defectors within a given group are superior to cooperators. The spontaneous emergence of multilevel selection within the proposed coevolutionary model leads to complete dominance of cooperators across the whole span of the temptation to defect provided τ is large enough. The study thus supplements previous works examining the impact of different coevolutionary rules, and moreover, demonstrates that heterogeneity can be of secondary importance by the evolution of cooperation on complex networks.
The remainder of this paper is organized as follows. In the next section we describe the employed evolutionary prisoner's dilemma game and the protocol for the coevolution of the random network. Section 3 is devoted to the presentation of the main findings, whereas in the last section we summarize conclusions based on them. Table 1. Payoff matrix of the prisoner's dilemma game. Strategies in rows get the depicted payoff when playing the game with the strategies in columns.

Mathematical model
Here the prisoner's dilemma game is used as a representative example of a social dilemma, whereby we adopt the same parametrization as proposed in [5]. Accordingly, the game is characterized by the temptation to defect T = b, reward for mutual cooperation R = 1, and punishment P as well as the suckers payoff S equaling 0, whereby 1 < b ≤ 2. In the game two cooperators facing one another acquire R, two defectors get P , whereas a cooperator receives S if facing a defector who then gains T . This can be summarized succinctly by the corresponding payoff matrix given in Table 1. It is worth noting that the prisoner's dilemma we use is not strict in that P is not strictly larger than S, but the qualitative dynamics of the game is thereby not affected. Initially, each player x is designated either as a cooperator (s x = C) or defector (s x = D) with equal probability, and is placed on a random network that is constructed from N individuals so that the average and minimal degree are k avg = 4 and k min = 1 (no player is detached), respectively. Duplicate links are also omitted. Evolution of the two strategies is performed in accordance with the Monte Carlo simulation procedure comprising the following elementary steps. First, a randomly selected player x acquires its payoff p x by playing the game with all its k x neighbors. Next, one randomly chosen neighbor of x, denoted by y, also acquires its payoff p y by playing the game with all its k y neighbors. Last, if p x > p y player x tries to enforce its strategy s x on player y in accordance with the probability W (s x → s y ) = (p x − p y )/bk q , where k q is the largest of the two degrees k x and k y . This proportional imitation rule [60] is used frequently when players on heterogeneous networks have different degrees [19,23,24,61]. Notably, an alternative to the latter is the Fermi strategy adoption rule [62], which enables tuning of the selection intensity via a single parameter K towards the weak selection limit [63,64]. However, caution should be exercised when applying it on heterogeneous networks since due to the different degrees of participating players the temperature K, and thus also the selection intensity, effectively varies from one strategy adoption to the other. We therefore use the simplified strategy adoption rule based on proportional imitation, but note that similar results as will be reported below are expected also for reasonably weak selection. However, if the strategy adoption process becomes comparable to the flip of a coin the below reported multilevel selection may be impaired. In accordance with the applied Monte Carlo procedure incorporating the random sequential update, each player is selected once on average during a full simulation step (MCS).
In addition to the evolution of the two strategies, a coevolutionary rule is a b c d implemented as follows. Whenever player x adopts a new strategy all its links, except from the one with the donor of the new strategy, are deleted. Hence, in addition to adopting a new strategy, the player is separated from its former allies and gets k x = 1. This process of strategy adoption and simultaneous link deletion is demonstrated in Fig. 1, and can be motivated by the fact that changes in lifestyle, moral values, political orientation or religious beliefs frequently result in deletion of existing ties we have formed with others. To counteract the depletion of links that constitute the random network, all individuals are allowed to form a new link with a random player with which they are not yet connected. The latter process, happening after every τ full Monte Carlo steps, corresponds to the continuous process of socialization or making of new friends, which typically entails the formation of new links. We also take into account aging, and accordingly, as soon as k x reaches a threshold k max , player x dies and is replaced by a newborn having the same strategy and keeping a single randomly selected link from its predecessor to assure connectedness. Note that since new links are added every τ game iterations, the number of links a player accumulates over time can be directly linked with age. Within the current work k max does not play a decisive role and was simply chosen large enough so as not to influence the initial random network topology and the subsequent evolution of cooperation. The presented results were obtained on networks hosting N = 10 4 players, for which k max = 500 has proven to be sufficiently large. Importantly, due to the continuous additions and deletions of links the initial Poissonian outlay of the degree distribution is qualitatively preserved irrespective of b and τ , as shown in Fig. 2. It is intriguing that, although the coevolutionary rule does not affect the heterogeneity of the interaction network in a significant manner, the promotion of cooperation depends crucially on τ , as can be inferred from the caption of Fig. 2. These motivational results presented in Fig 2 will be explained in the next section. It is also worth mentioning that the process illustrated in Fig. 1 may occasionally result in detached players that originally formed the neighborhood of the invaded player, as shown in Fig. 1(d). In this case we relinked the detached player randomly onto the network.
Next, we will systematically analyze the evolution of cooperation in dependence on b and τ , which are the two crucial parameters within the proposed coevolutionary model. Notably, the evolution of cooperation on static random networks has been studied in [65], and the reader is refereed there for comparisons with respect to the above proposed coevolutionary model.

Results
Within the coevolutionary model the final outcome of the prisoner's dilemma game is always an absorbing C or D state. We note that this is an inherent property of the coevolutionary model that prevails irrespective of the system size N. In order to account for the resulting fluctuating output near sharp transition points, we depict in Fig. 3 not the density of cooperators ρ C , but rather the probability ψ C that the final state is ρ C = 1, whereby the latter is determined within 1000 independent runs for each particular combination of b and τ . Foremost, Fig. 3 depicts the effect of different values of τ on the evolution of cooperation. Clearly, the coevolutionary process promotes cooperation extremely effectively, resulting in full dominance of cooperators up to b = 1.1 at τ = 1, b = 1.6 at τ = 10 and up to b = 2.2 at τ = 70, where the latter value of the temptation to defect is already past the span of the prisoner's dilemma game as it is considered here. Given that the continuous deletions and additions of links, entailed in the coevolutionary process, qualitatively preserve the initial random topology and the heterogeneity of the interaction networks (see Fig. 2), the observed promotion of cooperation cannot be attributed to mechanisms that rely on heterogeneous environments reported earlier [19,26]. Thus, instead we have to look for additional clues that may explain the doom of defectors by large values of τ , as evidenced in Fig. 3.
Striving towards a mechanism that could explain the promotion of cooperation, we show in Fig. 4 temporal courses of ρ C at b = 1.5 for increasing values of τ from left to right. First, it is worth noticing that indeed the absorbing cooperative state is reached fairly quickly, i.e. within a few thousand full Monte Carlo steps. But most importantly in Fig. 4, we point out the emergence of time intervals during which ρ C is constant. This cascade-like feature becomes increasingly pronounced as τ increases. Namely, at τ = 30 (dash-dotted black line) it is practically absent, whereas at τ = 500 (solid red line) the cumulative duration of dormancy of ρ C surpasses that of active phases. We argue that the reason for the emergence of these inactive windows lies  in the newly introduced coevolutionary process, which dictates continuous deletion of links that occurs always when a player adopts a new strategy, and also if its degree reaches k max . If τ is sufficiently large, meaning that new links are added rarely, the deletions of links lead to the emergence of homogeneous and virtually isolated groups of players [as schematically depicted in Fig. 1(a) and 1(c) not taking into account the dashed links]. These groups remain inactive for as long as it takes for the newly added links [dashed link in Fig. 1(a) and 1(c)] to reconnect them with one another, of which duration is roughly equivalent to τ (see the inset of Fig. 4). It is important to note that during the inactive phase there are practically no strategy transfers taking place, and thus the main source of link deletions is disabled. Consequently, the addition of new links can gradually reconnect the detached groups, which then again triggers an avalanche of strategy adoptions [see Figs. 1(b) and 1(d)], which in turn starts the whole process anew, until eventually an absorbing state is reached. Thus, the temporal plots in Fig. 4 arguably evidence the spontaneous emergence of multilevel selection, similarly as proposed recently in [12], due to the introduction of the proposed coevolutionary rule. We emphasize that the groups are not introduced a priori but eventually emerge spontaneously for appropriate values of τ . During the dormant phases isolated groups of cooperators can enhance their strength, while groups of defectors weaken as there is nobody to exploit. Notably, from the viewpoint of vulnerability there is no difference between defectors having a small or large degree. Thus, as soon as the two types of groups reestablish a sufficiently strong interconnectedness, cooperators can successfully invade the defectors, thereby gradually increasing the cooperative domains. This is schematically depicted in Fig. 1 if considering the cooperators to be green and defectors to be red. The latter processes manifest as rather steep jumps in the temporal traces of ρ C by large enough τ , which are then again followed by dormancy since the many strategy adoptions anew lead to isolation of homogeneous groups of players. At small values of τ (dash-dotted black line in Fig. 4), however, these steps are absent, which indicates that the addition of new links is too fast for the homogeneous groups to become isolated enough to trigger the multilevel selection. Importantly, a thorough isolation is necessary (compare the curves in the inset of Fig. 4), since without it groups hosting both strategies are susceptible to an overrule by defectors. Namely, in the small τ region the fast additions and deletions of links yield mean-field type conditions that are arguably harmful for cooperators. The above interpretation can be corroborated nicely by results presented in Fig. 5, where the critical temptation to defect b c is plotted in dependence on τ . We determined b c as the lowest value of b where an absorbing D state is attained with probability one. It can be observed that, as argued above, by small values of τ , where multilevel selection cannot be fully established, the promotion of cooperation is indeed marginal. However, it improves steeply and reaches a maximum at approximately τ = 70. According to the temporal outlays presented in Fig. 4, this is roughly the value of τ where the multilevel selection becomes fully pronounced. In particular, note that the steps by τ = 100, and even by τ = 30, are practically just as frequent as by τ = 500. Crucially, however, the dormant phases are shorter by smaller τ . Thus, as τ increases past the optimal value, solely the dormant phases are prolonged, yet the multilevel selection remains equally intense. Therefore, by values of τ past the optimum the defectors have more time to overtake individual groups during the dormant phases. Note that the isolation of homogeneous groups from the opposite strategy is never complete since neither individual players nor groups can become fully detached from the network. The prolongation of dormant states thus results in a slight decrease and subsequent saturation of b c . Nevertheless, due to the remaining intact multilevel selection even at large τ , the promotion of cooperation is still significant, letting the critical temptation to defect hoover comfortably over the maximal b permissible within the prisoner's dilemma game.

Summary
In sum, we have elaborated on the evolution of cooperation in the prisoner's dilemma game on evolving random networks. Foremost, we have shown that a simple coevolutionary rule, qualitatively preserving the initial heterogeneity of the interaction network, may spontaneously evoke a powerful multilevel selection mechanism that promotes cooperation across the whole span of values of temptation. The extend of the promotive effect depends on a single parameter τ , determining the frequency of forming new links between randomly chosen but not yet linked players. While this dependence exhibits a saturating outlay with a local maximum at τ = 70, the cooperation promoting mechanism remains intact also for substantially larger τ across the whole span of b. The optimum emerges as a result of the fully developed multilevel selection, which is not attainable for small values of τ because too frequent additions of new links hinder the formation of isolated homogeneous groups and yield mean-field type conditions. On the other hand, as additions of new links become rare (τ is increased), the saturating effect of cooperation promotion sets in due to the emergence of prolonged delays between periods of active multilevel selection, during which defectors may be able to invade and overtake individual cooperative groups. Our work demonstrates that simple strategy independent coevolutionary rules may spontaneously evoke dynamical mechanisms that affect the adoption of strategies on the macroscopic level of evolutionary game dynamics. This presently manifests as multilevel selection that strongly promotes cooperation in the prisoner's dilemma game.