Adaptive multilayer networks resolve the cooperation dilemma induced by breaking the symmetry between interaction and learning

We study the coupled dynamics between strategy updating and partner switching on adaptive multilayer networks whose structure is split into an interaction layer for accumulating payoff and a learning layer for updating strategy. Two different types of adaptive multilayer network dynamics are modeled to study the evolution of cooperation. First, the selected individual either varies his strategy updating environment or switches his interaction partners during the partner switching process. It is proved that an increasing ratio of interaction network reconfiguration facilitates the coevolution of cooperation, indicating that interaction network acts a profound role in promoting the coevolution cooperation. Furthermore, we consider a type that the selected player simultaneously updates his strategy updating network and interaction network during the rewiring process. For a low frequent partner switching process, it is found that the evolution of cooperation is hindered whenever the two layers do not coincide. However, when the frequency of partner switching process increases, breaking the symmetry between interaction network and learning network resolves the social dilemma and enhances the evolution of cooperation. Moreover, a comparison between two adaptive multilayer network dynamics shows that the former type that only permits one layer to evolve every step on the adaptive multilayer networks suppresses the evolution of cooperation.


Introduction
The evolutionary game theory provides a general approach to model Darwinian competition of species in ecosystem and human society [1,2]. It has been widely used to study cooperative behavior in evolving populations and, particularly, for the elaboration of how costly cooperators overcome rewarded defectors in fierce evolutionary competition. Evolutionary game dynamics depending on the structured populations favors the evolution of cooperation by a mechanism that is regarded as network reciprocity [3][4][5]. Essentially, network reciprocity leads to the efficiency that individuals have diverse opportunities in interacting with others. Defectors may dominate cooperators in the well mixed populations since individuals interact equally with the others, albeit may not have advantages in structured populations due to the facts that defectors have limited opportunities in exploiting cooperative clusters, and cooperators along the interface of the clusters can offset their losses against defectors by interacting with the conspecifics within the clusters [6,7]. As a consequence, network reciprocity facilitates cooperation by aggregating and forming cooperative clusters in physical or other spaces to resist defection. Numerous investigations have been developed to study the evolutionary games on monolayer networks including regular [6,7], random [8][9][10][11], small-world [12][13][14], scale-free [15][16][17][18][19][20][21][22][23] and coevolving networks [24][25][26][27][28][29][30][31][32][33][34][35], to name a few examples to deliver a better understandings on network reciprocity. However, increasing attention has been shifting toward the evolution of cooperation on interdependent and multilayer networks that extend the scope of evolutionary cooperation based on network reciprocity, since multilayer networks characterize more real-life situations of coupled systems such as diverse infrastructures comprised transportation, power systems and communication that are coupled together and should be modeled as multiplex networks [36][37][38]. Significant development of evolutionary cooperation on interdependent networks has been concentrated on ways of which the interdependency between different networks may contribute to the alleviation of social dilemmas [38,39], not least interconnectedness [40,41], information sharing [42,43], biased imitation [44,45], as well as the coupled evolutionary fitness [46][47][48][49]. Particularly, the distinction of interaction and learning networks features the fact that humans act different roles in different network relationships [50][51][52][53][54]. Individuals reap their payoffs by playing evolutionary games with anonymous partners on the interaction network. Nevertheless, instead of just relying on the information from the interacting partners to carry out decisions, individuals have diverse ways of obtaining information and making decisions via a network that rarely overlaps with the interaction network. Significantly, the studies on symmetry breaking between learning and interaction networks show that cooperators are hampered whenever the two networks do not coincide [50,51].
To resolve the social dilemma on asymmetry multilayer networks, a more real-world situation is taken into account by introducing coevolution between strategies and multilayer network topologies to explore the evolution of cooperation. Coevolution mainly concentrates on the intriguing interplays between the dynamics on networks and the dynamics of networks, which entangles the evolution of strategies and topology of adaptive networks tightly [55]. The impact of coevolving networks enhances the evolution of cooperation past the boundaries imposed by static networks [24][25][26][27][28][29][30][31][32][33][34][35]. The combination of strategy evolution and structure adaptation presents numerous amazing phenomena in revealing the complexity of interaction. Adaptive structures based on the rules of breaking up the connection with egotists and rewiring to altruists enable the prevalence of cooperation. By interpolating the previous findings concerning the symmetry breaking between interaction and learning, and the evolution of cooperation on coevolving networks, this investigation aims to study how the adaptive multilayer networks impact the evolution of cooperation. Individual not only switches his learning partner for a favorable environment of competition, but also changes the interaction environment to avoid the exploitation of defection. This is realistic since humans in different network relationships can stop the direct interactions with egotists, but may still keep information exchange by the internet with egotists so that behavior propagation still exists between individuals who have no direct interactions. The remainder of this investigation is organized as follows. The subsequent section presents the description of the mathematical model. Next, we devote to specifying the results in detail, whereas discussions and conclusions are drawn in the last section.

Model
We conduct our model on multilayer networks. Interactions are characterized by the prisoner's dilemma game, which perfectly incorporates the conflict of interest between individual and group. In a typical prisoner's dilemma game, two players each act either as a cooperator or defector. Both players each obtain a reward R for mutual cooperation and receive a punishment P for mutual defection. In addition, a defector enjoys the maximum temptation to defect T by exploiting a cooperator who acquires a sucker's payoff S. Following common practice, the payoff matrix can be rescaled as: For the cost-to-benefit ratio u ∈ (0, 1), such formalism not only satisfies the dilemma conditions T > R > P > S and 2T > R + S, but also has advantage in controlling game by a single parameter.
At the outset of the evolution, each player is assigned either as cooperator (C) or defector (D) with equal probability, and it is simultaneously located on multilayer networks. Each player reaps his payoffs by interacting with all neighbors on the interaction network, while updating his strategy by imitating one of his nearest neighbors on the learning network. The initial population structure is constructed as two-layer random networks. In interaction network layer, M = 5000 links randomly pair up N = 1000 nodes, and thus the average degree is k = 2M/N. The learning layer has the identical structure with interaction layer initially.
The proposed model integrates the entangled dynamics of strategy updating and partner switching in learning and interaction environments. In accordance with the standard Monte Carlo simulation, each generation comprises three elementary steps in the evolutionary process. First, a player i is randomly picked up from the population and accumulates payoff π i by playing games with all his neighbors on the interaction network. Then either strategy updating or partner switching happens, controlled by a pre-assigned probability w. Particularly, the individual either experiences strategy updating with probability 1 − w, or attempts to adjust his neighborhood for future interaction with the supplementary probability w.
Whenever strategy updating process occurs, player i randomly selects one neighbor, say j, on the learning network. Then i adopts j's strategy with a probability given by the Fermi function W(s i ← s j ) = {1 + exp[(π i − π j )/κ]} −1 , where π i and π j are i's and j's payoffs, respectively, and κ = 0.1 measures the noise to permit irrational choices. The essence of this imitation process consists in that individuals with higher payoffs are more likely to spread their strategies [6,7].
When player i attempts to adjust his neighborhood, the following partner switching rule applies. He cuts off such a link associated with a neighbor who has provided i with the lowest payoff of all i's neighbors. If more than one such neighbors exist, i severs one of them with equal probability. After this, i establishes a new link with a randomly selected player. Self-loop and double links are prohibited. Thus the total number of links keeps constant during the evolution. In order to ensure the connectivity of both learning network and interaction network, neighborhood adjustment dynamics cease evolving as soon as isolated individuals emerge.
We adopt two types of adaptive multilayer network to explore the dynamics as such networks are close representative of realistic situations and widely studied. Scenario 1: player i either varies his strategy updating environment with a probability w L , or switches his interaction partners with a probability w I , where w L + w I = w. This scenario naturally distinguishes the evolutionary time scales between the learning network and the interaction network. By adjusting w I ∈ [0, w], the evolutionary dynamics can take place from adaptive learning network and static interaction network to static learning network and adaptive interaction network. This scenario allows to explore their respective role of adaptive learning network and adaptive interaction network in affecting the evolution of cooperation.
Scenario 2: player i simultaneously updates his strategy updating network and interaction network with a probability w, where the learning network and the interaction network may differ. In particular, player i first updates his interaction network according to the partner switching rule. He cuts off the link connecting a neighbor (say x) yielding him the lowest payoff and then reconnects to a randomly selected player y. After this, with a probability p, player i checks whether player x is already his neighbor on the learning network. If x it is, i also switches his learning partner x, or the other neighbor z on learning network who satisfies the introduced partner switching rule is dismissed. Thereafter, i links to player y as a new learning partner that is in concert with the new interaction partner. Otherwise, with a probability 1 − p, according to the partner switching rule, player i dismisses the connection with l and links to a new learning partner m. Under this circumstance, network dynamics on multilayer networks are independent of each other. When p = 1, scenario 2 recovers to traditional settings where the learning network and the interaction network completely overlap. For p < 1, the overlapping between these two layers is broken and the decrement of p leads to the sparse overlap links between the two layers. We continuously compare the outcomes of the described scenarios, which differ with regard to the structures of adaptive multilayer networks, to probe the optimal conditions for cooperative behavior.
Simulation results are obtained in the following way. We concentrate on how cooperators fare under the introduced coevolutionary rule. In order to improve accuracy, we calculate the average cooperative level ρ C over last 5 × 10 3 MCS of total 10 6 MCS (or the simulation stops when C or D occupies the whole population) and the final results are averaged over 1000 independent realizations. Figure 1 shows the evolution of cooperation as a function of w for different cost-to-benefit ratio u. The cooperation level falls for a small w L (w L = w), yet sufficiently frequent learning partner switching guarantees the survival of cooperators. On the other hand, figure 1(b) illustrates in detail the effects of adaptive interaction network on the evolution of cooperation when learning partner switching is left unconsidered. Increasing w I (w I = w) enhances the level of cooperation, a result widely replicated in previous studies on coevolutionary dynamics of strategy imitation and partner switching. The key to promoting the evolution of cooperation lies in that individuals can switch off the adverse connections with defectors [24][25][26][27][28][29][30][31][32][33][34][35]. Consequently, individuals not only avert the exploitation from defectors, but also escape from the invasion of defectors. We have also found that the interaction network is non-evolvable, the exploitation of defectors upon cooperators intensifies since cooperators are unable to dismiss the adverse links swiftly and thus continually exploited in future interactions. Thus retarding the invasion of defectors on cooperators would play a crucial role in promoting cooperation, which can be realized by individuals' frequent learning partner switching. If we fix the learning network, defectors are more likely to invade cooperators, yet unable to enjoy a significant evolutionary advantage due to network reciprocity. Clustered cooperators can effectively fend off defectors invasion through mutual breed. Network reciprocity speeds up the formation of cooperator clusters.

Evolution of cooperation based on scenario 1
However, breaking the symmetry between interaction network and learning network impedes the evolution of cooperation. It is well-known that cooperators are allowed to maintain by aggregating to form cooperative clusters. On monolayer networks, defectors can hardly invade cooperators from the interface of clusters unless defectors have higher payoffs. In addition, cooperators from the inside of clusters impossible to change their strategies without mutation since they have no opportunity to meet defectors. However, breaking the symmetry between interaction network and learning network makes it possible for cooperators to learn from outside defectors. The invasion of defectors breaks up cooperative clusters from inside, and thus the reciprocity for surviving cooperation are destroyed. In figure 1(a), low frequency of learning partner switching breaks the symmetry between multilayer networks. Once cooperators from the inside of clusters switch to defectors, low frequency of learning partner switching fails to refrain from defection in time and this may lead to a snowball effect. Nevertheless, low frequency of interaction partner switching in figure 1(b) cannot result in the collapse of cooperation, indicating that adjusting interaction partner is a more efficient way in enhancing cooperation than switching learning partner.
In figure 2, we extend the analysis to an arbitrary number of w L and w I and show the phase illustration to study the evolution of cooperation. Cooperators fail to survive for small w L and w I , yet full cooperation can be achieved with the increment of w L and w I . In addition, when w = w L + w I = 1, players have no opportunity to update their strategies and the topology of the multilayer networks reconfigures all the time, and thus the cooperation level remains in accordance with the initial value 0.5. Moreover, frequent learning network reconfiguration and frequent interaction partner switching have different efficiency in favoring the evolution of cooperation. Assuming that w L + w I = 0.6 and setting off along this line, one can find regions where dominating cooperators gives rise to the coexistence phase separated two absorbing states, and even to dominating defectors, indicating that the cooperative behavior is impeded to some extent as increasing ratio of learning network reconfiguration.
To clearly shed light on the beneficial impact of the ratio of multilayer reconfiguration on the evolutionary cooperation, we explore the evolutionary consequences by introducing time scale associated with learning partner switching process and interaction partner rewiring process in figure 3. It is found that reducing interaction network reconfiguration hinders the evolution of cooperation, implicating that declining the exploitation from defectors is more conducive to cooperation than decelerating invasion process of defectors. Network reciprocity reveals that natural selection can favor cooperation since who-meets-whom is determined by spatial relationships. More precisely, the underlying mechanisms of favoring cooperation in structured populations are regarded as the interplay between reducing the exploitation from defectors and the limited opportunities in spreading defection. In frequent adaptive interaction network environment, keeping from the interaction with defectors enhances the competition advantages of cooperators in payoffs and abates the payoffs of defectors from exploiting cooperators. Such competition advantages guarantees the survival and reproduction of cooperators even if defectors have more chances of invading. Nevertheless, refraining from defective disseminators restricts the propagation of defection, but wipes out the competition advantages of cooperators in payoffs since defectors can frequently exploit cooperators, and thus defection still dominates cooperation by limited propagation paths. As a  consequence, an increasing ratio of learning network reconfiguration raises the opportunities of defectors to interact with cooperators, and thus saps the competitiveness of cooperators. Moreover, the enhancement of competitiveness of cooperation from restricting the invasion ways of defection cannot offset the loss of that induced by suboptimal payoffs of cooperators, and thus cooperation is hindered with a decreasing ratio of interaction network reconfiguration.
It is interesting to explore the underlying mechanism between the evolution of strategies and the evolution of structures. Here we present the typical time evolutions of the coupled dynamics of strategies and structures in figure 4. It is found that cooperators reproduce much more rapidly in the case of w I = 0.5 than that in the case of w L = 0.5 (see figure 4(a)), indicating that it spends less time for cooperators to occupy the whole population when switching interaction partner frequently. When one layer keeps static and the other layer changes its structures, the overlap level of two layers p O , representing by the frequency of overlap edges between learning layer and interaction layer, monotonously decreases with time t in figure 4(b). In addition, since setting w = 0.5, the evolution of two curves is of great similarity. In figure 4(c), the degree variance of the static layer keeps constant, while the degree variance of adaptive layer increases with increasing time t, and peaks at the time that cooperators rapidly reproduce, yet decreases if the network goes on evolving. The mean clustering coefficients of the different layers are depicted in figure 4(d), the variations of clustering coefficients are very similar with that of degree variance in figure 4(c). It is worth noting that high heterogeneity of the network indeed favors the evolution of cooperation [15,56], as well as high clustering coefficient [18]. The maximums of the degree variance and  clustering coefficient for cooperators to evolve on adaptive interaction layer are less than that on adaptive learning layer, indicating that frequent interaction partner switching to decline the exploitation from defectors is more conducive to the evolution of cooperation than learning partner switching.

Evolution of cooperation based on scenario 2
In figure 5, we not only give the evolutionary outcomes based on scenario 2, but also compare the evolutionary consequences of the previously introduced scenario 1, which differ in network dynamics.   cooperation is hindered if only one layer is permitted to evolve on adaptive multilayer networks. In this case, cooperative clusters often face invasion from defectors since the learning network keep unchanged during the evolution. The individuals from the inside of cooperative clusters have more opportunities to be replaced by defectors, and thus the evolution of cooperation is hindered. In addition, figure 5(d) that recovers the traditional coevolutionary prisoner's dilemma game on adaptive monolayer network shows an optimal cooperation for a small w. However, with the increment of w, the optimal condition for the evolution of cooperation reverses, the case of p = 0 shown in figure 5(b) reveals that the least overlap links between the two layers favors the evolution of cooperation best.
The evolution of cooperation of different scenarios as a function of w with different cost-to-benefit ratio u is presented in figure 6(a). For u = 0.15, scenario 2 with p = 1 recovers the traditional coevolutionary prisoner's dilemma game on adaptive monolayer network, the results show that the evolution of cooperation is best favored. Furthermore, for u = 0.3, cooperators need more frequent partner switching to survive. In this situation, breaking the symmetry between interaction network and learning network favors the evolution of cooperation. The results in figure 6(b) that show the fraction of cooperation as function of p also support the above observations of scenario 2. On monolayer networks, cooperators from the interface of clusters resist defectors effectively. The cooperators inside the clusters have a higher payoffs than the cooperators from the interface of clusters, yet the cooperators inside the clusters have no chance to spread their strategy since they will not meet defectors. However, breaking the symmetry between interaction network and learning network enables cooperators inside the clusters to exchange strategies with outside defectors. Whenever network reconfiguration happens infrequently, cooperative clusters are unable to expel defectors rapidly. Defectors have opportunities to exploit cooperators and have payoff advantages over cooperators. Outside defectors overcome the cooperators inside the clusters in evolutionary competition. Consequently, the survival of cooperative clusters is wrecked from inside. Furthermore, whenever network reconfiguration occurs frequently, cooperators can refrain from the exploitation of defection immediately, and thus defectors fail to obtain payoff advantages. The incongruence between interaction network and learning network creates advantages for cooperators inside the clusters to replace defectors since cooperators inside the clusters have higher payoffs. As a consequence, frequent network reconfiguration favors the evolution of cooperation best when interaction network and learning network differ.
To further understand the evolution of the underlying structures, typical evolutionary processes of strategies and topologies with time t are shown in figures 7 and 8 with different levels of partner switching frequencies. Figure 7(a) shows that low partner switching frequency makes it a slightly faster for cooperators to occupy the whole population with p = 0.9. It is obviously to find that the evolution of overlap level p O monotonously decreases with time t in figure 7(b) since the partner switching process destroys the symmetry of two layers, but the curve of p = 0.9 falls much slower than the curve of p = 0.1. In figures 7(c) and (d), the degree variance and the mean clustering coefficient of two networks peak at the time that cooperators rapidly reproduce, yet the maximums of the degree variance and the mean clustering coefficient for cooperators to evolve in the case of p = 0.9 are lower than that in the case of p = 0.1, indicating that a higher overlap level enhances the evolution of cooperation. However, the underly dynamics of networks under a higher partner switching frequency differs from that under a lower one. In figure 8, whenever individuals adjust their partnerships more frequently, overlap level reduces rapidly (see figure 8(b)), but cooperators propagate more rapidly for a low overlap level (see figure 8(a)). Moreover, the peak values of the degree variance and the mean clustering coefficient that two layers can achieve for p = 0.1 are lower than that for p = 0.9 (see figures 8(c) and (d)), indicating that a low overlap level relaxes the conditions of network reciprocity for cooperators to evolve and thus the evolution of cooperation is favored.

Discussion
It is well established that network reciprocity can promote the evolution of cooperation [3][4][5]. Network reciprocity characterizes the fact that individuals interact only within limited local neighborhoods. By playing evolutionary game with local neighbors, cooperators within the clusters can avoid the exploitation of defectors in physical space. In addition, cooperators within the clusters are free from the invasion of defectors when the evolutionary competition only occurs between local neighborhoods. Consequently, cooperation persists and even pervade the whole population. However, what roles of avoiding the exploitation of defectors and of avoiding the invasion of defectors respectively are is still untangled. In this investigation, the evolution of cooperation on adaptive multilayer networks is studied. Evolutionary time scale varies with networks. By studying the effects of varying time scale on the evolution of cooperation, we found that both frequent interaction and learning partner switching can alleviate the social dilemma, yet switching interaction partners is much more efficient. Our result also proves that avoiding the exploitation of defectors can more readily facilitate the evolution of cooperation. In addition, Ohtsuki et al [50,51] have proven that symmetry breaking between interaction graph and replacement graph weakens the network reciprocity and hinders the evolution of cooperation on static networks. When partner switching occurs rarely, the interaction dynamics advance as if they take place on static network and induce qualitatively similar result as reported by Ohtsuki et al [50,51] that adaptive monolayer networks optimize the evolution of cooperation. However, frequent network adjustment resolves the social dilemma induced by symmetry breaking between interaction graph and replacement graph. Cooperators flourish most on adaptive multilayer networks when frequent partner switching ensures a timely response to defection.
Our findings may provide new insights into many prior results. For instance, Wang et al [54] studied the evolution of cooperation on two-layer scale-free networks with all possible combinations of degree mixing, wherein interaction layer and strategy dispersal layer are asymmetric. However, degree mixing fails to resolve the cooperation dilemma induced by asymmetry networks. Only if the topologies of two-layer networks are symmetry, can the cooperators are best favored with disassortative mixing. However, amplifying the ranges of interaction and learning helps to resolve cooperation dilemma inducing by asymmetric interaction and learning. Xia et al [57] found that the medium-sized learning (interaction) range favors cooperators to spread whenever the interaction (learning) neighborhood is fixed. Particularly, enlarging the range of interaction does better than enlarging that of learning in favoring the evolution of cooperation. In our model, by the means of coevolution, the problems of strategy evolution transform into ones in a well-mixed population under a different game. In this sense, the partner switch frequency is just corresponding to the range variations of interaction and learning, indicating that switching interaction partners is much more efficient in promoting the evolution of cooperation. Furthermore, group interactions provide a framework for two players interact with each other not only in games originated by themselves but also in games initiated by their common neighbors, and make it possible for collective cooperation to evolve in asymmetric environments. In a seminal paper, Su et al [58] established an analytical model to predict the evolution of collective cooperation when interaction graph overlap learning graph, and found that when player interacts more times with a next nearest than with a nearest neighbor, asymmetric structures of interaction and learning could provide more advantages for cooperators to evolve.
In our work, adaptive networks are introduced to resolve the social dilemma induced by asymmetric spatial structures of interaction and learning, and thus provide a new way of affecting contact networks for individuals to transform into cooperative societies. However, we realize that the evolution of real-world spatial structures is more complicated. The evolution of moral behaviors such as honesty [59,60] and trust [61], the form of strategy propagation [62,63] and more realistic collective interaction [64,65] produce couple dynamics between population structure and the evolution of strategies. How these couple dynamics affect the spreading of strategies is interesting aspects to study evolutionary dynamics on asymmetric networks of interaction and learning. More importantly, exploring methods to solve the social dilemma induced by asymmetric spatial structures of interaction and learning is an enduring challenge, but it is a key to understand how network reciprocity affects the evolution of cooperation. Finally, we shall emphasize that the current work is just based on Monte Carlo simulation, and we thus expect to provide analytical formulas and experimental studies to predict the evolution of cooperation.