Effect of information transmission on cooperative behavior

Considering the fact, in the real world, that information is transmitted with a time delay, we study an evolutionary spatial prisoner's dilemma game where agents update strategies according to certain information that they have learned. In our study, the game dynamics are classified by the modes of information learning as well as game interaction, and four different combinations, i.e. the mean-field case, case I, case II and local case, are studied comparatively. It is found that the time delay in case II smoothes the phase transition from the absorbing states of C (or D) to their mixing state, and promotes cooperation for most parameter values. Our work provides insights into the temporal behavior of information and the memory of the system, and may be helpful in understanding the cooperative behavior induced by the time delay in social and biological systems.


Introduction
In behavioral sciences, in evolutionary biology and, more recently, in economics, understanding the conditions for the emergence and maintenance of cooperative behavior among unrelated and selfish individuals has become a central issue [1,2]. For the investigation of this problem, the most popular framework is game theory, together with its extensions involving the evolutionary context [3,4]. The prisoner's dilemma (PD) game is a general paradigm to explain cooperative behavior. It describes the situation when cooperation gives the highest population-level payoff, but, in a short timescale, defection maximizes the expected payoff of an individual. In wellmixed populations, cooperators cannot outperform defectors and are doomed to become extinct [3]. There are, however, many factors that favor cooperation in realistic systems. In particular, the spatially structured interactions among elements are found to benefit cooperative behavior, either for the case with static spatial networks [5]- [14] or for the case with dynamic networks [15]- [18], and thus the evolutionary spatial PD game dynamics have attracted much attention in the past few years.
There are two types of contacts among players in the evolutionary spatial game process: players collect payoffs from their neighbors by playing games with them, and then they update their strategies by learning from neighbors. These two processes are restricted to the neighborhoods in the corresponding underlying networks, namely the so-called game interaction network (IN) and the information learning network (LN) [19,20]. In most of the existing works, the IN and LN are assumed to be identical. However, the works in [19]- [24] have shown that, when the game interaction and information learning are based on different structures, the evolution of cooperation will be affected remarkably.
In most realistic physical and biological systems, the interaction signal (such as sound, neural spike, etc) is transported through a certain medium with a limited speed, which thus induces a time delay in receiving the signal [25,26]. The neural network, where information propagation takes place, is a good example. Time delay in propagation has been demonstrated to have a substantial influence on the temporal characteristics of oscillatory behavior of neural circuits [27,28]. Furthermore, the effects of time delay have also been widely studied in the questions about synchronization [29,30], coupled oscillator [31], reaction-diffusion processes [32] and the diversity of species in the ecosystem [33], etc.
As a natural extension of the aforementioned factors, a new intriguing task is to understand how the time delay in information transmission influences the cooperative behavior in the real world. In the present work, we will study the evolutionary PD game with the information of each individual (strategy and payoff) transmitted in the network step by step, which may induce a time delay. In the rest of the paper, we first introduce our model, defining the cases combining 3 different modes of game interaction and information learning. Then we present our numerical results and discussion in detail, and relate these to other studies.

The model
In the classical evolutionary PD game, players can make two choices: either to cooperate with their co-players or to defect. They are offered some payoffs depending on their choices, which can be expressed as a 2 × 2 payoff matrix: The players get rewards R (or punishment P) if both choose to cooperate (or defect). If one player cooperates while the other defects, then the cooperator C gets the lowest payoff S (sucker's payoff), while the defector D gains the highest payoff T (the temptation to defect). Thus, the elements of the payoff matrix satisfy the conditions T > R > P > S and 2R > T + S, so that they lead to a so-called dilemma situation where mutual cooperation is beneficial in a long perspective, but defection can produce big short-term profits. We assume that a cooperator pays a cost c for another individual to receive a benefit b (b > c), and a defector pays no cost and does not distribute any benefits. Thus the reward for mutual cooperation is R = b − c, the sucker's payoff S = −c, the punishment for mutual defection is P = 0, and the temptation to defect is T = b. Following [34], the payoffs are rescaled such that R = 1, T = 1 + r , S = −r and P = 0, where r = c/(b − c) denotes the ratio of the costs of cooperation to the net benefits of cooperation. There are two types of contacts among players in the evolutionary process: players collect payoffs from their neighbors in the game interaction and, subsequently, they update their strategies according to the information they learned from others. Here, we consider that the information of each agent (e.g. the strategy adopted and the payoff obtained) diffuses in the underlying network step by step and will be obtained by all the other individuals after a certain period of time. Here, I i (t) is defined as the information of one player i at time t (t = 0, 1, 2, . . .), which is discrete. Two simple cases are valuable to study. Case I, the information transmission network is fully connected. Thus, at time t, each individual obtains information from all the others emitted at t − 1. Then, the information collection for one given agent i to update strategy at time t is global and instant, which we denote by . Case II, the information transmission network is identical with the game interaction network. And, in each Monte Carlo (MC) time step, the information from each agent diffuses on the network for one step. Then, I i (t) is obtained by i's next nearest neighbors at time t + 2, and obtained by those individuals d steps away form i at time t + d. Thus the global information collection for one given agent i to refer at time t is with time delay, which we denote as {I j (t − d i j )} ( j = 1, 2, . . . , N , j = i, and d i j is the distance between i and j that is measured by the number of edges along the shortest path between the nodes i and j). In both of these two cases, information for strategy update is global, and the game interactions among players are locally restricted to their nearest neighbors.
After each generation, the players try to maximize their individual payoffs by updating their strategies. Following previous studies, player i will adopt the randomly chosen one of its information collection j's strategy s( j) with a probability depending on the payoff difference where κ characterizes the noise introduced to permit irrational choices. κ = 0 and κ → ∞ denote the completely deterministic and completely random selection of the j's strategy s( j), respectively; while for any finite positive values, κ incorporates the uncertainties in the strategy adoption, i.e. the better one's strategy is readily adopted, but there is a small probability of selecting the worst ones. The effect of noise κ on the evolution of cooperation for the spatial PD game has been studied in detail in [35]- [41]. Since this issue goes beyond the purpose of the present work, in all our following studies, we simply fix the value of κ as 0.1.

Simulation results and discussion
It is known that, for the evolutionary PD game upon the fully connected underlying network both for game interaction and information learning, the evolution of cooperation can be well described by the mean-field (MF) solution, which finally achieves the Nash equilibrium with ρ c = 0 [7,8]. From the perspective of game interaction network (IN) among individuals, as well as the information learning for strategy update, we can consider this MF case as that with global IN and global instant information. Also, we know that the case with both local IN and local instant information has been extensively studied in the previous works [7,8], which here we call the 'local case'. Case I and case II, which we have proposed in this paper, are the transition cases from the MF case to the local case. Case I corresponds to the local IN and global instant information, whereas case II corresponds to the local LN and global information with time delay. Table 1 shows the comparison of the aforesaid four cases from the perspective of game interaction and information learning. We comparatively investigate the aforesaid cases using MC simulations. Both the regular rings and the square lattices are studied as the game interaction networks. Initially, cooperators or defectors are chosen randomly with equal probability to occupy each site. We calculate the average fraction ρ c of cooperators in the population in the stationary state as a function of r , which is measured for the last 10 000 time steps of the total simulation time 1 × 10 5 (see figure 1). The results we present are averages over ten realizations from independent initial configurations. Comparing the two cases with global instant information (refer to table 1), i.e. the MF case (with global game interaction) and case I (with local game interaction), the nonzero cooperation level of case I implies that the local game interaction favors cooperation more than the global one.
Also, it is obvious that, for case I, as r increases through a nonzero threshold value r d in the small r region, the system evolves from the absorbing state of C to the mixed state of C and D. However, for case II, the absence of the threshold r d is explicit for both the k = 4 and the k = 8 systems. That is to say, the cooperation level of case II is lower than for case I in the small r region. Nevertheless, when r is slightly larger (in the region r 0.004), ρ c of case II turns out to be larger than that of case I. Thus we know that the time delay of the global information (case II), although it induces the earlier rise of D, can favor cooperation in the major region of r more than the case without time delay (case I). Additionally, comparing the results of k = 4 with k = 8, we see that the cooperation levels of both case I and case II are higher in the sparser network.
The results of the local case (with both local IN and local instant information) are also plotted in figure 1 (solid triangles) for comparison. We can see that, for different values of r , ρ c of the local case is always higher than that of case I. From the difference between the local case and case I, we know that, when game interaction is locally designed, the local instant information (in the local case) promotes cooperation more than the global instant information (in case I). Furthermore, it is remarkable from the result that global information with time delay (case II) obtains a higher cooperation level than the global instant ones (case I) and, moreover, may perform better than even the local case when r is large (see figure 1).
From the aforesaid comparison, we see that, in general, the locally assigned game interaction and local information are more beneficial for the persistence of cooperation, whereas time delay in the global information transmission, which introduces the effect of historical state of the system, is found to be important in the evolution of cooperation. Figure 2 shows the time series of cooperation density ρ c (t) for the systems of case I and case II, respectively. Obviously, compared to case I, the fluctuation of case II is much smaller, and the relaxation time to the evolutionary stable state is longer. We can understand the different behaviors of case I and case II (as have been shown in figures 1 and 2) from the so-called historical memory of the whole system. We know that, for case I, the agent's new strategy at time t is obtained by merely referring to the last instant information of the system at t − 1, while for case II, the strategy updating process is obtained by referring to the information that goes through a period [t − max(d i j ), t − 1]. Considering a system in case II as a whole, in contrast to case I, the historical information within max(d i j ) time steps is taking effect, which contains comprehensive and diversiform evaluations of the C and D strategies. In the historical memory of the system, the evaluations of one given strategy may vary from its present actual performance. For example, when the payoff of C is low in the large r region, the optimistic evaluations of C are still preserved in the system memory from its former good records, which thus increases the probability of agents adopting C. Indeed, the evaluation of C or D from the historical memory can be more moderate than that from the instant information from the last step. The simulation result in figure 1 validates the aforesaid analysis: on the one hand, the system of case II with 'memory' may keep a high cooperation level in the large r region, where that of case I has already evolved to the absorbing state of D. On the other hand, when D is about to be extinct at very low r , in case II the good record of D in the system memory induces the gradual fading of D, rather than the sudden extinction of D in case I (see figure 1). Additionally, we can imagine that the smaller fluctuation of ρ c (t) and longer relaxation time for case II can also be attributed to the moderate evaluations of strategies in the system memory.
The previous investigation by Vukov et al [42] has shown that the spatial game on a one-dimensional regular ring is essentially different from other regular lattices. Therefore, simulation results on the square lattice with degree k = 4 and k = 8 have also been shown in this paper (see figure 3). For case I and case II, simulation results on the square lattice and regular ring are similar. For the local case with k = 4, we recover the results of [19]. As shown in figure 3, global information learning networks obtain a higher cooperation level than the local case when k = 4. But opposite results are shown when k = 8. That is, the cooperation level of the local case is higher than case I and case II in the large r region when degree k = 8.
In our study, in case II the historical information of the agent impacts the dynamical processes. Indeed, the effect of historical state on the evolution of cooperation is an important issue that has also been visited in some recent works [43]- [45]. Wang et al [43] have presented a memory-based snowdrift game, in which the effect of historical state was introduced by the memory of each individual (recording its own past optimal strategies by self-questioning processes). The memory effects of individuals are discussed in detail and some nonmonotonous phenomena are observed. More recently, Wu et al [44] have studied the influence of heritability of fitness or, in other words, the maternal effects [46] on the evolutionary spatial PD game. They introduce the effect of historical state by associating the agent's fitness with the payoffs both from the current interaction and from their historical interactions. And the cooperation level is found to increase with the relative importance of the previous payoffs to the agent's fitness. In another related work, Qin et al [45] have explored the effects of infinite memory on the historical payoffs and strategies. Their memory mechanism directly introduces the effect of historical state to the dynamics, and cooperation is found to be enhanced by an increasing memory effect for most parameters. Different from [43]- [45], in our model, the effect of historical state and the socalled system memory is brought on by the time delay in information transmission. As far as we know, the time delay of information has not received much attention in the study of evolutionary spatial games and is worth further discussion in the future.

Conclusion
To summarize, in this paper we have studied the evolutionary spatial PD game considering information transmission with time delay. Various combinations of information learning modes and game interaction modes are studied comparatively, which includes the MF case, the local case and the transition cases we proposed, namely case I and case II. In general, it is found that the locally assigned game interaction and local information may favor cooperation more than the global ones, except for the results on a square lattice with degree k = 4. However, it is worth noting that the factor of information transmission with time delay gives rise to rich dynamic behavior of the system. That is, the time delay in the spreading of global information in case II induces the effect of historical state and is found to enhance cooperation more than even the system with local instant information for certain parameters. Also, the historical information, spreading among individuals with time delay, can be collectively considered as the record in system memory. This memory mechanism effectively moderates the evaluation of strategies in the updating processes, thus reducing the fluctuation of the system and thereby smoothing the transition process from the absorbing state of C (or D) to the mixing state of C and D.
These findings suggest that the information with time delay might benefit cooperation in the PD situation, giving another clue to the emergence of cooperation in social and biological systems of selfish individuals. We may carefully propose that successful modeling in the real world, such as social or (sub-)culture dynamics, will require a careful consideration of the involved information time delay along the lines discussed here.