An overview of the second-previous memory effect in the strictlyalternating donation game

Game theory delves into the examination of strategic behaviour across diverse domains such as insurance, business, military, biology, and more, with the aim of deriving optimal decisions. Recent research focusing on the alteration of memory in the donation game with simultaneous iterated rounds has spurred our interest in investigating this phenomenon within the realm of the strictly alternating donation game. This study proposes a novel decision-making approach, utilizing the pre-previous unit instead of the most recent one. The scope narrows down to 16 employed strategies, each de ﬁ ned by ﬁ nite two-state automata, while accounting for potential implementation errors in the computation of strategy payoffs. Dominant strategies are determined by assessing the interaction payoffs among strategy pairs. This article centers on the calculation of equilibrium points among heteroclinic three cycles, as there is a lack of a single strategy that is unequivocally dominant. Among the strategy


Introduction
Operations research, also known as the study of optimization strategies, is often referred to as a scientific method of decision-making [1].The mental processes that result in choosing a course of action from a variety of options are referred to as decision-making.Every decision-making process ends with a final choice.In many real-world situations, choices must be made in a setting of conflict between two or more opposing parties, each of whose actions is dependent upon the others.Such a competitive environment is referred to as a 'game' [2].The game puts players against one another in a race to achieve goals.There are two primary categories of games: games of chance, like roulette, and games of strategy, like poker.The game of strategies will be studied in this paper.Finding the best strategy that maximizes the gain and minimizes each player's loss is the key objective [3].
The history of game theory began with the solution of the two-player card game by Waldegrave using the minimax mixed strategy in 1713 [4].Then Von Neumann and Morgenstern used game theory in economics [5].John Nash also explored non-cooperative games [2].
Game theory concepts have been used to solve numerous applications in politics, business, biology, military, and many other fields [6,7].These studies urged researchers to discover famous games deeply.One of the most interesting games in evolutionary game theory is the so-called donation game [8,9], which can be applied to such fields as political science and environmental problems [10].
According to several researches, the donation game is a contest between two participants players.The two options available to each player are to defect (D) or cooperate (C).In the case of cooperation, the donor will pay a cost (c) where the recipient will gain a benefit (b).In the case of defection, there is no cost or benefit.If the two players cooperate, both will receive (b-c).If both defect, they will both receive (0).If the decision is made differently, the cooperator will receive (-c) and the defector will receive (b) [11][12][13].The payoff matrix that follows serves as a good representation of the donation game.
It assumed that b > c > 0. Taking this assumption into account, we find that the provided payoff matrix is an example of the famous symmetric 2 × 2 matrix [9].
The procedures of these studies were approved by these traditional conditions.
Defection is the best choice for players if only one round is played to avoid the lowest payoff (S).As a result, both players will receive (P) instead of the payoff of mutual cooperation (R).On the other hand, defection is not the preferable strategy if the donation game will be repeated several times (repeated donation game).Each player will develop his strategy based on previous rounds to increase his payoff.So, the decisions taken by every player will affect the other players reaction in the coming rounds which in turn affects the players payoff [14].This turns players into using mutual cooperation rather than mutual defection [15,16].
Repeated game research has a lengthy history.Repeated donation game has been the subject of many prior studies, much of which have focused on creating a new state utilizing the results of the previous unit [17][18][19][20][21][22][23].One of the most frequently cited issues with memorizing is ignorance or delay of the preceding round [3], this delay occurs in the real-time situation when rounds occur at the same time and the decision of some rounds isnt known.In 2016; EL-Seidy et al [24,25] investigated the process of creating a new state from the pre-previous one in a simultaneous iterated prisoner dilemma game.So, the present research explores the generating of the new state from the second-previous unit (unit consists of two sequential rounds) in an alter nating repeated donation game.
It is possible to categorize the game as either a simultaneous or an alternating game.The two players may participate in the same round in the simultaneous game.In the alternating game, every player will play in a separate round like chess [26,27].This research will investigate the alternating repeated donation game.
The two players in an alternating game are not permitted to decide during the same round.Each player will decide lonely in a separate round and the other player will react lonely in a subsequent round [18].In any round, the person who makes the decision is referred to as the leader (donor), and the other player is referred to as the recipient.Each unit consists of two rounds.
The two players have an equal chance of being the leader in the alternating game [28,29].This game is known as the strictly alternating game when the two players switch roles as the recipient for each round [30].The random alternating game depends on the irregular flipping of the leader role [31].In this work, we focus on the strictly alternating game.
The two available selections for the leader are C and D. The leader and recipient gains a and b, respectively, if the leader chooses C.But if the leader selects D, the leader, and the recipient gain c and d respectively.Its assumed that [32].
In the same unit, the two players receive a + b, if the two players play C. While both receive c + d, when both play D. If the two players take counter decisions, a + d, and c + b are the outcomes of the defector and the cooperator respectively.The previous outcomes are similar to the outcomes of the simultaneous donation game using the following equations [32].
T S P R.

( ) + = +
The previous equations together with the inequalities (3) imply that T > R > P > S and S + T < 2R.These are the simultaneous donation game conditions.
The behavior of strategies will be studied using domination which proves that there are no absolutely dominant strategies, so mixed strategies are studied [33].
The study of equilibrium between more than three strategies is difficult and time-intensive.So, heteroclinic three-cycles will only be studied.Three different strategies-A, B, and C-are known as heteroclinic three-cycles, characterized by A invading B, B invading C, and C returning to invading A [34,35].Every heteroclinic threecycle equilibrium point will be determined using values (T = 4, R = 3, P = 1, S = 0) (these values are used because Axelrods values dont satisfy the equation (T + S = R + P) [36].Game dynamics will be used to implement this.The use of mixed techniques across all heteroclinic three cycles was studied.
This paper attempts to study the behaviour of strategies using the second-previous unit in strictly alternating repeated donation game under the noise effect, which clarifies that defective strategies have superior performance and crush cooperative ones.Strategy S 2 shows the best performance because it satisfies the behaviour of rival strategies [37,38] with spiteful behaviour [39,40] which is not defeated by any other strategy.Strategy S 0 and S 8 show good performance but are defeated by some strategies.Unfortunately, the partner strategy S 10 shows moderate performance in this case study.Interestingly, some moderate performance memory-one strategies perform better when using the second-previous unit instead of the most recent one.But there is no absolutely dominating strategy, so the heteroclinic three cycles are also studied.Strategies S 10 and S 11 appear frequently in the majority of heteroclinic three-cycles.

States generation technique
This paper's technique is based on using the second-previous unit (two consecutive rounds) decisions instead of the immediately previous one to generate the new round.Thus, the fourth round will be generated using the decisions of the first unit (the first round of each player) and the fifth round will be generated state using the second unit (the first round of the second player with the second round of the first player).The second-previous unit is used since the immediately preceding round was unknown.The reward is obtained by averaging the payouts from each iteration of this process, which is performed continuously.
Each player can employ an endless number of strategies.It will be difficult to study all of them.We will only look at calculation-saving strategies developed by two-state automata.To transform a current state into a new one, a two-state automaton is utilized.The two edges C and D that exit from each node make up the two-state automata.An additional arrow is included to show how the first state is imposed.There are only sixteen strategies that a two-state automaton may create.
Each units alternative outcomes are the pairs (C, C), (C, D), (D, C), and (D, D).These pairs are made up of the decisions taken by the players and result in these payoffs R, S, T, and P, respectively.In the binary system, the strategies used are represented as quadruples of 0s and 1s.Each digit depicts the competitor's response to one of the four potential outputs of the used round (CC, CD, DC, and DD).The subsequent move for the player will either be D for a value of 0 or C for a value of 1.The digital form of the Tit-For-Tat strategy S 10 is (1, 0, 1, 0) as shown in figure 1 while the grim strategy S 8 is (1, 0, 0, 0), and the tweedled strategy S 11 is (1, 0, 1, 1).As a result, there are sixteen distinct strategies denoted by S 0 , S 1 , S 15 .The contention between S 8 versus S 11 will be discussed as an example to illustrate the memory-two states generation.
Firstly we will illustrate the difference between memory-one and memory-two in states generation.
Following the imposition of the first three rounds, new rounds will be produced.In the first sequence, the S 8 -player is assumed to play (C) in his first two rounds, and the S 11 -player is assumed to play (D) in his first round, then the new states are generated according to the states generation technique described previously, resulting in player S 8 playing (D) in each round and player S 11 switches between (C) and (D).The repetition period of this sequence is four rounds with payoffs (T, T, P, P).The value of the average payoff for this sequence will be (P+T)/2.Thus, any sequence that has a payoff (P+T)/2 will be called approach B.
The second sequence has the payoff of approach B.
In the third sequence, the two players are assumed to play (C) in the first three rounds, and then the new states are generated according to the states generation technique described previously, resulting in players playing (C) in each round.The repetition period of this sequence is one unit.This produces an average payoff with a value of (R) and is called approach A.
The fourth sequence has the payoff of approach B.
The fifth sequence has the payoff of approach B.
The sixth sequence has the payoff of approach B.
The seventh sequence has the payoff of approach B.
The eighth sequence has the payoff of approach B.
The appeared approaches and their corresponding payoffs are shown in table 1.The payoff may be affected by some types of errors.This study will only look at implementation flaws.

Perturbed payoff (noise effect)
If any player makes an incorrect movement or decision (plays C when the transition rule specifies D or plays D when the transition rule specifies C) which contradicts the player strategy transition rule (implementation error), this may affect the generation of the new states and in turn, may affect the player payoff.The transition rule will be represented in a quadruple, like how strategies are represented digitally, but zero is replaced by ò and one is replaced by 1 − ò and ò represents the probability of making an erroneous movement.These numbers represent the probabilities to play C after R, S, T, and P. To illustrate the effect of erroneous movement, approach A will be checked when an error occurs.
Every decision in the repetition period will be changed individually, and the generation of the new states will be tracked to specify the approach and payoff.
If the first decision of the repetition period (red D) changed, approach A will be changed to approach B.
Also, if the second element of the repetition period (red D) changes, approach A will change to approach B. Consider a scenario in which two players compete using the transition rules P = (p 1 , p 2 , p 3 , p 4 ) and Q = (q 1 , q 2 , q 3 , q 4 ), respectively.Pi and qi reflect the chance of playing C following the outcome of the preprevious unit and range in value from 0 to 1.This results in a Markov process in which matrix (7) determines the transitions between the four possible states R, S, T, and P. R S T P R S T P p q p q p q p q p q p q p q p q p q p q p q p q p q p q p q p q If the strategy cube's interior contains p and q, then this stochastic matrix's entries must be strictly positive, consequently, there is a unique stationary distribution π = (π 1 , π 2 , π 3 , π 4 ) where the probability of being in the state i in the n-th round is P i n ( ) and when n → ∞ it converges to π for (i = 1,2,3,4).The sum of positive components π is one.They represent R, S, T, and P asymptotic frequencies.π is the left eigenvector of matrix (7) where eigenvalue 1 is π = (π 1 , π 2 , π 3 , π 4 ).Equation (8) provides the payoff for player P playing against player Q.The payoff is unaffected by the first imposed three rounds.
The payoff can be obtained for any level of noise ò > 0 for a player with transition rule S i against another player with transition rule S j .But, if the limit value of the payoff is computed when ò → 0 the stochastic matrix (7) will be irreducible containing many zeroes because p i and q i are zeroes and ones.This makes the vector π not uniquely defined.This pushed us to directly calculate this vector for every contention by mutations.
There are eight sequences with two approaches in the contention between S 8 against S 11 .Approach A arises in one sequence when the two players play C in the imposed rounds, while B arises in the other seven sequences.Every decision in the repetition period will be tested under probable error (playing C instead of D or vice versa).Firstly, Approach A has only two rounds on its repetition period, if one of them changed, Approach A will be converted to Approach B. Secondly, Approach B has four rounds on its repetition period, and all mutations will not change the approach if perturbation occurred.The contention of S 8 against S 11 will be as follows.The following will be the relevant transition matrix between different approaches in the contention between S 8 versus S 11 .
Every element in the previous matrix represents the probability that each approach (row approaches) may be changed to another approach (column approaches) or remain the same when a wrong decision occurs.The first row in this matrix represents the probabilities of mutations of approach A. By studying the two possible mutations in approach A, it is evident that when a wrong decision occurs in approach A, approach A will be changed to (approach B) in all possible mutations with a probability of 100%, so the value of the element in the intersection between the first row (approach A) and second column (approach B) in the previous matrix is one and the other element in the same row is zero.Approach B (in the second row) has four possible mutations, approach B will not be converted to approach A in all mutations, with probability zero in the first column and probability one in the second column.Then, the corresponding stationary distribution of contention can be calculated using the following equation.
The corresponding stationary distribution of contention between S 8 and S 11 is (0, 1).This means that asymptotically, an iterated game between S 8 player and S 11 -player is in all the time in regime B. The S 8 -player receives an average payoff.
This procedure will be repeated for every contention between every two strategies and then put into table 2 which represents the conflict payoff between any two strategies used in this paper.Strategies behaviour can be studied using domination.To study the behaviour of any two strategies (S i × S j ) with each other, the four jointed entries between the two strategies in table 2 must be extracted and reused in the next matrix.

S S
S S a a a a 14 S i and S j are equivalent if a ii = a ji and a ij = a jj .But S i dominates S j when one of these two inequalities a ii a ji and a ij a jj is attained.Table 3 is created by substituting specific values (T = 4, R = 3, P = 1, S = 0) in table 2. Table 4 shows the behaviour of the strategies using domination when these specific values are substituted.
According to table 4, all strategies can be invaded, except for Strategy S 2 , which cannot outcompete strategies S 4 , S 7 , S 10 , and S 11 .This indicates that there is no absolutely evolutionary stable strategy that invades all other strategies.So, this research stimulates players to use several pure strategies at different rates in the same game (mixed strategy).
Similar to the Rock-Scissors-Paper game, if there are three strategies, S i , S j , and S k , where S i invades S j , S j invades S k , and S k then returns to invade S i , this is referred to as a heteroclinic three-cycle.The values in the intersection between these three strategies must be extracted from table 3 and reused in a new matrix to compute the equilibrium point between any heteroclinic three-cycles and to determine the type of this cycle.Below is the construction of the payoff matrix for the interaction of strategies S 0 , S 15 and S 10 .A system of linear equations must be constructed from the values of this matrix.

S S S S S S
The preceding system and the following equation will be solved in order to identify the equilibrium point.
The values of the diagonal entries of the preceding matrix must be changed to zeros to obtain the kind of the heteroclinic three-cycles (attractors, center, and repellors) using the matrix determinant.The diagonal entry of each column will be made zero by subtracting the value of this column's diagonal entry from each entry in the same column and then getting the determinant of the matrix.By applying this procedure to matrix (14) we will subtract (1) from each entry of the first column, (3) from each entry of the second column, and (2) from the third column to construct matrix (20).The cycle type can be categorized using the determinant of the matrix (20).If the determinant equals zero, the cycle type is center and if the determinant is less than zero, the cycle type is attractor otherwise the cycle type is repellor.

--
The determinant of matrix (20) is zero.As a result, the heteroclinic three-cycle has a centre type.For every three heteroclinic three-cycles, this procedure will be run again to obtain all equilibrium points.Table 5 will contain these equilibrium points.
Table 1.Approaches and their corresponding payoffs.

Results and discussion
This research studies the impact of each strategy using a different technique to generate new units with a probability of implementation error.General conditions (table 2) and specific values (T = 4, R = 3, P = 1, S = 0) (table 3) are used to determine which strategy will dominate others.Unusually, strategies behave in the same way regardless of whether general conditions or the other used values.
It is obvious from the results of applying domination between all strategies that no strategy could defeat the rival strategy S 2 , and it could also overcome eleven strategies including the All-D S 0 , the Grim S 8 , and the Win-Stay, Lose-Shift (WSLS) S 9 strategies.Therefore, the spitefull strategy S 2 forced us to say that it has a powerful performance in this new rounds generation technique.There are further strategies that have a superior performance like S 0 , S 4 , and S 8 which crush eleven strategies at least and are only outcompeted by two other strategies.
Figure 2 and domination clarify that weak strategies allow many other strategies to invade them and are unable to defeat a large number of strategies.Unexpectedly, Strategy S 6 performs satisfactorily, outperforming five strategies and only five other strategies outperform S 6 .Poor strategies S 13 , S 14 and S 15 only succeeded in defeating three strategies at most and were defeated by eleven other strategies.Given these findings, we can say that these cooperative strategies cannot hold against defective ones.Table 4 results pointed out that the two strong strategies S 0 , and S 8 were invaded by only the same two strategies S 2 , and S 10 .The Tit-For-Tat strategy S 10 cannot invade any strategy except only the three strong strategies S 0 , S 1 and S 8 .On the other hand, S 10 was invaded by four cooperative strategies S 7 , S 11 , S 14 , and S 15 .Therefore, in this study, the partner strategy S 10 can be considered a moderately well-performing strategy.
Table 5 specified that strategies S 10 and S 11 are the most frequent strategies in heteroclinic three-cycles because they appeared in ten and eleven out of twenty-two cycles respectively.This occurs for S 10 because it attacks the strong strategies S 0 and S 8 which in turn attack the majority of the sixteen used strategies and S 11 beats four strategies while being outcompeted by eight others.Also, each of S 14 and S 15 takes part in several heteroclinic cycles which mostly involve S 10 or S 11 .This happens because these strategies attack strategies S 10 and S 11 .According to the findings of this study, any player who chooses to use mixed strategies will mostly use strategies S 10 or S 11 .

Conclusion
The primary objective of this study is to investigate the behaviour of strategies in the strictly alternating repeated donation game wherein the two players switch the rule of the leader and donor in each round under the influence of memory changes.Research focuses on utilizing a set of sixteen finite two-state automata strategies to provide an accurate characterization of the strategies that were examined in the study.
Instead of relying on the most recent unit that is commonly employed in the generation of subsequent rounds, a novel approach is employed, replacing the most recent unit by the pre-previous one to determine the new states.This choice is made due to potential delays or inadequate information associated with the most recent unit, commonly encountered in real-time scenarios.The existence of implementation errors contributes to the modification of states generation, this factor is considered when evaluating perturbed payoffs to enhance the accuracy of the obtained results.
Throughout this research, starting with the creation of states and subsequently employing the domination technique, the strategy labelled S 2 emerges as a particularly robust contender.Specifically, strategy S 2 (0, 0, 1, 0) demonstrates a non-cooperative spiteful behaviour, cooperating only if it can deceive its opponent, thereby falling under the category of a defective rival strategy.
In contrast, out of all the tactics examined, strategy S 14 performs the worst, as illustrated in figure 2. This cooperative strategy (S 14 ) with the configuration (1, 1, 1, 0) opts for defection only when both players defect in the current unit of generation.
Notably, strategies S 10 and S 11 emerge as the most prevalent tactics within the context of heteroclinic cycles because they emerged at the majority of heteroclinic three-cycles along with strategies S 14 and S 15 as declared using table 5.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).
It has two mutations because its repetition period is one unit.•If S 8 plays D instead of C A → B •If S 11 plays D instead of C A → B b) Approach B: It has four mutations because its repetition period is two units.•If S 8 plays C instead of D when S 11 C B → B •If S 8 plays C instead of D when S 11 D B → B •If S 11 plays D instead of C when S 8 D B → B •If S 11 plays C instead of D when S 8 D B → B

Figure 2 .
Figure 2. The payoff of every strategy against itself and all other strategies.The S 2 strategy is included along with three other different strategies in each of the five sub-figures.Each subfigure includes the effective strategy S 2 to highlight how it behaves better than other strategies.(a) Contains strategy S 2 besides defective strategies S 0 , S 4 , and S 8 which have the highest payoffs against most strategies but are unable to outcompete strategy S 2 because they are not stable and have high and rapid payoff fluctuations.(e) Includes weak strategies S 13 , S 14 , and S 15 which gain the lowest payoffs and strategy S 2 easily defeats them.(b-d) Involves the rest of the strategies which gain moderate payoffs but S 2 payoff stability gives it the upper hand.