Circadian pattern and burstiness in mobile phone communication

The temporal communication patterns of human individuals are known to be inhomogeneous or bursty, which is reflected as the heavy tail behavior in the inter-event time distribution. As the cause of such bursty behavior two main mechanisms have been suggested: a) Inhomogeneities due to the circadian and weekly activity patterns and b) inhomogeneities rooted in human task execution behavior. Here we investigate the roles of these mechanisms by developing and then applying systematic de-seasoning methods to remove the circadian and weekly patterns from the time-series of mobile phone communication events of individuals. We find that the heavy tails in the inter-event time distributions remain robustly with respect to this procedure, which clearly indicates that the human task execution based mechanism is a possible cause for the remaining burstiness in temporal mobile phone communication patterns.


Introduction
In recent years, modern information-communication technology (ICT) has opened up access to a large amount of stored digital data on human communication, which in turn has enabled us to gain unprecedented insights into patterns of human behavior and social interaction. For example, we can now study the structure and dynamics of large-scale human communication networks [1][2][3][4] and the laws of mobility [5][6][7], as well as the motifs of individual behavior [8][9][10][11][12][13][14]. One of the robust findings of these studies is that human activity over a variety of communication channels is inhomogeneous, such that high activity bursts of rapidly occurring events are separated by long periods of inactivity [15][16][17][18][19][20]. This feature is usually characterized by the distribution of inter-event times τ , defined as the time intervals between, e.g., consecutive e-mails sent by a single user. This distribution has been found to have a heavy tail and show a power-law decay as P(τ ) ∼ τ −1 [8].
In human behavior, the obvious causes of inhomogeneity are the circadian and other longer cycles of our lives as a result of natural and societal factors. Malmgren et al [9,10] suggested that the approximate power-law scaling found in the inter-event time distribution of human correspondence activity is a consequence of the circadian and weekly cycles affecting us all, such that the large inter-event times are attributed to night-time and weekend inactivity. As an explanation they proposed a cascading inhomogeneous Poisson process, which is a combination of two Poisson processes with different time scales. One of them is characterized by the timedependent event rate representing the circadian and weekly activity patterns, while the other corresponds to the cascading bursty behavior with a shorter time scale. Their model was able to reproduce an apparent power-law behavior in the inter-event time distribution of e-mail and postal mail correspondence. In addition, they calculated the Fano and Allan factors to indicate the existence of some correlations for the e-mail data as well as for their model of the inhomogeneous Poisson process, with quite a good comparison [12].
However, the question remains whether in addition to the circadian and weekly cycledriven inhomogeneities there are also other correlations due to human task execution that contribute to the inhomogeneities observed in communication patterns, as suggested, e.g., by the queuing models [8,21]. There is evidence for this from Goh and Barabási [22], who introduced a measure that indicates that the communication patterns have correlations. Recently, Wu et al [23] studied a modified version of the queuing process proposed in [8] by introducing 3 a Poisson process as the initiator of localized bursty activity. This was aimed at explaining the observation that the inter-event time distributions in short message (SM) correspondence follow a bimodal combination of power-law and Poisson distributions. The power-law (Poisson) behavior was found to be dominant for τ < τ 0 (τ > τ 0 ). Since the event rates extracted from the empirical data have time scales larger than τ 0 (also measured empirically), a bimodal distribution was successfully obtained. However, in their work, the effects of circadian and weekly activity patterns were not considered, thus needing to be investigated in detail.
As the circadian and weekly cycles affect human communication patterns in quite obvious ways, taking place mostly during the daytime and differently during the weekends, our aim in this paper is to remove or de-season from the data the temporal inhomogeneities driven by these cycles. Then the study of the remaining de-seasoned data would enable us to gain insight into the existence of other human-activity-driven correlations. This is important for two reasons. Firstly, communication patterns tell us about the nature of human behavior. Secondly, in devising models of human communication behavior, the different origins of inhomogeneities should be properly taken into account; is it enough to describe the communication pattern by an inhomogeneous Poissonian process or do we need a model to reflect correlations in other human activities, such as those due to task execution?
In this paper, we provide a systematic method to de-season the circadian and weekly patterns from the mobile phone communication data. Firstly, we extract the circadian and weekly patterns from the time-stamped communication records, and secondly, these patterns are removed by rescaling the timing of the communication events, i.e. phone calls and SMs. The rescaling is performed such that the time is dilated or contracted at times of high or low event activity, respectively. Finally, we obtain the inter-event time distributions by using the rescaled timings and comparing them with the original distributions to check how the heavy tail and burstiness behavior are affected. As the main results we find that the de-seasoned data still show heavy tail inter-event time distributions with power-law scalings, thus indicating that human task execution is a possible cause of the remaining burstiness in mobile phone communication.
This paper is organized as follows. In section 2, we introduce methods for de-seasoning the circadian and weekly patterns systematically in various ways. By applying these methods the values of burstiness of inter-event time distributions are obtained and subsequently discussed. Finally, we summarize the results in section 3.

De-seasoning analysis
We investigate the effect of circadian and weekly cycles on the heavy-tailed inter-event time distribution and burstiness in human activity by using the mobile phone call (MPC) dataset from a European operator (national market share ∼ 20%) with time-stamped records over a period of 119 days starting from 2 January 2007. The data of 1 January 2007 are not considered due to its rather unusual pattern of human communication. We have only retained links with bidirectional interaction, yielding N = 5.2 × 10 6 users, L = 10.6 × 10 6 links and C = 322 × 10 6 events (calls). For the analysis of SM dataset, see the appendix.
We perform the de-seasoning analysis by defining first the observable. For an individual service user i, n i (t) denotes the number of events at time t, where t ranges from 0 seconds, i.e. the start of 2 January 2007 at midnight, to T f ≈ 1.03 × 10 7 s (119 days). The total number of events s i ≡ T f t=0 n i (t) is called the strength of user i. In general, for a set of users , the number of events at time t is denoted by n (t) ≡ i∈ n i (t). can represent one user, a set 4 of users or the whole population. If the period of cycle T is given, the event rate ρ ,T (t) with 0 t < T is defined as For convenience, we redefine the periodic event rate with period T as ρ ,T (t) = ρ ,T (t + kT ) with any non-negative integer k for 0 t < ∞. By means of the event rate, we define the rescaled time t * (t) as follows [12]: This rescaling corresponds to the transformation of the time variable by ρ * (t * )dt * = ρ(t)dt with ρ * (t * ) = 1. Here ρ * (t * ) = 1 means that there exists no cyclic pattern in the frame of rescaled time. The time is dilated (contracted) at the moment of high (low) activity.
In order to check whether the rescaling affects the inter-event time distributions and whether still some burstiness exists, we reformulate the inter-event time distributions by using rescaled event times and compare them with the original distributions. The definition of the rescaled inter-event time from the rescaled time is straightforward. Considering two consecutive events of a user i ∈ occurring at times t j and t j+1 , the original inter-event time is τ ≡ t j+1 − t j , then the corresponding rescaled inter-event time is defined as follows: To find out how much the de-seasoning affects burstiness, we measure the burstiness of events, as proposed in [22], where the burstiness parameter B is defined as Here σ τ and m τ are the standard deviation and the mean of the inter-event time distribution P(τ ), respectively. The value of B is bounded within the range of [−1, 1] such that B = 1 for the most bursty behavior, B = 0 for neutral or homogeneous Poisson behavior and B = −1 for completely regular behavior. The burstiness of the original inter-event time distribution, denoted by B 0 , is to be compared with that of B T of the rescaled inter-event time distribution for a given period T . With the de-seasoning the burstiness is expected to decrease and here we are most interested in knowing by what amount the burstiness decreases when using T = 1 day or 7 days, i.e. removing the circadian or weekly patterns.

Individual de-seasoning
First, we perform the de-seasoning analysis for individual users with various values of T . For some sample individuals in the MPC dataset, we obtain the original and the rescaled event rates. The event rates in the case of T = 1 day are depicted in the left column of figure 1. The strengths of individual users are 200, 400, 800, 1600 and 3197. In most cases, we find the characteristic circadian pattern, i.e. inactive night-time and active daytime with one peak in the afternoon and another peak in the evening. The rescaled event rate successfully shows the expected de-seasoning effect, i.e. ρ * (t * ) = 1, except for weak fluctuations.
ΔB T T=1 day 28 days  figure 3(a). The more active users have larger values of burstiness, while the values of burstiness of the more active users decay faster (slower) than those of the less active users before (after) T = 7 days. The overall behavior of the distributions shows that de-seasoning the circadian and weekly patterns does not destroy the bursty behavior of most individual users irrespective of their strengths. In addition, we find some exceptional users whose original values of burstiness are negative, indicating more regular behavior than the Poisson process, and we also find a few individual users whose values of burstiness have grown as a result of de-seasoning, i.e. B T > 0.

De-seasoning groups of individuals with the same strength
Here we analyze the group of individual users with the same strength, i.e. s ≡ {i|s i = s}. The averaged event rate of a group is measured by merging individual event rates, precisely by obtaining n s (t) = i∈ s n i (t). Figure 4 shows the original and the rescaled event rates with T = 1 day (left) and the original and the rescaled inter-event time distributions with various periods of T (right) for groups with strengths s = 200, 400, 800 and 1600. The values of burstiness decrease only slightly as T increases, but are smaller than those of the original burstiness, as shown in figure 3(b). The burstiness of groups of individuals with the same strength is larger than the average values of individual burstiness from P(B) of the same strength. For example, B 0 ≈ 0.256 for the group of strength s = 200 turns out to be larger than P(B 0 )dB 0 ≈ 0.204. Regarding this difference, we would like to note that the de-seasoning of individual event times by means of the averaged event rate may cause systematic errors due to the different circadian and weekly patterns between the individual and the group. To resolve this issue, various data clustering methods, such as self-organizing maps, can be used to classify users' activity patterns beyond their strengths and then perform de-seasoning separately for the different groups.

De-seasoning groups of individuals with a broad range of strengths
For the larger-scale analysis, we consider the strength-dependent grouping of users, i.e. groups of individual users with a broad range of strengths, denoted by m 1 ,m 2 ≡ {i|m 1 s i < m 2 }, similarly to [19,20]. The values of ms are determined in terms of the ratio to the maximum strength s max = 7911; see table 1 for details of the groups. We determine the averaged event rates of the groups and some of them are shown in the left column of figure 5. By means of the event rates, we perform the de-seasoning to obtain the rescaled inter-event time distributions; see the right column of figure 5. It is found that the values of burstiness initially decrease slightly and then stay constant at relatively large values as T increases, as shown in figure 3(c). These results again confirm our conclusion that de-seasoning the circadian and weekly patterns does not wipe out the bursty behavior of human communication patterns.

Power spectra analysis
In order to see clearly the effect of de-seasoning on the event rates, we compare the power spectra of the rescaled event rates to the original event rates. The power spectrum of the event rate is defined as where f denotes the frequency. In figure 6 we show the results of this comparison, where it is evident that with our de-seasoning methods, the circadian and weekly peaks of the original power spectrum are successfully removed and do not show in the rescaled spectra. In all cases, ranging from individual de-seasoning to the whole population de-seasoning, when T = 1 day, the circadian peak at 1/ f = 1 day is removed, while the others, i.e. the weekly and monthly peaks, remain. For T = 7 days, the weekly peak at 1/ f = 7 days is removed and so on. This behavior can be understood because cyclic patterns longer than T will not be affected by de-seasoning with the period T .

Summary
The heavy tails and burstiness of inter-event time distributions in human communication activity are affected by circadian and weekly cycles as well as by correlations rooted in human task execution. To investigate the existence of correlations rooted in human task execution, we devised a systematic method to de-season circadian and weekly cycles appearing in human activity and successfully demonstrated their removal. Here the circadian and weekly patterns extracted from the MPC and SMs records are used to rescale the timings of events, i.e. the    time is dilated or contracted during high or low call or SM activity of individual service users, respectively.
We found that after de-seasoning circadian and weekly cycles-driven inhomogeneities, the heavy tails and burstiness of inter-event time distributions remain. Hence our results imply that the heavy tails and burstiness are not only the consequence of circadian and weekly cycles but also due to correlations in human task execution. In addition, we calculated the Fano and Allan factors [12] for the de-seasoned mobile phone communication data, which yielded further evidence of the remaining burstiness in both MPC and SMs activities of human communication.
Although beyond the scope of the present study, as the next step one needs to focus on describing the mechanisms of the remaining burstiness in human communication patterns, served well by building models and analyzing their results.
We also investigated the effect of circadian and weekly cycles on the heavy-tailed inter-event time distribution by using the SM dataset from the same European operator mentioned in the main text. We have only retained links with bidirectional interaction, yielding N = 4.2 × 10 6 users, L = 8.5 × 10 6 links and C = 114 × 10 6 events (SMs). We have merged some consecutive SMs sent by one user to another user within 10 s into one SM event, because one longer message can be divided into many SMs due to the length limit of a single SM (160 characters) [24].
We apply the same de-seasoning analysis described in the main text to the SM dataset. Figure A.1 shows various circadian activity patterns of individual users. The main difference from the MPC dataset is that the activity peak is about 11 pm. This feature becomes evident when the averaged event rates are obtained from the same strength groups or from groups with a broad range of strengths, shown in the left columns of figures A.4 and A.5. For details of the strength-dependent grouping, see table A.1.
The inter-event time distributions and their values of burstiness are also compared. At first, the distributions cannot be described by the simple power-law form but by the bimodal combination of power-law and Poisson distributions as suggested in [23]. As shown in figure A.3, the values of burstiness slowly decrease as the period T increases up to 7 days, which implies that de-seasoning the circadian and weekly patterns does not considerably affect the bursty behavior.
Finally, we perform the power spectrum analysis for the SM dataset in figure A.6 and find again that cyclic patterns longer than T cannot be removed by the de-seasoning with period T . All these results confirm our conclusion that the heavy tail and burstiness are not only the consequence of circadian and other longer-cycle patterns but also due to other correlations, such as human task execution.