How events determine spreading patterns: information transmission via internal and external influences on social networks

Recently, information transmission models motivated by the classical epidemic propagation, have been applied to a wide-range of social systems, generally assume that information mainly transmits among individuals via peer-to-peer interactions on social networks. In this paper, we consider one more approach for users to get information: the out-of-social-network influence. Empirical analyzes of eight typical events’ diffusion on a very large micro-blogging system, Sina Weibo, show that the external influence has significant impact on information spreading along with social activities. In addition, we propose a theoretical model to interpret the spreading process via both internal and external channels, considering three essential properties: (i) memory effect; (ii) role of spreaders; and (iii) non-redundancy of contacts. Experimental and mathematical results indicate that the information indeed spreads much quicker and broader with mutual effects of the internal and external influences. More importantly, the present model reveals that the event characteristic would highly determine the essential spreading patterns once the network structure is established. The results may shed some light on the in-depth understanding of the underlying dynamics of information transmission on real social networks.


I. INTRODUCTION
How social networks affect information transmission or information spreading is a pressing problem.Among the spreading phenomena studied in recent years are news [1] and rumors spreading [2,3], innovation diffusion [4,5], human behaviors [6,7], and culture transmission [8,9].The structure of a network is crucial in determining the spreading pattern and thus widely studied [10,11], with the critical phenomenon on network topology [12,13], identification of influential spreaders [14][15][16], and spreading dynamics on adaptive networks [17,18] being the focuses.With the increasing availability of real and good-quality data for analysis, the propagation paths [19,20], patterns of human activities [21,22] and locating the source [23,24] also become the hot spots in studying spreading dynamics.
Theoretical studies on information spreading are mostly carried out within the framework of epidemic spreading [12], where the propagation is regarded as a sequence of social interactions between infected and susceptible individuals [25,26].Simulation results from such models, however, * zhangzike@gmail.comare very different from those observed in empirical analyses on real data [27] as information spreading carries its special features.Normally, an online individual is unlikely to forward the same piece of news to his friends repeatedly, but s/he could infect (be infected by) a friend the same disease more than once [28].The memory [29] and temporal effects [30] are also significantly different, with previous behaviors having grave implications for the information spreading process.In addition, the information content [31] and timeliness [32] would generate spreading patterns that are very different from epidemic propagation.
The spreading channel also plays an important role in information spreading.Generally, there are two ways for an individual to access information: (i) peer-to-peer communications via a social network; and (ii) an external influence from outside of the network.Many previous studies traced the information spreading process by focusing on the interactions among individuals [28,33], but spreading through the external channel was also found to be important [27,34,35].In Twitter, for example, about 71% of information by volume can be attributed to internal diffusion within the network, and 29% through external influence [36].In innovation diffusion, Kocsis and Kun [37] found a power-law with a crossover in the cluster size distribution, where the global effect due to the external channel determines the cluster's core and the local effect due to the internal channel governs its growth.There have also been studies on the effects of an external channel in epidemics, with transmission through a medium, e.g.mosquitoes, playing the role of an external channel, that an enhanced infection results from having multiple routes [38,39].
Although external influence can apparently enhance the information diffusion [27], it remains unclear how the interplay between external influence and peer-to-peer interactions affects information transmission in social networks [36].
In this paper, we analyze internal and external influences on information spreading by tracking how events diffuse on the largest micro-blogging system -Sina Weibo (http://www.weibo.com/)-in China.Empirical results show that external influence plays a significant role, especially for events that attract the media's attention readily at their immediate outbreaks.We then propose a diffusion model that incorporates both social interactions and media effects [27] so as to illustrate the inter-relationship between the external and internal spreading channels.Both simulation and mathematical results of the model reveal that the spreading pattern is largely determined by the event's characteristic, as found in the empirical analyses.

II. EMPIRICAL REGULARITIES
As in other micro-blogging systems (e.g Twitter), users of Sina Weibo can post short messages, namely tweets, in the variety of formats.When an event occurs, there are basically two ways to learn about it.Through the peer-to-peer interactions in a social network, referred to as internal influence, users receive automatically the contents posted by other users whom they follow.Alternatively, users become aware of an event via an external influence outside the social network, e.g.via media broadcasts.
Figure 1 shows the spreading dynamics of some selected events from Sina Weibo in the first 100 days of their outbreak.Details on the data are given in Supplementary Materials.Each topic carries at least 10 4 new tweets or 10 5 retweets, taken as a measure of the external and internal influences respectively.The basic statistic in Table I shows that the average retweet number is much larger than the new tweets, indicating that information diffusion on Sina Weibo mainly through the internal channel, which is consistent with the results on Twitter [36].Although all the events spread rapidly in the first ten days (shaped blue), the details of the spread patterns are different.In Fig. 1, p , where n # (t) represents the cumulative number of messages posted through # (internal or external) channel till time t.Take the event labelled Yao Ming Retires (Fig. 1b) for example.Being an internationally famous basketball star from China, people learned the news from media's coverage.The external influence led to a quicker outbreak of new tweets than retweets as the news propagated and was discussed (p r for external channel is higher than internal).Another type of event can be observed in the example labelled the Guo Meimei Event (Fig. 1g).It started when an ordinary lady showed off her wealthy lifestyle online and it did not draw the media's attention initially.Many users gossiped when her account was revealed as a key official of the Chinese Red Cross.It became a hot topic quickly and eventually attracted the media's attention.This strong internal influence led to a quicker outbreak of retweets as the item propagated and was discussed (p r for internal channel is higher).Figure 1b and Fig. 1g can be taken as typical of externally and internally initiated events, respectively, what are the events' characteristic mainly discussed in this work.We further analyze the diffusion network of each tweet [40,41].It is a directed network with an edge i→j indicating information transmission from user i to user j.A tweet can be traced from its origin through the retweeting path until the spreading terminates, showing the cascade due to the tweet.The network consists entirely of internal channels and may be divided into serval unconnected communities due to effect of information blind areas [42].For each event, the cascade size of each tweet can be found.Figure 2 shows the spreading cascade size distribution for each event.Each distribution exhibits a power-law with a slope around −2.0, similar to other systems [27], and suggests the spreading dynamics via a few very large-scale cascade and many small ones.The details, however, are different for internally and externally initiated events.For the Death of Wangyue (Fig. 2f), Guo Meimei (Fig. 2g) and Qian Yunhui events (Fig. 2h), the distribution ex-ponents are less negative (smaller than 2), indicating events with stronger peer-to-peer interactions would lead to more larger-size cascades.Furthermore, the average cascade size is also larger (see the metric N r in Table I).These events were initiated within the social network (see Fig. 1f-1h) until the media picked them up, and the discussions among peers gave rise to the large cascades.In contrast, the other events caught the media's attention quickly.The stronger external influence led to more message sources and smaller cascades (see Fig. 2a-2e), and thus a more negative exponent (larger than 2).

III. MODEL ANALYSE A. Model Description
We propose a theoretical model of information spreading that incorporates both internal and external influence.Figure 3 illustrates the model schematically.Two types of agents -ordinary individuals and media-agents -are included in the network.An agent receives information from another agent if s/he follows that agent, as indicated by the arrows (solid lines) for information flow.A tiny fraction of media-agents could broadcast information to the public represented by a group of agents (dashed lines) without them being followed in addition to forwarding information to followers.We aim to incorporate (a) memory effects [29]; (b) external influences [36,37]; and (c) non-redundancy of contacts [28].As an event propagates, every agent takes on one of four states at any time: (a) unaware: has not received information on event yet; (b) aware: received information but hesitate to accept the content; (c) accepted: accepted the content and ready to transmit it; (d) removed: knew of the content but would not transmit it any more.Therefore, an agent goes through the sequence of unaware→aware→accepted→removed, analogous to the SIR epidemic model.
The information diffusion process can be described as follows: • To initiate an event, an agent is chosen randomly as a seed (coloured red in Fig. 3) to spread the first piece of information, with the state set to accepted.All other agents are in the unaware state.
• At a time step t, every agent who turns into the accepted state at the time step (t − 1) will post the information and become removed.For an ordinary agent, s/he for-wards the information to her/his followers as a retweet.For a media-agent, the information is broadcasted as a new tweet to a fraction of randomly chosen agents to mimic those who gather information from the media in addition to forwarding it as retweets to the followers.
• At a time step t, all other agents check on information arrival.For unaware agents, they become aware and evaluate a time-dependent acceptance probability p a upon receipt of information according to the source (see Eq. ( 1)).For aware agents, they update p a if information arrives.These agents then use p a to turn into accepted at time t.Those changed to the accepted state are recorded.
• The steps are repeated until the information is spread to all accessible agents in the network.
There is a fraction (0.1% in this paper) of media-agents, and each of them makes the same impact through broadcasting to 0.1% of all agents.The acceptance probability p a increases as one receives the same information repeatedly.For an ordinary agent i at time t, p a (i, t) is proportional to the amount of information C(i, t) received so far and it is updated according to where Γi t−1 is the set of agents that i follows and who switches to the accepted state at time (t − 1) and thus forward the information at time t to i, wji measures the internal influence due to interaction j → i (wji = w is set for all pairs in the network), β measures the external influence due to the media, and the set Mt contains agents who received broadcasted information at time t.
For the acceptance probability p (m) a of the media-agents, we consider two extreme cases.For events initiated via gossips (labelled II for internally initiated, such as the Guo Meimei events) that the media are not eager to report, p (m) a = pa as in Eq. ( 1) and thus follow the same updating rule.To mimic externally initiated (labelled EI, such as the Yao Ming Retires) events that the media rush to report, we set p (m) a = 1 so that media-agents accept the news immediately after they are aware of the news.Note that Eq. ( 1) incorporates the memory effect.Obviously, considering the external influence can enhance the information diffusion effect (see Supplementary Fig. S1).

B. Simulation Results
The model is implemented on the who-follow-whom online social network, i.e., followship network, extracted from Sina Weibo data.The directed links give the direction of information flow, i.e., i→j when agent i is followed by j.The basic statistics are given in Fig. 4 (see inset).The network reciprocity [43] is about 15%.Fig. 4 shows the in-degree and out-degree distributions, excluding agents of de-gree zero.The distribution of kout is much broader than that of kin, due to the two different social relationship in Sina Weibo: following someone and being followed.Agents tend not to follow too many people due to their limited attention [44].However, some targeted users, e.g.movie stars, are followed by a large number of agents without their consent.The resulting mean degrees give kout ≫ kin, suggesting that Sina Weibo has developed into a structure highly suitable for information flow (see Supplementary Fig. S2).kin and kout represent the number of followers and followees for the corresponding user, respectively.The inset is the basic statistics of the original social network of Sina Weibo.N node and N edge are the number of nodes and directed links, respectively.k, kin and kout represent the average degree, average indegree and average outdegree, respectively.The nodes with zero indegree or outdegree are not counted.
We study both internally (II) and externally initiated (EI) events.As the empirical analysis, the fraction of followee-followers retweets and broadcasts (new tweets) are recorded as a function of time as the information spreads.Figure 5 shows the results in terms of the cumulative fractions of removed agents due to the two processes for EI (Fig. 5a) and II events (Fig. 5c).Tracing the propagation paths of many events, Fig. 5b and Fig. 5d give the corresponding cascade size distributions.Evidently, the model reproduces the key features in retweets and new tweets for EI events (compare Fig. 5a with Fig. 1a-1e and Fig. 5b with Fig. 2a-2e), with pr(t) for new tweets higher than pr(t) for retweets and a more negative exponent in the cascade size distribution.Similarly, key features for II events are also reproduced (compare Fig. 5c with Fig. 1f-1h and 5d with Fig. 2f-2h), with pr(t) for retweets higher than pr(t) for new tweets and a less negative exponent in the cascade size distribution.
In order to further understand the effect of media-agents quantitatively, we detect the sensitivity of the proposed model to the ratio of media-agents.Figure 6 shows the dynamics of the removed individuals through the two different channels for various media-agents ratios for the EI events.Intriguingly, the spreading pattern can be apparently impacted by the ratio of media-agents, manifesting the burst attention changes from external channel for relatively large fraction of media-agents (Fig. 6a) to internal channel for small ones (Fig. 6f).Thus, the external channel would only play the determining role in affecting the spreading patterns when there are enough media-agents in the systems for the EI events (e.g.0.06% shown in Fig. 6d).In this way, only few media-agents would not be able to supersede the influence of gossips although they could response promptly to the EI events.Therefore, it inspires that the information spreading patterns of the EI events would be partially controlled by regulating the media-agents in real social networks, e.g.persuading "stars" not to forward the target message.However, different from EI events, the information spread through the internal channel always bursts first for the II events (see Supplementary Fig. S3).For such events, the media-agents can only influence information outbreak size, while unable to change the spreading patterns whatever how large they dominate the network.

C. Mathematical Analysis
In this section, we will give the mathematical analysis to illustrate the information diffusion patterns of the proposed model.We use superscript symbols * n and * m to represent the ordinary individuals and media-agents, respectively.Denote S(t), I(t) and R(t) as the densities of aware-and unaware-, accepted-and removed-states individuals.Adopting the mean-field approach [12,45,46], we can obtain the differential equations describing the time evolution of the densities in each population: where l is the average out-degree of ordinary individuals, and o is the number of agents that can receive the information through broadcasting of each media-agent, pa(t) and p ′ a (t) respectively are the accepted probability for ordinary individuals and media-agents at time t.According to Eq. ( 1), it can be obtained that the average pa is proportional to the number of removed State individuals in the system [47].Therefore, we hereby assume the dynamics of pa(t) as the sigmoid function (also known as Fermi function in classic physics [48]), pa(t) ∼ c (1 + e −at+b ) (Supplementary Fig. S1 and Fig. S2 show the plausibility to this hypothesis).
As we have illustrated in Model Description, for the diffusion of EI events, the media-agents respond the event promptly, indicating that p ′ a = 1 all the time, while for the II events, they are less attractive to media-agents when they happen, representing that p ′ a (t) for the media-agents are identical to ordinary individuals, saying p ′ a (t) = pa(t).In addition, as there are only a small fraction of media-agents (0.1%) are involved in the initial spreading process, resulting in p ′ a (t) → 0 in the initial times.Therefore, we can obtain the numerical results for Eq.(2) in Fig. 7, which share the similar pattern to the simulation and empirical results.That is to say, spreading via external channel is always ahead of that through internal channel for the diffusion of the EI events (see Fig. 7a), and vice verse for the II events (see Fig. 7b).Further detailed analysis on the outbreak threshold of the proposed model is also presented in Supplementary Materials, and considering the external influence can diminish the information outbreak threshold significantly (see Supplementary Fig. S4).

IV. CONCLUSIONS & DISCUSSION
In this paper, we have studied the internal and external influences on information transmission on social networks.Empirical analyses from a wide-range class of incidents of the Chinese largest social micro-blogging platform, Sina Weibo, show that there are apparent differences between EI and II events.For the EI events which attract more attention from media-agents would result in a broad and diverse popularity and corresponding large exponents of cascade size distribution.Comparatively, the II events, mainly involved by social communications, show a very opposite phenomenon.Therefore, the present findings demonstrate that the combination of out-of-network broadcasting and peer-to-peer interactions has played a significant role in facilitating the emergence of different information transmission patterns.
In order to understand how information transmits with both peerto-peer interactions and media effects, we have proposed an information spreading model based on the classical SIR model, considering three representative characteristics: (i) memory effect; (ii) role of spreaders; and (iii) non-redundancy of contacts, which are all essential properties of the information diffusion and make it quite different from the basic models of biological epidemics.Thereinto, a small   fraction of randomly selected individuals to act as the media-agents, through which information can transmit out of the fixed structure of social network, referred to as the external influence.Both Simulation and mathematical results show that, though information diffusion depends largely on the strength of the peer-to-peer interactions, the spreading pattern is essentially determined by the event attribute once the observed network structure is established, which agrees well with empirical analyses.
In the proposed model, individuals receive information via two approaches: internal (peer-to-peer contacts) and external (media) influences.The role of the external influence can be interpreted as two aspects: (i) the depth effect: considered as the media's credibil-ity, the amount of received information of the aware-and unawarestate individuals, represented by the parameter β in the model; (ii) the breadth effect: considered as the media-agent's influence range, which brings more active unaccepted individuals via media broadcasting.Besides, the internal and external influences would also promote the effects of each other.On one hand, the breadth effect of the external influence will arouse more active individuals to be aware of the information, and transmit the information to all their followers.On the other hand, events spread through the internal channel will attract more medias to report them, which additionally enlarges the external influence of the event diffusion.As a consequence, information will spread quicker and broader in social systems by the mutual reinforcement of external and internal influences (see Supplementary Fig. S1).Furthermore, we additionally observe the impact of network structure by investigating different media-agents ratios (see Fig. 6 and Supplementary Fig. S3).It reveals that the population informed both from external and internal channels will increase with expanding the ratio of media-agents.In addition, the ratio of media-agents would largely influence the spreading patterns for the EI events.Therefore, strategy or policy makers should pay more attention to get along with the media-agents to obtain an effective way to manage the information diffusion.
The findings of this work may have various applications in studying how information spreads on social networks.(i) rumor spreading and detection are both very hot yet serious topics in purifying the air of public opinions; (ii) the field of information filtering confronts a huge challenge in dealing with tremendously increasing data every day, how to efficiently provide relevant information to users can be partially inspired to design more effective algorithms to obtain timely recommendations.The present work just provides a start point to preliminarily study the internal and external influences, a more comprehensive and in-depth understanding of multi-channel effects still need further efforts to discover.

FIG. 1 .
FIG.1.The spreading dynamics versus time of eight selected events on Sina Weibo.Blue areas represent the spreading range within ten days after the corresponding events have occurred.Red and black curves represent the spreading affected by internal (retweets) and external influence (new tweets), respectively.

FIG. 2 .
FIG.2.The cascade size for diffusion of eight selected events.The distribution exponent is obtained by the Least Square Method.

FIG. 3 .
FIG.3.Illustration of information spreading model with both internal and external influence.The agents with loudspeakers represent the media-agents (external influence), which can spread information to the other agents with the same probability (dash arrows).Other gray agents represent ordinary individuals (the red agent is randomly selected to represent the information seed in the model), which can only deliver messages via peer-to-peer interactions based on existing social structure (solid arrows).All arrows indicate the direction of information flow.

5 FIG. 4 .
FIG.4.The degree distribution of the social network of Sina Weibo.kin and kout represent the number of followers and followees for the corresponding user, respectively.The inset is the basic statistics of the original social network of Sina Weibo.N node and N edge are the number of nodes and directed links, respectively.k, kin and kout represent the average degree, average indegree and average outdegree, respectively.The nodes with zero indegree or outdegree are not counted.

FIG. 5 .
FIG. 5. Simulation process of the information spreading via two different channels.a and c: cumulative fraction of removed individuals as a function of time; b and d: the cascade size distribution represented by the proposed model.The parameters are set as: a and b: w = 0.1 and β = 0.01 for the EI events; c and d: w = 0.1 and β = 0.01 for the II events.

FIG. 7 .
FIG. 7. Cumulative fraction of removed individuals versus time steps in the numerical analysis.a for the EI events; b for the II events.The parameters are set as: w = 0.1 and β = 0.01.

TABLE I .
Basic statistics of the eight representative events.day represents the date when the corresponding event happens, Nm represents the number of new tweets talking about the corresponding event, Nr represents the total number of new tweets and retweets about the event, and Nr represents the average retweet number of each tweet.
FIG.6.Different patterns of information spreading via the two channels for various media-actor ratios for the EI events.a-f represent the results for different ratios of media-agents (#%): a 0.09%, b 0.08%, • • • , f 0.04% respectively.g represents the fraction of removed individuals through different channels for various media-actor ratios.The average value and the corresponding standard deviation value are obtained by averaging over 100 independent realizations.