Letter The following article is Free article

The dynamic nature of conflict in Wikipedia

, and

Published 7 October 2014 Copyright © EPLA, 2014
, , Citation Y. Gandica et al 2014 EPL 108 18003 DOI 10.1209/0295-5075/108/18003

0295-5075/108/1/18003

Abstract

The voluntary process of Wikipedia edition provides an environment in which the outcome is clearly a collective product of interactions involving a large number of people. We propose a simple agent-based model, developed from real data, to reproduce the collaborative process of Wikipedia edition. With a small number of simple ingredients, our model mimics several interesting features of real human behaviour, namely in the context of edit wars. We show that the level of conflict is determined by a tolerance parameter, which measures the editors' capability to accept different opinions and to change their own opinion. We propose to measure conflict with a parameter based on mutual reverts, which increases only in contentious situations. Using this parameter, we find a distribution for the inter-peace periods that is heavy tailed. The effects of wiki-robots in the conflict levels and in the edition patterns are also studied. Our findings are compared with previous parameters used to measure conflicts in edit wars.

Export citation and abstract BibTeX RIS

Introduction

The study of interacting particle systems has, for a long time, been an important subject of physics. The use of statistical methods has allowed for major advances in this area, by providing a bridge between the microscopic interactions and the large collective behaviour of the system [1,2]. This success has motivated researchers to try a statistical approach to other subjects outside physics [3,4]. The application of the methods of statistical physics to social phenomena, where the interacting particles are now interacting human beings, has proved to be very fruitful in allowing for the understanding of many features of human behaviour [57]. Some of these properties are common to very different phenomena in nature. Scaling, for example, is generally observed in a great variety of human networks. Universality, which states that the emergent phenomena displayed by the collective behaviour of interacting particles depends on symmetries, dimensionality and conservation laws and not on the microscopic details of the intrinsic dynamics mechanism [2,8], seems to be present in many social situations [57]. In this sense, the cornerstone to the successful modelling of social systems depends mainly on two major strategies: on the one hand an appropriate selection of the relevant variables and, on the other, a good visualisation/representation of the displayed phenomena, both related to the specific system being studied.

For this endeavour, Internet data has played an important role as a source of data that allows for the test of models of universal social patterns as a collective effect of interaction among single individuals [9]. The development of an article for Wikipedia (WP) is a process in which anybody may edit and change its content. Whatever the reasons behind someone's decision to edit an article in WP, and whatever his background and previous knowledge on the article's subject, it has been recognised that the reliability thus obtained is comparable to that of other high quality professional encyclopedias, such as the Encyclopedia Britannica [10]. This voluntary process clearly creates an environment where the outcome is a collective product of interactions among a large number of people [1113]. The on-line availability of the historical record of all editions has promoted an intense research activity [1416], trying to grasp the intrinsic behaviour that characterises several features of WP editing.

The emergence and development of conflicts in WP is one of the features that has recently received attention from the academic world. The interplay between strong convictions and tolerance leads to conflicts among the editors (the so-called edit wars) and the WP article may converge (or not) into a consensus edition [17]. Several approaches to measure the level of conflict in an article as it develops have been tried. Some take into account the resulting topology, for example on semantic flow [18] or on talk pages [19], missing the underlying dynamical process. Other approaches determine conflict levels by focusing on WP editors' dynamical behaviour, by measuring, for example, the talk page length [2022]. The drawback of these approaches, besides the time-consuming effort needed to apply the methodologies, is that the use of this editing channel, that was created to discuss changes and controversies, depends on cultural traits. In some cultures, the talk pages are extensively used to discuss the differences of opinion, while in others they are almost not used at all for this purpose [23]. The search for culture-independent indicators led to the study of disputes between pairs of editors by measuring, for example, the number of words that they have deleted from each other in a sequence of editions [17,24]. These and other appropriate parameters, that show some degree of mutual correlation, have been shown to work well as measures of conflict [23].

Recently, it was realised that, in the context of edit wars, reversion is quite common and becomes a typical mechanism used by editors to disagree with others in a controversial mode [25]. Reversion consists in completely recovering a certain previous edition of the article, totally disregarding the changes made afterwards. However, revert maps cannot fully differentiate between conflictive and non-conflictive articles [23], as reversion cannot discriminate between dispute and the response to simple acts of vandalism (such as restoring a page that has been fully deleted). Furthermore, acts of vandalism are not such a rare event; they are actually responsible for about 24% of all the reverts [25]. Nonetheless, this problem can be avoided if, instead of simple reverts, mutual reverts (when two editors revert each other's editions) are used to define conflictual behaviours [23,26,27]. Yasseri et al. proposed a parameter which is a function of mutual reverts and may distinguish conflicts from mere vandalism (see [27] and references therein). They also proposed an agent-based model in order to reproduce the main features depicted by their controversy parameter [28].

In this letter, we address the behaviour underlying the collective dynamics that emerges from WP editors' interactions. We show some plots of WP data different from those previously presented in the literature. The analysis of these plots led us to propose a new agent-based model to simulate the edition of a WP page. Inspired by the parameter M introduced by Yasseri et al. [23] to measure conflict, we define another parameter C, also based on mutual reverts, which is similar to M but has the advantage of more accurately detecting the end of conflicts. We use the same conflict parameter to compare our model to real data. We find a scaling behaviour when measuring the inter-peace periods with the new conflict parameter. Finally, we explore how the results of WP data analysis change due to the presence of edit robots.

Real data

The analysis presented in this letter is based on the January 2010 dump of the English WP, containing 4.64 million pages, available at the WikiWarMonitor webpage (http://wwm.phy.bme.hu/). The data sample used, a "light dump", contains a reduced information list of all the pages edits (the edit timestamp, a reversal flag, the edition number and the editor identification). Only pages with more than 1800 editions and a lifetime over 6 years were further analysed.

Editor's activity

In order to study the editors' activity, the editions of the article are numbered in chronological order. The edition number is denoted by e, so that e = 1 is the first edition, when the article is created, and editions $e=2,3,4,\ldots$ are the subsequent updates. In fig. 1, we present examples of editors' activity as a function of edition number (each editor is numbered according to the chronological order of his debut in that particular article), for a low, a medium and a high controversy page, as defined later. A symbol is plotted in each graph for every edition of every editor. In the three cases shown in this figure, there is a ratio R between the number of editors and the number of editions that is approximately constant over time. We found a similar behaviour in all the articles that we checked, and this seems to be a reasonable assumption.

Fig. 1:

Fig. 1: (Colour on-line) Examples of editor's activity as a function of edition number, (a) in blue for a low, (b) in green for a medium and (c) in brown for a high controversy page. On the vertical axis each editor is numbered according to the chronological order of his debut in that article.

Standard image

It is also clear from fig. 1 that the activity of most editors decays over time. We highlight this effect by summing all editions made by all editors after they start to edit. Let ei be the edition number at which editor Ei edits the article for the first time. Then, $\varepsilon_i=e-e_i$ is the edition number after Ei starts to edit. We choose the article named "Jesus" (with a total of 21768 editions) and plot in fig. 2 the editing activity as a function of $\varepsilon_i$ . For that purpose, we divide the total number of editions in 100 bins, each of size $\Delta \varepsilon$ equal to just over 200 editions, and plot the sum of the number of editions in each bin for all editors on the left panel in brown (for example, the first dot in this figure is the total number of times that all editors have edited the article during the first Δε editions since they started to edit). The brown dots follow approximately a straight line in the semi-log graph, which suggests an exponential decay in editing activity for the average editor. The number of reversal editions is shown in blue in the same graph, following a similar pattern. On the right panel of fig. 2 we plot the distribution of the number of editors as a function of the total number of editions that each one has made, for the same article. The distribution seems to follow a straight line again, but now in a log-log graph, suggesting a power-law distribution.

Fig. 2:

Fig. 2: (Colour on-line) For the wikipage "Jesus". Left panel: in brown, the number of editions in each bin, sized $\sim 200$ editions, as a function of $\varepsilon_i$ , summed over all editors. In blue, the same for reversal editions. Right panel: distribution of the number of editions per editor.

Standard image

Agent-based model

We propose an agent-based model that tries to grasp the features observed in real WP page edition. Like Yasseri et al. [27], we use the Bounded Confidence model proposed by Deffuant et al. [29] as a suitable candidate to address the dynamics of collaboration and conflicts in WP edition. We denote by xi the value of the opinion of editor Ei and by A the state of the article, i.e., the opinion reflected by the article, where an opinion about the subject under discussion is quantified by a continuous value between 0 and 1. Several sets of initial opinion distributions were tried, such as a uniform random distribution between 0 and 1 or a combination of Gaussian distributions with different parameters. We found the final results to be qualitatively unaffected by this choice and we have used for the initial opinion of the editors a Gaussian distribution with mean $m=0.5$ and standard deviation $\sigma=0.1$ , with a cutoff for values less than zero or greater than one. The computer simulation starts with just one editor and, at each dynamical step e, a new editor comes in and edits the wikipage for the first time with probability R. This means that, at each time step, an old editor Ei will interact with the article with probability $1-R$ , in which case the probability for choosing Ei among all the available editors will be

Equation (1)

where Ni is the number of previous editions by editor Ei and $\eta_i$ is a random number between 0 and 1. The proportionality to Ni is similar to the preferential attachment effect [30], "edits beget edits", in Wilkinson and Huberman's words [13]. Bryant et al. showed that the involvement of the editors with the quality of a WP article increases with the number of times they have edited it [31]. We assume this increased involvement may be described by the proportionality to Ni in eq. (1) [12,27]. The parameter $\eta_i$ is intended to measure the editor's propensity to edit this particular article, either due to the extent of his knowledge on the subject or merely to some emotional connection to it (the fitness as defined by Bianconi and Barabási in [32]). Once $\eta_i$ is defined for editor Ei, it will maintain its value throughout the whole edition procedure. The sum is performed over all active editors.

After choosing which editor is going to interact with the article in a specific dynamical step, the edition process is decided as follows: if the difference between the editor's opinion, xi, and the article's current state, A, is inside a tolerance threshold, epsilon, $(|x_i-A|<\epsilon)$ , the editor will change his mind and approach the article's point of view, which remains unchanged, $ \Delta x_i = -\mu (x_i-A)$ , where $\mu\,(=0.2)$ is a convergence parameter. Otherwise, the editor will maintain his opinion and change the article. In the latter case, if a previous edition is found with an opinion value difference from the current editor's opinion smaller than epsilon, then with probability $P_{\textit{rev}}(=0.5)$ , the editor will revert the article to a previous (the nearest) such edition. In case no edition reversal occurs, the article state is changed to approach the editor's opinion, $\Delta A = \mu (x_i-A)$ . After each edition, each active editor will become inactive with probability $P_{\textit{inac}}\,(=0.0005)$ , which reflects the editor's loss of interest in the article (an ageing effect). This parameter tries to mimic the observed progressive loss of interest of most editors in the article, which can be perceived in fig. 1. We have chosen the reversal probability value according to our findings about reversals in data, as we show, for example, in the left panel of fig. 2, for the specific case of the wikipage "Jesus". The value is able to reproduce the reversal behaviour in pages of different controversiality.

In fig. 3 we show the simulation results for 17k editions and 4k editors. Plot (a) shows the editor's activity (similar to fig. 1(c) for real data). Plot (b) shows the distribution of the number of editions per editor (similar to the right panel of fig. 2 for real data). In summary, our model has five parameters, three of which have the following fixed values for all simulations: $P_{\textit{inac}} =0.0005$ , $\mu=0.2$ and $P_{\textit{rev}} =0.5$ , which were obtained so as to reproduce the real data. The parameter R is adjusted to describe the amount of editor participation required for each simulation and epsilon is the only parameter that controls the degree of controversy.

Fig. 3:

Fig. 3: (Colour on-line) From our simulation model: (a) editors' activity as a function of edition number $(R \approx 0.24)$ , (b) distribution of the number of editions per editor.

Standard image

Controversy parameters

Several algorithms have been proposed to rank controversial articles in WP; most of them concentrate on detecting edit wars and high controversiality [23]. Yasseri et al. measured controversy by means of a sum over the minimum number of all the editions by each pair of editors, with at least one mutual reversal between them:

Equation (2)

where $\min(N_i^d,N_j^r)$ is the minimum of the 2 values $N_i^d$ and $N_j^r$ , which are the number of editions of editors Ei and Ej who have been involved in at least a mutual reversal with each other and E is the number of editors who, at some point, have performed a mutual reversal with any other editor. The sum excludes the term with the maximum value, in order to remove a possible personal conflict.

Yasseri et al. [23] showed that this parameter can effectively select the high controversy WP articles. However, there is a problem with this definition. Assuming there is a collaborative period after a conflict, this parameter keeps growing, as long as the editors involved in the conflict keep editing, thus failing to recognise the end of the conflict.

In order to avert this problem, we reworked M and defined our conflict parameter, C, as the sum of all the reversals between all pairs of editors with at least a mutual reversal between them $\langle i,j \rangle$ , multiplied by E (keeping the general recognition that "the larger the armies, the larger the war"),

Equation (3)

where $N_{i,j}^R$ is the number of reversals between editors Ei and Ej (both the reversals of Ei over an edition by Ej and vice versa). We chose not to exclude the maximum in this sum, as there is no way to identify a personal conflict (except by actually looking at the several editions of the WP page in detail).

Figure 4 displays the conflict parameter C as a function of the edition number for the same simulation and for different values of the tolerance parameter and number of editions. In plot (a) we show the evolution of C for $\epsilon=0.18$ , in plot (b) for $\epsilon=0.10$ and in plot (c) for $\epsilon= 0.05$ , which correspond, respectively, to a low, a medium and a high level of conflict. Comparison with plots obtained with real data (shown in fig. 5), suggests that the tolerance parameter provides an appropriate description of the level of conflict in the edition process.

Fig. 4:

Fig. 4: (Colour on-line) From our simulation model: controversy parameter $(C)$ as a function of edition number for (a) $ \epsilon=0.18$ (low controversy, blue), (b) $\epsilon=0.10$ (medium controversy, green) and (c) $\epsilon=0.05$ (high controversy, brown). Insets of the medium and high controversy plots show zooms with behaviours similar to low and medium controversy, respectively (a scale effect).

Standard image
Fig. 5:

Fig. 5: (Colour on-line) Controversy parameter C as a function of edition number for WP pages of (a) low ("Hiccup"), (b) medium ("Timur") and (c) high controversy ("Jesus"). Again, insets of the medium and high controversy plots show zooms with behaviours similar to low and medium controversy, respectively.

Standard image

In identifying the highly controversial pages, parameters C and M are quite similar. Out of the top 100 most controversial pages according to each of those two parameters, 80% are common to both selections, and the measured correlation between the two parameters is $r=0.97$  [33]. Out of the 4.64 million pages of the English WP, only about 216k have C > 0 and, out of these about 58% (just over 124k) have $C\leq 10$ . For medium and high controversy pages, $\sim 5.8\text{k}\ (2.7{\%})$ have $C > 1\text{k}$ and $\sim 650\ (0.3{\%})$ have $C > 10\text{k}$ . Despite the above mentioned similarities, the left panel of fig. 6, showing normalised values of C and M for the same edition period, illustrates that the two parameters follow different evolutions. There are some peaceful periods (where C is constant) that are not recognised as such by M (which increases most of the time). M is suitable to detect the most controversial articles, but it fails in capturing some collaboration patterns.

Fig. 6:

Fig. 6: (Colour on-line) Left panel: an example of normalised values of C (brown) and M (blue) for the same editing interval. Note how M rises when C remains constant in several intervals, like at the beginning of the plot. Right panel: an example of M evolution with (blue) and without (brown) robots [34], for the same period. Both plots were obtained with the wikipage "Anarchism".

Standard image

It is important to recognize that the edition of a WP page is shared between humans and robots, i.e., programs conceived to perform various routine tasks such as spell correction and vandalism detection. In order to measure conflict activity among humans and uncover the intrinsic dynamics of controversy, we must remove the effect of the robots in the controversy parameters [34]. In the right panel of fig. 6 we compare the evolution of parameter M with and without the robots. It can be seen that the robots do make a difference in conflict detection, as their non-elimination artificially increases the conflict parameter M. A similar effect is not observed in parameter C, where the difference between the two graphs (with and without robots) is not perceivable with the naked eye (and for this reason we do not show it here).

There is a significant correlation between high levels of conflict and the number of editions [27]. Therefore, when comparing conflictuality periods, it may be necessary to normalise the measuring parameter with respect to the number of editions.

Inter-peace periods

A lot of effort has been devoted to describe and understand temporal patterns of human activities [35,36]. Several heavy-tailed distributions of time intervals between events on different communication media have been found. WP has shown specific time attributes depending on geographical and cultural constraints [27,37,38]. Yasseri et al. reported the exponent of the inter-edit time distribution [37]. In order to analyse edition patterns of edit wars, we show in fig. 7 the fat-tailed distribution of inter-peace periods, where we define peace as a period, lasting no less than n editions, during which the controversy parameter remains constant. In our calculations, we have used n = 3 and we have gathered all the pages with more than 1500 editions and a lifetime longer than 8 years. We believe that this makes more sense than to study the length of the peace periods as these will, in many cases, depend on exogenous factors either cyclic (like the anniversary of some event related to the article) or non-cyclic (special events) [39,40].

Fig. 7:

Fig. 7: (Colour on-line) Distribution of the number of editions between peaceful intervals for M and C on the left and on the right panel, respectively. The red lines are power-law fits to the data points.

Standard image

The red line in fig. 7 represents our power-law fit to the data. With parameter C, we get for the exponent the value $\alpha=3.90\pm 0.06$ , with a fit p-value of 0.012, while for M we get an almost negligibly higher α value $(3.98\pm 0.04)$ , because it does not take into account some of the peaceful times between conflictual editors. The p-value for this fit is practically zero $(\sim 2\times 10^{-35})$ . The details of these calculations are explained in ref. [41]1 . The fit p-value was obtained from the $\chi^2$ fit probability.

The robots were excluded from all these calculations. We checked that their effect on the values of α is very small.

Conclusions

The edit wars on WP are conflicts with the unusual circumstance of being a symmetric scenery, in which the antagonist entities are individuals with no connection to one another. In this letter, we aimed at capturing the basic ingredients that lead the WP editing process to a collaborative/consensual edition or to remain conflictive. With a simple agent-based model that relies on few parameters, we reproduced several interesting real behaviours. We proposed a conflict measurement parameter based on mutual reverts that is culture independent and has the advantage, over previous parameters, of being able to differentiate between collaboration and conflict among editors. We show that the level of conflict in a WP page can be related to the tolerance in the system, which may be associated to issues that people are not prepared to negotiate because of the strong convictions involved. We found a long-tailed power-law distribution on inter-peace periods, measured with our conflict measuring parameter, C. We also showed the differences in the power-law exponent with a previous conflict parameter, M. We mention the effect of the robots that edit the WP in the controversy measurement parameter and edit patterns.

This study has not included the real edition time. The parameter e was defined as the edition number and all variables were defined in terms of e and not of the real time. A natural extension of this work will be to study the dynamical features in terms of real time.

Acknowledgments

JC thanks Filipe Veloso and Cláudio Silva for help on technical questions. YG thanks Ernesto Medina and Silvia Chiacchiera for useful discussions. YG acknowledges financial support from Fundos FEDER through Programa Operacional Factores de Competitividade-COMPETE and by Fundação para a Ciência e a Tecnologia, through Project FCOMP-01-0124-FEDER-015708.

Footnotes

  • The Kolmogorov-Smirnov test was used for the calculation of $x_{\min}$ (lower fit range), according to the procedure described in ref. [41].

Please wait… references are loading.
10.1209/0295-5075/108/18003