A. E. Allahverdyan et al 2010 EPL 90 18002 doi:10.1209/0295-5075/90/18002
A. E. Allahverdyan1, G. Ver Steeg2 and A. Galstyan2
Show affiliationsWe study the problem of graph partitioning, or clustering, in sparse networks with prior information about the clusters. Specifically, we assume that for a fraction ρ of the nodes their true cluster assignments are known in advance. This can be understood as a semi-supervised version of clustering, in contrast to unsupervised clustering where the only available information is the graph structure. In the unsupervised case, it is known that there is a threshold of the inter-cluster connectivity beyond which clusters cannot be detected. Here we study the impact of the prior information on the detection threshold, and show that even minute (but generic) values of ρ>0 shift the threshold downwards to its lowest possible value. For weighted graphs we show that a small semi-supervising can be used for a non-trivial definition of communities.
89.75.Hc Networks and genealogical trees
Issue 1 (April 2010)
Received 16 November 2009, accepted for publication 23 March 2010
Published 27 April 2010
A. E. Allahverdyan et al 2010 EPL 90 18002
N. Perra et al 2009 EPL 88 48002
W. T. Cruz et al 2009 EPL 88 41001