This site uses cookies. By continuing to use this site you agree to our use of cookies. To find out more, see our Privacy and Cookies policy.
Paper The following article is Open access

Community detection by laser network dynamics

, , , and

Published 10 January 2018 © 2018 The Author(s). Published by IOP Publishing Ltd
, , Citation Hriomasa Sakaguchi et al 2018 J. Phys. Commun. 2 015005 DOI 10.1088/2399-6528/aa9b6b

2399-6528/2/1/015005

Abstract

We introduce methods for detecting communities in networks using the dynamics of a laser network. These methods utilize phase synchronization of the coupled laser network. In these methods, lasers within the same community share the identical phases with small phase errors by making use of in-phase and out-of-phase couplings. Numerical simulation shows that our methods can identify the community structure of the networks with artificially generated communities.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Community structure is one of the fundamental properties of complex networks in which nodes are densely connected only within the same clusters [1]. Many types of networks, including functional brain networks to social networks, have been used to study the community structure [24]. The detection of the community structure is useful for analyzing a network and understanding the functions of a network [5, 6]. Although community detection is one of the most well-known topics in network science, and many algorithms have been proposed, there is no universal definition for community detection that applies to either all types of networks or to a variety of purposes.

One of the approaches to community detection is the use of a dynamical process on a network such as synchronization of coupled oscillators [79], dynamics of a spin system [1014], and diffusion processes [1517]. In the case of coupled oscillators, the oscillators especially in densely connected clusters can be quickly synchronized before a fully synchronized state, which can be utilized to identify communities [7]. In another case, the Potts spin model is often used for community detection by minimizing the Hamiltonian that corresponds to the cost function of the problem [10]. A method that utilizes the Boltzmann distribution at a finite temperature to find a consensus among many partitions has also been proposed [18].

In recent years, several attempts have been made to construct various physical systems in order to solve computationally difficult problems using their own dynamical processes. There are several examples of developing a physical system to simulate the Ising spin model in order to solve combinatorial optimization problems, such as those found in superconductor-based quantum annealing machines [19], CMOS-based classical annealing machines [20], and time-division-multiplexing optical parametric oscillator networks [2123]. A laser network constructed using optical fibers where laser pulses interact with each other via optical delay lines has also been proposed [24, 25]. This laser network is intended for emulating XY spin models as a Boltzmann sampler with a tunable effective temperature. Since the laser network is a coupled oscillator system whose dynamics is similar to the Kuramoto model, it can also be used for an community detection for arbitrary network topology.

In this paper, we propose community detection methods using the dynamics of a laser network. In some existing methods [79], the transient dynamics of a coupled oscillator network toward synchronization or desynchronization have been used for a community detection method. Generally, these methods do not give the timing to be used for identifying communities. Our methods, instead, use a steady state for community detection by introducing out-of-phase coupling that corresponds to the negative coupling constant. The out-of-phase coupling prevents the laser network from reaching a fully synchronized state and arrives at a partially synchronized state with community information as a final state. We tested our methods using numerical simulation of the dynamics of lasers on an artificial network called the Lancichinetti–Fortunato–Radicchi (LFR) benchmark graph [26].

2. Laser model

We consider the following complex Langevin equation, which describes the dynamics of a laser network [24, 27, 28]:

Equation (1)

where Aj is the complex amplitude of the jth laser field, ${\gamma }_{c}$ is the cavity photon decay rate, p is the normalized pump rate, n0 is the saturation photon number, and ${\xi }_{{jk}}={\xi }_{{kj}}$ is the coupling constant between the jth and kth lasers. The first term of the right-hand side represents the net gain. The third term represents the Langevin noise source, where dW is a complex-valued Wiener process that possesses the properties of ${{\rm{d}}{W}}_{j}{{\rm{d}}{W}}_{j}^{* }={\rm{d}}{t}$ and ${{\rm{d}}{W}}_{j}{{\rm{d}}{W}}_{j}=0$, and D is the variance of the noise. Since we are interested in the phase dynamics of each laser, we consider the form of ${A}_{j}=\sqrt{{n}_{j}}{{\rm{e}}}^{{\rm{i}}{\theta }_{j}}$. Then, equation (1) can be written in the form of phase ${\theta }_{j}$ and the photon number nj as

Equation (2)

Equation (3)

where ${{\rm{d}}{W}}_{\theta j}$ and dWnj are independent real-valued Wiener processes. Equation (2) shows that the phase equation is equivalent to that of Kuramoto oscillators if all lasers have the same amplitudes.

3. Community detection

Community detection is the problem of finding densely connected groups of nodes; however, nodes between different groups are sparsely connected. To detect such communities using our laser network, we utilize the phase correlations of the pair of lasers, which is represented as

Equation (4)

where the bracket is an ensemble average over several runs: $\langle x\rangle =1/T{\sum }_{n=1}^{T}{x}^{(n)}$, where T is the number of runs and ${x}^{(n)}$ is the sample of x for the nth run. We calculate correlations only for the pairs of lasers that are connected in the original graph $G(V,E)$, i.e., $(i,j)\in E$, where V is the set of the nodes and E is the set of the edges of the graph. Then, correlations ${\rho }_{{ij}}$ are converted to binary matrix $\{{D}_{{ij}}\}$ by the threshold parameter T as

Equation (5)

We identify the communities as connected components of $\{{D}_{{ij}}\}$. If we construct the coupling term in equation (1) directly from the adjacency matrix of $G(V,E)$, and the coupling strength is sufficiently larger than the noise term, all lasers will be synchronized at the same phase. Since this situation is not suitable for our purpose, we propose two different methods for constructing a coupling matrix from a given network as follows.

3.1. Homogenous out-of-phase coupling

In the first approach, we consider homogenous out-of-phase couplings for community detection. For a giving graph $G(V,E)$, we construct a laser network such that

Equation (6)

where ξ and γ are the positive constants that determine the coupling strengths and the ratio between the in-phase and out-of-phase coupling strength, respectively. As shown in figure 1, the nodes are coupled in-phase if they are connected in the $G(V,E)$ and out-of-phase, otherwise. Since the out-of-phase coupling separates the phases of lasers that are not connected in the $G(V,E)$ we expect that only densely connected groups of nodes to oscillate in phases with close phase values.

Figure 1.

Figure 1. Schematic of the construction of a laser network from the original graph $G(V,E)$.

Standard image High-resolution image

In this approach, the important parameter is γ because it tunes the global out-of-phase coupling effect, which in turn strongly depends on the edge density $D=2{m}/\{N(N-1)\}$ of the $G(V,E)$, where m and N are a number of edges and a number of nodes in the $G(V,E)$, respectively. Therefore, the value of γ needs to be determined, depending on the D of the $G(V,E)$. To this end, we investigate the order parameter r for a random network with the same edge density as the $G(V,E)$ and the coupling protocol as described above. The order parameter ${{r}{\rm{e}}}^{{\rm{i}}\phi }$ of phases is defined as

Equation (7)

where r shows how much oscillators are synchronized globally, and ψ is the mean value for all phases. Figure 2(a) shows the numerical simulation result of the γ dependence of r with various values of edge density D in the Barabási–Albert (BA) model with number of nodes N = 1000 [29]. The BA model is a random network model with power-law degree distribution. The order parameter r becomes 1 in the region where in-phase couplings are dominant, and r decreases as γ increases. At a fixed γ, order parameter r simply increases as edge density D increases, since the number of in-phase couplings increases. Operationally, we choose the optimum value of γ at which order parameter r becomes 0.5 in a random network with the same edge density as the $G(V,E)$. Since we are interested in the parameter region where lasers can easily detect the community structure, we expect the phase correlation of lasers to become sensitive to the local change of edge density from a random connection at this critical parameter. We plotted this γ dependence on edge density D in figure 2(b). The relationship between γ and D was approximated by a linear fit using parameters a and b:

Equation (8)

where we assumed the system size N dependence as a finite size correction. The best fit was achieved using a = 1.0 and $b=-4.8$. Using this relation, we determine the value of γ depending on D of the $G(V,E)$.

Figure 2.

Figure 2. Order parameter of phases on the BA model with number of nodes N = 1000. (a) Out-of-phase coupling ratio γ dependence of order parameter r with various values of edge density D. (b) Out-of-phase coupling ratio γ that satisfies r = 0.5 depending on edge density D. The dashed line represents linear fit $\gamma ={aD}+b/N$, where a = 1.0 and $b=-4.8$.

Standard image High-resolution image

3.2. Modularity matrix

In the second approach, we construct a laser network with a coupling matrix on the basis of modularity [30] which is often used as a cost function for community detection [3134]. The modularity Q with resolution parameter ${\gamma }_{\mathrm{mod}}$ [11] is represented as

Equation (9)

Equation (10)

where $\{{A}_{{ij}}\}$ is an adjacency matrix of the $G(V,E)$, ${\sigma }_{i}$ is the community to which node i belongs, and ki is the degree of node i. We use modularity matrix Bij for the coupling matrix of the laser network as ${\xi }_{{ij}}=\xi {B}_{{ij}}$. Here, the out-of-phase coupling strength is determined by the expected number of edges between the nodes in the configuration model of the $G(V,E)$. The configuration model is a random connectivity model of the $G(V,E)$ where edges are randomly rewired, and the degree of each node is preserved. The modularity matrix is based on the assumption that two connected nodes are likely to be in the same community if the probability that the nodes are accidentally connected by chance in the configuration model is low. The global strength of the out-of-phase coupling can be tuned by resolution parameter ${\gamma }_{\mathrm{mod}}$, which determines the size of the community that is preferred by modularity Q. The issue of determining ${\gamma }_{\mathrm{mod}}$ remains controversial, and tuning ${\gamma }_{\mathrm{mod}}$ without any bias for the particular size of the community is difficult [35]. In this paper, we simply choose ${\gamma }_{\mathrm{mod}}=1$, which is the same parameter value as that of the original definition of modularity [30].

3.3. Network model and evaluation

For our benchmark, we chose the LFR benchmark graph, which has a built-in community structure and is often used for benchmarking a community detection algorithm [26]. The LFR benchmark graph has power-law community size and degree distributions with exponents α and β, respectively. The average values of the distributions can be determined by the minimum and maximum values of degree k and community size s as ${k}_{\min }$, ${k}_{\max }$, ${s}_{\min }$, and ${s}_{\max }$, respectively. Another important parameter of the LFR benchmark graph is the mixing parameter μ, which tunes the fraction of edges of each node that is connected to the nodes in other communities. We chose the same parameter of the LFR model as [36] for the undirected graph.

We evaluated the performance of our methods on the basis of normalized mutual information (NMI) [37]; the NMI measures the similarity between two partitions of the same graph on the basis of Shannon's mutual information. NMI is often used for evaluating community detection algorithms by comparing the partition embedded in the benchmark graph with the partition obtained using algorithms [36]. However, NMI is known to be biased for a large number of divisions because of the finite size effect [38]. In particular, the partition in which the all nodes belong to the different groups represents a high NMI value compared to the case where all nodes belong to the same group, although both of those partitions are trivial cases for community detection. Relative NMI (rNMI) has been proposed to consider this finite size effect [38]. The rNMI is defined as

Equation (11)

where A is the partition embedded in the graph, B is the partition obtained by the algorithm, C is a random partition with the same group size distribution as partition B, and $\langle \mathrm{NMI}(A,C)\rangle $ is the expected value of $\mathrm{NMI}(A,C)$ over different realizations of random partition C. The ratio of relative normalized mutual information (rrNMI), which is normalized to 1 for the same partitions, is defined as [39]

Equation (12)

We use rrNMI as a measure of community detection accuracy for our methods and calculate the $\langle \mathrm{NMI}(A,C)\rangle $ as an ensemble average of 100 random realizations of C.

4. Numerical simulation result

The validity of our methods was confirmed via numerical simulation. The physical parameters in the equation (1) were chosen as p = 2.0, $D/\sqrt{{\gamma }_{c}}=0.5$ and $\xi /{\gamma }_{c}=0.1/\langle k\rangle $, where $\langle k\rangle $ is the average degree per node of the $G(V,E)$. The histogram of the correlation $\{{\rho }_{{ij}}\}$ for the LFR benchmark graph with N = 1000 nodes was obtained by calculating the average of 100 trials for each ${\rho }_{{ij}}$ shown in figure 3. The exponents for the degree and community size distributions of the graphs are $\alpha =-2$ and $\beta =-1$, the average degree is $\langle k\rangle =10$, the maximum degree is ${k}_{\max }=50$, the minimum community size is ${s}_{\min }=20$, and the maximum community size is ${s}_{\max }=100$. When the mixing parameter is $\mu \lt 0.5$, the distributions of correlations of nodes in the same community and the different community are separated from each other, and the correct communities can be found by setting a threshold value between the two distributions. In contrast, two distributions overlap when $\mu \gt 0.5$, and we cannot distinguish whether the two nodes belong to the same community or not.

Figure 3.

Figure 3. Histograms of phase correlations ${\rho }_{{ij}}$ on the LFR benchmark graph. Correlations between nodes in the same community (intra community) and the different community (inter community) are plotted as solid blue and dashed green lines, respectively.

Standard image High-resolution image

We then plotted the rrNMI as a function of mixing parameter μ for different community sizes. S and B in figure 4 represent graphs with a small and big community size, where ${s}_{\min }=10$ and ${s}_{\max }=50$ for a small community size S, and ${s}_{\min }=20$ and ${s}_{\max }=100$ for a big community size B. We compared the rrNMI evaluation of our two methods with two well-known algorithms, the Louvain method [32] and the Infomap [40]. The Louvain method is based on a greedy maximization of the modularity and is performed by the iteration of a local maximization of the modularity and the aggregation of nodes belonging to the same community. The method is well known as the one of the fastest algorithm for community detection. The Infomap is based on the optimization of a map equation, which is the minimum description length of a random walk on a graph. Both algorithms performed well in the literature [36] in which various algorithms were compared by NMI using the LFR benchmark graph. In the study, the Infomap was the best algorithm in terms of the NMI, and the Potts spin model approach [13] performed similarly to the Infomap. Since the Louvain method is the fastest algorithm and the Infomap is the best in terms of the NMI, we chose these two algorithms to compare with our methods. The fraction of edges in the same community becomes small for large μ. As a result, rrNMI tends to decrease for all algorithms as the value of μ becomes large. Since the region where rrNMI equals to 1 means that the algorithm completely find embedded communities, the performance of the algorithm can be evaluated at the point where rrNMI begins decreasing from 1. In this sense, the performance of our two methods shows comparable performance to the previous methods. Although Infomap performs better than our method, our method is better than the Louvain method for the large-size and big community graphs.

Figure 4.

Figure 4. Performance of algorithms in terms of rrNMI on the LFR benchmark graph. S and B are small (10–50 nodes) and big community (20–100 nodes) size distributions respectively. The Louvain method and Infomap are also evaluated for comparison. The solid black circles represent the result for the graph with S and N = 1000, the dashed red squares for B and N = 1000, the solid green triangles for S and N = 5000, the dashed rhombus blue for B and N = 5000, the solid cyan pentagon for S and N = 10 000, the dashed magenta hexagon for B and N = 10 000, the solid yellow cross for S and N = 50 000 and the dashed brown star for B and N = 50 000.

Standard image High-resolution image

Finally, we investigated the relaxation time of laser networks to the final state. Since only the phase information was used at the final state for detecting communities, it is crucial for the computational time of our method. We estimate the relaxation time using the link order parameter ${r}_{\mathrm{link}}$ [41], which is described as

Equation (13)

This order parameter represents the degree of synchronization in phases that are connected in the $G(V,E)$. In figure 5, we plotted the relaxation times in terms of the convergence time of ${r}_{\mathrm{link}}$ for the LFR benchmark graph. In every case, the relaxation times increase as mixing parameter μ increases. This means that a strong community structure in the $G(V,E)$ makes a system converge quickly. The relaxation times in the modularity-based method are less than those in the homogenous method. A possible explanation of this tendency is as follows. In the modularity-based method, the out-of-phase coupling strength between the pairs of low-degree nodes, which are a majority on scale-free graphs, is weaker than that in the homogenous method. Therefore, the frustration effect in the modularity-based method is weaker than that in the homogenous method; this phenomenon makes the systems converge quickly.

Figure 5.

Figure 5. Relaxation time of laser networks to a steady state depending on size of the network N and mixing parameter μ of the LFR benchmark graph. Each line shows N dependence of the relaxation time at the specific μ. Relaxation time is estimated by the average of the convergence time over 100 trials. Time is normalized by the lifetime of photon in a cavity $\tau =1/{\gamma }_{c}$ as $t/\tau ={\gamma }_{c}t$.

Standard image High-resolution image

The relaxation time shown in figure 5 is normalized by the lifetime of a photon in a cavity ${\tau }_{c}=1/{\gamma }_{c}$. The photon decay rate of the cavity ${\gamma }_{c}$ depends on the Q value of a cavity, and the estimated value of ${\gamma }_{c}$ of the experimental setup in the literature [24] is $50\,\mathrm{MHz}$. Therefore, the order of the relaxation time of the system is estimated to be $10\mbox{--}100\,\mu {\rm{s}}$. It should be noted that the relaxation times do not depend on the system size N under the same value of μ with the same degree and community size distribution, except for the graphs with weak community structure ($\mu \gt 0.7$) in the homogenous method.

5. Conclusion and outlook

We proposed two methods of community detection by using a laser network with an out-of-phase coupling scheme and numerically confirmed that the performance of our methods is comparable to the existing algorithms in terms of rrNMI evaluation. We also proposed a method to tune the parameter γ of the homogenous out-of-phase coupling depending on the edge density of an the original graph. In the case of modularity-based out-of-phase coupling, the relaxation time of the laser network is less than that of the homogenous out-of-phase coupling although both schemes perform similarly for the graphs with strong community structures.

With regard to the perspective of computation within a real physical system, a scheme for constructing an arbitrary network topology is required to apply these methods. So far, 1D-ring network topology with all-optical connections has been implemented for a laser network [24, 25]. A promising way to implement an arbitrary network topology with FPGA in an optical parametric oscillator system has recently been proposed [22, 23]; this method is also applicable to a laser network. The evaluation of relaxation time suggests that the computation time for community detection using such an experimental system is estimated to be $1\mbox{--}10\,\mathrm{ms}$ for 100 runs of an ensemble average calculation. In the Louvain method, known as the fastest algorithm, the computational time for a graph with number of nodes N = 70 000 and number of edges M = 351 000 is reported to be 1 s, which linearly scales up by M for sparse graphs [32]. Our methods also require post-processes for the calculation of phase correlations of node pairs, which are connected in the original graph, and the identification of connected components of a binarized correlation matrix, whose computation time also scales up by M. With regard to the time for running dynamics, however, relaxation time does not depend on system size for the LFR benchmark graphs with the same mixing parameter and could be further reduced using methods such as the physical implementation of a replica exchange Monte Carlo algorithm [42], which could also be implemented in the laser network by the control of effective temperature [25].

Acknowledgments

This work was funded by the Impulsing Paradigm Change through Disruptive Technologies (ImPACT) Program of the Council of Science, Technology and Innovation. We thank for fruitful discussion with Taro Takaguchi, Kenji Leibnitz and Shinji Nishimoto.

Please wait… references are loading.
10.1088/2399-6528/aa9b6b