Detectability of the spectral method for sparse graph partitioning

We show that modularity maximization with the resolution parameter offers a unifying framework of graph partitioning. In this framework, we demonstrate that the spectral method exhibits universal detectability, irrespective of the value of the resolution parameter, as long as the graph is partitioned. Furthermore, we show that when the resolution parameter is sufficiently small, a first-order phase transition occurs, resulting in the graph being unpartitioned.

Introduction. -Graph partitioning is often analyzed as a fundamental problem to understand the performance of community detection in complex networks. Graph partitioning was originally an optimization problem: for a given number of modules, the problem is to find the partition with the sparsest cut under the constraint that the size of the modules are exactly or nearly equal. When graph partitioning is applied to a social network with a modular structure, for instance, the nodes identified as members of a module are expected to belong to the same social group.
To clarify the perspective of graph partitioning as community detection, let us consider the partitioning of a uniform random graph (i.e., a random graph without planted block structures) as an example. While it is still meaningful to find the optimal partition for each instance when the problem is regarded purely as an optimization problem, the result is hardly significant for community detection; it should be statistically different from the results for uniform random graphs. Therefore, it is of significant importance to ascertain each algorithm's performance. Interestingly, even when we generate random graphs with a planted modular structure, which have higher edge density within a module than between modules, the average performance of a partition may be indistinguishable from that of uniform random graphs as long as the modular structure is not sufficiently clear. This indistinguishable region is called the undetectable phase, while the region where the partition is positively correlated to the planted modules is called the detectable phase. The boundary is called the detectability threshold [1][2][3][4][5][6][7][8]. Since many real networks are sparse, we focus on the case of sparse graphs. That is, the average degree does not increase as the total number of nodes increases.
Because the graph partitioning is usually formulated as a discrete optimization problem, which is computationally expensive, the spectral method [9,10] that solves for the continuous relaxation of the original problem is often used. Whereas the performance of partition generally depends on the choice of objective function to be optimized, it was shown in [11] that the spectral method for three popular objective functions, namely modularity, normalized cut, and log-likelihood of the degree-corrected stochastic block model [12], reduce to an eigenvalue problem of the normalized Laplacian when an elliptical normalization is considered as a constraint.
In this paper, we first show that the above three objective functions can be formulated in the framework of modularity maximization in the level of discrete optimization. While the modularity form of the log-likelihood of the degree-corrected stochastic block model was already derived in [11], we show that the normalized cut can also be formulated in the same framework. We then conduct a detectability analysis of the spectral method of the modularity with the spherical normalization constraint, i.e., the method with the modularity matrix. The detectability analysis of the spectral method with the normalized Laplacian for sparse graphs was performed in [13].
An important difference between graph partitioning and community detection lies in whether the number of modules is given or to be estimated. While the graph is always partitioned into a given number of modules when p-1 the normalized Laplacian is used, the method with the modularity matrix has its own criterion that determines whether the graph should or should not be partitioned. We refer to the parameter region, where the graph is not partitioned, as the unpartitioned phase. To our knowledge, the detectability analysis was not concerned with this unpartitioned phase. Focusing on the bisection problem, our analysis here reveals the relation between the detectable phase, undetectable phase, and unpartitioned phase.
Unifying framework of graph partitioning. -We consider the bipartitioning of a graph G(V, E) with a node set V and an edge set E. We denote the number of nodes as N (= |V |) and the total degree, or the graph's volume, as K(= 2|E|). The two subsets of nodes obtained by the partition and their total degree are denoted by S r and K r (r = 1, 2), respectively. We indicate the set of edges that connect nodes in S 1 and S 2 as E(S 1 , S 2 ).
The modularity with the resolution parameter, the objective function to be maximized, is where A is the adjacency matrix and c i represents the degree of node i. The modularity function distinguishes the connectivity of the actual graph, A ij , and the corresponding value of its null model, c i c j /K, in each module. The resolution parameter θ > 0 controls the balance between them. Although the choice of the null model is generally arbitrary, as considered in most of the literature, we employ a random graph, the expected degree sequence of which is equal to that of the actual graph. Because we focus upon bisection in the present study, the modularity function Q θ can be expressed using a spin representation s i = ±1 (i = 1, . . . , N ). Ignoring the constants irrelevant to the partition, we have where the vector c = (c 1 , . . . , c N ) has the degree of each node as its components and ⊤ indicates the transpose. The matrix B is called the modularity matrix, and θ is usually set to unity when bisection is considered. As we show in what follows, this framework contains the normalized cut minimization as a special case. The method of maximum log-likelihood is also a special case of this framework that has a particular value of θ [11]. A similar argument was presented in [14].
The objective function f Ncut of the normalized cut is for bisection. We denote the minimum value of f Ncut (S 1 , S 2 ) as θ * , i.e., for any partition, By using the relations |E(S 1 , S 2 )| = K − s ⊤ As /4, K 1 = (K + c ⊤ s)/2, and K 2 = (K − c ⊤ s)/2, we can recast (4) as where we excluded the unpartitioned case, which is singular in (4). Note that the right-hand side of (5) does not depend on the partition. The equality holds when the left-hand side is maximized, which only occurs when the optimum partition is achieved unless nontrivial degeneracies exist; although the unpartitioned case also achieves the equality in (5), this choice is excluded. Therefore, if the optimum value θ * is known, minimizing the normalized cut is equivalent to maximizing modularity with the resolution parameter, i.e., Because we do not know the minimum value of the normalized cut θ * a priori, the above argument is completely formal. However, Eq. (6) denies the possibility that the optimum partition in the sense of the normalized cut may be different from the optimum partition in the sense of modularity.
We now consider the spectral method for (2). As in [10], we relax the optimization of the spin variables s to a continuous vector x with the spherical normalization condition |x| 2 = N . This leads to the eigenvalue problem of the modularity matrix B, and we determine the partition based on the signs of the leading eigenvector elements. That is, each element in the leading eigenvector corresponds to the weight of a vertex, and we identify the set of vertices with the weights of the same sign as a module. The unpartitioned phase is the case in which every node has the same sign of weight. Note that the 1-vector, the vector in which all elements are equal to unity, is not an eigenvector when θ = 1. However, as we show, the leading eigenvector is orthogonal to the 1-vector in the detectable region.
Largest eigenvalue of the modularity matrix. -We analyze the performance of the spectral method for an ensemble of random graphs with a planted 2-block structure. We denote the node sets of planted modules as V 1 and V 2 , where |V 1 | = p 1 N and |V 1 | = p 2 N (p 2 = 1 − p 1 ). We impose a constraint that the number of edges between blocks are γN . The rest of the edges are placed randomly within each module so that every node satisfies a given degree distribution {b t }, where b t represents the fraction of nodes with degree c t . The average degree is denoted by c (= t b t c t ). As we have γ = cp 1 p 2 for a uniform random graph, it is natural to consider Γ = 1 − γ/cp 1 p 2 : Γ = 1 when modules are completely disconnected, and Γ = 0 for a uniform random graph.
Our goal is to evaluate the ensemble averages of the leading eigenvalues and their eigenvector distributions as functions of θ and Γ. This allows us to measure the correlation between the partition obtained by the spectral method {S 1 , S 2 } and the planted partition {V 1 , V 2 }. As in the previous works [13,15], the largest eigenvalue of the modularity matrix B can be calculated as where For the average largest eigenvalue, the replica trick yields where we denote the ensemble average over random graphs as [· · · ] B . We consider the limit N → ∞ and evaluate (9) using the saddle-point method with the replica-symmetric ansatz. After a calculation analogous to that in [13,15], Solving for the saddle point in the entire function space is, however, not feasible analytically or numerically. Thus, we restrict the possibility of distributions q r (A, H) and q r (Â,Ĥ) to simple forms of q(A) = δ(A − a) andq(Â) = δ(Â −â). While such distributions actually provide the exact saddle point for random regular graphs, they are approximations in general; this is called the effective medium approximation (EMA). Under this restriction, we can determine the average first eigenvalue [λ 1 ] B and the saddlepoint conditions in analytic forms by using the functions and the moments m nr = dH q r (H)H n andm nr = dĤq r (Ĥ)Ĥ n . The average first eigenvalue [λ 1 ] B be- where X nr = r p r X nr . The extremum conditions elucidate the appearance of each phase. While we havê m 11 =m 12 with m 1r = 0 in the detectable phase, the conditionm 2 11 =m 2 12 = 0 is satisfied in the undetectable phase. The transition occurs when φ andâ satisfy In the detectable phase, φ andâ are determined using the following extremum conditions: while they are constant in the undetectable phase. In both phases, we haveΩ = 0, and the average first eigenvalue is Note that (13) does not contain the resolution parameter θ; therefore, the detectability threshold is universal with respect to θ. There also exists a solution withΩ = 0 andm 11 = m 12 = 0. This solution indicates the unpartitioned phase, and it is observed when the corresponding first eigenvalue becomes larger than that of the detectable and undetectable phases. In this phase, they are determined using (14) and The transition to the unpartitioned phase occurs when the values of φ andâ for two phases coincide.  Random regular graph. -In the case of random regular graphs, the above results are exact, and we can analytically solve for the physical quantities and the boundaries of the phases. The detectable phase of random cregular graph has The undetectable phase has and the detectability threshold is This is equal to the case of the normalized Laplacian [13]. Finally, the boundary of the detectable phase and the unpartitioned phase is This is a monotonically decreasing function that is minimum when The corresponding value of Γ coincides with the detectability threshold. Note that when the graph is regular, the 1-vector is the leading eigenvector in the unpartitioned phase. Its eigenvalue is c(1 − θ) and is equal to the eigenvalue of the undetectable phase at θ max . Hence, the region where both θ and Γ are small is the unpartitioned phase. Consequently, we obtain the phase diagram shown in Fig. 1(a). Following the literature, we employed c in −c out , instead of Γ, to indicate the strength of the block structure; in the case of equal-size blocks p 1 = p 2 = 0.5, c in − c out is twice the difference between the average degree within a block and between blocks (see [13] for the relation between γ and c in − c out ). The fractions of correctly classified vertices and the average first eigenvalues [λ 1 ] B are plotted in Figs. 1(b) and 1(c). To draw the solid lines, we further approximated that the distribution of the eigenvector elements is Gaussian. We can confirm a universal detectability curve for various values of sufficiently large θ and an abrupt transition between the detectable and unpartitioned phases for a small value of θ.
Recall that in the case of the normalized cut, the resolution parameter θ is the optimum value of the objective function itself. Although the exact value of θ is not known, it is bounded by the second-smallest eigenvalue of the normalized Laplacian [13] using the Cheeger inequality [16]. The dashed line in Fig. 1(a) indicates the lower bound of θ, i.e., one half of the second-smallest eigenvalue of the normalized Laplacian, while the upper bound is very large.
Stochastic block model. -Although there are many variants of the stochastic block model, we consider the most fundamental model. Pairs of nodes within the same module and between different modules are connected with probabilities p in and p out (p in > p out ), respectively. That is, the nodes within the same module are more densely connected than the nodes in different modules. Because we are focusing on sparse graphs, we set both  [2], threshold of the normalized Laplacian [13], and threshold of the modularity matrix with EMA. Although they are not shown, we also confirmed the universal behavior with respect to θ for several unequal block sizes.
c in = p in N and c out = p out N to be O(1). This model has the Poisson degree distribution, and therefore, our EMAbased treatment no longer offers the exact result. Because no bound exists for the maximum of a node degree, we need to rely on the numerical estimate of the formal solution by truncating the infinite summations of R n (φ,â) and S n (φ,â). The results of our replica analysis and those of the corresponding numerical experiments are compared in Figs. 2(a)-(c). The comparison indicates that our estimates offer very accurate predictions. The average first eigenvalue [λ 1 ] B is fairly large even in the undetectable phase, which is consistent with [17]. Note that the degree of eigenvector localization, which we measured by the inverse participation ratio (IPR), increases gradually around the detectability threshold; as far as we explored, the value of IPR becomes significantly large only in the undetectable phase (see [13] for the definition of IPR). Although it is difficult to prove whether the localized eigenvector is absent in the detectable region, its influence seems negligible. This is consistent with what we observed empirically [18].
Summary and discussion. -In summary, we showed that the modularity maximization with the resolution parameter θ offers a unifying framework of graph partitioning and analyzed the detectability of the spectral method of the unifying framework. Our phase diagram shows that when the resolution parameter θ is sufficiently small, the unpartitioned phase appears before the detectability threshold because the block structure is weakened; that is, even when the block structure is statistically significant in the sense of the Bayesian inference, there exist cases where the graph has no significant structure in the sense of an objective function. Otherwise, the detectability threshold is universal irrespective of θ. This behavior occurs probably because the graph is assumed to be infinitely large in our analysis; because the order of the penalty factor (c ⊤ x) 2 /K is smaller than that of x ⊤ Ax in the detectable phase, the value of θ does not affect the resulting performance. Our results imply that, whereas the gap between the detectability threshold of the Bayesian inference and the spectral method was mainly due to the eigenvector localization when the normalized