Effects of homophily and heterophily on preferred-degree networks: mean-field analysis and overwhelming transition

We investigate the long-time properties of a dynamic, out-of-equilibrium network of individuals holding one of two opinions in a population consisting of two communities of different sizes. Here, while the agents' opinions are fixed, they have a preferred degree which leads them to endlessly create and delete links. Our evolving network is shaped by homophily/heterophily, which is a form of social interaction by which individuals tend to establish links with others having similar/dissimilar opinions. Using Monte Carlo simulations and a detailed mean-field analysis, we study in detail how the sizes of the communities and the degree of homophily/heterophily affects the network structure. In particular, we show that when the network is subject to enough heterophily, an"overwhelming transition"occurs: individuals of the smaller community are overwhelmed by links from agents of the larger group, and their mean degree greatly exceeds the preferred degree. This and related phenomena are characterized by obtaining the network's total and joint degree distributions, as well as the fraction of links across both communities and that of agents having fewer edges than the preferred degree. We use our mean-field theory to discuss the network's polarization when the group sizes and level of homophily vary.

While homophily appears to be ubiquitous in social networks with many examples of "birds of a feather flock together" behaviors, heterophily appears to be more elusive, with some empirical evidence in team formation processes [63], and in professional cooperation networks [38,40]. Quite interestingly however, in a twocommunity growing network according to preferential attachment, it has recently been found that heterophily is responsible for the increase of the average degree of the agents of the smaller group [55].
Here we consider an evolving network model in which links fluctuate continuously as the result of the homophilic/heterophilic interactions between individuals of two communities, holding one of two different opinions that remains fixed, who try to satisfy a prescribed preferred degree [64][65][66].
The objective of this work is to understand in some detail how homophily and heterophily affect the stationary state of the network and its emerging properties. While some of these aspects are considered in Ref. [67] for the special case of opinion groups of the same size, here we focus on the general case of communities of arbitrary sizes, showing that this leads to surprising results at a price of a considerably more challenging analysis. Our main contribution is the detailed characterization of the "overwhelming transition" arising under enough heterophily in communities of different sizes. Remarkably, as observed in Ref. [67], the network then undergoes a transition separating a phase where it is homogeneous and an "overwhelming phase" in which agents of the smaller community are overwhelmed by links from those of the larger group, with degrees greatly exceeding the preferred degree. Here, we unveil the properties of the overwhelming phase (and ordinary regime) notably in terms of the total and joint degree distributions by devising suitable mean-field theories and Monte Carlo simulations.
The remainder of the paper is organized as follows: the formulation of the model and quantities of interest are introduced in the next section. In Sec. III, we summarize the main properties of the symmetric system reported in Ref. [67]. Our main findings regarding the ordinary and overwhelming phases are presented in Secs. IV and V centred respectively on simulation results and meanfield analysis. Our conclusions are presented in Sec. VI. Further details are provided in the Supplementary Material (attached at the end of the document; or on https: //stacks.iop.org/JSTAT/2022/013402/mmedia).

II. MODEL FORMULATION AND QUANTITIES OF INTEREST
Our model consists of N agents (nodes) with a varying number of connections (links) between them, forming a fluctuating network. Each agent j = 1, · · · , N is endowed with one of two possible states ("opinions"), σ j = ±1, and a preferred number of links κ. A fraction n ± of the agents is in opinion state ±1, so that the network consists of a number N ± = N n ± of agents holding opinion ±1. Using the terminology of Ising-like models [68], (N + , N − ) can be replaced by (N, m), where m ≡ (N + − N − ) /N = n + − n − is an intensive quantity called "magnetization".
The basis of our model is a preferred degree network (PDN) [64][65][66]69] in which, at discrete time steps, different individuals (nodes) are chosen to add or cut links to others depending on whether its degree is less or greater than κ. The former are referred to as "adders" and the latter as "cutters". So as to avoid frozen networks, κ is chosen to be a half integer. Here, the PDN dynamics is supplemented by two ingredients: (i) the network consists of two communities of differing opinions and sizes; (ii) there are social interactions among agents embodied by the notion of homophily. Controlled by a parameter J ∈ [−1, 1], homophily models the behavior of individuals who prefer to "make friends" and establish links with those holding the same opinion (positive homophily or simply homophily, J > 0) or those with opposite opinions (negative homophily or simply heterophily, J < 0), see Fig. S1 and Sec. S1 in the supplementary material.
By combining Monte Carlo and mean-field techniques, we here investigate how the network topology respond to parameters (κ, J, N, m) in its out-of-equilibrium stationary state (see Sec. S2 of the supplementary material), and focus on the unexpected phenomena arising when m = 0, briefly reported in [67].

A. Model update rules
The dynamic rules of our model are conveniently stated by assuming discrete time steps, t = 1, 2, · · · . These rules are illustrated in Fig. 1. At each t, an agent i (i = 1, · · · , N ) of degree k i ∈ [0, N − 1] is randomly chosen. For convenience, we will refer to nodes connected to i as its "neighbors" and those not connected as "nonneighbors". Then, after i is chosen, • If k i > κ, choose a neighbor j and delete the link J thus plays a role similar to the nearest neighbor interaction in the ordinary Ising model (spin alignment). Note that for J = 0, the distinction between the communities is only nominal and as the opinions are irrelevant for the dynamics, and the system becomes similar to the PDN models of Refs. [64,69], see Sec. S1 in the supplementary material.

B. Quantities of interest
To study the behavior of this evolving network in various regimes of parameter space, we focus on a few quantities of interest, summarized in the table of Sec. S3 of the supplementary material. The most common of these is the degree distribution (DD) p σ (k) associated with agents in community σ, and from these p's, we can extract the average degrees µ σ ≡ k k p σ (k) for each community σ = ±, as well as the variances V σ ≡ k k 2 p σ − µ 2 σ . We also distinguish the number of links an agent has to neighbors with opinion τ ∈ {−, +}, and denote these by τ (where k = + + − ). With τ , we compile the joint degree distributions (JDD) [66] for the two communities: P σ ( + , − ), and from these can compute the averages: which can be regarded as the "centers of mass" (CM) of the two JDDs, see Fig. 9 of Ref. [67]. The average degree in community σ is thus µ σ = ¯ + σ + ¯ − σ . Another convenient characteristic is the conditional distribution of cross-links: which gives the probability for a node in the group σ to have w cross-links (CLs), provided it has total degree k.
( a d d a n I L l i n k ) FIG. 1: Illustration of the model for κ = 2.5. Dark nodes represent +1 voters (majority group) and light nodes are −1 voters (minority group). (a) Cutting process of an ij link when ki > κ. (b) Adding of an ij link when ki < κ. Dashed: cross links (CLs) between agents of different groups; solid: internal links (ILs) between voters of the same group. The probabilities of cutting an IL and adding a CL arě J = (1 − J)/2 andĴ = (1 + J)/2, respectively. Similarly, the respective probabilities of cutting a CL and adding an IL arê J andJ, see text.
We also study the (fluctuating) total number of connections, which is a global quantity denoted by L = L +L × , where L is the overall number of ILs and L × is the total number of CLs. Denoting by L στ the links between agents opinions σ and τ , we have L = L ++ + L −− and L × = L +− = L −+ , and hence Since we focus on the steady-state averages of these quantities, we simplify notation by writing L in lieu of L , etc. (The same below with α and ρ.) Clearly, these averages are related to ¯ τ σ : A natural way to describe polarization (extent of division between the communities) is to start with the ratio of CLs to the total number of links [32] ρ ≡ L × /L, and then a measure of polarization is given by so that Λ (J = ±1) = ±1. For asymmetric systems, however, the ratios for the two communities are distinct: Further, as will be discussed in Section IV.E below, Λ suffers from some deficiencies. Instead, let us introduce an alternative measure of polarization, Π, which relies on the (normalized) difference between the two CMs, Specifically, we define In order to account for the (fluctuating) number of nodes to add/cut links, we denote by N a and N c the number of "adders" and "cutters", respectively. Further we denote by N β σ with σ ∈ {+, −} and β ∈ {a, c}, the number of agents who are adders (β = a) or cutters (β = c) and hold opinion σ. Hence N a + N c = N and N σ = N a σ + N c σ , and the associated fractions are n β σ ≡ N β σ /N ; n σ = Σ β n β σ ; n β = Σ σ n β σ .
Clearly, Σ σ,β n β σ = 1. The fraction of adders, denoted by α ≡ n a = N a /N, plays an important role. It is useful to define the fraction of adders in each group σ by with n + α + + n − α − = α and α ± = α when m = 0.

III. SYMMETRIC CASE, m = 0: SUMMARY OF RESULTS
For the sake of reference and completeness, we summarize the findings on the symmetric case m = 0 reported in Ref. [67]. We showed that by solving the balance equations for L × and L in the context of a mean-field approximation, the fractions of adders and CLs in the steady state of the symmetric case are which implies that the response to homophily in a clear unique mean-field expression of the polarization: We also studied the stationary degree distributions p (k) and q (w|k). For the former, we found from which we obtain the degree average and its variance: We also consider the JDD, P σ ( + , − ), with P + (x, y) = P − (y, x) which, due to symmetry, gives the probability for either a + or − node to have x ILs and y CLs. The JDD is obtained from (2), where the conditional distribution of cross-links is approximated by a binomial distribution, i.e. q q bin (w|k) = k w ρ w (1 − ρ) k−w [67]. Hence, with (13) and (11), the product is a suitable approximation of the JDD for our purposes.
As explained in Sec. IV.D, for the generic m = 0 case with |mJ| 1, the form (15) can be used to get a threedimensional impression of the network's properties (see Fig. 7).

IV. ASYMMETRIC CASE, m = 0: SIMULATION RESULTS
In our simulations, m varies from −0.04 to −0.6, and we will see that m = −0.04 and m = −0.6 represent two very different regimes, i.e., the ordinary phase and the overwhelming phase, the system displays properties quite distinct from those summarized in Sec. III, see below.
We initiate the system with no initial links, run for typically over 10 2 MCS, and verify that quantities like α σ and ρ σ have settled into steady values. One MCS (Monte Carlo step) is N updates. Thereafter, simulation runs are carried on for up to another 10 6 MCS, during which we measured various quantities every MCS. For simplicity, we show mainly the data associated with N = 1000 and κ = 60.5, so that 1 κ N ± . The data collected for other values give some impression of finite-size effects, which can be serious when N and κ are lowered by an order of magnitude. In addition, we have carried out 10 runs with different random number generators in a handful of cases, to get a better idea of statistical errors. In all cases tested, the scatter for global quantities like α and ρ is much smaller than the size of the symbols (i.e., at most one part in a thousand).
A. Fraction of adders and cross-community links As Fig. 2(a) illustrates, the asymmetric case is of great difference from the symmetric case even for the smallest m, except that asymmetry has no effect on the fraction of adders at J = 0. The effect of homophily appears to be opposite for the two α's: when J > 0 and m < 0, α + is greater than α in the symmetric case, while α − is less than that α in the case m = 0. On the heterophily side (J < 0), we find richer phenomena, namely, the presence of a "kink" in the α − (J) curves at larger −J, accompanied by α + becoming vanishingly small. This phenomenon is most clearly seen in the m = −0.6 data, shown in Fig. 2(a), when −J is larger than ∼ 0.5. The transition to this overwhelmed state is accompanied by rapid changes in the slopes of α − (J). Such kinks can be seen even for low asymmetry systems (e.g., m −0.1), provided the strength of heterophily is sufficiently large, see Sec. IV.C. As we probe deeper into this regime (i.e., larger −J), we find α − (J) turning downwards, which is consistent with α vanishing in the limits |J| = 1. For the fraction ρ σ (J) of CLs associated in the group holding opinion σ, we have ρ ± (0) = n ∓ = (1 ∓ m)/2 and find a monotonic decrease with J in Fig. 2(b). Similar to the behavior of α ± , the curves of ρ ± vs. J also deviate from the symmetric case in opposite directions. We find that the minority fraction, ρ + approaches 1, just as the majority ρ − develops a kink, as J is decreased towards −1. As may be expected, these features occur at about the same value of J ≈ J c as those in α ± (J). Beyond the critical J, all ρ ± approach 1 rapidly when J < J c , a property which can also be understood intu- itively: heterophily is so strong in this regime that the entire population is locked into establishing CLs.

B. Mean degrees and associated variances
We now consider the mean degrees (µ σ ) and associated variances (V σ ) of each community σ = ±. As in the case of α ± and ρ ± , the two µ's deviate in opposite ways as we increase |J|, see Fig. 3(a,b). As the data for the larger κ's show, the differences µ ± − κ in this regime converge on values which are O (1), which indicates that the communities are not interacting much. In stark contrast, interactions across the communities affect the network dramatically under larger heterophily, see Fig. 3(c,d): while µ − remains relatively close to κ, µ + − κ in the minority group is strongly κ dependent, with a pronounced effect for large asymmetry, see Fig. 3(d) [note the scale of the µ − κ axis]. In this regime, the minority starts being "overwhelmed" once −J rises beyond the aforementioned −J c . Indeed, the average degree of minority agents can be enhanced by a factor E ≡ µ + /κ J N − /N + which can be much larger than unity [70]. Thus, instead of the difference µ σ − κ, we plot (for large asymmetry, m = −0.6) Fig. 3(e,f). From simulation results for M − in Fig. 3(e), we conclude that M − → 0 as κ → ∞, consistently with µ − − κ = O (1). In Fig. 3(f), we plot M + = E − 1 vs J and, from the data collapse of the simulation results (red symbols), we conjecture that M + converges to some definite, non-trivial thermodynamic limit. The behavior of M + is hence reminiscent of the Ising magnetization, becoming non-zero below a critical temperature. In our simulation results, M + appears to execute a smooth crossover through the transition region (inset of Fig. 3(f)). The DD variances paint a similar picture, with V σ = O (1) in the ordinary regime and V + = O (κ) in the overwhelmed state, see Fig. 3(g,h). The overall behavior is qualitatively clear, but the dependence on the parameters is complex as illustrated in Fig. 3(h) [note the scale of the axis]. In the inset of Fig. 3(h), V + /κ scales, to some extent, with κ but without converging as κ → ∞, which suggests the need for a further study of finite-size effects to draw conclusions about the thermodynamic limit.

C. Overwhelming transition region characteristics
Here, we highlight the transition region between the ordinary and overwhelming phases by providing a perspective of the simulation data based on the derivatives of α − , ρ − , µ + , V + in Fig. 4, see Sec. S4 in the supplementary material. We report the discrete derivatives of these quantities, and all results clearly indicate the existence of a sharp peak in the vicinity of J ≈ −0.42. These peaks correspond to the "kinks" shown in Fig. 2, at which the transition between ordinary and overwhelming phases occurs ("overwhelming transition"). The critical point J c at which this transition occurs depends on

D. Degree distributions and joint degree distributions
We now study the total and joint degree distributions. At low asymmetry and/or heterophily (|m| 1 and/or J > J c ), the DDs p ± (k) remain approximate exponentials (Laplacian distribution), dropping as k gets further from κ. As shown in Fig. 6(a-c), the Laplacian distributions are not symmetric, as the slopes on each side (log-linear plot) differ slightly, producing µ σ = κ, with the slopes of the DDs in the log-linear plots and widths being O (1). In our simulations, we found little dependence of the DDs on κ and N ± . The slopes of p ± differ in the opposite directions for the two communities, corresponding to α ± deviating from the m = 0 curve in opposite ways. In Fig. 6(c) we see that the right half of the DD of the minority is bent in the log-linear plot, and a Gaussian-like distribution for the cutters in the minority develops gradually as J is decreased further (increased heterophily), with J down to −0.6 in Fig. 6(d-f). These findings illustrate the process of the system transiting from the ordinary into the overwhelming regime. To locate the change of phase, the "overwhelming transition", we assume that the system is in the transition regime when p( κ ) p( κ + 1), i.e., the slope of the degree distribution at κ of cutters in the minority equals 0, see Fig. 6(d). Within the overwhelming regime, the fraction of adders (k < κ) decreases substantially, and p + (k) from an exponential becomes a Gaussian-like distribution. In Fig. 6(e,f) we show the DDs deep in the overwhelming phase, where novel behavior emerges: while the majority keep their DD to be narrowly distributed around κ, the dramatic rise of the average degree of the minority agents is accompanied by significant changes to p + (k) and a substantial decrease of the adders in the minority. As illustrated in Fig. 6(f) for m = J = −0.6, there are no minority nodes with k < κ = 60.5 (i.e. no minority adders), while the distribution is essentially a Gaussian peaking at k 145, well over twice κ. Unlike the narrow Laplacian, the variance of this Gaussian is considerably higher, V + 217. The inset of Fig. 6(f) shows the striking difference between the DDs on linear scale.
We also consider the joint degree distributions, (2), where k = + + − . In the regime where |mJ| is small, we can approximate the conditional distribution of cross-links by a binomial distribution, i.e., q σ ( −σ |k) [67] as in the symmetric case, yielding for the JDDs P σ ( that are of the same form as (15). Accordingly, the narrow Laplacian and broad Gaussian are embodied as different perspectives -in p σ (k) and q σ ( −σ |k). A three-dimensional plot of P σ ( + , − ) is the best way to view the "knife-edge" of the JDD caused by p σ (k) following a narrow asymmetric Laplacian distribution in the ordinary phase. As examples, in Fig. 7, we present the JDDs for the case of low asymmetry and intermediate heterophily, and the case of high asymmetry (and low heterophily). In Fig. 7(a,b), the two perspectives for the majority JDD are shown, one along the knife-edge and the other, broadside. Note that k = + + − , so that the perspective of the former is aligned with constant k. The latter clearly gives the impression of a Gaussian stemming from q σ . This picture is qualitatively the same for the minority agents. In Fig. 7(c,d), we present the broadside perspective of the minority JDD in linear and semi-logarithmic scales. The latter reveals the initial stages of the minority being overwhelmed: in Fig. 7(d), the right side of the Gaussian is truncated near + = 0, and squeezed towards implying that the probability of a minority node to have a small number of ILs (small + ) is not vanishingly small. The graphs of Fig. 7(e,f), are similar, but for a low level of heterophily: the effect of heterophily is moderate, with almost half of the Gaussian being squeezed into the + = 0 plane, with the narrowing of the JDD along + being accompanied by its broadening along − . In this cross-over regime, the partition of P + ( + , − ) into the product of a narrow Laplacian and a broad Gaussian is not valid (see Fig. 6(d)). Finally, deep in the overwhelming phase, the minority JDD collapses entirely onto the + = 0 plane, and the product expression is trivially valid: is a broad Gaussian in the overwhelmed state. Meanwhile, the JDD of the majority agents, P − ( + , − ), continues to display the same "knife-edge" characteristics, as in the ordinary phase. In the transition region, the minority JDD cannot be simply approximated by the product of the DD and the conditional degree distribution. When measuring the network polarization, the quantity Λ is commonly used and sometimes referred to as "average edge homogeneity" [47,71]. See Sec. S6 in the supplementary material for the derivation of Λ in terms of ρ ± which are functions of J and m obtained from the mean-field theory of Sec. V.B. Λ provides a sensible measure of polarization in systems with low asymmetry (m ≈ 0). However, this is generally not the case when m = 0, especially when J ≈ 0 and Λ ≈ m 2 fail to predict a vanishing polarization when J → 0, see Fig. 8(a) and Ref. [67]. This led us to introduce Π, defined by Eq. (7), which is constructed to avoid the deficiencies of Λ by providing a meaningful measure of polarization for any values of m. Although Π was motivated by the separation of the CMs of P σ (the JDDs of the two groups), we have shown that in the context of the mean-field approximation, Π can be computed from ρ ± and m, see Eq. (S7) in the supplementary material. While Π contains the same information displayed in Fig. 2(b), its mean-field expression is an instructive combination of ρ ± offering a single meaningful quantity for polarization. We indeed find that Π ∝ 1 − ρ + − ρ − which vanishes for J = 0 and any value of m (since ρ ± = n ∓ ), while it reduces to Λ for m = 0. When J = ±1, Π(±1, m) = Λ(±1, m) = ±1. This implies that the sign of Π alone signifies if the system is homophilic or heterophilic, as seen in Fig. 8(b) where the mean-field predictions are in excellent agreement with simulation data (symbols) for all values of m and J. While Π does not appear to be independent of m, the dependence found in Fig. 8(b) turns out to be weak. Finally, we note that in Fig. 8, as expected, Λ and Π display a signal of the transition from the ordinary to the overwhelmed state about J ≈ J c (m).

V. ASYMMETRIC SYSTEMS: THEORETICAL CONSIDERATIONS
From the phenomena presented in the last section, it is clear that there are two different regimes, the ordinary and overwhelming phases, with quite distinct properties. This section is devoted to their theoretical characterization in terms of mean-field theories. While we believe it is possible to formulate a single mean-field based theory which would completely describe both regimes, such a theory will be quite complex. Instead, here we adopt a simpler approach which has the advantage of being pragmatic and easily comprehensible, see Sec. V.B, at the price of being less effective in the overwhelming regime, an issue that we circumvent in Sec. V.C with a refined (complementary) mean-field theory.

A. Framework for a general mean-field analysis
Our starting point to set a general framework for a mean-field analysis that applies for any choice of N ± , for arbitrary m, is to generalize the mean-field theory devised in Ref. [67] for the symmetric case m = 0, see also Sec. III.
For arbitrary m, we now have four unknowns: α ± (or n a ± ) and ρ ± [72]. As in the symmetric case we have two equations and still need two other equations to obtain a self-contained theory. One such equation is for each L σσ , while the other is a strict constraint involving ρ ± in the identity L +− = L −+ .
Focusing first on global quantities like L, we must consider the changes for L σσ , with σ = ±. Proceeding as in Ref. [67], the gain and loss rates are n a σ n σĴ and n c σ (1 − ρ σ )J, respectively. For the former, n σ means that a node can choose to add a link to unequal fraction of partners [73], rather than 1/2 when m = 0. Now, there are four ways L σσ can arrive at a stationary state. Two correspond to extreme values of J, when one of the rates vanishes and L σσ reaches its own bound. Another is when both n a σ and 1 − ρ σ vanish (or are vanishingly small), an interesting possibility we will return to in subsection V.C. Here, we are most interested in the last scenario, when both rates reach stationary, generic values. Balancing these, we find for the steady state:Ĵn a σ n σ =Jn c σ (1 − ρ σ ), which is equivalent to where B ≡Ĵ/J = 1 + J 1 − J is a convenient way to express the bias due to homophily: B > 1 for J > 0. For CLs, the generalization of the gain/loss rates is slightly more complicated, yielding where we can read the contributions from both groups. Another equation comes from the constraint L +− = L −+ . In terms of the variables here, this equality (trivially satisfied in the symmetric case) reads In order to apply Eq. (18), we must first generalize the technique of Ref. [67] for the degree distributions p ± (k) when m = 0, from which to extract µ ± in terms of α ± and ρ ± . Thus, we devote the next subsections to studies of the DDs.
B. Systems with low asymmetry or −J 1 As seen in the simulation data, the DDs for asymmetric system with small m and −J 1 are qualitatively the same as in the case m = 0. Proceeding as in the symmetric case [67], we start from the balance equation for the addition/deletion of a link at a single node: for which we need expressions for the four R a,c σ which are the probabilities in a time unit at which a connection is added (R a σ ) and cut (R c σ ) in the community σ. Each R has contributions from both communities, and the probabilities for adding and cutting links are here associated with the symbols η and χ, respectively. We will further distinguish contributions due to the actions (η, χ) of the chosen node, or from the rest of the population (η,χ) [67]. In the context of our mean-field theory, the former pair simply reads: as the prefactors of the J's just refer to the chances our node adds/cuts a link to an agent in its community or otherwise. The latter is similar, except for the extra factors accounting for the fraction of adders/cutters in the σ group, yielding η σ = α σ n σĴ + α −σ n −σJ These enter into the R's as in the case m = 0 [67]: (22) Proceeding as in the symmetric case, solving the recursion relation (19) [67], we obtain again asymmetric Laplacian distributions (see Sec. S5 in the supplementary material and (13)): where, with (20) and (21), are the factors controlling the exponentials. Finally, by imposing the normalization conditions both p's are uniquely determined in terms of the unknowns α ± and ρ ± . Our goal is to find the averages µ σ of these distributions, which allow us to apply Eq. (18) and in turn complete our mean-field theory. From Eqs. (23) and (25), we have where we have neglected terms of order O(γ κ σ> , γ κ σ< ) since κ 1. Equation (26) gives us the expression of µ ± in terms of α ± and ρ ± , which can be determined by solving Eqs. (16), (17) and (18), and in turn allow us to obtain the predictions of this theory. Yet, since Eqs. (16) and (18) and (26) have been directly used to obtain the meanfield predictions of α ± and ρ ± , Λ and Π (see Sec. S6 in the supplementary material), µ ± and M ± = (µ ± /κ) − 1, and also the degree distribution p σ (k).
The blue and red lines of Fig. 2 show the predictions of α σ and ρ σ which are found in good accord with simulation data for all values of J, across both the ordinary and overwhelming phases. The agreement is excellent in the ordinary regime, and we note that our mean-field theory remarkably captures the "kinks" of α − and ρ − at the onset of the overwhelming transition. In the overwhelming phase, |m| = O(1) and J < J c < 0 (see Sec. IV.C), the degrees of the minority agents are no longer small compared to N , which violates a key assumption of our meanfield theory. This leads to systematic but rather modest deviations between the mean-field predictions and simulation data.
The comparison of simulation and theoretical results for Λ and Π in Fig. 8 shows a remarkable agreement for different values of m over the entire range of J, i.e., across both ordinary and overwhelming phases. In particular, the mean-field theory correctly captures the weak m-dependence of Π, and signals the transition from the ordinary to overwhelming regimes for both Λ and Π.
For µ ± and M ± , as shown in Fig. 3, mean-field predictions generally agree well with simulation results, with an agreement that improves as κ N , with κ 1. Remarkably, the mean-field results give sensible results for µ ± , and M ± also in the overwhelming regime, see Sec. V.C and Sec. S8 in the supplementary material.
Theoretical predictions of p σ (k) are used to obtain the red and blue lines in Fig. 6, which are generally in good agreement with simulation data, especially in the ordinary regime, i.e., |mJ| 1 or J > 0, see Fig. 6(a,b). As shown in Fig. 6(c), for |m| = O(1) and above a certain level of heterophily (J < 0 with |J| = O(1)), a Gaussianlike distribution for the cutters in the minority begins to develop, which is not captured by the above meanfield theory. In fact, deep in the overwhelming phase the degree distribution of the minority no longer falls off exponentially as predicted by (23), but is a broad Gaussian, see Fig. 6(f), characterized in Sec. V.C. We note that the degree distribution of the majority always falls off exponentially and, in both ordinary and overwhelming phases, is well described by (23), see blue lines in Fig. 6(a-f).

C. A refined mean-field theory for the degree distribution of the overwhelmed minority
Clearly, the overwhelmed states lie beyond the domain of validity of the above mean-field theory. This chiefly results from the DDs of the minority agents having morphed, see Fig. 6(e,f). To get a reasonable characterization of the quantities in this region, we have to study the degree distribution of the minority agents more carefully.
For this, we revise the above theory following the approach of Ref. [66], and write the balance equation obeyed by the minority DD with a full non-trivial kdependence of adding/cutting probabilities R a,c (k): Furthermore, guided by simulation data, we assume that deep in the overwhelming phase we have α + = 0 (see also at the end of this section). To determine R a (k), we recognize that there are just (N − − k) majority nodes which can add a link to our agent of degree k, each of which can be chosen with probability 1/N , and only a fraction α − of them would add. There is also the biasJ for adding CLs. Finally, the adder will choose our agent with probability 1/ N −k − 1 , wherek is the number of links it has, which is a fluctuating quantity. In the spirit of a mean-field approximation,k is replaced by µ a − (average degree of majority adders). Putting everything together, we have Similar arguments give the expression of R c + (k): With (28) and (29), the solution of the recursion (27) is a "shifted binomial" (see Sec. S7 in the supplementary material): where Since Q c > 0, this expression is well-defined as far as k = 0. Since α + ≈ 0 is our assumption in the overwhelming state we expect this theory to be fairly good deep in the overwhelming regime, but to fail near the transition line.
For the characterization of the DD by (30), we expect p + (k) to approach a Gaussian in the limit of large N, κ with generic values of m, J. Thus, we use the modek for µ + and the curvature of ln p + for V + , see also Sec. S7 in the supplementary material.
(33) To determine Q a and Q c , guided by simulation data showing that the DDs for the majority agents still follow the asymmetric Laplacian distribution (23), we assume that µ a − ≈ µ c − ≈ κ. Also, consistently with our previous assumption, in the overwhelming state, we set α + = 0 in (16) and (17), and obtain These together give us the approximation of Q a and Q c in the overwhelming state, and thus the degree distribution p + (k) of the minority agents (almost all "cutters") shown as red curves in Fig. 6(d-f). Since (30) with (34) and µ a,c − ≈ κ rely on the observed absence of adders and ILs in the minority community, it is a "semi-phenomenological" refined mean-field theory. A fully self-consistent refined mean-field theory is outlined in Sec. S9 of the supplementary material. Comparison with simulation results of Fig. 6 shows that this semi-phenomenological theory gives a good description of the minority DD not too deep in the overwhelming phase, see Fig. 6(d,e). Yet, deep in the overwhelming regime of very heterophilic systems (−J close to 1), there are quantitative deviations between the theoretical predictions and simulation data, see the red curve in Fig. 6(f). These are traced back to the use of (34), and a significantly better agreement is found when (30) is used with α − directly obtained from simulations (and µ a,c − ≈ κ), leading to the cyan curve in Fig. 6(f). The accuracy of the predictions of (30) with (34) improves as the system size is increased, i.e. the red and cyan curves of Fig. 6(f) will get closer for larger N and κ.
As a simple assessment of our refined mean-field theory, we compare its predictions with the simulation data in the case of N = 1000 and κ = 60.5, with m = J = −0.6. As shown in Fig. 6(f), p + (k) in this case study is clearly Gaussian-like, with measured µ + 146 and V + 217. These values are compared with the predictions of our theory based on Eq. (31), (30) and (34), yielding (µ + , V + ) (128, 203) from (32) and (33). These results are in reasonable but not perfect agreement with those of simulation. The data can also be compared with (32) and (33) when these are used with α − directly measured from simulations, yielding (µ + , V + ) (146, 216), which are the approximation of the mean and variance of the cyan curve and compare remarkably well with those obtained from simulations. This agreement gives us confidence that we have devised a suitable mean-field description of the DD of the minority agents.
We conclude that our results, illustrated by Figs. 2-8, show that the ordinary MF approximation gives a sound qualitative and quantitative characterization of all quantities in the ordinary phase, as well as of the global quantities in the overwhelming phase and DDs of the majority phase. Yet, the refined MF is necessary to describe the DD of the minority in the overwhelming phase, see also Sec. S8 and Fig. S2 in the supplementary material.

D. Transition line
In this section, we use the theoretical results of Secs. V.B and V.C to derive the mean-field prediction of the transition line (m, J c (m)) separating the ordinary and overwhelming phases (respectively at J > J c and J < J c , with m fixed), see Fig. 5. For the sake of concreteness, and without loss of generality, here we consider m < 0.
To find the point where the transition occurs, we start from the balance equation for a minority node of degree k = κ which, from (19), reads with R a + ( k ) =η + /N, R c + ( k ) = (χ + +χ + ) /N, (36) where we have used (22) with H(κ − κ ) = 0 and H( κ − κ) = 1. As discussed in Sec. IV.D, we consider that the transition between the ordinary and overwhelming phases occurs when p + ( k ) = p + ( k + 1), see Fig. 6(d). With (35) and (36), this readily gives Further, we assume that at the onset of the transition, the features of both phases hold: α + = 0, ρ + = 1 (see Fig. 2), and µ + ≈ µ − . With these assumptions and Eqs. (3), (6), (20), (21) and (37), we have where n − = (1−m)/2 and α − (J, m) can be approximated by Eq. (34). The unique physical root of (38) thus gives us the mean-field expression of J c (m), which explicitly reads (for m < 0): (39) This expression is plotted in Fig. 5. The predictions of (39) are found to generally agree well with simulation data, with an excellent agreement for m −0.4 and some noticeable deviations close to the symmetric case (|m| 1). These are due to the deterioration of the approximation of (α + , ρ + ) ≈ (0, 1) that we attribute chiefly to finite size effects, expected to be important close to the ordinary phase consisting of a finite fraction of adders and CLs (given by (11) in both communities). Naturally, this and the limited validity of the crude assumption µ + ≈ µ − affect the applicability of (39) [74].

VI. CONCLUSION AND OUTLOOK
We have investigated a dynamic, out-of-equilibrium, network of individuals that may hold one of two different "opinions" in a two-party society. In this work, the opinions of agents are held fixed while inter-party and cross-party links are endlessly created and deleted in order to satisfy a preferred degree. The evolving network has therefore a fluctuating number of links and is shaped by homophily and heterophily which model forms of social interactions by which agents tend to establish links with others having similar or dissimilar opinion, respectively. In our model, homophily/heterophily is modeled by an evolutionary process leading to the continuous "birth" and "death" of links within and between the communities. While the features of the system where the two opinion groups are of the same size (symmetric case) have been studied elsewhere [67], here we have focused on the generic case of communities of different sizes. We have thus investigated how the joint effect of community size asymmetry and homophily/heterophily influences the network structure in its steady state and leads to new phenomena.
The most striking feature of our model is the transition between distinct phases as the level of homophily/heterophily is varied. As main findings, we unveil the emergence of an "overwhelming phase" whose properties are analyzed in detail by a variety of analytical and computational methods presented in Sections IV and V.
When the level of heterophily is non-existing or modest, the system is in an "ordinary phase" similar to that characterizing the network with communities of equal size. Under intermediate to large heterophily, for sufficient asymmetry in the size of the communities, the agents of the majority group "overwhelm" those of the minority by creating a large number of cross-party links. We refer to this change of regime as the "overwhelming transition", and to the regime itself as the "overwhelming phase". In the overwhelming phase, the minority consists of agents having only cross-party links and large degrees following a broad distribution whose average can greatly exceed the preferred degree. By means of extensive Monte Carlo simulations and mean-field theories, we have determined the transition line separating the ordinary and overwhelming phases, and characterized in detail both regimes. In particular, we have studied the dependence on the level of homophily/heterophily and community size asymmetry of the number of cross-party links, fraction of agents with fewer links than the preferred degree, as well as the average degree in each community and the level of polarization in the network. In addition to these global quantities, we have also determined the total and joint degree distributions of both communities.
We have found that the ordinary phase is characterized by features similar to those of the symmetric case [67]. The analysis of these follows from a direct two-community generalization of the mean-field approach used in the absence of group size asymmetry. The excellent agreement between simulation and analytical results has allowed us to show that in the ordinary regime, the network is essentially homogeneous, with total degree distribution centred about the preferred degree and falling off exponentially (asymmetric Laplacian distribution), and with a broad distribution of cross-party links resulting in a "knife-edge" joint degree distribution.
Remarkably, the overwhelming phase displays a number of surprising features: generally, the agents of the minority, all have a number of edges exceeding greatly the preferred degrees, and all of these are cross-party links. This results in a degree distribution of the minority com-munity that follows a broad Gaussian-like distribution. To characterize the latter, we have devised a nontrivial generalization of the ordinary-phase mean-field analysis which is found to be in good agreement with simulation data. Interestingly, the majority community in the overwhelming regime has essentially the same properties as in the ordinary phase: it forms a homogeneous network whose degree distribution is centred about the preferred degree and that falls off exponentially. The transition from the ordinary to the overwhelming phase occurs at finite level of heterophily (when group sizes are asymmetric), and therefore differs from the fragmentation/fission, arising in other network models with homophily [27,30,32]. Such a transition, by which the network is split into disconnected communities, is also found in our model but only under extreme homophily.
It would be interesting to understand whether the existence of an overwhelming transition, the most distinctive features of our simple model, is robust against generalizations of our simple dynamic network model. As natural further avenues we could consider more than one preferred degree, or to allow agents to draw their preferred degree from a finite range. It would also be instructive to investigate other forms of update rules, e.g., like networks subject to heterophily and growing with preferential attachment [55]. An even more realistic, yet challenging, generalization would be to consider the co-evolutionary dynamics where network varies in response to changes of node states and the changes of those are coupled to updates of the network links. It would be quite relevant to investigate whether an overwhelming phase is a common feature of all these model extensions, and to what extent our analytical methods can be generalized to tackle the latter. This endeavor, while challenging and likely to unveil even richer and more complex phenomenology, would allow us to shed further light on the important problem of better understanding the general features of dynamic network shaped by social interactions.

ACKNOWLEDGMENTS
We are indebted and grateful to Andrew Mellor for substantial input and helpful discussions. The support of a joint PhD studentship of the Chinese Scholarship Council and University of Leeds to X.L. is gratefully acknowledged (Grant No. 201803170212). We are also grateful to the London Mathematical Society (Grant No. 41712) and Leeds School of Mathematics for their financial support, and R.K.P.Z. is thankful to the Leeds School of Mathematics for their hospitality at an early stage of this collaboration. This work was undertaken on ARC4, part of the High Performance Computing facilities at the University of Leeds, UK.
[2] S. E. Asch, Studies of independence and conformity: I. a minority of one against a unanimous majority, Psychol.
let us consider J = 1: since each node adds only ILs and cuts CLs, any initial network will quickly evolve into one with no CLs. Meanwhile, as soon as a node exceeds κ links, it becomes passive. However, as others in the community with fewer than κ links can choose to link to it, its degree can only increase * . While we always consider an initial network without any links, we can here consider the most atypical example consisting of an initial complete network, with κ ≪ N ± . Then, in every time step, an agent will cut a link -CL or IL -depending on J = ±1, until attaining a steady state consisting (regardless of m) of either two disjoint complete networks or a single bipartite graph as in Fig. S1(a,c).
(iii) Another singular limit of this model is extreme asymmetry: |m| → 1. With only one set of agents, the population is homogeneous and the update rules reduce to adding (cutting) links with probabilityĴ (J). Thus, balancing αĴ with (1 − α)J, we conclude α = (1 − J) /2. This result is consistent with those above: there are no adders in the final state when J = 1. The opposite limit, J = −1, is more interesting as the naive conclusion would be that there are no cutters (1 − α = 0) in the steady state. But this appears to contradict (ii) above. The resolution lies in the singular nature of the points J = −1, |m| = 1, so that the final states depend on which direction these points are approached. If J is taken to −1 first, then the final state will indeed have no adders. Such a singularity affects α (m, J) before these points are reached, namely, there is a non-uniform convergence towards α → 0 when J → −1. As shown in Sec. V, as the asymmetry becomes more pronounced (i.e. as |m| approaches 1), the fraction of adders α (m, J) in the the majority tends to (1 − J) /2 for larger portions of J ∈ [−1, 1], before dropping steeply towards α = 0 near J = −1.
These turning points are significant as they signal transitions to very different final states. These discussions can also shed light on the properties of the general system, e.g., the properties of α ± displayed in Fig. 2(a). First, we see that when J is not close to −1 and m = −0.6, the fraction of majority adders, α − (J), can be approximated by (1 − J)/2. Meanwhile, much of the minorities' response is governed by the action of the majority on the CLs. With J > 0, CLs are preferentially cut. As there are more majority cutters (α − < 1/2), L × is likely to suffer more losses, which can be compensated by having the minority agents add more links. Thus, we find more minority adders (α + > 1/2). Of course, when homophily is increased further, the level of polarization is higher. With the dominance by the majority lessened, the minority agents can revert to the behavior of a single population with J ≃ 1, i.e., α + → 0.

S2 Master equation and detailed balance
The complete description of our system is given by the adjacency matrix A with elements A ij being 1 or 0, if a link between nodes i and j is present or not. Clearly, A is symmetric and we define A ii ≡ 0. To be specific, we will order the nodes so that all those with opinion +1 comes first: σ i = 1 for i = 1, · · · , N + . It is then convenient to use the notation where the blocks represent ILs and CLs. Obviously, A −+ = A T +− is rectangular in general. Starting with some initial network A ini and generic J, the system's stochastic time evolution obeys a master equation for the distribution P (A, t) -an abbreviation of the conditional probability P (A, t|A ini , 0) -and arrive at a stationary P * (A) as t → ∞. We will show that detailed balance is violated by the dynamics of this model that is therefore genuinely out of thermal equilibrium. As a consequence, determining the full evolution of P or even an explicit form for P * are beyond reach. Thus, in the main text we focus our attention mainly on investigating the interesting steady state properties of our model.
The stochastic aspects of the system is embodied in P (A, t), the probability of having A at discrete times t = 0, 1, · · · . The evolution of P is governed by a Master equation: where, R [A ← A ′ ] stands for transition probabilities from A ′ to A. From one t to the next, an update consists of choosing a node at random, deciding if it adds or cuts a link (based on κ), choosing another node at random for this * Although the process appears Poisson-like, it is much more complex, as the number of adders not already connected to our node depends on k and dwindles with time. Even the case with the lowest κ (0.5) is non-trivial, though the N → ∞ limit may be analytically accessible (H. Hilhorst, private communication (2020)) action, and then applying the rules of homophily (controlled by J ∈ [−1, 1]). Given in words in Sec. II, these rules are presented concisely by and where accounts for the effects of homophily, and ensures that only a ij and a ji are changed. In Eq. (S2), H is the Heaviside step function, while the denominator accounts for the total number of choices node i has to act (i.e., add/cut the link to node j ̸ = i).
It is clear that, for J = 0, the opinions are irrelevant so that the system reduces to the case of "a homogeneous population" as in Ref. [64,76]. Since detailed balance was shown to be violated in that model, we can expect that to be generically true here. For example, let κ = 10.5 and nodes i, j, and ℓ with (k i , k j , k ℓ ) = (9,15,6). Now, consider the loop (9, 15, 6) → (10, 15, 7) → (11, 16, 7) → (10, 16, 6) → (9, 15, 6) with non-zero rates at every step (by i adding link to k, i adding to j, then i cutting to k, and then j cutting to i). But the reverse has R [(9, 15, 6) ← (10, 15, 7)] = 0, since k i and k ℓ are both < κ. Thus, the product of R's for the forward loop cannot be the same as for the reverse (regardless of the opinions of the nodes) and, applying the Kolmogorov criterion [77], detailed balance is violated. Since κ is picked so that a chosen node always adds or cuts a link, it is easy to verify that the system is ergodic (any configuration A can be reached from any other). Thus, the Frobenius theorem guarantees P settles in a stationary state P * (A), which is the right eigenvector of R with unit eigenvalue. Since detailed balance is violated, P * (A) should be regarded as a non-equilibrium steady state. In particular, unlike typical equilibrium states associated with the Boltzmann-Gibbs ensemble, there are non-trivial probability currents in this steady state [78].

S3 Quantities of interest: table & exact expressions for quantities of interest
We also provide explicit expressions for various quantities defined in Sec. II, summarized in Table 1. To be specific, we choose σ i = 1 for i = 1, · · · , N + and −1 for i = N + + 1, · · · , N . Since σ's are not variables, there is no need to express them explicitly in A.
The average of an observable O in the steady state is Some relevant observables and averages are listed below.

IV
Total number of links: Fraction of adders, α ≡ n a = N a /N p σ (k) Degree distribution of the community of opinion σ P σ (ℓ + , ℓ − ) Joint degree distribution of the community of opinion σ (l ± ) σ The average fraction of neighbours of opinion ± for an agent of opinion σ, see Eq.
-Number of ILs and CLs: .
-Number of adders and cutters holding opinion σ: -Average degree partitions: -Degree distribution of nodes with opinion σ: V so that the moments are with µ σ ≡ ⟨k⟩ σ = l + σ + l − σ .

S4 Discrete derivatives
Since we have taken measurements at unevenly spaced parameter values, we provide some details on how we arrive at the plots of first and second derivatives of α and ρ. Given f (x i ), where the points x i (i = 1, 2, · · · ) are not necessarily evenly spaced, we define discrete derivatives by and If we want to be consistent, then we should assign this ∆ 2 f value tõ which of course reduces to x i if the spacings are equal. It is interesting to note that the difference,x i − x i , looks like the ordinary curvature (x i−1 − 2x i + x i+1 ).

S5 Average and variance of an asymmetric Laplacian distribution (ALD)
The degree distributions considered in this paper are well approximated by ALDs, see e.g. Eq. (23) in Sec. V.B: for ℓ > 0 or ≤ 0 respectively. Along with the γ's (both < 1), the discontinuity (given by the ratio Γ) is needed to specify the distribution uniquely. Normalization implies Thus, the fraction with ℓ ≤ 0 is VI and, of course, the fraction with ℓ > 0 is The average can be computed by "pieces", starting with the result In particular, S6 Computing Λ and Π from ρ ± and m The quantity Λ is commonly used and sometimes referred to as "average edge homogeneity" [47,72]. From its definition, in the realm of the mean-field approximation, we have where we have used 1/ρ + + 1/ρ − = 2/ρ, based on Eq. (6). A salient feature of Π is its simple mean-field expression in terms of the fraction of connections within each of the three sectors: f ± ≡ 2L ±± /N 2 ± and f × ≡ L × /N + N − . Then, Eq. (3) implies l σ σ = f σ N σ and l σ −σ = f × N σ , so that by plugging these into Eq. (7), i.e., our definition of Π is equivalent to, Again, it clearly reduces to ±1 for J = ±1, since f ± or f × vanishes. For J = 0, every nodes has exactly the same average degree, κ, so that the same fraction, f = κ/ (N − 1), prevails in each sector and Π = 0. For calculating Π by Eq. (S7), we recognize that we only need the two ratios f ± /f × that will be again treated in the spirit of the mean-field approximation, see below. Using Eq. (6), we find and so f σ f × = 1 − ρ σ ρ σ 1 − σm 1 + σm .
This leads immediately to the numerator of Π being  (32) with α − determined by (34) and the simulation data, respectively, and µ a,c − = κ; the dotted line is from the predictions of (26).
that p − (k), the degree distribution of the majority nodes, and p + (k < κ), the degree distribution of the adders of the minority nodes, are again asymmetric Laplacian distributions (see Fig. 6). Hence, the mean-field theory in Sec. V.B, yielding exponential fall off of the degree distribution of both communities according to (30), still provides a good approximation for these nodes, as indicated by the red and blue straight lines in Fig. 6(d-f).