Energy landscapes of some matching-problem ensembles

The maximum-weight matching problem and the behavior of its energy landscape is numerically investigated. We apply a perturbation method adapted from the analysis of spin glasses. This method provides insight into the complexity of the energy landscape of different ensembles. Erdős–Rényi graphs and ring graphs with randomly added edges are considered, and two types of distributions for the random edge weights are used. Fast and scalable algorithms exist for maximum weight matching, allowing us to study large graphs with more than 105 nodes. Our results show that the structure of the energy landscape for standard ensembles of matching is simple, comparable to the energy landscape of a ferromagnet. Nonetheless, for some of the ensembles presented here, our results allow for the presence of complex energy landscapes in the spirit of a replica-symmetry breaking scenario.

These mentioned problems are members of the class of non-deterministic-polynomial (NP) problems [29].All known exact algorithms for treating these problems exhibit worst-case running times that grow exponentially with the problem size.This means, one is rather limited in the numerical investigations of the equilibrium behavior of these problems, because only comparable small systems can be studied.This is a pity, because the behavior of these models is very interesting.
On the other hand, in the class of polynomial (P) problems, the worst case run time growths only polynomially.Hence, much larger systems can be studied, leading to numerically more accurate results and better statistics.Unfortunately, problems from P typically feature a simple energy landscape [30][31][32][33][34][35], leading to a rather trivial behavior.This seems to indicate that complex behavior and convenient numerical access to the equilibrium behavior do not come together.But it should be noted that most studies so far where performed for model ensembles, where the randomness is introduced in a simple and uncorrelated way.
Interestingly, recently a complex behavior was found for ensembles which utilize correlations for the model of directed polymers in random media.For this model a polynomial exact sampling algorithm exists and thus large systems of up to 10 9 lattice sites could be studied and strong indications for replica-symmetry breaking have been observed [36].Furthermore, for the polynomially tractable problem of increasing subsequences, also called Ulam's problem, numerical evidence for RSB was found [37].Finally, it should be noted that also the XOR-SAT problem, which can be solved efficiently by Gaussian elimination [38,39], exhibits one-step RSB, but this becomes irrelevant in equilibrium in the thermodynamic limit [40].
Thus, by the choice of a suitable ensemble, it seems to be possible to study some systems with nontrivial behavior in a numerically efficient way.In the present work, in the spirit of [36,37], we study another problem from the class P to seek for indications for complex behavior.In particular, we consider the maximum-weight graph-matching problem.This model was one of the first optimization problems to be studied from the viewpoint of statistical mechanics [41].Analyses were extended to arbitrary graphs [30], finite-size effects [42] and Euclidean edge weights [43][44][45].
The interests of graph-matching problems in physics also comes from the fact that other problems can be solved by a mapping to suitable matching problems.This has been done for, e.g., 2D spin glasses [33], dimer covering [46], negative-weight percolation [47] and controllability of dynamic networks [48,49].
The remaining of this paper is organized as follows: In section 2, graph matching and random graph ensembles used in this work are introduced.These ensembles are Erdős-Rényi graphs and ring graphs with additional random edges, both with Gaussian distributed or constant edge weights with uniform noise.Section 3 presents the details of the perturbation technique.It it explained how one can investigate whether a complex energy landscape is present by observing the differences between matchings obtained from the original and slightly perturbed graphs, respectively.We also address the question whether the perturbation approach can be used to sample matching statistically correctly.The results of the analyses are given in section 4, where we start with the results obtained from the perturbation technique.We summarize and draw final conclusions in section 5.

Matching
An undirected graph G = (V, E) is given by a set of nodes i ∈ V and a set of edges e ∈ E ⊆ V (2) , where n = |V | and m = |E|.Two nodes i and j are called adjacent if e = {i, j} is a member of E. This edge e called incident to i and j.A matching M is a subset of E in which no two edges are incident to the same node.For each edge in M , its adjacent nodes are called matched.Nodes which are not matched are called free.A maximum cardinality matching is a matching with maximizes the number of matched nodes |M |.On a graph G with edge weights w(e), the weight of the matching is W = e∈M w(e).A maximum-weight matching is a matching M that maximizes the weight W over all possible matchings on G.In section 3 we state the algorithm we have used to obtain the maximum-weight matching of a given graph.

Random graphs
We consider two types of random graph ensembles.Details on the chosen values of the model parameters are given in section 4. The first ensemble consists of Erdős-Rényi random graphs [50].To generate such graphs, one starts with on empty graph of n nodes V = {1, 2, • • • , n}.Then, for each possible pair of distinct nodes, an edge is created independently with probability p = c/n, where c is the connectivity.The resulting degree distribution is Poissonian.
The second ensemble consist of ring graphs with additional random edges.First, a regular ring graph with mean degree of two is created.This means, the edge E set initially contains the edges {i, i + 1} for i = 1, . . ., n − 1 plus the edge {n, 0}.Next, two nodes i, j are chosen randomly with uniform probability and the edge {i, j} is created if i = j and {i, j} does not yet exist in E. This process is repeated until m + edges are added to the graph this way.The number of edges in G is then given by m = n+m + .Below we will consider the cases Two types of edge weights w(e) are investigated.First, we consider w(e) ∼ N (1, 0.01), i.e., Gaussian distributed random weights with a mean of 1 and a variance of 0.01.Second, we study constant weights with additional uniform noise, w(e) = 1 + ε.To choose the strength of the additional noise ε, we mention that with w(e) = 1 for all edges, a maximum-weight matching is equivalent to a maximum cardinality matching, since W = |M | holds.The purpose of the additive noise ε is to make the optimal solution unique.But beside that, it should not influence the size of the optimum matching even for large values of n.To ensures that, ε is drawn from a uniform distribution where the values scale with n − 1 2 , i.e., ε ∼ U (−0.01 n − 1 2 , 0.01 n − 1 2 ).

Methods
To find a maximum weight matching we use Edmond's Blossom Shrinking algorithm [51], implemented in the LEMON-library [52].The algorithm has a worstcase running time O(n 2 m), i.e., polynomial, such that correspondingly large graphs can be treated.

Perturbation technique
When studying the complexity of energy landscapes, a common technique is to apply weak perturbations on the system.To our knowledge, this technique was first used to study spin glasses (see, e.g., [53]).Since then, it was adapted for, e.g., random-field Ising models [54,55] and the traveling salesperson problem [23].
The basic idea is to first calculate a ground state and the perturb the system slightly such that by a recomputation a new ground state is obtained, which is an excited state with respect to the original system.For the matching problem, we use edge flips as a perturbation technique.First, a maximum-weight matching M 0 with total weight W 0 is calculated.Then, one randomly chosen edge e will be "flipped".If e is in M 0 , it is not allowed to be in a matching M 1 of the perturbed graph.On the other hand, if e is not in M 0 , it is enforced to be in M 1 .In practice, this can be achieved by setting the weight of e to a very small value, e.g., −2W 0 , or a very large value, e.g., 2W 0 , respectively.After M 1 is found, the weight of the edge will be reset to its original value such that the weight W 1 of M 1 is calculated with the original weights.In the following, M 0 will always denote an optimal matching with respect to the original edge weights, while M i with i > 0 denote independent matchings resulting from perturbations by a single edge flip with respect to M 0 .W 0 and W i denote the corresponding weights of the matchings.
A perturbation is weak if the relative difference of W 0 and W 1 behaves as If (1) holds, M 1 is quasi optimal.Note that this means that when considering the system in the canonical ensemble, i.e., according to the Boltzmann distribution, the matchings M 0 and M 1 will contribute with the same weight in the thermodynamic limit.To compare two matchings M i and M j , we apply a similarity measure, also called overlap.Since matchings are sets, it is convenient to use the Jaccard index [56], given by The distance between two matchings is defined as [57].Overlap and distance between optimal matching M 0 and matching M i obtained from a perturbation are denoted as q 0 and d 0 , respectively, where the second index i is omitted.A necessary condition for a complex energy landscape is that there exist quasi optimal matchings with d 0 > 0 in the thermodynamic limit of infinite large graphs [58].Hence, we measured (W 0 − W i )/W 0 and d 0 for different values of n, averaged over random graph realizations.If the matchings for the perturbed graphs are quasi optimal and d 0 converges towards zero for increasing values of n, a simple energy landscape can be assumed.But if d 0 remains finite, a complex energy landscape can not be excluded.

Sampling bias test
To further investigate whether the matching problem on a given graph ensemble indeed exhibits a complex energy landscape, it would be desirable to efficiently sample matchings.For correct statistics, matchings with equal total weight W should be sampled with equal probability.
Hence, a sampling approach needs to know the number of distinct matchings for the full problem and for encountered subproblems.Unfortunately, the matching problem is in the class of #P-complete problems [59,60].
This means, with all known algorithms, it requires to enumerate all matchings, which are typically exponentially many.Hence, such an approach is computationally demanding.
We can, in principle, use matchings obtained by the perturbation method as samples.Unfortunately, edge flips create an unknown bias to the sampling because various configurations are prohibited, which depends on the flipped edge.It could nonetheless be the case that bias errors average out, especially for large graphs.To assess this, we have performed the following investigation.
For a given graph, here we study Erdős-Rényi graphs, a set of edge weights of constant weights with noise w = 1 + ε is drawn, see section 2.2.Then, an optimal matching according to this set of weights is calculated.Next, multiple times a perturbed graph is obtained, using random edge flips, and again matchings are calculated.This process is then repeated for multiple realizations of the initial noise.
The obtained matchings are collected in a sample histogram.Here, we include only matchings with the same cardinality as the optimal matching.The resulting histogram is then used to compare the relative frequencies Q(M ) to the case where all matchings would have been drawn with equal probability.That is P (M ) = 1/n M , where n M is the number distinct matchings.
As the number of matchings grows exponentially with the number of edges, this procedure is only appropriate for small graphs.
To compare P and Q quantitatively, two distance measures are used.The first one is the Kullback-Leibler divergence D KL defined as The second measure is the mean relative error MRE given by which is the averaged relative error per matching.
If P and Q become more equal for larger graphs, the values of D KL and MRE will become smaller.In particular, if the sampling approaches the uniform one, the distance measures should converge to zero.If this was the case, the overlap distributions P (q) could be investigated in a meaningful way.In case the distribution of overlaps remains broad in the limit of large graphs, this would indicate that a complex behavior is significant in equilibrium.On the other hand, if the values of D KL and MRE increase with larger graphs, the sampling bias errors can not be neglected.In this case, a limiting nonzero value of the distance d 0 would only indicate that significantly different matchings exists, but one would not know by using the present approach whether their weights are strong enough to influence the equilibrium behavior.

Results
The following results are obtained by performing simulations [61] for the two graph ensembles, various values of parameters and graph sizes in the range of n = 7 to 131 072.If not otherwise specified, all results are averaged over 100 different random graph realizations and 100 different edge-flip perturbations and resulting matchings for each graph realization.Partially, the GNU Parallel tool was used to distribute the simulations over several CPUs [62].All data fits were performed as recommended in Ref. [63].

Perturbation
We first verify that the matchings of the perturbed graphs are quasi optimal.Our results show that for both types of random graphs, Erdős-Rényi and ring graphs, as well as for both types of edge weights, the matchings for the perturbed graphs are quasi optimal as (1) holds.In figure 1, results are shown for Erdős-Rényi graphs with connectivities c ∈ {0.5, 1, 2, 4}.Included are results from fits using In the other studied cases the results look similar, hence they are not shown here.Note that for ring graphs, the data deviates from the fit for small sizes n ≤ 2 8 .Since ( 1) is required to hold only asymptotically, the condition for quasi optimality is fulfilled for this case as well.

Erdös-Rényi
Figure 1.Verification that the matchings on perturbed graphs are quasi optimal, as (W 0 − W )/W 0 as a function of n behaves as O(1/n).Shown here are results for Erdős-Rényi graphs with Gaussian distributed random weights and connectivities c ∈ {0.5, 1, 2, 4}.The straight lines represent fits in the form of (6).
We next look at the behavior of d 0 and start with the case of Gaussian distributed weights.The results for Erdős-Rényi graphs are shown in figure 2. One observes that the distance d 0 approaches zero for increasing n.To confirm this behavior systematically, a power law fit with offset d ∞ is used, resulting in good fits.The resulting fit parameters for all studied cases are reported in table 1.
For Gaussian distributed weights, the value of d ∞ is set to a fixed value of zero in all cases to obtain reasonable results.Fits with a freely estimated d ∞ result in negative values for this parameter, i.e., nonphysical results.The observed behavior shows that the matchings for the perturbed graphs share most of their edges with the matching for the original graphs, increasingly with growing system size.This can be explained by the application of a non-uniform weight

Erdös-Rényi
Figure 2. Averaged distance d 0 between matchings on perturbed graphs and optimal matchings as a function of n for Erdős-Rényi graphs with Gaussian distributed random weights and connectivities c ∈ {0.5, 1, 2, 4}.The straight lines represent power low fits in the form of (7).distribution.To maximize W 0 , the matching algorithm focuses on few edges with large weights.Now for a matching with perturbed weights, if one edge is flipped, a large total weight is still achieved by using most of the same edges with large weights as in M 0 .Consequently, only a few edges change compared to M 0 to compensate the edge flip.This results in a small difference between optimal matching and matching of the perturbed graph that vanishes in the limit of infinity large graphs.
Overall, the results for Gaussian weights indicate a simple structure of the energy landscape.The results for ring graphs are similar and therefore not shown here.As constant values of m + result in d ∞ = 0, we omitted to investigate values of m + ∈ O(log n) or m + ∈ O(n) for Gaussian distributed weights.
We next look at constant edge weights with additional noise.Since here all edge weights are more similar to each other, the existence of very different matchings with very similar weight, i.e., a more complex energy landscape, can be anticipated.For Erdős-Rényi graphs the results for d 0 are similar to the above discussed case with Gaussian distributed weights, and therefore no shown here.Again, fixed values of d ∞ = 0 are needed to obtain physically meaningful results for the fits according to (7).Thus, also for this case no sign of a complex energy landscape can be found.
The argumentation for Gaussian weights no longer holds here since each edge is almost equally important for the total weight.
But here, the topological constrains of Erdős-Rényi graphs explain the behavior: Since loops are of order log(n) for Erdős-Rényi graphs, local changes of the matching do not spread far, resulting only in small overall differences between the Table 1.Values of the fit parameters for the fits in the form of (7) to determine d∞.Fixed values are marked with *. Results are given for various values of parameters (first column) and edge weights (second column) for Erdős-Rényi graphs and ring graphs with even or odd number of edges.A value of d∞ = 0 corresponds to a simple energy landscape.But for d∞ > 0, a complex behavior can not be excluded.Values for β 1 and β 2 are given for completeness.
On the other hand, for ring graphs the behavior is different, as it can also be seen in figure 3. Here, a value of d ∞ > 0 was needed for a good fit in all studied cases.
To understand this behavior, consider ring graphs with an even number n of nodes and no random edges, i.e., m + = 0. Since matchings along paths appear as alternations of matched and free edges, the simple ring graph structure enforces that all matchings M i with > 0 are given by E \ M 0 , i.e., all matched edges are Figure 3. Averaged distance d 0 between matchings on perturbed graphs and optimal matchings as a function of n for ring graphs.(a) shows results for constant edge weights with additional noise and values of m + ∈ {0, 1, 10} and (b) shows results for m + = l + ln(n) with l + ∈ {2, 4}.Blank symbols indicate even number of nodes n, while filled symbols are used for odd values of n.The straight or dashed lines represent power low fits with offset d∞ in the form of (7), respectively.
If n is odd, one node remains free and there are n maximum cardinality matchings.M 0 is given by the maximum cardinality matchings with the largest total weight.Using E \ M 0 for a perturbed graph would results in lower cardinality of the matchings and hence lower total weight.However, n − 1 other matchings remain to compensate for the edge flip, each of which leaves another node free.Thus, one average about half of the edges change in the optimum matching when the graph is perturbed.This can be compared to a simple spin system: for a ferromagnet when anti-periodic boundary conditions are introduced, two domain walls will be created, one where the new boundary conditions are located, the other one anywhere in the system.This also results in many different configurations, but the system still actually does not behave in a complex way.In conclusion, that d ∞ stays finite for m + = 0 for the simple ring graph is not resulting from a complex energy landscape, but rather from the simple structure of the graphs.
When adding random edges, the situation becomes more interesting.The perturbation leads for the cases of m + = const and m + ∈ O(ln n) to various matchings and more complex overlaps between the matchings become possible.The finite value of d ∞ can no longer be explained by the graph structure alone.Hence, we observe quasi optimal solutions that differ from the optimal solution by an extensive number of edges.This is a necessary condition for complexity [58].Thus a complex energy landscape may exist for this ensemble.
We also performed simulations with ring graphs where the number of random edges m + grows linearly with n.Here, the ring graphs become increasingly equal to Erdős-Rényi graphs for large values of n and the ring structure becomes less important.Hence, similar findings as for Erdős-Rényi graphs were obtained, i.e., we observed d 0 → 0 for n → ∞.Therefore, we do not report further results for this case.

Sampling
Our previous results show that there exists ensembles for the matching problem, where optimum matchings exists, which differ by O(n) variables but almost have the same energy, i.e., are quasi optimal.This is a necessary condition for the existence of a complex energy landscape.As mentioned above, it would be desirable to sample these matchings and compare them to each other, to understand the energy landscape better, in particular in the thermodynamic limit.For this purpose sampling with equal probability, i.e., in an unbiased way, is needed.
For the sampling bias test, Erdős-Rényi graphs with 1 + weights, c = 2 and n between 4 and 32 are used.Note that fully enumerating all solutions and sampling with good statistics is only feasible for such small sizes.For each graph 1000 different sets of edge weights are used and 1000 matchings for the perturbed graphs are calculated for each of these weight sets, resulting in 10 6 matchings for each graph.
The frequencies Q(M i ) of occurrences of these matchings are measured and compared to the required uniform distribution as described in section 3. error bars increase with larger values of n.This is the case, because the number of possible matchings grow with the number n of nodes resulting in an increasing variance.
This result shows that the sampling error does not averages out.Hence, it is not meaningful to investigate sampled overlap distributions P (q) with samples directly drawn from the perturbation process.

Conclusion
We have studied the behavior of the energy landscape of matchings by using "edge flips" to perturb the optimal solution of the original graphs.Our analyses have shown that for suitable ensembles there exist quasi optimal states with differ from the ground state by an extensive number of edges.Hence, a complex energy landscape can not be excluded here.These ensembles include ring graphs with a constant or logarithmic growing number of additional edges and constant edge weights with uniform noise.In these cases, the extrapolated relative distance d ∞ between the matchings stays finite.For the other observed ensembles, ring graphs with linear growing number of extra edges and for Erdős-Rényi graphs, the results show no evidence for a complex energy landscape.
It was shown that samples taken directly from the perturbation method are biased.This hindered further more detailed investigations, like obtaining the distribution of overlaps between many optimum matchings.Hence, it is unknown if the quasi optimal states are relevant in the thermodynamic limit.If this was not the case, the behavior of the energy landscape observed here resembles "weakly broken replica symmetry" [64].It should be noted that, in general, approaches based on perturbing given optimum solutions are only able to show whether there exist other solutions which differ by O(n) variables, but not whether they are relevant in the thermodynamic limit.Still, if these very different other quasi-optimal solutions do not exist, it is clear that the energy landscape is simple.
To set up further studies, it should be noted that for ring graphs with a constant or logarithmically growing number m + (n) of additional edges, a finite value of d ∞ > 0 was observed.But if m + (n) grows linearly with n, we found d ∞ = 0. Consequently, there exist a transition between choices for m + (n) where d ∞ remains finite and where it vanishes.To analyses this further, the number of extra edges could be modeled by a power law m + ∈ O(n λ ).Then, it can be evaluated if a critical value λ c for the exponent exists, where a sudden transition between d ∞ > 0 and d ∞ = 0 takes place.
Also, it would be interesting to find a method to efficiently draw unbiased samples.One could use sampling methods which are numerically more demanding, like Monte Carlo Markov-chain simulations [65] or exhaustive enumerations.With present algorithms this would restrict the accessible system sizes considerable.On the other hand, it could be possible to correct the sampling bias afterwards using the ballistic-search approach [66], which has been applied successfully to correct the sampling bias observed for evolutionary algorithms when applied to spin glasses [67].This is algorithmically quite demanding, but certainly feasible in future projects.By applying such approaches, the sampled overlap distributions P (q) can then be investigated to get more inside into the energy landscape of the ensembles proposed here.In this way, it would also possible to check whether the quasi-optimal very different solutions are relevant in the thermodynamic limit.
Beside matching and directed polymers, other P problems may also show indications of a complex energy landscape for suitable ensembles.The methods described in this or other works [36][37][38][39] can be adjusted to these problems to studied them in a similar fashion.

Figure 4 .
Figure 4. Result of the sampling bias Averaged Kullback-Leibler divergence D KL G and mean relative error MRE G both systematically increase with larger values of n.The results are obtained from Erdős-Rényi graphs with a connectivity of c = 2 and constant edge weights with additional noise.