Decentralized Gauss-Newton method for nonlinear least squares on wide area network

This paper presents a decentralized approach of Gauss-Newton (GN) method for nonlinear least squares (NLLS) on wide area network (WAN). In a multi-agent system, a centralized GN for NLLS requires the global GN Hessian matrix available at a central computing unit, which may incur large communication overhead. In the proposed decentralized alternative, each agent only needs local GN Hessian matrix to update iterates with the cooperation of neighbors. The detail formulation of decentralized NLLS on WAN is given, and the iteration at each agent is defined. The convergence property of the decentralized approach is analyzed, and numerical results validate the effectiveness of the proposed algorithm.


Introduction
The significant importance of the nonlinear least squares (NLLS) in applications of state estimation in power system [1], signal detection in wireless networks [2], and target tracking in mobile networks [3] has been appreciated for decades. The Gauss-Newton (GN) method, which can be seen as a modification of the Newton's method, is widely used to solve the NLLS [4]. The GN method finds the minimizer of the NLLS in an iterative fashion, and obtains the solution with provable local optimality and convergence rate. In this work, a decentralized GN method for NLLS on WAN is presented. In particular, only local GN Hessian matrix is used and limited communication is performed between neighboring agents. The decentralized optimization enjoys the advantage of scalability to network size, robustness to dynamic topologies and privacy preservation in data-sensitive applications [5][6][7][8]. A detailed formulation of the decentralized optimization problem for NLLS on WAN is provided, and the updating rule at each agent is explicitly given. Finally, we investigate the convergence property of the proposed algorithm, which turns out the convergence rate is related to the number of agents as well as the minimum node degree in the network. Numerical tests validate the performance of the proposed algorithm.
The contributions of this work are threefold. Firstly, we do not assume any specific structure for the global Hessian matrix, and proposed a decentralized GN method for NLLS use only local Hessian matrix. Whereas the localization application in [3] has a block-wise Jacobian matrix which is convenient to decompose, and needs global Hessian through network-wide consensus. [1] proposes a generalized gossip-based GN method, which still requires global Hessian through Gossip exchange. Secondly, we proved the local superlinearly convergence property of the proposed algorithm. Finally, we validated the proposed method through numerical simulations.

Centralized Nonlinear Least Squares
Consider an unknown variablex ∈ R n in a network, and m observations are obtained through a vector-valued function h(x) = (h 1 (x), . . . , h m (x)) : R n → R m . Each entry in function h(x) is a real value function and not necessarily convex. Let z ∈ R m denote the observations as z = h(x) + e, where e stands for measurement errors, which are assumed zero mean and known covariance R ∈ R m×m . The unknown variablex ∈ R n can be estimated by the NLLS as follows: The GN method can be adopted to solve (1) given that all the observations and functions are available at a central computing node. Specifically, define r(x) = R −1/2 (z − h(x)) and its Jacobian J(x) = ∂r(x)/∂x. Let F (x) = r(x) 2 , and problem (1) can be solved iteratively as where the descent direction d k at each iteration can be obtained by solving The GN method can solve the problem (1) at a superlinear convergence rate with order at least two under Assumption 1. Note that majority of NLLS problems are non-convex, and in this paper we only consider the local convergence property of the algorithm. We assume Assumption 1 holds throughout the paper.
Assumption 1. Consider a function F (x), suppose the following assumptions hold.
(i) The function F (x) is continuous, differentiable and bounded below.
(ii) There exists a vectorx * such that the greatest lower bound can be achieved.

Decentralized Nonlinear Least Squares
In a multi-agent system consisted of N networked agents, each agent is engaged in its own monitoring and controlling task in the network, and at the same time, each agent is cooperating with other agents in the context of estimating the global system statesx. Suppose these N agents are loosely coupled; there is very little, if any, central coordination and control among those agents, and each agent is able to exchange information with its neighbors. The system statesx can be obtained by solving the following optimization problem: where z i is the local observation which is a subset of z, i.e., z = (z 1 ; . . . ; z N ), and h i is the local observation function which is a subset of h, i.e., h = (h 1 ; . . . ; h N ). R −1 i is the covariance matrix of local noise vector e i . x i ∈ R n is the local duplicate ofx, and x = (x 1 ; . . . ; x N ) ∈ R N n .

Proposed Algorithms
For concreteness, the network model of agents is first described. Specifically, consider a graph G = (V, E). V represents the set of agents, and E represents the set of communication links between each pair of agents. An arc e is associated with an order pair (i, j) as e ∼ (i, j), which means the information is transmitted from agent i to agent j. Assume the graph formed by the agents is connected. By introducing the auxiliary variables w ij ∈ R n associated with each arc e ∼ (i, j) ∈ E, problem (3) can be reformulated as: subject to where w ij is used to enforce the equality of variables x i and x j for agents i and j connected by arc (i, j). We use compact notations in the following for the sake of discussion simplicity. Concatenating w ij in vector w, problem (4) can be reformulated as: where A s and A d are the extended arc source matrix and extended arc destination matrix for the network graph G, respectively. Stacking A s and A d to form A = [A s ; A d ] ∈ R 2M n×N n . The optimization problem (6) reduces to: where B = [−I M n ; −I M n ] ∈ R 2M n×M n . The GN method is utilized to solve the optimization problem (7), where the updates of x are implemented in a decentralized fashion. The local update rule at each agent is given in the following proposition.

Proposition 1.
Consider iterates x k and z k with the initialization E u x 0 = 2w 0 , the iterates x k i at each agent i can be iteratively generated by the following recursions for k > 0: where α k i is a positive constant, and d k i is the descent direction which can be determined as is the degree of agent i in the network, and N i denotes the neighbors of agent i.
Proof. The proof is omitted here due to the space limitation.
The local convergence property and convergence rate of the proposed decentralized approach are given by the following theorem. Theorem 1. Suppose the Assumption 1 holds, and the start point of each agent x 0 i is in S δ . The sequence {x k } generated by the update rule given in Proposition 1 is defined, and converges to x * = {x * ; . . . ;x * }. Furthermore, we have where N is the number of agents and max(ν i ) is the maximum node degree in the network.
Proof. The proof is omitted here due to the space limitation.
Remark that Theorem. 1 illustrates the local convergence property of the proposed decentralized approach, which converges to the optimal solution superlinearly. The convergence rate is related to the number of agents as well as the minimum node degree in the network.

Numerical Results
A bidirectionally connected ring network composed of N = 100 agents is considered here, in which each agent connects to exactly two other agents. The unknown system states in the network isx ∈ R 3 . The observation function h i (x i ) at each agent i is defined as: where a i , b i , and c i are i.i.d. random variables follow the standard normal distribution. It is seen that observation function h i (x i ) is a nonlinear function with a quadratic term, a trigonometric term and a cross product term. The agents in the network work cooperatively to estimate the unknown system statesx in a decentralized fashion. The convergence result is depicted in Fig.  1. It is shown that the proposed algorithm is effective in the sense that after a moderate number of iterations, the iterates converge to the optimal values. To investigate the performance of the proposed decentralized approach at each agent, the root-mean-square error (RMSE) of the estimate at each agent is calculated. The best RMSE (agent 2), the worst RMSE(agent 52) and the average RMSE are described in Fig. 2. It can be seen that at each agent, the RMSE decreases as the iteration increases. Furthermore, the convergence rate at each agent is different.

Conclusions
We have presented a decentralized approach of GN method for NLLS on WAN. We have given the updating rule at each agent explicitly, and investigated the local convergence of the proposed algorithm. Numerical simulations validated the effectiveness of the proposed algorithm.