An accelerated randomized extended Kaczmarz algorithm

The randomized Kaczmarz(RK) is a useful algorithm for solving consistent linear system Ax = b(A ∈ ℝm×n, b ∈ ℝ). It was proved that for inconsistent linear system, with randomized orthogonal projection, the randomized extended Kaczmarz(REK) method converges with an expected exponential rate. We describe an accelerated randomized extended Kaczmarz algorithm(AREK) with Nesterov’s accelerated procedure. The analysis shows that AREK converges better than REK when A is dense and the smallest singular value of ATA is small.


Introduction
The Kaczmarz method [1] is a popular algorithm for solving overdetermined linear systems and has numerous applications from tomography to image processing [2,3]. This method sweeps through the rows of A in a cyclic manner, projecting in each substep the last iterate orthogonally onto the solution hyperplane of where i = k mod m + 1, i a is the th i row of A. T. Strohmer and R. Vershynin [4,5] propose at each iteration to randomly select a row of A with probability proportional to Euclidean norm of the row. This method is the randomized Kaczmarz(RK) which can thus be described by where   . This randomized version of the Kaczmarz method provides clear advantages over the standard method in many cases. And in [5], using the selection above,they were able to provide a proof for the expected rate of convergence, all vectors x. The RK method addresses the significant disadvantage of the original Kaczmarz algorithm. It is easy to analyze the convergence rate for the RK method, while the original algorithm converge very slow when the data order is poor.
In 2010, D. Needell [6] analyzed the case where the system b x  A is corrupted by noise, and considered the system where r is an arbitrary error vector. In the noisy version, she obtained the RK method reaches an error threshold dependent on the matrix A with the same rate as in the error-free case, x denote the th k iterate of noisy randomized Kacmarz. Later, in [7], D. Needell and Y. C. Eldar utilized the Johnson-Lindenstrauss [7] dimension reduction technique to keep the runtime on the same order as the original RK version and proposed randomized Kaczmarz via Johnson-Lindenstrauss(RKJL) and improved the convergence rate of the RK method. In [8], A. Zouzias and N. M. Freris presented the randomized extended Kaczmarz for solving inconsistent linear system . By a randomized approximation orthogonal projection, they approximately computed linear equation of . Essentially, the RK algorithm is applied twice. The convergence rate of this method is . This method can be considered as a randomized variant of the extended Kaczmarz method proposed by C.Popa [9]. In [10], J. Liu and S. J. Wright proposed an accelerated randomized(ARK) algorithm with better convergence than the standard RK method on ill conditioned problems. The ARK method starts at a significant disadvantage in the sparse setting, so by Nesterov's accelerated procedure [11], they improved the iteration of the standard RK method and proved, if matrix A is dense and the minimum singular value of Motivated by the idea of Nesterov's acceleration, unlike the asynchronous parallel randomized Kaczmarz algorithm, we apply Nesterov's accelerated procedure in the REK method if A is dense and the minimum singular value of A A T is small, and propose an accelerated randomized extended Kaczmarz algorithm(AREK). In this paper, under the condition that A is dense and the minimum singular value of A A T is small, we prove that the convergence rate of the AREK method is better than the REK method, and illustrate the computational results of our approach.

Algorithm
Both theoretical and numerical researches have shown that the RK algorithm provides very promising results. Here we mainly show an accelerated REK method works well in the case where the system is corrupted with noise. In this section we consider the consistent system b x  A after an error r is added to the right side, In (4)  performs well for least square problems whose least squares error is very close to zero, . First we give the notations and preliminaries.

Notations and preliminaries
We denote the rows and columns of A by . For an integer m, let } ,..., 2 , has a solution and denote x * := A † b, and the fist estimate , we can write it as is the Frobenius norm of A.
Given a positive semidefinite matrix M, M X is defined as holds for all vectors x . We assume throughout that the rows of A are normalized,

Update:
Set k = k + 1 end A. Zouzias and N. M. Freris proposed the variant of the RK method for solving inconsistent linear system, see [8] for more details. This approach can efficiently reduce the norm of the noisy part of b, then apply the RK algorithm on a new linear system whose right hand side vector is now arbitrarily close to the column space of A, . Algorithm 1, see Table 1, showed that the RK method is applied twice by the REK method, and The convergence rate of it has been proven, it is showed in (5). Actually, A. Zouzias and N. M. Freris did not give the best result of the convergence rate of the REK method. By the method of their proof, it is easy to obtain the convergence rate of the REK algorithm, iterations, let * k x be the th k iterate of the noisy REK method, then, We now describe the complexity of the REK method for the case of dense A. In Section 2, We assume throughout that the rows of A are normalized, m 1,2,..., i 1, . Then the computation of Algorithm 1 requires about 4(m + n) operations per iteration.

Nesterov's Accelerated Randomized Kaczmarz(ARK)
. Choose K  to be the larger root as in (8). Set k  and k  as in (9).

end
The ARK algorithm, see Table 2, applies Nesterov's accelerated procedure [11] to improve the standard RK method. If we apply is the objective gradient and k  is the stepsize. Nesterov's accelerated procedure introduces two sequences   k y and   k v and obtain the following iterative scheme: With appropriate choice of k  , k  and k  , this method yields better convergence rate than the standard gradient descent. And in [12], Y. Nesterov obtained the way of choosing the parameter k  , k  and k  , we set , choose k  to be the larger root of Then we set k  and k  as follows: In [10], J. Liu and S. J. Wright showed the ARK method to address the consistent linear system.Since the main computation of REK is about 4m+4n operations per iteration then the cost per iteration of Algorithm 2 is about 11n with the same assumption as the REK method, incurred in steps of Nesterov's accelerated procedure.

Remark 1.
If the data matrix A is sparse, with a fraction of  nonzeros(with 1 0    ), the ARK method has s significant disadvantage. The average number of operations for each iteration of the RK method is ) ( n O  , however, since the vectors k x , k y and k v are dense in general, the operation of the ARK algorithm remains at ) (n O . Therefore, in this paper, we assume that the data matrix A is dense. We present Nesterov's accelerated randomized extended Kaczmarz algorithm which is a specific combination of the REK algorithm together with the ARK algorithm. The proposed algorithm consists of two components. The first component is the randomized approximate orthogonal projection which implicitly maintains an approximation to

Nesterov's Accelerated Randomized extended Kaczmarz(AREK)
formed by k z b  . The second part applies the ARK method, with Nesterov's accelerated procedure, on the system

Update:
Set k = k + 1 end The main computation of Algorithm, see Table 3, consists two parts, the first is 4m which is the cost of randomized approximate orthogonal projection, the second component is 11n of that of the ARK method. Next, we consider the expected rate of convergence AREK.

Theorem 2. If data matrix A is dense and
Before proving Theorem 2, it is first really important to analyze that, when the error vector is added, what will happen to the solution space of the original equations. Next lemma will show that each hyperplane, which is is shifted, and the distance of each hyperplane. Then we will utilize the convergence rate of the part of the randomized approximate orthogonal projection to obtain convergence rate of our approach.

Lemma 2. [8] Let
Proof of Theorem 2. We denote * k x be the th k iterate of the noisy AREK method, and k x is the th k iterate of the noiseless AREK method. Then we have After k iterations, we obtain , we obtain (10).
If data matrix A is dense and , by comparing the conclusion of Theorem 1 and Theorem 2, we conclude that our approach is faster than the REK method.

Numerical results
In the section, we describe some numerical results for the REK algorithm and the AREK algorithm. We compare REK and AREK for dense A and

Conclusion
This paper proposes an accelerated randomized extended Kaczmarz(AREK) algorithm via Nesterov's accelerated procedure, and we obtain better convergence rate. Both REK and AREK have almost the same complexities per iteration for the dense data. In the section of numerical results, under the assumption of dense data matrix A and