Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems

Liu, Chein-Shan; Chang , Chih-Wen; Kuo , Chung-Lun

doi:10.3390/a17060266

Open AccessArticle

Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems

by

Chein-Shan Liu

¹

,

Chih-Wen Chang

^2,*

and

Chung-Lun Kuo

¹

Center of Excellence for Ocean Engineering, National Taiwan Ocean University, Keelung 202301, Taiwan

²

Department of Mechanical Engineering, National United University, Miaoli 36063, Taiwan

^*

Author to whom correspondence should be addressed.

Algorithms 2024, 17(6), 266; https://doi.org/10.3390/a17060266

Submission received: 19 May 2024 / Revised: 9 June 2024 / Accepted: 14 June 2024 / Published: 15 June 2024

(This article belongs to the Special Issue Numerical Optimization and Algorithms: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

:

GMRES is one of the most powerful and popular methods to solve linear systems in the Krylov subspace; we examine it from two viewpoints: to maximize the decreasing length of the residual vector, and to maintain the orthogonality of the consecutive residual vector. A stabilization factor,

η

, to measure the deviation from the orthogonality of the residual vector is inserted into GMRES to preserve the orthogonality automatically. The re-orthogonalized GMRES (ROGMRES) method guarantees the absolute convergence; even the orthogonality is lost gradually in the GMRES iteration. When

η < 1 / 2

, the residuals’ lengths of GMRES and GMRES(m) no longer decrease; hence,

η < 1 / 2

can be adopted as a stop** criterion to terminate the iterations. We prove

η = 1

for the ROGMRES method; it automatically keeps the orthogonality, and maintains the maximality for reducing the length of the residual vector. We improve GMRES by seeking the descent vector to minimize the residual in a larger space of the affine Krylov subspace. The resulting orthogonalized maximal projection algorithm (OMPA) is identified as having good performance. We further derive the iterative formulas by extending the GMRES method to the affine Krylov subspace; these equations are slightly different from the equations derived by Saad and Schultz (1986). The affine GMRES method is combined with the orthogonalization technique to generate a powerful affine GMRES (A-GMRES) method with high performance.

Keywords:

GMRES; absolute convergence; orthogonality of consecutive residual vector; maximal decreasing length of residual vector; restarted and re-orthogonalized GMRES; affine GMRES; orthogonalized maximal projection algorithm

MSC:

65F10; 15A06

1. Introduction

In the paper, we revisit GMRES and GMRES(m) to solve

A x = b, x, b \in R^{n}, A \in R^{n \times n} .

(1)

In terms of an initial guess,

x_{0}

, and the initial residual

r_{0} = b - A x_{0}

, Equation (1) is equivalent to

A u = r_{0},

(2)

where

u = x - x_{0}

(3)

is the descent vector or called a corrector.

Equation (2) can be used to search a proper descent vector,

u

, which is inserted into Equation (3) to produce a solution,

x

, closer to the exact solution. We suppose an m-dimensional Krylov subspace:

K_{m} (A, r_{0}) : = Span {r_{0}, A r_{0}, \dots, A^{m - 1} r_{0}} .

(4)

Krylov subspaces are named after A.N. Krylov in a paper [1], who described a method for solving a secular equation to determine the frequencies of small vibrations of material systems.

Many well-developed Krylov subspace methods for solving Equation (1) have been discussed in the text books [2,3,4,5]. Godunov and Gordienko [6] used an optimal representation of vectors in the Krylov space with the help of a variational problem; the extremum of the variational problem is the solution of the Kalman matrix equation, in which the 2-norm minimal solution is used as a characteristic of the Krylov space.

One of the most important classes of numerical methods for iteratively solving Equation (1) is the Krylov subspace methods [7,8], and the preconditioned Krylov subspace methods [9]. Recently, Bouyghf et al. [10] presented a comprehensive framework for unifying the Krylov subspace methods for Equation (1).

The GMRES method in [11] used the Petrov–Galerkin method to search

u \in K_{m}

via a perpendicular property:

r_{0} - A u ⊥ L_{m} = A K_{m} .

(5)

The descent vector

u \in K_{m}

is achieved by minimizing the length of the residual vector [2]:

min ∥ r ∥,

(6)

where

r = b - A x = r_{0} - A u,

(7)

and

∥ r ∥

is the Euclidean norm of the residual vector,

r

.

In many Krylov subspace methods, the central idea is the projection. The method of GMRES is a special case of the Krylov subspace methods with an oblique projection. Let

V

and

W = A V

be the matrix representations of

K_{m}

and

L_{m}

, respectively. We project the residual Equation (2) from an n-dimensional space to an m-dimensional subspace:

W^{T} A u = W^{T} r_{0} .

(8)

We seek

u

in the subspace

K_{m}

by

u = V y;

(9)

hence, Equation (8) becomes

W^{T} W y = W^{T} r_{0} .

(10)

Solving

y

from Equation (10) and inserting it into Equation (9), we can obtain

u

, and, hence, the next

x = x_{0} + u

;

x

is a correction of

x_{0}

, and we hope

∥ r ∥ < ∥ r_{0} ∥

.

Meanwhile, Equation (7) becomes

r = r_{0} - W y,

(11)

where

W y = A u

. By means of Equations (5), (7), and (11), we enforce the perpendicular property by

r_{0} - W y ⊥ L_{m};

(12)

it is equivalent to the projection of

r_{0}

on

L_{m}

.

W y = A u

is the projection of

r_{0}

on

L_{m}

, and

r

is perpendicular to

L_{m}

, which ensures that the length of the residual vector is decreasing. For seeking a fast convergent iterative algorithm, we must keep the orthogonality, and simultaneously maximize the projection quantity

∥ W y ∥ = ∥ A u ∥

. Later on, we will modify the GMRES method from these two aspects.

Liu [12] derived an effective iterative algorithm based on the maximal projection method to solve Equation (1) in the column space of

A

. Recently, Liu et al. [13] applied the maximal projection method and the minimal residual norm method to solve the least-squares problems.

There are two reasons to cause the slowdown and even the stagnation of GMRES: one is the loss of orthogonality, and other is that its descent vector,

u = V y

, is not the best one. There are different methods to accelerate and improve GMRES [14,15,16,17]. The restart technique is often applied to the Krylov subspace methods, which, however, generally slows their convergence. Another way to accelerate the convergence is updating the initial value [18,19]. The preconditioning technology is often used to speed up GMRES and other Krylov subspace methods [20,21,22,23,24]. The multipreconditioned GMRES [25] is also proposed by using two or more preconditioners simultaneously. The polynomial preconditioned GMRES and GMRES-DR were also developed in [26]. As an augment of the Krylov subspace, some authors sought better descent vectors in a larger space to accelerate the convergence of GMRES [27,28,29]. Recently, Carson et al. [30] presented mathematically important and practically relevant phenomena that uncover the behavior of GMRES through a discussion of computed examples; they considered the conjugate gradient (CG) and GMRES methods crucially important for further development and practical applications, as well as other Krylov subspace methods.

Contribution and Novelty:

The major contribution and novelty of the present paper are that we propose a quite simple strategy to overcome the slowdown and stagnation behavior of GMRES by inserting a stabilization factor,

η = r^{T} A u / {∥ A u ∥}^{2}

, into the iterative equation. In doing so, the orthogonality property of the residual vector and the stepwise decreasing property of the residual length are automatically preserved. We extend the space of GMRES to an affine Krylov subspace for seeking a better descent vector. When the maximal projection method and the affine technique are combined into the orthogonalization technique, two very powerful iterative algorithms for solving linear systems with high performance are developed.

Highlight:

Examine the iterative algorithm for linear systems from the viewpoint of maximizing the decreasing quantity of the residual and maintaining the orthogonality of the consecutive residual vector.
A stabilization factor to measure the deviation from the orthogonality is inserted into GMRES to preserve the orthogonality automatically.
The re-orthogonalized GMRES guarantees the preservation of orthogonality, even if the orthogonality is gradually lost in the iterations by means of GMRES.
Improve GMRES by seeking the descent vector to minimize the length of the residual vector in a larger space of the affine Krylov subspace.
The new algorithms are all absolute convergence.

Outline:

The Arnoldi process and the conventional GMRES and GMRES(m) are described in Section 2. In Section 3, we propose a new algorithm modified from GMRES by inserting a stabilization factor,

η

, which is named a re-orthogonalized GMRES (ROGMRES). We discuss ROGMRES from two aspects: preserving the orthogonality and maximally decreasing the length of the residual vector; we also prove that ROGMRES can automatically maintain

η = 1

, and preserve the good property of orthogonality and the maximality of reducing the residual. In Section 4, a better descent vector is sought in a larger affine Krylov subspace by using the maximal projection method; upon combining it with the orthogonalization technique, we propose a new algorithm, the orthogonalized maximal projection algorithm (OMPA). The iterative equations in GMRES are extended to that in the affine Krylov subspace by using the minimum residual method in Section 5, which results in an affine GMRES (A-GMRES) method. The numerical tests of some examples are given in Section 6. Section 7 concludes the achievements, novelties, and contributions of this paper.

2. GMRES and GMRES(m)

The Arnoldi process [2] is often used to orthonormalize the Krylov vectors

A^{j} r_{0}, j = 0, \dots, m - 1

in Equation (4), such that the resultant vectors

v_{i}, i = 1, \dots, m

satisfy

v_{i} \cdot v_{j} = δ_{i j}, i, j = 1, \dots, m

, where

δ_{i j}

is the Kronecker delta symbol. The full orthonormalization procedure, known as the Algorithm 1, is set up as follows.

Algorithm 1: Arnoldi process

1: Select m and give an initial

r_{0}

2:

v_{1} = \frac{r_{0}}{∥ r_{0} ∥}

3: Do

j = 1 : m

4:

w_{j} = A v_{j}

5: Do

i = 1 : j

6:

h_{i j} = v_{i} \cdot w_{j}

7:

w_{j} = w_{j} - h_{i j} v_{i}

8: Enddo of i

9:

h_{j + 1, j} = ∥ w_{j} ∥

10:

v_{j + 1} = \frac{w_{j}}{h_{j + 1, j}}

11: Enddo of j

A dot between two vectors represents the inner product, like as

a \cdot b = a^{T} b

.

V

denotes the Arnoldi matrix whose jth column is

v_{j}

:

V : = [v_{1}, \dots, v_{m}] \in R^{n \times m} .

(13)

After m steps of the Arnoldi process, we can construct an upper Hessenberg matrix,

H_{m}

, as follows:

H_{m} = [\begin{matrix} h_{11} & h_{12} & \dots & h_{1 m} \\ h_{21} & h_{22} & \dots & h_{2 m} \\ h_{32} & \dots & h_{3 m} \\ ⋱ & ⋮ \\ h_{m, m - 1} & h_{m m} \end{matrix}] .

(14)

Utilizing

H_{m}

, the Arnoldi process can be expressed as a matrix product form:

A V = V H_{m} + h_{m + 1, m} v_{m + 1} e_{m}^{T},

(15)

where

H_{m} \in R^{m \times m}

, and

e_{m} = {[0, \dots, 0, 1]}^{T} \in R^{m}

. Now, the augmented Hessenberg upper matrix,

{\bar{H}}_{m} \in R^{(m + 1) \times m}

, can be formulated as

{\bar{H}}_{m} = [\begin{matrix} H_{m} \\ h_{m + 1, m} e_{m}^{T} \end{matrix}] = [\begin{matrix} h_{11} & h_{12} & \dots & h_{1 m} \\ h_{21} & h_{22} & \dots & h_{2 m} \\ h_{32} & \dots & h_{3 m} \\ ⋱ & ⋮ \\ h_{m, m - 1} & h_{m m} \\ 0 & \dots & 0 & h_{m + 1, m} \end{matrix}] .

(16)

According to [2], the Algorithm 2 method is given as follows.

Algorithm 2: GMRES

1: Give

m_{0}

,

x_{0}

, and

ε

2: Do

m = m_{0}, \dots

(

k = m - m_{0}

), until

∥ r_{k} ∥ < ε

3:

r_{k} = b - A x_{k}

4:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

5:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

6: Solve

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = ∥ r_{k} ∥ {\bar{H}}_{k}^{T} e_{1}

7:

x_{k + 1} = x_{k} + V_{k} y_{k}

In the above,

x_{k}

is the kth step value of

x

;

r_{k}

is the kth residual vector;

y_{k}

is the vector of m expansion coefficients used in

V_{k} y_{k}

;

e_{1}

is the first column of

I_{m + 1}

;

V_{k}

denotes the kth step

V

in Equation (13);

{\bar{H}}_{k} \in R^{(m + 1) \times m}

is the kth step augmented Hessenberg upper matrix in Equation (16); and

V_{k} y_{k}

is the kth step descent vector. Notice that

V_{k} y_{k} \in K_{m} (A, r_{k})

.

In addition to Algorithm 3, there is a restarted version (GMRES(m)) [11,31] described as follows.

Algorithm 3: GMRES(m)

1: Give

m_{0}

,

m_{1}

,

x_{0}

, and

ε

2: Do

j = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

(

k = m - m_{0}

)

4:

r_{k} = b - A x_{k}

5:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

6:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

7: Solve

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = ∥ r_{k} ∥ {\bar{H}}_{k}^{T} e_{1}

8:

x_{k + 1} = x_{k} + V_{k} y_{k}

9: If

∥ r_{k + 1} ∥ < ε

, stop

10: Otherwise,

x_{0} = x_{k + 1}

; go to 2

Instead of

m_{0} = 1

used in the original GMRES and GMRES(m), a suitable value of

m_{0} \neq 1

can speed up the convergence; hence,

m = m_{1} - m_{0} + 1

is the frequency for restart. Rather than the original GMRES and GMRES(m) [2,11,31], we take

x_{k}

, not

x_{0}

, in (7) of Algorithm GMRES and in (8) of Algorithm GMRES(m). Because

x_{0}

is not updated in the original GMRES and GMRES(m), they converge slowly. In Algorithm GMRES(m), if we take

m_{1} = m_{0}

, then we return to the usual iterative algorithm for GMRES, with a fixed-dimension,

m = m_{0}

, of the Krylov subspace.

The restart remedies the long-term recurrence and accumulations of the round-off errors of GMRES. In order to improve the restart process, several improvement techniques have been proposed, as mentioned above. Some techniques based on adaptively determining the restart frequency can be found in [32,33,34,35]. For a quicker convergence, we can update the current value of

x

at each iterative step. Because

x_{k}

is new information for determining the next value of

x_{k + 1}

, the result of

x_{k}

cannot be wasted for saving computational cost.

3. Maximally Decreasing Residual and Automatically Preserving Orthogonality

In this section, we develop a new iterative algorithm based on GMRES to solve Equation (1). The iterative form is

x_{k + 1} = x_{k} + u_{k},

(17)

where

u_{k}

is the kth step descent vector, which is determined by the iterative algorithm. For instance,

u_{k}

in GMRES is

u_{k} = V_{k} y_{k}

.

Lemma 1.

A better iterative scheme than that in Equation (17) is

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k} .

(18)

Proof.

Let

x_{k + 1} = x_{k} + η u_{k};

(19)

it follows that

r_{k + 1} = r_{k} - η A u_{k} .

(20)

Taking the squared norms of both sides yields

∥ r_{k + 1} ∥^{2} = ∥ r_{k} ∥^{2} - (2 η r_{k}^{T} A u_{k} - η^{2} ∥ A u_{k} ∥^{2}) = : ∥ r_{k} ∥^{2} - h .

(21)

To maximally reduce the residual we consider

max_{η} \{h = 2 η r_{k}^{T} A u_{k} - η^{2} {∥ A u_{k} ∥}^{2}\},

(22)

which leads to

η_{k} = \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} .

(23)

Inserting

η_{k}

into Equation (19), the proof of Equation (18) is finished. □

Lemma 2.

In Equation (18) the following orthogonal property holds:

r_{k + 1} \cdot (A u_{k}) = 0 .

(24)

Proof.

Apply

A

to Equation (18):

A x_{k + 1} = A x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} A u_{k} .

(25)

Using

A x_{k + 1} = b - r_{k + 1}

and

A x_{k} = b - r_{k}

, Equation (25) changes to

r_{k + 1} = r_{k} - \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} A u_{k} .

(26)

Taking the inner product with

A u_{k}

yields

r_{k + 1} \cdot (A u_{k}) = r_{k}^{T} A u_{k} - \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} {∥ A u_{k} ∥}^{2} = 0 .

(27)

Equation (24) is proven. □

It follows from Equation (5) that

η = 1

is implied by GMRES. However, during the GMRES iteration the numerical values of

η

may deviate from 1 to a great extent, even being zero or a negative value. Therefore, we propose a re-orthogonalized version of GMRES, namely the ROGMRES method (Algorithm 4), as follows.

Algorithm 4: ROGMRES

1: Give

m_{0}

,

x_{0}

, and

ε

2: Do

m = m_{0}, \dots

(

k = m - m_{0}

), until

∥ r_{k} ∥ < ε

3:

r_{k} = b - A x_{k}

4:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

5:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

6: Solve

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = ∥ r_{k} ∥ {\bar{H}}_{k}^{T} e_{1}

7:

u_{k} = V_{k} y_{k}

8:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

Correspondingly, the ROGMRES(m) Algorithm 5 is given as follows.

Algorithm 5: ROGMRES(m)

1: Give

m_{0}

,

m_{1}

,

x_{0}

, and

ε

2: Do

j = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

(

k = m - m_{0}

)

4:

r_{k} = b - A x_{k}

5:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

6:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

7: Solve

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = ∥ r_{k} ∥ {\bar{H}}_{k}^{T} e_{1}

8:

u_{k} = V_{k} y_{k}

9:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

10: If

∥ r_{k + 1} ∥ < ε

, stop

11: Otherwise,

x_{0} = x_{k + 1}

; go to 2

Upon comparing with GMRES(m) in Section 2, the computational cost of ROGMRES(m) is slightly increased by computing an extra term,

r_{k}^{T} A u_{k} / {∥ A u_{k} ∥}^{2}

, at each iteration; it needs one matrix-vector product and two inner products of vectors.

Theorem 1.

For GMRES, if

η_{k} \neq 1

is happened at the kth step, the orthogonality in Equation (5) cannot be continued after the kth step, i.e.,

r_{k + 1} \cdot (A u_{k}) \neq 0;

(28)

moreover, if

η_{k} < 1 / 2

, the residual does not decrease, i.e.,

∥ r_{k + 1} ∥ > ∥ r_{k} ∥ .

(29)

For ROGMRES in Equation (18), the residual is absolutely decreased:

∥ r_{k + 1} ∥ < ∥ r_{k} ∥,

(30)

and the orthogonality

r_{k + 1} \cdot (A u_{k}) = 0

holds.

Proof.

Applying

A

to

x_{k + 1} = x_{k} + V_{k} y_{k}

in Algorithm GMRES yields

r_{k + 1} = r_{k} - A u_{k},

(31)

where

u_{k} = V_{k} y_{k}

.

It follows from Equation (23) that

(r_{k} - η_{k} A u_{k}) \cdot (A u_{k}) = 0,

(32)

which can be rearranged to

(r_{k} - A u_{k}) \cdot (A u_{k}) = (η_{k} - 1) {∥ A u_{k} ∥}^{2},

(33)

and then by Equation (31),

r_{k + 1} \cdot (A u_{k}) = (η_{k} - 1) {∥ A u_{k} ∥}^{2} .

(34)

If

η_{k} \neq 1

, we can prove Equation (28). Taking the squared norm of Equation (31) and using Equation (23) generates

∥ r_{k + 1} ∥^{2} = ∥ r_{k} ∥^{2} - 2 r_{k} \cdot (A u_{k}) + ∥ A u_{k} ∥^{2} = ∥ r_{k} ∥^{2} + (1 - 2 η_{k}) {∥ A u_{k} ∥}^{2} .

(35)

If

η_{k} < 1 / 2

,

(1 - 2 η_{k}) {∥ A u_{k} ∥}^{2} > 0

and Equation (29) is proven.

For ROGMRES, taking the squared norm of Equation (26), we have

∥ r_{k + 1} ∥^{2} = {∥ r_{k} ∥}^{2} - \frac{{(r_{k}^{T} A u_{k})}^{2}}{∥ A u_{k} ∥^{2}} .

(36)

Because

{(r_{k}^{T} A u_{k})}^{2} / {∥ A u_{k} ∥}^{2} > 0

, Equation (30) is proven. The orthogonality

r_{k + 1} \cdot (A u_{k}) = 0

was proven in Equation (24) of Lemma 2. □

Corollary 1.

For GMRES, the orthogonality of the consecutive residual vector is equivalent to the maximality of h in Equation (22), where

η_{k} = 1 \Leftrightarrow r_{k}^{T} A u_{k} = {∥ A u_{k} ∥}^{2} .

(37)

Proof.

By means of Equation (34), to satisfy the orthogonality condition of the residual vector we require

η_{k} = 1

; it implies Equation (37) by the definition in Equation (23). At the same time, Equation (36) changes to

∥ r_{k + 1} ∥^{2} = ∥ r_{k} ∥^{2} - {∥ A u_{k} ∥}^{2} .

(38)

Inserting

η_{k} = 1

into Equation (22), we have

max h = 2 r_{k}^{T} A u_{k} - ∥ A u_{k} ∥^{2} = {∥ A u_{k} ∥}^{2},

(39)

of which

r_{k}^{T} A u_{k} = {∥ A u_{k} ∥}^{2}

in Equation (37) was used. For GMRES, obtaining the maximal value of h and preserving the orthogonality are the same. □

Remark 1.

To maintain

η_{k} = 1

in GMRES is a key issue for preserving the orthogonality and for the maximality of reducing the residual. However, for the traditional GMRES, it is not always true that it can maintain

η_{k} = 1

during the iteration process.

Theorem 2.

Let

w_{k} = \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

(40)

be the descent vector of ROGMRES in

x_{k + 1} = x_{k} + w_{k} .

(41)

The following identity holds:

η_{k} = \frac{r_{k}^{T} A w_{k}}{∥ A w_{k} ∥^{2}} = 1;

(42)

such that ROGMRES can automatically maintain the orthogonality of the residual vector, and achieve the maximality of the decreasing length of the residual vector.

Proof.

Let

γ = \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}};

(43)

Equation (40) is written as

w_{k} = γ u_{k} .

(44)

Inserting it into Equation (42) produces

η_{k} = \frac{1}{γ} \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}};

(45)

by means of Equation (43), it follows that

η_{k} = \frac{∥ A u_{k} ∥^{2}}{r_{k}^{T} A u_{k}} \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} = 1 .

(46)

Thus, Equation (42) is proven. The proof to satisfy the orthogonality condition of the consecutive residual vector and the maximality of h is the same as that given in Corollary 1. □

Remark 2.

The properties in ROGMRES are crucial so that it can automatically maintain

η_{k} = 1

, for preserving its good properties of the orthogonality, and the maximality for reducing the residual. The method in Equation (18) can be viewed as an automatically orthogonality preserving method. Numerical experiments will verify these points.

Theorem 3.

For ROGMRES in Equation (18), the iterative point,

x_{k}

, is located on an invariant manifold:

∥ r_{k} ∥^{2} = \frac{∥ r_{0} ∥^{2}}{Q_{k}},

(47)

where

Q_{0} = 1

;

Q_{k} > 0

with

Q_{k + 1} > Q_{k}

is an increasing sequence.

Proof.

We begin with the following time-invariant manifold:

{∥ r ∥}^{2} = \frac{∥ r_{0} ∥^{2}}{Q (t)},

(48)

where

Q (0) = 1

, and

Q (t)

is an increasing function of t.

We suppose that the evolution of

x

in time is driven by a descent vector,

u

:

\dot{x} = γ u .

(49)

Equation (48) is a time-invariant manifold, such that its time differential is zero, i.e.,

\frac{1}{2 Q (t)} \dot{Q} (t) {∥ r ∥}^{2} - r \cdot (A \dot{x}) = 0,

(50)

where

\dot{r} = - A \dot{x}

was used.

Inserting Equation (49) into Equation (50) yields

γ = \frac{\dot{Q} (t)}{2 Q (t)} \frac{{∥ r ∥}^{2}}{r^{T} A u} .

(51)

Inserting

γ

into Equation (49) and using

v = A u

, we can derive

\dot{x} = q (t) \frac{{∥ r ∥}^{2}}{r^{T} v} u,

(52)

where

q (t) : = \frac{\dot{Q} (t)}{2 Q (t)} > 0 .

(53)

By applying the forward Euler scheme to Equation (52), we have

x (t + Δ t) = x (t) + β \frac{{∥ r ∥}^{2}}{r^{T} v} u,

(54)

where

β = q (t) Δ t > 0 .

(55)

An iterative form of Equation (54) is

x_{k + 1} = x_{k} + β_{k} \frac{∥ r_{k} ∥^{2}}{r_{k}^{T} v_{k}} u_{k} .

(56)

It generates the residual form:

r_{k + 1} = r_{k} - β_{k} \frac{∥ r_{k} ∥^{2}}{r_{k}^{T} v_{k}} v_{k} .

(57)

Taking the squared norms of both sides yields

∥ r_{k + 1} ∥^{2} = ∥ r_{k} ∥^{2} - 2 β_{k} ∥ r_{k} ∥^{2} + β_{k}^{2} \frac{∥ r_{k} ∥^{4}}{{(r_{k}^{T} v_{k})}^{2}} {∥ v_{k} ∥}^{2} .

(58)

Dividing both sides by

∥ r_{k} ∥^{2}

renders

a_{k} β_{k}^{2} - 2 β_{k} + 1 - \frac{∥ r_{k + 1} ∥^{2}}{∥ r_{k} ∥^{2}} = 0,

(59)

where

a_{k} : = \frac{∥ r_{k} ∥^{2} {∥ v_{k} ∥}^{2}}{{(r_{k}^{T} v_{k})}^{2}} \geq 1 .

(60)

By means of Equation (47), Equation (59) changes to

a_{k} β_{k}^{2} - 2 β_{k} + 1 - \frac{Q_{k}}{Q_{k + 1}} = 0 .

(61)

If we take

β_{k} = \frac{1}{a_{k}} = \frac{{(r_{k}^{T} v_{k})}^{2}}{∥ r_{k} ∥^{2} {∥ v_{k} ∥}^{2}}, \frac{Q_{k}}{Q_{k + 1}} = 1 - \frac{1}{a_{k}},

(62)

Equation (61) is satisfied. In terms of

β_{k}

, Equation (56) is recast to

x_{k + 1} = x_{k} + \frac{{(r_{k}^{T} v_{k})}^{2}}{∥ r_{k} ∥^{2} {∥ v_{k} ∥}^{2}} \frac{∥ r_{k} ∥^{2}}{r_{k}^{T} v_{k}} u_{k} = x_{k} + \frac{r_{k}^{T} v_{k}}{∥ v_{k} ∥^{2}} u_{k} .

(63)

Noticing

v_{k} = A u_{k}

, Equation (63) is just the iterative Equation (18) for ROGMRES. □

Remark 3.

For the factor defined in Equation (23),

η_{k}

plays a vital role in Equation (34) to dominate the orthogonal property of GMRES, and also in Equation (35) to control the residual decreasing property of GMRES. When GMRES blows up for some problems, by introducing

η_{k}

in GMRES, we can stabilize the iterations to avoid blowing up. In this viewpoint,

η_{k}

is a stabilization factor for GMRES.

Inspired by Remark 3, two simple methods can be developed for GMRES and GMRES(m): we can compute

η_{k}

at each step, and consider

| η_{k} - 1 | > 10^{- 10}

or

η_{k} < 1 / 2

to be the stop** criteria of the iterations; they are labeled as GMRES(

η

) and GMRES(m,

η

), respectively.

Remark 4.

h given in Equation (21) is a decreasing quantity of the residual, whose maximal value by means of Equation (36) is

max h = \frac{{(r_{k}^{T} A u_{k})}^{2}}{∥ A u_{k} ∥^{2}} .

(64)

Upon comparing with

a_{k}

in Equation (60), we have

max h = \frac{∥ r_{k} ∥^{2}}{a_{k}} .

(65)

Because

a_{k} \geq 1

, the best value of

max h

that can be obtained is

∥ r_{k} ∥^{2}

when

a_{k} = 1

; in this situation, it will directly lead to the exact solution at the

k + 1

step, owing to

∥ r_{k + 1} ∥ = 0

, which is obtained by inserting

A u_{k} = r_{k}

into Equation (38). In this sense,

a_{k}

can also be an objective function.

a_{0}

correlates to

f = {(r_{0}^{T} A u)}^{2} / {∥ A u ∥}^{2}

, which is the quantity of the projection from

r_{0}

on

L_{m} = A K_{m}

, by

a_{0} = \frac{∥ r_{0} ∥^{2}}{f} .

(66)

In the next section, we will maximize f, i.e., minimize

a_{0}

, to find the best descent vector for

u

.

4. Orthogonalized Maximal Projection Algorithm

As seen from Equation (38), we must make

{∥ A u ∥}^{2}

as large as possible.

A u = A V y

, mentioned in Section 1, is the projection of

r_{0}

on

L_{m} = A K_{m}

. Therefore we consider

max_{u} \{f = \frac{{(r_{0} \cdot v)}^{2}}{{∥ v ∥}^{2}}\}

(67)

to be the maximal projection of

r_{0}

on the direction

v / ∥ v ∥

, where

v = A u

. In view of Equation (66), the maximization of f is equivalent to the minimization of

a_{0}

.

We can expand

u \in r_{0} + K_{m}

via

u = r_{0} + V y,

(68)

which is slightly different from the descent vector

u = V y \in K_{m}

used in GMRES.

Theorem 4.

For

u \in r_{0} + K_{m} (A, r_{0})

,

r_{0} \neq 0

, the optimal

u

approximately satisfying Equation (2) and subjecting to Equation (67) is given in Equation (68), where

\begin{matrix} W = A V, C = W^{T} W, \end{matrix}

(69)

\begin{matrix} C y = W^{T} (r_{0} - A r_{0}) . \end{matrix}

(70)

Proof.

Due to

v = A u

and Equation (68),

v = v_{0} + W y,

(71)

where

W = A V

and

v_{0} = A r_{0} .

(72)

By Equation (71), one has

\begin{matrix} r_{0} \cdot v = r_{0} \cdot v_{0} + r_{0}^{T} W y, \end{matrix}

(73)

\begin{matrix} {∥ v ∥}^{2} = {∥ v_{0} ∥}^{2} + 2 v_{0}^{T} W y + y^{T} C y, \end{matrix}

(74)

where

C : = W^{T} W > 0 \in R^{m \times m} .

(75)

Resorting to the maximality condition for f in Equation (67), we can obtain

r_{0} \cdot v y_{2} - {∥ v ∥}^{2} y_{1} = 0,

(76)

where

\begin{matrix} y_{1} : = \nabla_{y} (r_{0} \cdot v) = W^{T} r_{0}, \end{matrix}

(77)

\begin{matrix} y_{2} : = \frac{1}{2} \nabla_{y} {∥ v ∥}^{2} = W^{T} v_{0} + C y . \end{matrix}

(78)

From Equation (76),

y_{2}

is equal to

y_{1}

:

y_{2} = η^{- 1} y_{1} \Rightarrow y_{2} = y_{1},

(79)

because of

η^{- 1} = \frac{{∥ v ∥}^{2}}{r_{0} \cdot v} = \frac{{∥ A u ∥}^{2}}{r_{0}^{T} A u} = 1,

(80)

according to Corollary 1.

Then, it follows from Equations (77)–(79) that

C y + W^{T} v_{0} = W^{T} r_{0},

(81)

which can be arranged to Equation (70). □

We pull back the minimization problem used in the GMRES method to the following one:

min {∥ r_{0} {- v ∥}^{2}} = min {∥ r_{0} - A u ∥^{2}},

(82)

which is derived from Equations (6) and (7).

Theorem 5.

For

u \in r_{0} + K_{m} (A, r_{0})

, derived from the minimization problem in Equation (82),

u

in Equation (2) is optimized to be that in Theorem 4.

Proof.

From

\nabla_{y} ∥ r_{0} {- v ∥}^{2} = \nabla_{y} {∥ v ∥}^{2} - 2 \nabla_{y} (r_{0} \cdot v) = 0,

(83)

and Equations (78) and (77) it follows that

W^{T} v_{0} + C y = W^{T} r_{0},

(84)

which is just Equation (70). □

According to Theorems 4 and 5, the maximal projection is equivalent to the minimization of residual. Therefore, the maximal projection algorithm (MPA) is an extension of GMRES to a larger space of the affine Krylov subspace, not merely in the Krylov subspace.

By considering Lemma 1 and Theorem 5, we come to an improvement of ROGMRES as well as the double-improvement of GMRES to the following iterative formulas for the orthogonalized maximal projection algorithm (OMPA):

\begin{matrix} (W^{T} W) y = W^{T} (r_{0} - A r_{0}), \end{matrix}

(85)

\begin{matrix} u = r_{0} + V y, \end{matrix}

(86)

\begin{matrix} x = x_{0} + \frac{r_{0}^{T} A u}{{∥ A u ∥}^{2}} u . \end{matrix}

(87)

Equation (85) can be further simplified in the next section. The algorithms of OMPA (Algorithm 6) and OMPA(m) (Algorithm 7) are given as follows.

Algorithm 6: OMPA

1: Give

m_{0}

,

x_{0}

, and

ε

2: Do

m = m_{0}, \dots

(

k = m - m_{0}

), until

∥ r_{k} ∥ < ε

3:

r_{k} = b - A x_{k}

4:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

5:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

6:

W_{k} = A V_{k}

7: Solve

(W_{k}^{T} W_{k}) y_{k} = W_{k}^{T} (r_{k} - A r_{k})

8:

u_{k} = r_{k} + V_{k} y_{k}

9:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

Algorithm 7: OMPA(m)

1: Give

m_{0}

,

m_{1}

,

x_{0}

, and

ε

2: Do

j = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

(

k = m - m_{0}

)

4:

r_{k} = b - A x_{k}

5:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

6:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

7:

W_{k} = A V_{k}

8: Solve

(W_{k}^{T} W_{k}) y_{k} = W_{k}^{T} (r_{k} - A r_{k})

9:

u_{k} = r_{k} + V_{k} y_{k}

10:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

11: If

∥ r_{k + 1} ∥ < ε

, stop

12: Otherwise,

x_{0} = x_{k + 1}

; go to 2

In Algorithm OMPA(m), if we take

m_{1} = m_{0}

, then we return to the usual iterative algorithm with a fixed-dimension

m = m_{0}

of the Krylov subspace. In Algorithm OMPA(m),

m = m_{1} - m_{0} + 1

is the frequency for restart.

Remark 5.

In view of Equations (67) and (80), f and

η = 1

have the following relation:

f = \frac{r_{0} \cdot v}{η} = r_{0} \cdot v = {∥ A u ∥}^{2} .

(88)

Therefore, the maximization of f is equivalent to the maximization of the length of the projection vector

A u

. By means of Equation (64),

f = max h;

(89)

hence,

max_{u} f = max_{u} max_{η} \{2 η r_{0}^{T} A u - η^{2} {∥ A u ∥}^{2}\}

(90)

follows from Equation (22). Through a double-maximization, we have derived the maximal projection algorithm.

5. An Affine GMRES Method

In the derivation of GMRES, the following technique was employed [11]:

min_{y} ∥ r_{0} e_{1} - {\bar{H}}_{m} y ∥,

(91)

where

r_{0} = ∥ r_{0} ∥

.

Instead of Equation (91) used in the GMRES method to develop the iterative algorithm from

u \in K_{m} (A, r_{0})

, we seek the descent vector,

u

, in a larger subspace of

u \in r_{0} + K_{m} (A, r_{0})

by

min_{y} ∥ r_{0} - A r_{0} - W y ∥,

(92)

which is obtained from Equation (82) by inserting Equation (86) for

u

, and using

W = A V

.

Now, Equation (92) is re-written as

min_{y} ∥ r_{0} - A r_{0} - A V_{m} y ∥ .

(93)

We denote

V_{m}

the

m \times m

-dimensional Arnoldi matrix, and

V_{m + 1}

the

(m + 1) \times (m + 1)

-dimensional Arnoldi matrix. It is known that

A V_{m} = V_{m + 1} {\bar{H}}_{m}, V_{m}^{T} V_{m} = I_{m}, V_{m + 1}^{T} V_{m + 1} = I_{m + 1} .

(94)

Theorem 6.

For

u \in r_{0} + K_{m} (A, r_{0})

, derived from Equation (93), the orthogonalized affine GMRES possesses the following iterative formulas:

\begin{matrix} ({\bar{H}}_{m}^{T} {\bar{H}}_{m}) y = r_{0} {\bar{H}}_{m}^{T} e_{1} - r_{0} {[{\bar{H}}_{m}^{T} {\bar{H}}_{m}]}_{1}, \end{matrix}

(95)

\begin{matrix} u = r_{0} + V_{m} y, \end{matrix}

(96)

\begin{matrix} x = x_{0} + \frac{r_{0}^{T} A u}{{∥ A u ∥}^{2}} u, \end{matrix}

(97)

where

r_{0} = ∥ r_{0} ∥

, and

{[{\bar{H}}_{m}^{T} {\bar{H}}_{m}]}_{1}

is the first column of

{\bar{H}}_{m}^{T} {\bar{H}}_{m}

.

Proof.

The objective

r_{0} - A r_{0} - A V_{m} y

in Equation (93) can be simplified as follows. In terms of the first Krylov vector

v_{1} = r_{0} / r_{0}

, where

r_{0} = ∥ r_{0} ∥

, and using the first one in Equation (94), we have

r_{0} - A r_{0} - A V_{m} y = r_{0} v_{1} - r_{0} A v_{1} - V_{m + 1} {\bar{H}}_{m} y .

(98)

We write

v_{1} = V_{m + 1} e_{1}, A v_{1} = A V_{m} e_{2} = V_{m + 1} {\bar{H}}_{m} e_{2},

(99)

where

e_{2}

is the first column of

I_{m}

. Then, it follows that

r_{0} - A r_{0} - A V_{m} y = V_{m + 1} [r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y] .

(100)

Taking the squared norms of both sides yields

\begin{matrix} ∥ r_{0} - A r_{0} - A V_{m} {y ∥}^{2} \\ = {[r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y]}^{T} V_{m + 1}^{T} V_{m + 1} [r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y] \\ = {[r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y]}^{T} [r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y] \\ = ∥ r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} {y ∥}^{2}, \end{matrix}

(101)

where the last one in Equation (94) was used.

The minimization problem in Equation (93) is now simplified to

min_{y} ∥ r_{0} e_{1} - r_{0} {\bar{H}}_{m} e_{2} - {\bar{H}}_{m} y ∥ .

(102)

To satisfy the minimality condition, we can derive Equation (95); using the technique in Lemma 1, Equation (97) can be derived. □

Remark 6.

Notice that the minimization problem in Equation (102) is slightly different from that in Equation (91), where an extra term,

r_{0} {\bar{H}}_{m} e_{2}

, appeared in Equation (102). However, the resulting affine GMRES (A-GMRES) (Algorithm 8) in the affine Krylov subspace can significantly improve the performance rather than GMRES, when the orthogonality is taken into account. We remind that (6)–(8) in the following Algorithm A-GMRES are slightly more computational expensive than (6) and (7) in Algorithm GMRES. Also, (7)–(9) in the following Algorithm A-GMRES(m) (Algorithm 9) are slightly more computational expensive than (7) and (8) in Algorithm GMRES(m).

Algorithm 8: A-GMRES

1: Give

m_{0}

,

x_{0}

, and

ε

2: Do

m = m_{0}, \dots

(

k = m - m_{0}

)

3:

r_{k} = b - A x_{k}

,

r_{k} = ∥ r_{k} ∥

4:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

5:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

6:

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = r_{k} {\bar{H}}_{k}^{T} e_{1} - r_{k} {[{\bar{H}}_{k}^{T} {\bar{H}}_{k}]}_{1}

7:

u_{k} = r_{k} + V_{k} y_{k}

8:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

9: If

∥ r_{k + 1} ∥ < ε

, stop

Algorithm 9: A-GMRES(m)

1: Give

m_{0}

,

m_{1}

,

x_{0}

, and

ε

2: Do

j = 1, \dots

3: Do

m = m_{0}, \dots, m_{1}

(

k = m - m_{0}

)

4:

r_{k} = b - A x_{k}

,

r_{k} = ∥ r_{k} ∥

5:

v_{1}^{k} = \frac{r_{k}}{∥ r_{k} ∥}

6:

V_{k} = [v_{1}^{k}, \dots, v_{m}^{k}]

(by Arnoldi process)

7:

({\bar{H}}_{k}^{T} {\bar{H}}_{k}) y_{k} = r_{k} {\bar{H}}_{k}^{T} e_{1} - r_{k} {[{\bar{H}}_{k}^{T} {\bar{H}}_{k}]}_{1}

8:

u_{k} = r_{k} + V_{k} y_{k}

9:

x_{k + 1} = x_{k} + \frac{r_{k}^{T} A u_{k}}{∥ A u_{k} ∥^{2}} u_{k}

10: If

∥ r_{k + 1} ∥ < ε

, stop

11: Otherwise,

x_{0} = x_{k + 1}

; go to 2

6. Examples

In this section, we apply the Algorithms GMRES, GMRES(m), A-GMRES, A-GMRES(m), ROGMRES, ROGMRES(m), OMPA, and MPA(m) to solve some examples endowed with a symmetric cyclic matrix, randomized matrix, ill-posed Cauchy problem, diagonal dominant matrix, and highly ill-conditioned matrix. They are subjected to the convergence criterion with

∥ r_{k} ∥ < ε

, where

ε

is the error tolerance.

6.1. Example 1: Cyclic Matrix Linear System

A cyclic matrix,

A

, is generated from the first row

(1, \dots, n)

. The exact solution is assumed to be

x_{i} = 1, i = 1, \dots, n

, and

b

can be obtained from Equation (1). First we apply GMRES to this problem with

n = 300

and

ε = 10^{- 10}

. The initial values are

x_{i}^{0} = 1 + 1 / i, i = 1, \dots, n

. For this problem, we encounter the situation that

η_{k} < 1

after 150 steps, not

η_{k} = 1

as shown in Figure 1a by a solid line. As shown in Figure 1b, the residuals blow up for GMRES.

This problem shows that the original GMRES method cannot be applied to solve this linear system, since

η_{k} < 1

happened after 150, and the iteration of GMRES blows up after around 157 steps. In this situation, by Equation (35),

∥ r_{k + 1} ∥ > ∥ r_{k} ∥,

the residual obtained by GMRES grows step-by step; hence, the GMRES method is no longer stable after 157 steps.

To overcome this drawback of GMRES, we can either relax the convergence criterion, or employ the version of GMRES

(η)

before it diverges.

Table 1 compares the maximum error (ME) and number of steps (NS) for different methods under

ε = 3 \times 10^{- 10}

.

m_{0} = 10

is used in GMRES and ROGMRES;

m_{0} = 1

is used in GMRES(

η

);

m_{0} = 30

and

m_{1} = 40

are used in GMRES(m);

m_{0} = 35

and

m_{1} = 40

are used in ROGMRES(m) and GMRES(m,

η

);

m_{0} = 50

and

m_{1} = 50

are used in MPA(m) and A-GMRES(m). As shown in Figure 1a by a dashed line and a dashed–dotted line, both ROGMRES and ROGMRES(m) can keep the value of

η_{k} = 1

up to the termination; as shown in Figure 1b, they converge faster than GMRES. ROGMRES(m) converges faster than ROGMRES. This example shows that ROGMRES can stabilize GMRES when it is unstable, as shown in Figure 1b.

It is interesting that the MPA(m), with a fixed value

m = m_{0} = m_{1} = 50

, can attain a highly accurate solution with ten steps, as shown in Figure 1b; A-GMRES(m) obtains ME =

5.88 \times 10^{- 14}

with 15 steps by using

m = m_{0} = m_{1} = 50

. If the orthogonality is not adopted in A-GMRES(m), ME =

7.51 \times 10^{- 14}

is obtained with 14 steps; it means that

η_{k} = 1

can be kept, even though the orthogonalization technique was not employed. Within ten steps, A-GMRES with

m_{0} = 50

can obtain ME =

7.06 \times 10^{- 14}

. Within nine steps, A-GMRES(m) with

m_{0} = 50

and

m_{0} = 51

can obtain ME =

8.02 \times 10^{- 14}

.

6.2. Example 2: Random Matrix Linear System

The coefficient matrix

A

of size

50 \times 50

is randomly generated with

1 < a_{i j} < 2

. The exact solution is

x_{i} = 1, i = 1, \dots, 50

, and

b

can be obtained by Equation (1).

We consider the initial values

x_{i}^{0} = 0.5

and take

ε = 10^{- 10}

. Table 2 compares ME and NS for different methods under

ε = 10^{- 10}

.

m_{0} = 10

is used in ROGMRES, OMPA, A-GMRES, and GMRES(

η

);

m_{0} = 30

and

m_{1} = 40

are used in GMRES(m), ROGMRES(m) and GMRES(m,

η

). ROGMRES is better than OMPA and A-GMRES; other methods are not accurate. For this example, the restart ROGMRES(m) is not good.

6.3. Example 3: Inverse Cauchy Problem and Boundary Value Problem

Consider the inverse Cauchy problem:

\begin{matrix} Δ u = u_{r r} + \frac{1}{r} u_{r} + \frac{1}{r^{2}} u_{θ θ} = 0, \end{matrix}

(103)

\begin{matrix} u (ρ, θ) = h (θ), 0 \leq θ \leq π, \end{matrix}

(104)

\begin{matrix} u_{n} (ρ, θ) = g (θ), 0 \leq θ \leq π . \end{matrix}

(105)

The method of fundamental solutions is taken as

u (x) = \sum_{j = 1}^{n} c_{j} U (x, s_{j}), s_{j} \in Ω^{c} .

(106)

where

U (x, s_{j}) = ln r_{j}, r_{j} = ∥ x - s_{j} ∥ .

(107)

We consider

\begin{matrix} u (x, y) = cos x cosh y + sin x sinh y, \end{matrix}

(108)

\begin{matrix} ρ (θ) = \exp (\sin θ) \sin^{2} (2 θ) + \exp (\cos θ) \cos^{2} (2 θ), \end{matrix}

(109)

where

u (x, y)

is an exact solution of the Laplace equation, and

ρ (θ), 0 \leq θ \leq 2 π

describes the boundary contour in the polar coordinates.

By GMRES with

n = 60

, through 60 iterations the residual is shown in Figure 2a, which does not satisfy the convergence criterion with

ε = 10^{- 2}

. If we run it to 70 steps, GMRES will blow up. In contrast, ROGMRES converges with 72 steps. As compared in Figure 2b, ME obtained by ROGMRES is smaller than 0.27, as shown in Figure 2c. However, GMRES obtains ME = 0.5 with a worse result.

Because the inverse Cauchy problem is a highly ill-posed problem, GMRES does not converge within a loose convergence criterion

ε = 10^{- 2}

, and after 70 steps it will blow up. It is apparent that GMRES cannot solve the Cauchy problem adequately. The ROGMRES method is stable for the Cauchy problem, and can improve the accuracy to 0.27; however, it leaves room to improve the methods based on the Krylov subspace for solving the Cauchy problem.

Next, we consider the mixed boundary value problem. Under

ε = 10^{- 3}

, for GMRES(m) with

m_{0} = 3

and

m_{1} = 10

, as shown in Figure 3a,

η_{k}

after 12 steps is irregular; it causes the residuals to decrease non-monotonically, as shown in Figure 3b. The GMRES(m) does not converge within 200 steps, and obtains an incorrect solution with ME = 0.92. If

η_{k} < 1 / 2

is used as a stop** criterion, through 14 steps, ME =

6.64 \times 10^{- 3}

is obtained. When the ROGMRES method is applied, we obtain ME =

4.16 \times 10^{- 4}

with 187 steps. As shown in Figure 3a by a dashed line,

η_{k} = 1

is kept well, and the residuals monotonically decrease, as shown in Figure 3b by a dashed line for ROGMRES. ME =

4.87 \times 10^{- 4}

is obtained through 167 steps by OMPA with

m_{0} = 4

, which is convergent faster than ROGMRES. ME =

4.09 \times 10^{- 4}

is obtained through 109 steps by A-GMRES with

m_{0} = 4

, which is convergent faster than ROGMRES and OMPA.

6.4. Example 4: Diagonal Dominant Linear System

Consider an example borrowed from Morgan [27]. The matrix

A

consists of main diagonal elements from 1 to 1000; the super diagonal elements are 0.1, and other elements are zero. We take

b = 1

.

Table 3 compares the number of steps (NS) for different methods under

ε = 10^{- 10}

.

m_{0} = 1

is used in OMPA and A-GMRES;

m_{0} = 1, m_{1} = 25

are used in original GMRES(m), ROGMRES(m) and current GMRES(m). The NS of ROGMRES(m), and current GMRES(m) are the same, because of

η_{k} = 1

for the current GMRES(m) before convergence. Figure 4 compares the residuals.

The method in Equation (18) is an automatically orthogonality preserving method. When the OMPA and A-GMRES methods do not consider the automatically orthogonality preserving method, the results are drastically different, as shown in Figure 5. Because

η = 1

is not preserved, both OMPA and A-GMRES diverge very quickly after 40 steps.

6.5. Example 5: Densely Clustered Non-Symmetric Linear System

Consider an example with

a_{i j} = (i + j / 2) / n, i, j = 1, \dots, n

, where

n = 2000

. Suppose that

x_{i} = 1, i = 1, \dots, 2000

are exact solutions, and the initial values are

x_{i}^{0} = 0, i = 1, \dots, 2000

. The values of matrix elements are densely clustered in a narrow range with

a_{i j} \in [0.00075, 1.5]

. This problem is a highly ill-conditioned non-symmetric linear system with the condition number over the order

10^{15}

. Table 4 compares the maximal errors (MEs) and number of steps (NS) for different methods under

ε = 10^{- 10}

.

m_{0} = 1

is used in OMPA and A-GMRES;

m_{0} = 1, m_{1} = 3

are used in current GMRES(m), and

m_{0} = 2, m_{1} = 3

are used in ROGMRES(m). ROGMRES(m) is convergent faster than other algorithms. Even for this highly ill-conditioned problem, the proposed novel methods are highly efficient in finding accurate solutions.

7. Conclusions

In this paper the GMRES method was re-examined from the two aspects of preserving the orthogonality and maximizing the decreasing length of the residual vector, both of which can significantly enhance the stability and accelerate the convergence speed of GMRES. If

η_{k} = r_{k}^{T} A u_{k} / {∥ A u_{k} ∥}^{2}

in GMRES is kept to

η_{k} = 1

during the iterations, it is stable; otherwise, the GMRES is unstable. For any iterative form of

x_{k + 1} = x_{k} + u_{k}

, we can improve it to

x_{k + 1} = x_{k} + u_{k} (r_{k}^{T} A u_{k}) / {∥ A u_{k} ∥}^{2}

, such that the orthogonality is preserved automatically, and simultaneously the length of the residual vector is reduced maximally. It brings the GMRES method to a re-orthogonalized GMRES (ROGMRES) method, preserving the orthogonality automatically and having the property of absolute convergence. We derived the new iterative form

x_{k + 1} = x_{k} + u_{k} (r_{k}^{T} A u_{k}) / {∥ A u_{k} ∥}^{2}

also from the invariant-manifold theory.

In order to find a better descent vector, we solved a maximal projection problem in a larger affine Krylov subspace to derive the descent vector. We proved that the maximal projection is equivalent to the minimal residual length in the m-dimensional affine Krylov subspace. After taking the automatically orthogonality preserving method into account, a new orthogonalized maximal projection algorithm (OMPA) method was developed. We derived formulas similar to GMRES in the affine Krylov subspace, namely the affine GMRES (A-GMRES) method, which had superior performance for the problems tested in the paper; it is even better than OMPA for some problems. The algorithm A-GMRES possessed two advantages: solving a simpler least squares problem in the affine Krylov subspace, and preserving the orthogonality automatically.

Through the studies conducted in the paper, the novelty points and main contributions are summarized as follows.

The GMRES was examined from the viewpoints of maximizing the decreasing quantity of the residual and to maintain the orthogonality of the consecutive residual vector.
A stabilization factor, $η$ , to maintain orthogonality was inserted into GMRES to preserve the orthogonality automatically.
GMRES was improved in a larger space of the affine Krylov subspace.
A new orthogonalized maximal projection algorithm, OMPA, was derived; a new affine A-GMRES was derived.
The new algorithms ROGMRES, OMPA, and A-GMRES guarantee the absolute convergence.
Through examples, we showed that the orthogonalization techniques are useful to stabilize the methods of GMRES, MPA, and A-GMRES.
Numerical testings for five different problems confirmed that the methods of ROGMRES, ROGMRES(m), OMPA, OMPA(m), A-GMRES, and A-GMRES(m) can significantly accelerate the convergence speed compared with the original methods of GMRES and GMRES(m).

Author Contributions

Conceptualization, C.-S.L. and C.-W.C.; Methodology, C.-S.L. and C.-W.C.; Software, C.-S.L., C.-W.C. and C.-L.K.; Validation, C.-S.L., C.-W.C. and C.-L.K.; Formal analysis, C.-S.L. and C.-W.C.; Investigation, C.-S.L., C.-W.C. and C.-L.K.; Resources, C.-S.L. and C.-W.C.; Data curation, C.-S.L., C.-W.C. and C.-L.K.; Writing—original draft, C.-S.L. and C.-W.C.; Writing—review & editing, C.-W.C.; Visualization, C.-S.L., C.-W.C. and C.-L.K.; Supervision, C.-S.L. and C.-W.C.; Project administration, C.-W.C.; Funding acquisition, C.-W.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Krylov, A.N. On the numerical solution of equation by which are determined in technical problems the frequencies of small vibrations of material systems. Izv. Akad. Nauk. SSSR 1931, 7, 491–539. [Google Scholar]
Saad, Y. Iterative Methods for Sparse Linear Systems, 2nd ed.; SIAM: Pittsburgh, PA, USA, 2003. [Google Scholar]
van der Vorst, H.A. Iterative Krylov Methods for Large Linear Systems; Cambridge University Press: New York, NY, USA, 2003. [Google Scholar]
Meurant, G.; Tabbens, J.D. Krylov Methods for Non-Symmetric Linear Systems: From Theory to Computations; Springer Series in Computational Mathematics; Springer: Berlin/Heidelberg, Germany, 2020; Volume 57. [Google Scholar]
Godunov, S.K.; Rozhkovskaya, T. Modern Aspects of Linear Algebra; Translations of Mathematical Monographs; AMS: Little Rock, AR, USA, 1998.
Godunov, S.K.; Gordienko, V.M. The Krylov space and the Kalman equation. Sib. Zhurnal Vychislitel’Noi Mat. 1998, 1, 5–10. [Google Scholar]
van Den Eshof, J.; Sleijpen, G.L.G. Inexact Krylov subspace methods for linear systems. SIAM J. Matrix Anal. Appl. 2004, 26, 125–153. [Google Scholar] [CrossRef]
Bai, Z.Z. Motivations and realizations of Krylov subspace methods for large sparse linear systems. J. Comput. Appl. Math. 2015, 283, 71–78. [Google Scholar] [CrossRef]
Simoncini, V.; Szyld, D.B. Recent computational developments in Krylov subspace methods for linear systems. Numer. Linear Algebra Appl. 2007, 14, 1–59. [Google Scholar] [CrossRef]
Bouyghf, F.; Messaoudi, A.; Sadok, H. A unified approach to Krylov subspace methods for solving linear systems. Numer. Algor. 2024, 96, 305–332. [Google Scholar] [CrossRef]
Saad, Y.; Schultz, M.H. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 1986, 7, 856–869. [Google Scholar] [CrossRef]
Liu, C.S. A maximal projection solution of ill-posed linear system in a column subspace, better than the least squares solution. Comput. Math. Appl. 2014, 67, 1998–2014. [Google Scholar] [CrossRef]
Liu, C.S.; Kuo, C.L.; Chang, C.W. Solving least-squares problems via a double-optimal algorithm and a variant of Karush-Kuhn-Tucker equation for over-determined system. Algorithms 2024, 17, 211. [Google Scholar] [CrossRef]
Morgan, R.B. Implicitly restarted GMRES and Arnoldi methods for nonsymmetric systems of equations. SIAM J. Matrix Anal. Appl. 2000, 21, 1112–1135. [Google Scholar] [CrossRef]
Baker, A.H.; Jessup, E.R.; Manteuffel, T. A technique for accelerating the convergence of restarted GMRES. SIAM J. Matrix Anal. Appl. 2005, 26, 962–984. [Google Scholar] [CrossRef]
Zou, Q. GMRES algorithms over 35 years. Appl. Math. Comput. 2023, 445, 127869. [Google Scholar] [CrossRef]
Thomas, S.; Carson, E.; Rozložník, M. Iterated Gauss–Seidel GMRES. SIAM J. Sci. Comput. 2023, 46, S254–S279. [Google Scholar] [CrossRef]
Imakura, A.; Sogabe, T.; Zhang, S.L. An efficient variant of the GMRES(m) method based on the error equations. East Asian J. Appl. Math. 2012, 2, 19–32. [Google Scholar] [CrossRef]
Imakura, A.; Sogabe, T.; Zhang, S.L. A look-back-type restart for the restarted krylov subspace methods to solve non-Hermitian systems. Jpn. J. Ind. Appl. Math. 2018, 35, 835–859. [Google Scholar] [CrossRef]
Benzi, M.; Meyer, C.D.; Tuma, M. A sparse approximate inverse preconditioner for the conjugate gradient method. SIAM J. Sci. Comput. 1996, 17, 1135–1149. [Google Scholar] [CrossRef]
Benzi, M.; Tuma, M. A sparse approximate inverse preconditioner for nonsymmetric linear systems. SIAM J. Sci. Comput. 1998, 19, 968–994. [Google Scholar] [CrossRef]
Benzi, M.; Cullum, J.K.; Tuma, M. Robust approximate inverse preconditioning for the conjugate gradient method. SIAM J. Sci. Comput. 2000, 22, 1318–1332. [Google Scholar] [CrossRef]
Benzi, M.; Bertaccini, D. Approximate inverse preconditioning for shifted linear systems. BIT Numer. Math. 2003, 43, 231–244. [Google Scholar] [CrossRef]
Baglama, J.; Calvetti, D.; Golub, G.H.; Reichel, L. Adaptively preconditioned GMRES algorithms. SIAM J. Sci. Comput. 1998, 20, 243–269. [Google Scholar] [CrossRef]
Bakhos, T.; Kitanidis, P.; Ladenheim, S.; Saibaba, A.K.; Szyld, D.B. Multipreconditioned GMRES for shifted systems. SIAM J. Sci. Comput. 2017, 39, S222–S247. [Google Scholar] [CrossRef]
Liu, Q.; Morgan, R.B.; Wilcox, W. Polynomial preconditioned GMRES and GMRES-DR. SIAM J. Sci. Comput. 2015, 37, S407–S428. [Google Scholar] [CrossRef]
Morgan, R.B. A restarted GMRES method augmented with eigenvectors. SIAM J. Matrix Anal. Appl. 1995, 16, 1154–1171. [Google Scholar] [CrossRef]
Baglama, J.; Reichel, L. Augmented GMRES-type methods. Numer. Linear Alg. Appl. 2007, 14, 337–350. [Google Scholar] [CrossRef]
Dong, Y.; Garde, H.; Hansen, P.C. R³GMRES: Including prior information in GMRES-type methods for discrete inverse problems. Electron. Trans. Numer. Anal. 2014, 42, 136–146. [Google Scholar]
Carson, E.; Liesen, J.; Strakoš, Z. Towards understanding CG and GMRES through examples. Linear Alg. Appl. 2024, 692, 241–291. [Google Scholar] [CrossRef]
Erhel, J.; Burrage, K.; Pohl, B. Restarted GMRES preconditioned by deflation. J. Comput. Appl. Math. 1996, 69, 303–318. [Google Scholar] [CrossRef]
Baker, A.H.; Jessup, E.R.; Kolev, T.Z. A simple strategy for varying the parameter in GMRES(m). J. Comput. Appl. Math. 2009, 230, 751–761. [Google Scholar] [CrossRef]
Moriya, K.; Nodera, T. The deflated-GMRES(m,k) method with switching the restart frequency dynamically. Numer. Linear Alg. Appl. 2000, 7, 569–584. [Google Scholar] [CrossRef]
Sosonkina, M.; Watson, L.T.; Kapania, R.K.; Walker, H.F. A new adaptive GMRES algorithm for achieving high accuracy. Numer. Linear Alg. Appl. 1998, 5, 275–297. [Google Scholar] [CrossRef]
Zhang, L.; Nodera, T. A new adaptive restart for GMRES(m) method. Austr. N. Z. Ind. Appl. Math. J. 2005, 46, 409–426. [Google Scholar] [CrossRef]

Figure 1. For example 1: (a)

η

of GMRES, ROGMRES, ROGMRES(m), and MPA(m), and (b) the residuals blowing up obtained by GMRES, ROGMRES, ROGMRES(m), and MPA(m) do not blow up.

Figure 1. For example 1: (a)

η

of GMRES, ROGMRES, ROGMRES(m), and MPA(m), and (b) the residuals blowing up obtained by GMRES, ROGMRES, ROGMRES(m), and MPA(m) do not blow up.

Figure 2. For example 3 of the inverse Cauchy problem: (a) the residuals, (b) comparing solutions, and (c) the errors obtained by GMRES and ROGMRES.

Figure 3. For example 3 with mixed boundary conditions: (a)

η

and (b) the residuals obtained by GMRES(m), ROGMRES, OMPA, and A-GMRES; the residuals obtained by GMRES(m) fail.

Figure 3. For example 3 with mixed boundary conditions: (a)

η

and (b) the residuals obtained by GMRES(m), ROGMRES, OMPA, and A-GMRES; the residuals obtained by GMRES(m) fail.

Figure 4. For example 4, the residuals obtained by OMPA, A-GMRES, ROGMRES(m), original GMRES(m), and current GMRES(m), which is coincident with that obtained by ROGMRES(m).

Figure 5. For example 4,

η

obtained by OMPA and A-GMRES. If the automatically orthogonality preserving method is not taken into account, they fail.

Figure 5. For example 4,

η

obtained by OMPA and A-GMRES. If the automatically orthogonality preserving method is not taken into account, they fail.

Table 1. For example 1, ME and NS obtained by different methods.

	GMRES(m)	ROGMRES	ROGMRES(m)	GMRES( $η$ )	GMRES(m, $η$ )	MPA(m)	A-GMRES(m)
ME	$7.39 \times 10^{- 14}$	$7.22 \times 10^{- 14}$	$6.50 \times 10^{- 14}$	$8.22 \times 10^{- 14}$	$9.04 \times 10^{- 14}$	$5.66 \times 10^{- 14}$	$5.88 \times 10^{- 14}$
NS	28	44	18	16	16	10	15

Table 2. For example 2, ME and NS obtained by different methods.

	GMRES(m)	ROGMRES	OMPA	A-GMRES	ROGMRES(m)	GMRES( $η$ )	GMRES(m, $η$ )
ME	$3.51 \times 10^{- 2}$	$9.77 \times 10^{- 14}$	$1.88 \times 10^{- 13}$	$3.36 \times 10^{- 9}$	$3.51 \times 10^{- 2}$	$3.65 \times 10^{- 2}$	$3.25 \times 10^{- 2}$
NS	>550	51	43	52	>550	45	163

Table 3. For example 4, NS obtained by different methods.

	Original GMRES(m)	ROGMRES(m)	OMPA	A-GMRES	[27]	Current GMRES(m)
NS	>300	44	32	31	300	44

Table 4. For example 5, ME and NS obtained by different methods.

Method	OMPA	A-GMRES	Current GMRES(m)	ROGMRES(m)
NS	4	3	9	2
ME	$2.10 \times 10^{- 9}$	$1.46 \times 10^{- 11}$	$1.18 \times 10^{- 13}$	$5.98 \times 10^{- 13}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, C.-S.; Chang , C.-W.; Kuo , C.-L. Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems. Algorithms 2024, 17, 266. https://doi.org/10.3390/a17060266

AMA Style

Liu C-S, Chang C-W, Kuo C-L. Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems. Algorithms. 2024; 17(6):266. https://doi.org/10.3390/a17060266

Chicago/Turabian Style

Liu, Chein-Shan, Chih-Wen Chang , and Chung-Lun Kuo . 2024. "Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems" Algorithms 17, no. 6: 266. https://doi.org/10.3390/a17060266

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Re-Orthogonalized/Affine GMRES and Orthogonalized Maximal Projection Algorithm for Solving Linear Systems

Abstract

1. Introduction

2. GMRES and GMRES(m)

3. Maximally Decreasing Residual and Automatically Preserving Orthogonality

4. Orthogonalized Maximal Projection Algorithm

5. An Affine GMRES Method

6. Examples

6.1. Example 1: Cyclic Matrix Linear System

6.2. Example 2: Random Matrix Linear System

6.3. Example 3: Inverse Cauchy Problem and Boundary Value Problem

6.4. Example 4: Diagonal Dominant Linear System

6.5. Example 5: Densely Clustered Non-Symmetric Linear System

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI