Suggested Solution to Homework 5

1

Using QR factorization, one should decompose \(A\) into two matrices \(Q\) and \(R\) as follows.

\[ Q = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ 0 & \frac{2}{\sqrt{6}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{6}} \end{pmatrix} \]

\[ R = \begin{pmatrix} \sqrt{2} & \sqrt{\frac{9}{2}} \\ 0 & \sqrt{\frac{3}{2}} \end{pmatrix} \]

Since \(Q\) is an orthogonal matrix, we can turn the least square problem \(Ax = b\) into \(Rx = Q^Tb\).

\[ \begin{pmatrix} \sqrt{2} & \sqrt{\frac{9}{2}} \\ 0 & \sqrt{\frac{3}{2}} \end{pmatrix} x = \begin{pmatrix} \frac{3}{\sqrt{2}} \\ -\frac{1}{\sqrt{6}} \end{pmatrix} \]

Finally, we get \(x = (2, -\frac{1}{3})^T\).

2

2.1

Let \(q_j^{(n)}\) be the \(j\)-th coordinate of \(q^{(n)}\) and \(v_k^j\) be the \(j\)-th coordinate of \(v_k\). By definition, we have the following.

\begin{aligned} q^{(m)} & = A^m q^{(0)} \\ & = A^m (A - \lambda_1 I) \sum_{k=1}^{n} a_k v_k \\ & = A^m \sum_{k=1}^{n} a_k (A - \lambda_1 I) v_k \\ & = A^m \sum_{k=1}^{n} a_k (\lambda_k - \lambda_1) v_k \\ & = A^m \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) v_k \\ & = \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^m v_k \end{aligned}

Consider the \(j\)-th coordinate of \(Aq^{(m)} = q^{(m+1)}\) and \(q^{(m)}\). Since \(|\lambda_2| > |\lambda_3| > ... > |\lambda_n| > 0\), as \(m \to \infty\), \(\left( \frac{\lambda_k}{\lambda_2} \right)^m \to 0\) for \(k = 3, 4, ..., n\). Hence, we have the following.

\begin{aligned} \frac{Aq_j^{(m)}}{q_j^{(m)}} & = \frac{\sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^{m+1} v_k}{\sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^m v_k} \\ & = \frac{\lambda_2^{m+1} \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \left( \frac{\lambda_k}{\lambda_2} \right)^{m+1} v_k}{\lambda_2^m \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \left( \frac{\lambda_k}{\lambda_2} \right)^m v_k} \\ & \rightarrow \frac{\lambda_2^{m+1} a_2(\lambda_2 - \lambda_1)}{\lambda_2^m a_2(\lambda_2 - \lambda_1)} = \lambda_2 \text{ as } m \rightarrow \infty \end{aligned}

2.2

First, we forgot to say that \(A\) is a sysmmetric matrix, so that we can assume \(A = V \Sigma V^T\).

\[ V = [v_1, v_2, ..., v_n] \] \[ \Sigma = \begin{pmatrix} \lambda_1 \\ & \lambda_2 \\ && \ddots \\ &&& \lambda_n \end{pmatrix} \]

In other words, we have \(A = \sum_{k=1}^{n} \lambda_k v_k v_k^T\). By the definition of \(B\), we have the following.

\begin{aligned} B & = A - \lambda_1 v_1 v_1^T \\ & = \sum_{k=2}^{n} \lambda_k v_k v_k^T \\ & = V \tilde{\Sigma} V^T \end{aligned}

\[ \tilde{\Sigma} = \begin{pmatrix} 0 \\ & \lambda_2 \\ && \ddots \\ &&& \lambda_n \end{pmatrix} \]

Clearly, we see that the largest eigenvalue (in absolute value) is \(\lambda_2\). Hence, the power iteration will give an approximation of \(\lambda_2\).

3

3.1

By expressing \(\vec{x}\) in terms of the basis \(\{ \vec{x}_k \}_{k=1}^{n}\), we have the following.

\begin{aligned} \| A\vec{x} - q\vec{x} \|^2 & = \| (A-qI) \sum_{k=1}^{n} a_k \vec{x}_k \|^2 \\ & = \| \sum_{k=1}^{n} a_k (\lambda_k - q) \vec{x}_k \|^2 \\ & = \sum_{k=1}^{n} |a_k|^2 |\lambda_k - q|^2 \end{aligned}

The last equality is by the Pythagorean theorem.

3.2

From the above, we choose the \(\lambda_k\) such that \(|\lambda_k - q|\) is smallest, i.e. the eigenvalue closest to \(q\). Let \(\lambda_q\) be that eigenvalue. Then, we have the following inequality.

\begin{aligned} \| A\vec{x} - q\vec{x} \|^2 & = \sum_{k=1}^{n} |a_k|^2 |\lambda_k - q|^2 \\ & \geq \sum_{k=1}^{n} |a_k|^2 |\lambda_q - q|^2 \\ & = |\lambda_q - q|^2 \sum_{k=1}^{n} |a_k|^2 \\ & = |\lambda_q - q|^2 \| \vec{x} \|^2 \end{aligned}

The result follows by a simple rearrangement and taking square root.

3.3

By a careful computation, we have the following.

\[ q = \frac{\vec{x}^T A \vec{x}}{\vec{x}^T \vec{x}} = \frac{22}{3} \] \[ \frac{\| A\vec{x} - q\vec{x} \|}{ \| \vec{x} \|} = \frac{\sqrt{2}}{3} \]

Note that \(A\) is now symmetric, so its eigenvalues are real. Using the previous inequality, we have the following.

\[ \lambda_q \in [\frac{22}{3} - \frac{\sqrt{2}}{3}, \frac{22}{3} + \frac{\sqrt{2}}{3}] \]

4

4.1

Inverse power iteration:

function [r, x] = inverse_power(A, x, mu, N)
% INVERSE_POWER
% A   : n x n matrix
% x   : n x 1 nonzero vector
% mu  : eigenvalue shift
% N   : number of iterations
% r   : estimated spectral radius of inv(A - mu*I)

B = A - mu * eye(size(A));
for i = 1 : N
  x = B \ x;
  x = x / norm(x, inf);
end
r = norm(A * x, inf);

end

Rayleigh quotient iteration:

function [mu] = rqi(A, x, mu, N)
% RQI
% A   : n x n matrix
% x   : n x 1 nonzero vector
% N   : number of iterations
% mu  : estimated spectral radius of A

I = eye(size(A));
for i = 1 : N
  x = (A - mu * I) \ x;
  x = x / norm(x);
  mu = x' * A * x;
end

end

4.2

In MATLAB, we can easily find the eigenvalues of \(A\) by eig(A), where \(A\) can be obtained by hilb(100).

...    0.0493    0.2186    0.8214    2.1827

You can see that the third largest eigenvalue, say \(\lambda^*\), is closest to \(0.5\).

\[ \lambda^* = 0.218595882370697... \]

4.3

We compare the eigenvalue approximated by the two iterations with the accurate one given by the built-in eigenvalue function in MATLAB. You can check the implementations using the following script.

A = hilb(100);
eig_all = eig(A);
eig_target = eig_all(end-2);

x = ones(100, 1);
mu = 0.5;
for N = 1 : 5
  eig_inv = inverse_power(A, x, mu, N);
  eig_rqi = rqi(A, x, mu, N);
  err_inv = abs(eig_inv - eig_target);
  err_rqi = abs(eig_rqi - eig_target);
  fprintf('| %d | %e | %e |\n', N, err_inv, err_rqi);
end

With \(\mu = 0.5\) and \(x^{(0)} = (1, 1, ..., 1)^T\), one should obtain the following errors.

Iteration	Error of inverse power iteration	Error of Rayleigh quotient iteration
1	2.127156e-01	4.643714e-01
2	1.064413e+00	5.817227e-01
3	3.840291e-01	6.028210e-01
4	9.669190e-01	6.028497e-01
5	3.451331e-01	6.028497e-01

The inverse power iteration will converge to the target eigenvalue. (Try more iterations!) However, the Rayleigh quotient iteration converges another eigenvalue (the second largest one). This shows that choosing a close enough inital guess of eigenvalue is not enough. You may need a close enough initial guess of eigenvector.