Suggested Solution to Homework 5
1
Using QR factorization, one should decompose \(A\) into two matrices \(Q\) and \(R\) as follows.
\[ Q = \begin{pmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}} \\ 0 & \frac{2}{\sqrt{6}} \\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{6}} \end{pmatrix} \]
\[ R = \begin{pmatrix} \sqrt{2} & \sqrt{\frac{9}{2}} \\ 0 & \sqrt{\frac{3}{2}} \end{pmatrix} \]
Since \(Q\) is an orthogonal matrix, we can turn the least square problem \(Ax = b\) into \(Rx = Q^Tb\).
\[ \begin{pmatrix} \sqrt{2} & \sqrt{\frac{9}{2}} \\ 0 & \sqrt{\frac{3}{2}} \end{pmatrix} x = \begin{pmatrix} \frac{3}{\sqrt{2}} \\ -\frac{1}{\sqrt{6}} \end{pmatrix} \]
Finally, we get \(x = (2, -\frac{1}{3})^T\).
2
2.1
Let \(q_j^{(n)}\) be the \(j\)-th coordinate of \(q^{(n)}\) and \(v_k^j\) be the \(j\)-th coordinate of \(v_k\). By definition, we have the following.
\begin{aligned} q^{(m)} & = A^m q^{(0)} \\ & = A^m (A - \lambda_1 I) \sum_{k=1}^{n} a_k v_k \\ & = A^m \sum_{k=1}^{n} a_k (A - \lambda_1 I) v_k \\ & = A^m \sum_{k=1}^{n} a_k (\lambda_k - \lambda_1) v_k \\ & = A^m \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) v_k \\ & = \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^m v_k \end{aligned}Consider the \(j\)-th coordinate of \(Aq^{(m)} = q^{(m+1)}\) and \(q^{(m)}\). Since \(|\lambda_2| > |\lambda_3| > ... > |\lambda_n| > 0\), as \(m \to \infty\), \(\left( \frac{\lambda_k}{\lambda_2} \right)^m \to 0\) for \(k = 3, 4, ..., n\). Hence, we have the following.
\begin{aligned} \frac{Aq_j^{(m)}}{q_j^{(m)}} & = \frac{\sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^{m+1} v_k}{\sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \lambda_k^m v_k} \\ & = \frac{\lambda_2^{m+1} \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \left( \frac{\lambda_k}{\lambda_2} \right)^{m+1} v_k}{\lambda_2^m \sum_{k=2}^{n} a_k (\lambda_k - \lambda_1) \left( \frac{\lambda_k}{\lambda_2} \right)^m v_k} \\ & \rightarrow \frac{\lambda_2^{m+1} a_2(\lambda_2 - \lambda_1)}{\lambda_2^m a_2(\lambda_2 - \lambda_1)} = \lambda_2 \text{ as } m \rightarrow \infty \end{aligned}2.2
First, we forgot to say that \(A\) is a sysmmetric matrix, so that we can assume \(A = V \Sigma V^T\).
\[ V = [v_1, v_2, ..., v_n] \] \[ \Sigma = \begin{pmatrix} \lambda_1 \\ & \lambda_2 \\ && \ddots \\ &&& \lambda_n \end{pmatrix} \]
In other words, we have \(A = \sum_{k=1}^{n} \lambda_k v_k v_k^T\). By the definition of \(B\), we have the following.
\begin{aligned} B & = A - \lambda_1 v_1 v_1^T \\ & = \sum_{k=2}^{n} \lambda_k v_k v_k^T \\ & = V \tilde{\Sigma} V^T \end{aligned}\[ \tilde{\Sigma} = \begin{pmatrix} 0 \\ & \lambda_2 \\ && \ddots \\ &&& \lambda_n \end{pmatrix} \]
Clearly, we see that the largest eigenvalue (in absolute value) is \(\lambda_2\). Hence, the power iteration will give an approximation of \(\lambda_2\).
3
3.1
By expressing \(\vec{x}\) in terms of the basis \(\{ \vec{x}_k \}_{k=1}^{n}\), we have the following.
\begin{aligned} \| A\vec{x} - q\vec{x} \|^2 & = \| (A-qI) \sum_{k=1}^{n} a_k \vec{x}_k \|^2 \\ & = \| \sum_{k=1}^{n} a_k (\lambda_k - q) \vec{x}_k \|^2 \\ & = \sum_{k=1}^{n} |a_k|^2 |\lambda_k - q|^2 \end{aligned}The last equality is by the Pythagorean theorem.
3.2
From the above, we choose the \(\lambda_k\) such that \(|\lambda_k - q|\) is smallest, i.e. the eigenvalue closest to \(q\). Let \(\lambda_q\) be that eigenvalue. Then, we have the following inequality.
\begin{aligned} \| A\vec{x} - q\vec{x} \|^2 & = \sum_{k=1}^{n} |a_k|^2 |\lambda_k - q|^2 \\ & \geq \sum_{k=1}^{n} |a_k|^2 |\lambda_q - q|^2 \\ & = |\lambda_q - q|^2 \sum_{k=1}^{n} |a_k|^2 \\ & = |\lambda_q - q|^2 \| \vec{x} \|^2 \end{aligned}The result follows by a simple rearrangement and taking square root.
3.3
By a careful computation, we have the following.
\[ q = \frac{\vec{x}^T A \vec{x}}{\vec{x}^T \vec{x}} = \frac{22}{3} \] \[ \frac{\| A\vec{x} - q\vec{x} \|}{ \| \vec{x} \|} = \frac{\sqrt{2}}{3} \]
Note that \(A\) is now symmetric, so its eigenvalues are real. Using the previous inequality, we have the following.
\[ \lambda_q \in [\frac{22}{3} - \frac{\sqrt{2}}{3}, \frac{22}{3} + \frac{\sqrt{2}}{3}] \]
4
4.1
Inverse power iteration:
function [r, x] = inverse_power(A, x, mu, N) % INVERSE_POWER % A : n x n matrix % x : n x 1 nonzero vector % mu : eigenvalue shift % N : number of iterations % r : estimated spectral radius of inv(A - mu*I) B = A - mu * eye(size(A)); for i = 1 : N x = B \ x; x = x / norm(x, inf); end r = norm(A * x, inf); end
Rayleigh quotient iteration:
function [mu] = rqi(A, x, mu, N) % RQI % A : n x n matrix % x : n x 1 nonzero vector % N : number of iterations % mu : estimated spectral radius of A I = eye(size(A)); for i = 1 : N x = (A - mu * I) \ x; x = x / norm(x); mu = x' * A * x; end end
4.2
In MATLAB, we can easily find the eigenvalues of \(A\) by eig(A)
,
where \(A\) can be obtained by hilb(100)
.
... 0.0493 0.2186 0.8214 2.1827
You can see that the third largest eigenvalue, say \(\lambda^*\), is closest to \(0.5\).
\[ \lambda^* = 0.218595882370697... \]
4.3
We compare the eigenvalue approximated by the two iterations with the accurate one given by the built-in eigenvalue function in MATLAB. You can check the implementations using the following script.
A = hilb(100); eig_all = eig(A); eig_target = eig_all(end-2); x = ones(100, 1); mu = 0.5; for N = 1 : 5 eig_inv = inverse_power(A, x, mu, N); eig_rqi = rqi(A, x, mu, N); err_inv = abs(eig_inv - eig_target); err_rqi = abs(eig_rqi - eig_target); fprintf('| %d | %e | %e |\n', N, err_inv, err_rqi); end
With \(\mu = 0.5\) and \(x^{(0)} = (1, 1, ..., 1)^T\), one should obtain the following errors.
Iteration | Error of inverse power iteration | Error of Rayleigh quotient iteration |
---|---|---|
1 | 2.127156e-01 | 4.643714e-01 |
2 | 1.064413e+00 | 5.817227e-01 |
3 | 3.840291e-01 | 6.028210e-01 |
4 | 9.669190e-01 | 6.028497e-01 |
5 | 3.451331e-01 | 6.028497e-01 |
The inverse power iteration will converge to the target eigenvalue. (Try more iterations!) However, the Rayleigh quotient iteration converges another eigenvalue (the second largest one). This shows that choosing a close enough inital guess of eigenvalue is not enough. You may need a close enough initial guess of eigenvector.