using LinearAlgebra
= [2.0 1.0; 1.0 2.0] A
2×2 Matrix{Float64}:
2.0 1.0
1.0 2.0
Biostat 216
Let \(\mathbf{A} \in \mathbb{R}^{n \times n}\) and \[ \mathbf{A} \mathbf{x} = \lambda \mathbf{x}, \quad \mathbf{x} \ne \mathbf{0}. \] Then \(\lambda\) is an eigenvalue of \(\mathbf{A}\) with corresponding eigenvector \(\mathbf{x}\).
Note if \(\mathbf{x}\) is an eigenvector with eigenvalue \(\lambda\), then \(\alpha \mathbf{x}\) is also an eigenvector with same eigenvalue \(\lambda\). Therefore eigenvectors are determined up to a scaling factor; in practice we often normalize eigenvectors to have unit \(\ell_2\) norm.
From eigen-equation \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\), we have \[ (\lambda \mathbf{I} - \mathbf{A}) \mathbf{x} = \mathbf{0}. \] That is the marix \(\lambda \mathbf{I} - \mathbf{A}\) must be singular and \[ \det(\lambda \mathbf{I} - \mathbf{A}) = 0. \]
The \(n\)-degree polynomial \[ p_{\mathbf{A}}(\lambda) = \det(\lambda \mathbf{I} - \mathbf{A}) \] is called the characteristic polynomial. Eigenvalues are the \(n\) roots of \(p_{\mathbf{A}}(\lambda)\) (fundamental theorem of algebra).
Example: For \[ \mathbf{A} = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, \] the characteristic polynomial is \[ p_{\mathbf{A}}(\lambda) = \det \begin{pmatrix} \lambda - 2 & -1 \\ -1 & \lambda - 2 \end{pmatrix} = \lambda^2 - 4 \lambda + 3 = (\lambda - 1)(\lambda - 3). \] Therefore \(\mathbf{A}\)’s eigenvalues are \(\lambda_1 = 1\) and \(\lambda_2 = 3\). Solving linear equations \[ \begin{pmatrix} \lambda - 2 & -1 \\ -1 & \lambda - 2 \end{pmatrix} \mathbf{x} = \mathbf{0} \] now gives the corresponding eigenvectors \[ \mathbf{x}_1 = \begin{pmatrix} 1 \\ -1 \end{pmatrix}, \quad \mathbf{x}_2 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \] We observe that (1) \(\text{tr}(\mathbf{A}) = \lambda_1 + \lambda_2\), (2) \(\det(\mathbf{A}) = \lambda_1 \lambda_2\), and (3) the two eigenvectors are orthogonal to each other.
using LinearAlgebra
= [2.0 1.0; 1.0 2.0] A
2×2 Matrix{Float64}:
2.0 1.0
1.0 2.0
eigen(A)
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
2-element Vector{Float64}:
1.0
3.0
vectors:
2×2 Matrix{Float64}:
-0.707107 0.707107
0.707107 0.707107
= [0.0 -1.0; 1.0 0.0] Q
2×2 Matrix{Float64}:
0.0 -1.0
1.0 0.0
eigen(Q)
Eigen{ComplexF64, ComplexF64, Matrix{ComplexF64}, Vector{ComplexF64}}
values:
2-element Vector{ComplexF64}:
0.0 - 1.0im
0.0 + 1.0im
vectors:
2×2 Matrix{ComplexF64}:
0.707107-0.0im 0.707107+0.0im
0.0+0.707107im 0.0-0.707107im
If \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\) and \(\mathbf{B}\) is a nonsingular matrix of same size as \(\mathbf{A}\), then \[ (\mathbf{B} \mathbf{A} \mathbf{B}^{-1}) (\mathbf{B} \mathbf{x}) = \mathbf{B} \mathbf{A} \mathbf{x} = \lambda (\mathbf{B} \mathbf{x}). \] That is \(\mathbf{B} \mathbf{x}\) is an eigenvector of the matrix \(\mathbf{B} \mathbf{A} \mathbf{B}^{-1}\).
We say the matrix \(\mathbf{B} \mathbf{A} \mathbf{B}^{-1}\) is similar to \(\mathbf{A}\) because they share the same eigenvalues.
Geometric multiplicity (GM) of an eigenvalue \(\lambda\): count the independent eigenvectors for \(\lambda\), i.e., \(\text{dim}(\mathcal{N}(\lambda \mathbf{I} - \mathbf{A}))\).
Algebraic multiplicity (AM) of an eigenvalue \(\lambda\): count the repetition for \(\lambda\). Look at the roots of characteristic polynomial \(\det(\lambda \mathbf{I} - \mathbf{A})\).
Always \(\text{GM} \le \text{AM}\).
The shortage of eigenvectors when \(\text{GM} < \text{AM}\) means that \(\mathbf{A}\) is not diagonalizable. There is no invertible matrix \(\mathbf{X}\) such that \(\mathbf{A} = \mathbf{X} \boldsymbol{\Lambda} \mathbf{X}^{-1}\).
Classical example of non-diagonalizable matrices: \[ \mathbf{A} = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}. \] AM = 2, GM=1. Eigenvalue 0 is repeated twice but there is only one eigenvector \((1, 0)\).
More examples: all three matrices \[ \begin{pmatrix} 5 & 1 \\ 0 & 5 \end{pmatrix}, \begin{pmatrix} 6 & -1 \\ 1 & 4 \end{pmatrix}, \begin{pmatrix} 7 & 2 \\ -2 & 3 \end{pmatrix} \] have AM=2 and GM=1.
eigen([0 1; 0 0])
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
2-element Vector{Float64}:
0.0
0.0
vectors:
2×2 Matrix{Float64}:
1.0 -1.0
0.0 2.00417e-292
eigen([5 1; 0 5])
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
2-element Vector{Float64}:
5.0
5.0
vectors:
2×2 Matrix{Float64}:
1.0 -1.0
0.0 1.11022e-15
eigen([6 -1; 1 4])
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
2-element Vector{Float64}:
5.0
5.0
vectors:
2×2 Matrix{Float64}:
0.707107 0.707107
0.707107 0.707107
eigen([7 2; -2 3])
Eigen{ComplexF64, ComplexF64, Matrix{ComplexF64}, Vector{ComplexF64}}
values:
2-element Vector{ComplexF64}:
5.0 - 4.2146848510894035e-8im
5.0 + 4.2146848510894035e-8im
vectors:
2×2 Matrix{ComplexF64}:
0.707107-0.0im 0.707107+0.0im
-0.707107-1.49012e-8im -0.707107+1.49012e-8im
Multiplying both sides of the eigen-equation \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\) by \(\mathbf{A}\) gives \[ \mathbf{A}^2 \mathbf{x} = \lambda \mathbf{A} \mathbf{x} = \lambda^2 \mathbf{x}, \] showing that \(\lambda^2\) is an eigenvalue of \(\mathbf{A}^2\) with eigenvector \(\mathbf{x}\).
Similarly \(\lambda^k\) is an eigenvalue of \(\mathbf{A}^k\) with eigenvector \(\mathbf{x}\).
For a diagonalizable matrix \(\mathbf{A} = \mathbf{X} \boldsymbol{\Lambda} \mathbf{X}^{-1}\), we have \[ \mathbf{A}^k = \mathbf{X} \boldsymbol{\Lambda}^k \mathbf{X}^{-1}. \] Recall that matrix multiplication is an expensive operation, \(O(n^3)\) flops. This formula suggests we just need one eigen-decomposition to evaluate matrix powers.
Shifting diagonal of \(\mathbf{A}\) shifts all eigenvalues. \[ (\mathbf{A} + s \mathbf{I}) \mathbf{x} = \lambda \mathbf{x} + s \mathbf{x} = (\lambda + s) \mathbf{x}. \]
\(\mathbf{A}\) is singular if and only if it has at least one 0 eigenvalue.
Eigenvectors associated with distinct eigenvalues are linearly independent.
Proof: Let \[ \mathbf{A} \mathbf{x}_1 = \lambda_1 \mathbf{x}_1, \quad \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2, \] and \(\lambda_1 \ne \lambda_2\). Suppose \(\mathbf{x}_1\) and \(\mathbf{x}_2\) are linealy dependent. Then there is \(\alpha \ne 0\) such that \(\mathbf{x}_2 = \alpha \mathbf{x}_1\). Hence \[ \alpha \lambda_1 \mathbf{x}_1 = \alpha \mathbf{A} \mathbf{x}_1 = \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2 = \alpha \lambda_2 \mathbf{x}_1, \] or \(\alpha (\lambda_1 - \lambda_2) \mathbf{x}_1 = \mathbf{0}\). Since \(\alpha \ne 0\), \(\lambda_1 \ne \lambda_2\), \(\mathbf{x}_1=\mathbf{0}\), a contradiction.
The eigenvalues of a triangular matrix are its diagonal entries.
Proof: \[ p_{\mathbf{A}}(\lambda) = (\lambda - a_{11}) \cdots (\lambda - a_{nn}). \]
Eigenvalues of an idempotent matrix (i.e., an oblique projector) are either 0 or 1.
Proof: \[ \lambda \mathbf{x} = \mathbf{A} \mathbf{x} = \mathbf{A} \mathbf{A} \mathbf{x} = \lambda^2 \mathbf{x}. \] So \(\lambda = \lambda^2\) or \(\lambda =0, 1\).
Eigenvalues of an orthogonal matrix have complex modulus 1.
Proof: Since \(\mathbf{A}' \mathbf{A} = \mathbf{I}\), \[ \mathbf{x}^* \mathbf{x} = \mathbf{x}^* \mathbf{A}' \mathbf{A} \mathbf{x} = \lambda^* \lambda \mathbf{x}^* \mathbf{x}. \] Since \(\mathbf{x}^* \mathbf{x} \ne 0\), we have \(\lambda^* \lambda = |\lambda|^2 = 1\).
Let \(\mathbf{A} \in \mathbb{R}^{n \times n}\) (not required to be diagonalizable), then \(\text{tr}(\mathbf{A}) = \sum_i \lambda_i\) and \(\det(\mathbf{A}) = \prod_i \lambda_i\) (HW6). The general version can be proved by the Jordan canonical form, a generalization of the eigen-decomposition.
For a symmetric matrix \(\mathbf{A} \in \mathbb{R}^{n \times n}\),
Proof of 1 (optional): Pre-multiplying the eigen-equation \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\) by \(\mathbf{x}^*\) (conjugate transpose) gives \[ \mathbf{x}^* \mathbf{A} \mathbf{x} = \lambda \mathbf{x}^* \mathbf{x}. \] Let \(\mathbf{x} = \mathbf{a} + \mathbf{b} i\) be the associated eigenvector. Now both \[ \mathbf{x}^* \mathbf{x} = (\mathbf{a} + \mathbf{b} i)^* (\mathbf{a} + \mathbf{b} i) = (\mathbf{a}' - \mathbf{b}' i)(\mathbf{a} + \mathbf{b} i) = \mathbf{a}' \mathbf{a} + \mathbf{b}' \mathbf{b} \] and \[ \mathbf{x}^* \mathbf{A} \mathbf{x} = (\mathbf{a}' - \mathbf{b}' i) \mathbf{A} (\mathbf{a} + \mathbf{b} i) = \mathbf{a}' \mathbf{A} \mathbf{a} + \mathbf{b}' \mathbf{A} \mathbf{b} \] are real numbers. Therefore \(\lambda\) is a real number.
Proof of 2: Suppose \[ \mathbf{A} \mathbf{x}_1 = \lambda_1 \mathbf{x}_1, \quad \mathbf{A} \mathbf{x}_2 = \lambda_2 \mathbf{x}_2, \] and \(\lambda_1 \ne \lambda_2\). Then \[\begin{eqnarray*} (\mathbf{A} - \lambda_2 \mathbf{I}) \mathbf{x}_1 &=& (\lambda_1 - \lambda_2) \mathbf{x}_1 \\ (\mathbf{A} - \lambda_2 \mathbf{I}) \mathbf{x}_2 &=& \mathbf{0}. \end{eqnarray*}\] Thus \(\mathbf{x}_1 \in \mathcal{C}(\mathbf{A} - \lambda_2 \mathbf{I})\) and \(\mathbf{x}_2 \in \mathcal{N}(\mathbf{A} - \lambda_2 \mathbf{I})\). By the fundamental theorem of linear algebra, \(\mathbf{x}_1 \perp \mathbf{x}_2\).
Note a symmetric matrix certainly can have complex eigenvectors. For example, if \(\mathbf{x}\) is a real eigenvector of \(\mathbf{A}\), then \(\mathbf{A} (i \mathbf{x}) = i (\mathbf{A} \mathbf{x}) = \lambda (i \mathbf{x})\). That is \(i \mathbf{x}\) is a complex eigenvector of \(\mathbf{A}\) with same eigenvalue. The next result is more important, which says we always have enough real eigenvectors for a symmetric matrix.
For a symmetric matrix, the algebraic multiplicity of a distinct eigenvalue always equals its geometric multiplicity, i.e., AM=GM. See StackExhcange for a self-contained proof (optional).
For an eigenvalue with multiplicity, we can choose its eigenvectors to be orthogonal to each other. Also we normalize each eigenvector to have unit L2 norm. Thus we obtain the extremely useful spectral decomposition of a symmetric matrix \[ \mathbf{A} = \mathbf{Q} \boldsymbol{\Lambda} \mathbf{Q}' = \begin{pmatrix} \mid & & \mid \\ \mathbf{q}_1 & \cdots & \mathbf{q}_n \\ \mid & & \mid \end{pmatrix} \begin{pmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix} \begin{pmatrix} - & \mathbf{q}_1' & - \\ & \vdots & \\ - & \mathbf{q}_n' & - \end{pmatrix} = \sum_{i=1}^n \lambda_i \mathbf{q}_i \mathbf{q}_i', \] where \(\mathbf{Q}\) is orthogonal (columns are eigenvectors) and \(\boldsymbol{\Lambda} = \text{diag}(\lambda_1, \ldots, \lambda_n)\) (diagonal entries are eigenvalues).
= [2.0 1.0; 1.0 2.0] A
2×2 Matrix{Float64}:
2.0 1.0
1.0 2.0
= eigen(Symmetric(A)) Aeig
Eigen{Float64, Float64, Matrix{Float64}, Vector{Float64}}
values:
2-element Vector{Float64}:
1.0
3.0
vectors:
2×2 Matrix{Float64}:
-0.707107 0.707107
0.707107 0.707107
# eigenvectors are orthonormal
' * Aeig.vectors Aeig.vectors
2×2 Matrix{Float64}:
1.0 0.0
0.0 1.0
* Diagonal(Aeig.values) * Aeig.vectors' Aeig.vectors
2×2 Matrix{Float64}:
2.0 1.0
1.0 2.0
For a symmetric matrix \(\mathbf{A}\), the eigenvectors corresponding to non-zero eigenvalues are a basis for \(\mathcal{C}(\mathbf{A})\). The eigenvectors corresponding to the zero eigenvalue are a basis for \(\mathcal{N}(\mathbf{A})\).
Proof: If \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\) and \(\lambda \ne 0\), then \(\mathbf{x} \in \mathcal{C}(\mathbf{A})\). If \(\mathbf{A} \mathbf{x} = \lambda \mathbf{x}\) and \(\lambda = 0\), then \(\mathbf{x} \in \mathcal{N}(\mathbf{A})\).