Paragraph 13.1 from the textbook mentions that matrix M is unitary if the adjoint is the inverse:
$$\mathbf{M}^\dagger \cdot \mathbf{M} = \mathbf{I}$$Where I is the identity matrix with 1s on the diagonal and 0s everywhere else.
According to the text, Discrete Fourier Transformation (DFT) is defined by:
$$X_f = \frac{1}{\sqrt{N}} \sum^{N-1}_{n=0}e^{2\pi ifn/N} x_n$$for the given data vector $\{ x_o, x_1, ..., x_{N-1} \}$
Build the DFT as a matrix.
Define $$M_{f,n}=\frac{e^{2 \pi i f n / N}}{\sqrt{N}}$$
Then continue: $$X_f = \sum_{n=0}^{N-1} M_{f,n} x_n $$
And because: $$(\mathbf{M}^\dagger)_{jk} = M_{kj}^* $$
We can do: $$= \sum^{N-1}_{k=0} \mathbf{M}_{jk}^* \mathbf{M}_{kl}$$
Take the complex conjugate of a complex number into account: means, to reflect across the real plane. $$(a+bi)^*$$
$$= a-bi$$Everything is a real number, except 2 /pi
$$(e^{ix})^* = e^{-i(x^*)}$$Simplify by removing the fraction:
$$=\frac{1}{N} \sum^{N-1}_{k=0} e^{(-2 \pi i k j / N + 2 \pi i k l / N)}$$Simplifying once more:
$$=\frac{1}{N} \sum^{N-1}_{k=0} e^{2 \pi i k (l-j) / N}$$What happens when l = j? You're on the diagonal, and it should be one. Here is the proof:
$$(\mathbf{M}^\dagger \cdot \mathbf{M})_{jj} = \mathbf{I}$$$$(\mathbf{M}^\dagger \cdot \mathbf{M})_{jj} = \frac{1}{N} \sum^{N-1}_{k=0} e^{2 \pi i k (l-j) / N}$$When $l = j$, we get: $$l-j = 0$$
This creates an exponent of 0, resulting in value 1. $$e^0 = 1 $$
There's another N in the exponent: $$\frac{1}{N} N = 1$$
The two N's factor out, resulting in the expected value of 1 on the diagonal.
What happens when l is not equal to j?
Take big N = 4
$$\mathbf{M}^\dagger \mathbf{M}_{jl} $$(a) Analytically calculate the covariance matrix of $\vec{x}$.
(b) What are the eigenvalues?
(c) Numerically verify these results by drawing a data set from the distribution and computing the covariance matrix and eigenvalues.
(d) Numerically find the eigenvectors of the covariance matrix, and use them to construct a transformation to a new set of variables ~y that have a diagonal covariance matrix with no zero eigenvalues. Verify this on the data set.
To answer this, we need to look at section 13.4 of the textbook, Principal Components. Summarizing a Gaussian distribution of the three-component vector, where N is a normal distribution, the mean ($\mu$) is the first value set at zero, and the variance ($\sigma^2$) equals one as the second value:
$$ \begin{align} x_1 = & N(0,1)\\ x_2 = & N(0,1)\\ x_3 = & x_1 + x_2 \end{align} $$From this, we can plug in numbers:
$$ \begin{align} x_1 = & (x_1^{(0)}, x_1^{(1)}, \dots, x_1^{(1000)})\\ x_2 = & (x_2^{(0)}, x_2^{(1)}, \dots, x_2^{(1000)})\\ x_3 = & (x_1^{(0)} + x_2^{(0)}, x_1^{(1)} + x_2^{(1)}, \dots , x_1^{(1000)} + x_2^{(1000)} ) \end{align} $$And we know the expected values, indicated within these $\langle .. \rangle$:
$$ \begin{align} \langle x_1 \rangle = \langle x_2 \rangle & = 0 = Mean (\mu)\\ \langle x_1^2 \rangle = \langle x_2^2 \rangle & = 1 = Variance (\sigma^2) \end{align} $$The measurement vector is:
$$\vec x = (x_1, x_2, x_1+x_2)$$Resulting in the covariance matrix:
$$\mathbf{C}= \left\langle \left(\vec x - \langle \vec x \rangle \right) \cdot \left( \vec x - \langle \vec x \rangle \right)^T \right\rangle $$Where $\langle \vec x \rangle$ is:
$$ \begin{align} \langle \vec x \rangle & = \left( \langle x_1 \rangle , \langle x_2 \rangle , \langle x_1 + x_2 \rangle \right) \\ & = 0,0,0 \end{align} $$Matrix multiplication is done by multiplying the rows of the first by the columns of the second:
$$C= \left\langle \begin{pmatrix} x_1 \\ x_2 \\ x_1 + x_2 \end{pmatrix} \begin{pmatrix} x_1 , & x_2 , & x_1 + x_2 \end{pmatrix} \right\rangle $$Resulting in the following 3x3 matrix:
$$ C = \begin{bmatrix} x_1^2 & x_1 x_2 & x_1(x_1 + x_2) \\ x_1 x_2 & x_2^2 & x_2(x_1 + x_2 \\ x_1(x_1 + x_2) & x_2(x_1 + x_2) & (x_1 + x_2)^2 \end{bmatrix} $$This corresponds with the expected values:
$$ C = \begin{bmatrix} \langle x_1^2 \rangle & \langle x_1 x_2 \rangle & \langle x_1(x_1 + x_2) \rangle\\ \langle x_1 x_2 \rangle & \langle x_2^2 \rangle & \langle x_2(x_1 + x_2 \rangle \\ \langle x_1(x_1 + x_2) \rangle & \langle x_2(x_1 + x_2) \rangle & \langle (x_1 + x_2)^2 \rangle \end{bmatrix} $$Then plug in the values we know, e.g.:
$$ \begin{align} \langle x_1 \rangle = \langle x_2 \rangle & = 0 = Mean (\mu)\\ \langle x_1^2 \rangle = \langle x_2^2 \rangle & = 1 = Variance (\sigma^2) \\ \langle x_1, x_2 \rangle & = 0 \\ \langle x_1 (x_1 + x_2) \rangle & = \langle x_1^2 \rangle + \langle x_1 x_2 \rangle = 1 + 0 = 1 \\ \langle (x_1 + x_2)^2 \rangle & = \langle x_1^2 + 2x_1 x_2 + x_2^2 \rangle = 1 + 0 + 1 = 2 \end{align} $$This results in the covariance matrix of $\vec{x}$:
$$ C = \begin{bmatrix} 1 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 1 & 2 \end{bmatrix} $$Here is an MIT math chapter on eigenvalues and eigenvectors. The symbol for eigenvalue is $\lambda$.
The determinant is a scalar value of a square matrix, rewritten as a product of a set of rules. For a 2x2 matrix the determinant can be calculated:
$$\det \begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} = ad - bc $$And the determinant of of a 3x3 matrix is: $$\det \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = aei + bfg + cdh - ceg - bdi - afh $$
For any matrix M, the eigenvalues are the solutions to:
$$det|C - \lambda I| = 0$$Plugging in the covariance matrix:
$$ \det \begin{bmatrix} 1 - \lambda & 0 & 1 \\ 0 & 1 - \lambda & 1 \\ 1 & 1 & 2 - \lambda \end{bmatrix} = 0 $$We get a cubed solution: $$(1-\lambda)^2 (2-\lambda) - 2(1-\lambda) = 0$$
Therefore three eigenvalues are the solutions: $\lambda = 0 , \lambda = 1, \lambda =3$
Solve this for $\lambda$, this is where SymPy comes in.
Using the numpy.linalg.eig routine, it is possible to solve with numpy to get the eigenvalues. Let's start with a basic example:
import numpy as np
from numpy import linalg as LA
w, v = LA.eig(np.diag((1, 2, 3)))
w; v
array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
# Experiment with ab - bc
# a = 0, b = 1, c = 2, d = 3
import numpy as np
from numpy import linalg as LA
w, v = LA.eig(np.array([[0,1], [2,3]]))
M = np.array([[0,1], [2,3]])
print('Matrix M')
print(M)
print('Eigenvalues')
w; v
Matrix M [[0 1] [2 3]] Eigenvalues
array([[-0.87192821, -0.27032301], [ 0.48963374, -0.96276969]])
# Create an array of the covariance matrix
import numpy as np
M = np.array([[1, 0, 1], [0, 1, 1], [1, 1, 2]])
print('Matrix M')
print(M)
Matrix M [[1 0 1] [0 1 1] [1 1 2]]
To verify these results I used numpy.random.normal for generating the gaussian, numpy.cov for the covariance matrix and numpy.linalg.eigvals to calculate the eigenvalues. Results are rounded with numpy.round().
import matplotlib.pyplot as plt
import numpy as np
µ = 0 # Mean
σ = 1 # Standard deviation
x1 = np.random.normal(µ,σ,1000)
x2 = np.random.normal(µ,σ,1000)
x = np.array([x1, x2, x1+x2])
C = np.cov(x) # Covariance matrix
eig = np.linalg.eigvals(C) # Calculate eigenvalues.
print('Covariance Matrix')
print(np.round(C,2)) # Round at two decimals
print('Eigenvalues')
print(np.round(eig,2))
Covariance Matrix [[0.97 0.01 0.98] [0.01 1.07 1.07] [0.98 1.07 2.05]] Eigenvalues [3.08 1.01 0. ]
The covariance can be calculated with numpy.cov. Then feed this in numpy.linalg.eig for the eigenvalues. The matrix product of two arrays is calculated with numpy.matmul.
import matplotlib.pyplot as plt
import numpy as np
µ = 0 # Mean
σ = 1 # Standard deviation
x1 = np.random.normal(µ,σ,1000)
x2 = np.random.normal(µ,σ,1000)
x = np.array([x1, x2, x1+x2])
C = np.cov(x) # Covariance matrix
eigvector = np.linalg.eig(C)[1] # Calculate eigenvectors
new_eigvector = eigvector[eig > 1e-6] # Select the ones with non-zero eigenvalues
y = np.matmul(new_eigvector, x) # Transform the dataset
# Print new outputs
C_y = np.cov(y)
eigy = np.linalg.eigvals(C_y)
print('Covariance Matrix')
print(np.round(C_y,2))
print('Eigenvalues')
print(np.round(eigy,2))
Covariance Matrix [[2.61 0.84] [0.84 0.97]] Eigenvalues [2.97 0.61]