Chapter 19 - Filtering and State Estimation

(19.1) What is the approximate Kalman gain matrix in the limit of small measurement noise? How is the error matrix updated in this case?

We have $\mathbf{N}_t^y \to 0$. In that case, $$\mathbf{K}_t = \mathbf{E}_{t|t-1}\mathbf{B}_t^T (\mathbf{B}_t\mathbf{E}_{t|t-1}\mathbf{B}_t^T + \mathbf{N}_t^y)^{-1} \approx \mathbf{E}_{t|t-1}\mathbf{B}_t^T (\mathbf{B}_t\mathbf{E}_{t|t-1}\mathbf{B}_t^T)^{-1} = \mathbf{E}_{t|t-1}\mathbf{B}_t^T (\mathbf{B}_t^T)^{-1} \mathbf{E}_{t|t-1}^{-1} \mathbf{B}_t^{-1} = \mathbf{B}_t^{-1} $$

Consequently, for the error matrix, $$ \mathbf{E}_{t|t} = (\mathbf{I}-\mathbf{K}_t\mathbf{B}_t)\mathbf{E}_{t|t-1} = (\mathbf{I}-\mathbf{B}_t^{-1}\mathbf{B}_t)\mathbf{E}_{t|t-1} = \mathbf{0} $$

Naturally, when there is no noise in the measurement, we can have an error-less estimation.

(19.2) Take as a test signal a periodically modulated sinusoid with noise added, $y_n = sin[0.1t_n+4sin(0.01t_n)] + \eta \equiv \sin(\theta_n) + \eta$, where $\eta$ is a Gaussian noise process with $\sigma = 0.1$. Design an extended Kalman filter to estimate the noise-free signal. Use a two-component state vector $\vec{x}_n = (\theta_n,\theta_{n-1})$, and assume for the internal model a linear extrapolation $\theta_{n+1} = \theta_n + (\theta_n - \theta_{n-1})$. Take the system noise matrix $N^x$ to be diagonal, and plot the predicted value of $y$ versus the measured value of $y$ if the standard deviation of the system noise is chosen to be $10^{-1}, 10^{-3},$ and $10^{-5}$. Use the identity matrix for the initial error estimate.

We have $\vec{x}_n = (\theta_n,\theta_{n-1})$, $\theta_{n+1} = \theta_n + (\theta_n - \theta_{n-1})$ and $y_n = \sin(\theta_n) + \eta \approx \cos(\theta_n)\theta_n + \eta$ (1st order Taylor approximation). Written in the form of state equations, $$ x_n = \begin{bmatrix} \theta_n \\ \theta_{n-1} \end{bmatrix} = \begin{bmatrix} 2 & -1 \\ 1 & 0 \end{bmatrix} \begin{bmatrix} \theta_{n-1} \\ \theta_{n-2} \end{bmatrix} \text{ and } y_n = \begin{bmatrix} \cos{\theta_n} \\ 0 \end{bmatrix} \begin{bmatrix} \theta_{n-1} \\ \theta_{n-2} \end{bmatrix} $$

Thus, $\mathbf{A} \equiv \begin{bmatrix} 2 & -1 \\ 1 & 0 \end{bmatrix} $, and $\mathbf{B} \equiv \begin{bmatrix} \cos{\theta_n} \\ 0 \end{bmatrix}$.

A Python implementation of the (extended) Kalman filtering process follows.

In [201]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('white')
sns.set_context('notebook')

# Setting up signal
NPOINTS=1000
h = np.random.normal(loc=0, scale=0.1, size=NPOINTS)
t = np.arange(0, NPOINTS)
y = np.sin(.1*t + 4.0*np.sin(0.01*t)) + h 
In [202]:
# Setting up Kalman matrices
A = np.asarray([[2., -1.], [1., 0.]])
stdx_opts = [1e-1, 1e-3, 1e-5]
Nx_opts = [np.diag([stdx**2, stdx**2]) for stdx in stdx_opts]
Ny = .1**2
I = np.eye(2)

y_preds = [[] for _ in Nx_opts]
for i, Nx in enumerate(Nx_opts):
    E = np.eye(2)
    x = np.array([0.1, 0.0])
    for n in range(NPOINTS):
        y_preds[i].append(np.sin(x[0]))
        B = np.array([np.cos(x[0]), 0.0]).T
        K = np.matmul(E, B.T) / (np.matmul(B, np.matmul(E, B.T)) + Ny)
        x = x + K*(y[n] - y_preds[i][n])
        E = np.matmul(I - np.outer(K, B), E)
        x = np.matmul(A, x)
        E = np.matmul(A, np.matmul(E, A.T)) + Nx
        
    plt.figure()
    plt.scatter(t, y, label='Measured')
    plt.plot(t, y_preds[i], label='Predicted', color='r')
    plt.xlim([0, NPOINTS])
    plt.ylim([-1.5, 1.5])
    plt.legend()
    plt.title('System Noise Standard Deviation: %.2e' % stdx_opts[i])
    plt.show()

(19.3) (a) Given an HMM trellis (Figure 19.5), work out the probability to see an internal state given the observations $p(x_t|y_1, . . . , y_T )$ in terms of forward and backward terms.

(b) Generate observations from the model in Figure 19.4, with a fair and a biased coin as the internal states, and probabilities to switch between them.

In [140]:
transition_probs = np.array([[.2, 0.8], [.45, .55]])
generation_probs = np.array([[.5, .5], [.35, .65]])

N = 100
# Initialize
coins = [np.random.choice([0, 1])]
observations = [np.random.choice([0, 1], p=generation_probs[coins[-1]])]

# Run HMM
for _ in range(N):
    coins.append(np.random.choice([0, 1], p=transition_probs[coins[-1]]))
    observations.append(np.random.choice([0, 1], p=generation_probs[coins[-1]]))

(c) Use knowledge of this model and those observations to estimate the probabilities for which coin was used when, and compare with the correct values.