NMM week 10: Functions & Density Estimation¶

Question 1: Functions¶

Take as a data set x = {−10, −9, . . . , 9, 10}, and y(x) = 0 if x ≤ 0 and y(x) = 1 if x > 0¶

In [76]:
import numpy as np

x = np.arange(-10, 11)
y = np.where(x <= 0, 0, 1)

print('x =', x)
print('y =', y)

print('Mean:', np.mean(y))
print('Standard deviation:', np.std(y))
print('Min:', np.min(y))
print('Max:', np.max(y))

plt.plot(x, y, 'o', color='pink')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Plot of y vs. x')
plt.show()
x = [-10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7
   8   9  10]
y = [0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1]
Mean: 0.47619047619047616
Standard deviation: 0.49943278484292924
Min: 0
Max: 1

a) Fit the data with a polynomial with 5, 10, and 15 terms, using a pseudo-inverse of the Vandermonde matrix.¶

We're using a singular value decomposition (SVG) as learned in a previous week, and let NumPy calculate it.

Change integers into floats with $dtype=$ in arange. to change to float.

$$f(x)=y$$

Polynomial function, so we're assuming:

$$a_0 + a_1x + a_2x^2 + a_3x^3 + a_4x^4$$

Next steps:

  • Plug this in.
  • It becomes a matrix multiplication.
  • Take -10 and turn it into a vector.
  • Make a row for each vector.
  • Build the Vandermonde matrix.
  • Pseudo-inverse the matrix: single value decompsition (SVD) might be the way to do this.

Unitary matrix, sum is zero everything off the diagonal:

$$M = U \sum V^*$$

In this assignment we just work with regular numbers, no complex numbers. u and vh are orthogonal.

$$M \vec a = \vec y$$

But it can't be solved exactly, therefore we use an SVD np.linalg.svd. To create the matrix, I'm workinig with np.stack to get a transpose of the matrix.

More resources on SVD:

  • https://relate.cs.illinois.edu/course/cs357-f17/f/lectures/upload/sec-svd.pdf
  • https://andreask.cs.illinois.edu/cs357-s15/public/demos/09-svd-applications/Least%20Squares%20using%20the%20SVD.html
In [75]:
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(-10, 11)
y = np.where(x <= 0, 0, 1)

# VanderMonde matrix 
arrays = [x**0, x**1, x**2, x**3, x**4]
M = np.stack(arrays, axis=1) # transpose of the matrix, one in column instead of row 

# SVD
u, s, vh = np.linalg.svd(M)

print('x =', x)
print('y =', y)
print('M =', M)


print('u =', u)
print('s =', s)
print('vh =', vh)
x = [-10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0   1   2   3   4   5   6   7
   8   9  10]
y = [0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1]
M = [[    1   -10   100 -1000 10000]
 [    1    -9    81  -729  6561]
 [    1    -8    64  -512  4096]
 [    1    -7    49  -343  2401]
 [    1    -6    36  -216  1296]
 [    1    -5    25  -125   625]
 [    1    -4    16   -64   256]
 [    1    -3     9   -27    81]
 [    1    -2     4    -8    16]
 [    1    -1     1    -1     1]
 [    1     0     0     0     0]
 [    1     1     1     1     1]
 [    1     2     4     8    16]
 [    1     3     9    27    81]
 [    1     4    16    64   256]
 [    1     5    25   125   625]
 [    1     6    36   216  1296]
 [    1     7    49   343  2401]
 [    1     8    64   512  4096]
 [    1     9    81   729  6561]
 [    1    10   100  1000 10000]]
u = [[ 5.45969906e-01  5.02703380e-01  2.83843133e-01 -2.54767497e-01
  -1.09179838e-01  5.61602062e-02  3.24581267e-02 -9.18359712e-03
  -5.81557256e-02 -1.05932391e-01 -1.46071099e-01 -1.74212726e-01
  -1.88081522e-01 -1.87485108e-01 -1.74314478e-01 -1.52543999e-01
  -1.28231409e-01 -1.09517819e-01 -1.06627711e-01 -1.31868943e-01
  -1.99632740e-01]
 [ 3.58220768e-01  3.66481770e-01 -5.70381850e-02 -3.04210970e-02
   6.04884931e-02 -2.26467931e-02  7.86836382e-03  2.79758715e-02
   4.13808001e-02  5.13921073e-02  6.09226379e-02  7.24891244e-02
   8.82121863e-02  1.09816331e-01  1.38629952e-01  1.75585332e-01
   2.21218640e-01  2.75669933e-01  3.38683153e-01  4.09606134e-01
   4.87390592e-01]
 [ 2.23644114e-01  2.57402667e-01 -2.47979606e-01  1.31124361e-01
   1.21993782e-01  8.91022072e-02  1.72135422e-01  2.39805754e-01
   2.88475050e-01  3.15727387e-01  3.20369070e-01  3.02428630e-01
   2.63156829e-01  2.05026655e-01  1.31733325e-01  4.81942839e-02
  -3.94507963e-02 -1.23840014e-01 -1.96389241e-01 -2.47292120e-01
  -2.65520066e-01]
 [ 1.31103472e-01  1.72450237e-01 -3.27000349e-01  2.36846759e-01
   1.05437102e-01 -3.63776616e-01 -3.59671639e-01 -3.27847741e-01
  -2.73970516e-01 -2.04001035e-01 -1.24195849e-01 -4.11069820e-02
   3.84180625e-02  1.07236305e-01  1.57909288e-01  1.82703078e-01
   1.73588265e-01  1.22239960e-01  2.00377989e-02 -1.41934061e-01
  -3.72886937e-01]
 [ 7.07725426e-02  1.08608646e-01 -3.27646781e-01  2.93723980e-01
   3.73782220e-02 -2.75880934e-01 -1.84008020e-01 -9.07241943e-02
  -1.37547350e-02  3.45244336e-02  4.70867403e-02  2.22549677e-02
  -3.62987483e-02 -1.19552918e-01 -2.13136700e-01 -2.97329896e-01
  -3.47062958e-01 -3.31916982e-01 -2.16123711e-01  4.14344648e-02
   4.87224509e-01]
 [ 3.41352001e-02  6.28620588e-02 -2.78992421e-01  3.08733906e-01
  -5.91643901e-02  8.61110780e-01 -1.37726010e-01 -1.29952204e-01
  -1.17463340e-01 -1.01929996e-01 -8.47977966e-02 -6.72874082e-02
  -5.03945424e-02 -3.48899540e-02 -2.13194419e-02 -1.00038488e-02
  -1.03906108e-03  5.70399099e-03  1.05793333e-02  1.41659479e-02
   1.72677731e-02]
 [ 1.39854917e-02  3.21946425e-02 -2.05637941e-01  2.88854419e-01
  -1.64713570e-01 -1.27386583e-01  8.60926513e-01 -1.43166552e-01
  -1.39856826e-01 -1.29650723e-01 -1.13370019e-01 -9.21518566e-02
  -6.74487420e-02 -4.10285460e-02 -1.49745039e-02  8.31478454e-03
   2.61253550e-02  3.54278787e-02  3.28776623e-02  1.48146482e-02
  -2.27365859e-02]
 [ 4.42763802e-03  1.35905625e-02 -1.27711160e-01  2.41063403e-01
  -2.63333455e-01 -1.05991748e-01 -1.29336776e-01  8.54934941e-01
  -1.51589824e-01 -1.48144695e-01 -1.34783682e-01 -1.12381193e-01
  -8.26320226e-02 -4.80513605e-02 -1.19747873e-02  2.14417244e-02
   4.72218099e-02  5.95687130e-02  5.18652850e-02  1.66739857e-02
  -5.42631172e-02]
 [ 8.76032687e-04  4.03398487e-03 -6.08670509e-02  1.72338739e-01
  -3.42629487e-01 -7.95325609e-02 -1.11871346e-01 -1.37394346e-01
   8.47019952e-01 -1.56710869e-01 -1.47873150e-01 -1.26957158e-01
  -9.56570874e-02 -5.68710566e-02 -1.47011112e-02  2.55467776e-02
   5.73627126e-02  7.30328707e-02  6.36395028e-02  1.90609337e-02
  -7.20284374e-02]
 [ 5.52427038e-05  5.09075385e-04 -1.62877363e-02  8.96583110e-02
  -3.93748409e-01 -5.20113177e-02 -8.96672049e-02 -1.21977292e-01
  -1.44759354e-01  8.44761773e-01 -1.52045800e-01 -1.35221021e-01
  -1.06209896e-01 -6.78654874e-02 -2.44479138e-02  1.83756477e-02
   5.35309633e-02  7.25367421e-02  6.55046359e-02  2.11392393e-02
  -7.32619101e-02]
 [ 8.24531683e-09 -9.13337858e-17 -6.82489984e-04  1.93368463e-14
  -4.11378267e-01 -2.66047653e-02 -6.53488961e-02 -1.00713031e-01
  -1.28073744e-01 -1.44205773e-01  8.52717973e-01 -1.36873578e-01
  -1.13949666e-01 -8.08776997e-02 -4.14232523e-02 -7.50065073e-04
   3.45799542e-02  5.66067309e-02  5.59720232e-02  2.19194230e-02
  -5.77056453e-02]
 [ 5.52427038e-05 -5.09075385e-04 -1.62877363e-02 -8.96583110e-02
  -3.93748409e-01 -5.66410025e-03 -4.11755021e-02 -7.55769559e-02
  -1.04483359e-01 -1.24682537e-01 -1.34135241e-01  8.68024850e-01
  -1.18508870e-01 -9.52159341e-02 -6.47488030e-02 -3.09328638e-02
   1.23356906e-03  2.55792541e-02  3.47600225e-02  2.02587788e-02
  -2.76145001e-02]
 [ 8.76032687e-04 -4.03398487e-03 -6.08670509e-02 -1.72338739e-01
  -3.42629487e-01  9.28503108e-03 -1.90406425e-02 -4.86207136e-02
  -7.59624875e-02 -9.83275699e-02 -1.13731867e-01 -1.20945587e-01
   8.80506763e-01 -1.09653625e-01 -9.24598613e-02 -6.96993550e-02
  -4.39138168e-02 -1.83992578e-02  2.79401007e-03  1.48613744e-02
   1.22439219e-02]
 [ 4.42763802e-03 -1.35905625e-02 -1.27711160e-01 -2.41063403e-01
  -2.63333455e-01  1.75425322e-02 -4.72474075e-04 -2.19722088e-02
  -4.48995578e-02 -6.73899482e-02 -8.77713480e-02 -1.04564266e-01
  -1.16481754e-01  8.77570599e-01 -1.21505342e-01 -1.13000248e-01
  -9.63973362e-02 -7.13723608e-02 -3.77936191e-02  4.27805096e-03
   5.45892702e-02]
 [ 1.39854917e-02 -3.21946425e-02 -2.05637941e-01 -2.88854419e-01
  -1.64713570e-01  1.92338565e-02  1.33663088e-02  2.16439763e-03
  -1.40971429e-02 -3.47087716e-02 -5.85261398e-02 -8.39700915e-02
  -1.09026663e-01 -1.31247085e-01  8.52252222e-01 -1.55210357e-01
  -1.49881631e-01 -1.27573598e-01 -8.36634511e-02 -1.30935767e-02
   8.96284478e-02]
 [ 3.41352001e-02 -6.28620588e-02 -2.78992421e-01 -3.08733906e-01
  -5.91643901e-02  1.53100077e-02  2.16784743e-02  2.15086880e-02
   1.32280414e-02 -3.71316361e-03 -2.88417154e-02 -6.06614928e-02
  -9.66534650e-02 -1.33275692e-01 -1.65963322e-01  8.10871402e-01
  -1.96160849e-01 -1.79426497e-01 -1.30269054e-01 -3.90091202e-02
   1.05055611e-01]
 [ 7.07725426e-02 -1.08608646e-01 -3.27646781e-01 -2.93723980e-01
   3.73782220e-02  7.54753935e-03  2.40322532e-02  3.37039883e-02
   3.34451364e-02  2.15777287e-02 -2.13656331e-03 -3.64964282e-02
  -7.88609148e-02 -1.25149432e-01 -1.69841747e-01 -2.05977991e-01
   7.74841349e-01 -2.17544576e-01 -1.71856975e-01 -7.53774178e-02
   8.60521679e-02]
 [ 1.31103472e-01 -1.72450237e-01 -3.27000349e-01 -2.36846759e-01
   1.05437102e-01 -1.45144502e-03  2.03613388e-02  3.63173679e-02
   4.25091400e-02  3.65547344e-02  1.75978124e-02 -1.36923825e-02
  -5.51210252e-02 -1.02967708e-01 -1.51986443e-01 -1.95405658e-01
  -2.24928201e-01  7.69268664e-01 -2.01466746e-01 -1.24260534e-01
   1.52867825e-02]
 [ 2.23644114e-01 -2.57402667e-01 -2.47979606e-01 -1.31124361e-01
   1.21993782e-01 -8.25929179e-03  1.09648870e-02  2.68396396e-02
   3.59609071e-02  3.60186588e-02  2.57968920e-02  5.17363255e-03
  -2.48790655e-02 -6.22951200e-02 -1.03914420e-01 -1.45482827e-01
  -1.81652174e-01 -2.05980266e-01  7.89069122e-01 -1.87873760e-01
  -1.27084630e-01]
 [ 3.58220768e-01 -3.66481770e-01 -5.70381850e-02  3.04210970e-02
   6.04884931e-02 -8.62279733e-03 -3.49248383e-03  2.68535968e-03
   8.92714968e-03  1.41802836e-02  1.73231400e-02  1.71650781e-02
   1.24464385e-02  1.83854259e-03 -1.60563073e-02 -4.27048277e-02
  -7.96427543e-02 -1.28474842e-01 -1.90874864e-01  7.31414388e-01
  -3.63418900e-01]
 [ 5.45969906e-01 -5.02703380e-01  2.83843133e-01  2.54767497e-01
  -1.09179838e-01  2.53679198e-03 -2.19806926e-02 -3.88071721e-02
  -4.38795637e-02 -3.53396332e-02 -1.35339948e-02  1.89858883e-02
   5.74637046e-02  9.49382939e-02  1.22243647e-01  1.28008907e-01
   9.86583680e-02  1.84114746e-02 -1.30717176e-01 -3.68917836e-01
   2.81414394e-01]]
s = [1.83169142e+04 1.98933627e+03 6.32612904e+01 1.10097314e+01
 2.42858601e+00]
vh = [[ 1.51028759e-04  1.11022302e-16  1.17944030e-02  1.03232112e-17
   9.99930432e-01]
 [ 9.05814995e-21 -1.28041020e-02 -7.07441918e-18 -9.99918024e-01
   8.67361738e-18]
 [-4.31751971e-02  9.99157354e-15 -9.98997948e-01 -1.20809693e-16
   1.17899253e-02]
 [ 9.87745279e-15 -9.99918024e-01 -1.04326270e-14  1.28041020e-02
   1.11022302e-16]
 [-9.99067505e-01 -1.03179184e-14  4.31739741e-02  1.31803269e-16
  -3.58348253e-04]]

Next, using np.linalg.pinv to create the SVD.

In [54]:
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(-10, 11)
y = np.where(x <= 0, 0, 1)

# Function to define the number of terms in the polynomial
for n_terms in [5, 10, 15]:

    # Construct the Vandermonde matrix
    arrays = [x**i for i in range(n_terms)]
    M = np.stack(arrays, axis=1)

    # Compute the pseudo-inverse of the Vandermonde matrix
    M_pinv = np.linalg.pinv(M) 

    # Compute the coefficients of the polynomial
    c = np.dot(M_pinv, y)

    # Evaluate the polynomial at a higher resolution of x values
    x_eval = np.linspace(-10, 10, 1000)
    y_eval = np.polyval(c, x_eval)

    # Plot the data and the polynomial fit
    plt.plot(x, y, 'o', label='Data', color='pink')
    plt.plot(x_eval, y_eval, label=f'{n_terms} terms', color='purple')
    plt.xlabel('x')
    plt.ylabel('y')
    plt.title(f'Polynomial fit with {n_terms} terms')
    plt.legend()
    plt.show()

Putting it in subplots:

In [67]:
import numpy as np
import matplotlib.pyplot as plt

# Define the dataset
x = np.arange(-10, 11)
y = np.where(x <= 0, 0, 1)

# Define the number of terms to use in the polynomial fit
n_terms = [5, 10, 15]

# Create subplots
fig, axs = plt.subplots(1, len(n_terms), figsize=(15, 5))

# Loop over the number of terms
for i, n in enumerate(n_terms):

    # Construct the Vandermonde matrix
    arrays = [x**j for j in range(n)]
    M = np.stack(arrays, axis=1)

    # Compute the pseudo-inverse of the Vandermonde matrix
    M_pinv = np.linalg.pinv(M) 

    # Compute the coefficients of the polynomial
    c = np.dot(M_pinv, y)

    # Evaluate the polynomial at a higher resolution of x values
    x_eval = np.linspace(-10, 10, 1000)
    y_eval = np.polyval(c, x_eval)

    # Plot the data and the polynomial fit
    axs[i].plot(x, y, 'o', label='Data', color='pink')
    axs[i].plot(x_eval, y_eval, label=f'{n} terms', color='purple')
    axs[i].set_xlabel('x')
    axs[i].set_ylabel('y')
    axs[i].set_title(f'Polynomial fit with {n} terms')
    axs[i].legend()

# Show the plot
plt.show()

Not sure if this is what we're looking for..

(b) Fit the data with 5, 10, and 15 r3 RBFs uniformly distributed between x = −10 and x = 10.¶

An RBF is a Radial Basis Function. It computes the distance from our point to some other points. other points are getting a column. Use the same SVD strategy after we've built our appropriate matrix.

In [ ]:
 

(c) Using the coefficients found for these six fits, evaluate the total out-of-sample error at x = {−10.5, −9.5, . . . , 9.5, 10.5}.¶

(d) Using a 15th-order polynomial, fit the data with the curvature regularizer in equation (14.29), plot the fits, and compare the out-of-sample errors for λ = 0, 0.01, 0.1, 1 (this part is harder than the others).¶

Equation 14.29 is:

$$I = \sum^N_{n=1} [ y_n - f(x_n)]^2 - \lambda \int^{x_{max}}_{x_{min}} \left[ \frac{\partial ^2 f}{\partial x^2} \right] ^2 dx $$

Question 2: Density Estimation¶

a)¶

Generate a data set by sampling 100 points from $y = tanh(x)$ between $x = −5 to 5$, adding random noise with a magnitude of $0.25$ to y.¶

Numpy provides a few options for generating random noise. np.random.rand always gives numbers between 0 and 1. I'm choosing np.random.uniform. For creating x and y I'm using np.linspace and numpy.tanh.

In [64]:
import numpy as np
import matplotlib.pyplot as plt

ยต = 0 # Mean
ฯƒ = 1 # Standard deviation
N = 100 # Data points

noise = np.random.uniform(low=-0.25, high=0.25, size=N) # Adding random noise with a magnitude of 0.25 to y

x = np.linspace(-5, 5, N) # From -5 to 5 in linspace
y = np.tanh(x) # Hyperbolic tangent 

# Cluster-weighted model using a local linear function 
# y = a + b*x

plt.scatter(x, noise, color="pink")
plt.title(f'Noise with {N} data points')
plt.show()


plt.scatter(x, y, color="pink")
plt.title(f'x and y with {N} data points')
plt.show()


plt.scatter(x, y+noise, color="pink")
plt.title(f'x and y+noise with {N} data points')
plt.show()

Fit it with a cluster-weighted model using a local linear function in the clusters, $y = a + bx$. Plot the data and the resulting forecast $⟨y|x⟩$, uncertainty $⟨σ2 y |x⟩$, and cluster probabilities $p(⃗x |cm)p(cm)$ as the number of clusters is increased (starting with one), and animate their evolution during the EM iterations.¶

For this problem set, corresponding equations are in chapter 16.4 (page 202). In the book, EM is explained: 'Calculating an expected distribution given parameters, then finding the most likely parameters given a distribution, is called Expectation-Maximization (EM), and it converges to the maximum likelihood distribution starting from the initial guess [Dempster et al., 1977].' To get a better understanding of how to translate this part of the equation into code, I found Eyal's documentation helpful.

Then I asked ChatGPT the following: Do you know how to make a cluster-weighted model using a local linear function? Sharing the results here for future reference (and fun).

Yes, I can explain how to make a cluster-weighted model using a local linear function. A cluster-weighted model (CWM) is a type of mixture model where each mixture component is a locally weighted linear regression model. The idea is to partition the input space into clusters and fit a linear regression model to each cluster. The weights of the local regression models are determined by a probability distribution over the clusters, which is learned from the data.

Here's the basic algorithm to construct a CWM using a local linear function:

1. Partition the input space into clusters using a clustering algorithm such as k-means or Gaussian mixture model. The number of clusters and their locations are typically chosen based on prior knowledge or through cross-validation.

2. For each cluster, fit a locally weighted linear regression model to the data points in that cluster. The weights are determined by a kernel function that assigns higher weights to nearby points and lower weights to distant points.

3. Compute the probability of each data point belonging to each cluster, using the learned clustering probabilities. These probabilities can be computed using Bayes' rule and the Gaussian kernel density estimator.

4. Predict the output for a new input point by taking a weighted average of the predictions of the local regression models, where the weights are the probabilities of the point belonging to each cluster.

5. Tune the hyperparameters of the model, such as the kernel bandwidth and the number of clusters, using cross-validation.

In practice, the algorithm can be implemented using existing machine learning libraries such as scikit-learn in Python. I hope this helps!

Next, I asked, Do you know what an Expectation-Maximization (EM) iteration is?

Yes, I can explain what an Expectation-Maximization (EM) iteration is. The EM algorithm is a widely used iterative method for finding maximum likelihood estimates of parameters in statistical models where some of the variables are not observed. The basic idea of EM is to iteratively estimate the missing variables, while maximizing the likelihood of the observed data.

The algorithm consists of two steps: the Expectation (E) step and the Maximization (M) step. In the E step, the missing data is estimated based on the current estimate of the parameters. In the M step, the parameters are updated based on the estimated missing data. These two steps are repeated until convergence.

Here is the EM algorithm in more detail:

1. Initialization: Choose an initial estimate for the parameters of the model.

2. E step: Compute the expected value of the complete data log-likelihood given the observed data and the current estimate of the parameters. This involves estimating the missing data by computing the posterior distribution of the missing data given the observed data and the current estimate of the parameters.

3. M step: Maximize the expected value of the complete data log-likelihood with respect to the parameters. This involves updating the parameters to increase the likelihood of the observed data, using the estimated missing data from the E step.

4. Repeat steps 2-3 until convergence: Check for convergence, and if the convergence criterion is not met, return to step 2.

The EM algorithm is used in a wide range of applications, including image processing, natural language processing, and bioinformatics, among others. It is particularly useful in cases where some of the data is missing, or where the likelihood function is difficult to optimize directly.

With that information, I'm attempting to create the cluster weighted model.

In [66]:
import numpy as np
import matplotlib.pyplot as plt

ยต = 0 # Mean
ฯƒ = 1 # Standard deviation
N = 100 # Data points

noise = np.random.uniform(low=-0.25, high=0.25, size=N) # Adding random noise with a magnitude of 0.25 to y

x = np.linspace(-5, 5, N) # From -5 to 5 in linspace
y = np.tanh(x) + noise # Hyperbolic tangent 

# Cluster-weighted model using a local linear function y = a + b*x
M = 1 # number of clusters 

# y = a + b*x

# Plotting a resulting forecast โŸจ๐‘ฆ|๐‘ฅโŸฉ

# Plotting uncertainty โŸจฯƒ2๐‘ฆ|๐‘ฅโŸฉ

# Plotting cluster probabilities ๐‘(โƒ— ๐‘ฅ|๐‘๐‘š)๐‘(๐‘๐‘š)


plt.scatter(x, y+noise, color="pink")
plt.title(f'x and y+noise with {N} data points')
plt.show()