Problem Set 11 (Density Estimation)

1

Generate a data set by sampling 100 points from between to 5, adding random noise with a magnitude of 0.25 to . Fit it with cluster-weighted modeling, using a local linear model for the clusters, . Plot the data and the resulting forecast , uncertainty , and cluster probabilities as the number of clusters is increased (starting with one), and animate their evolution during the EM iterations.

My code lives here.

Here’s a video of 50 EM iterations with three clusters. I’m displaying the local linear models over a range. (I don’t indicate the output variance in any way, would be nice to add this.)

Not surprisingly, for tanh three clusters seems to work best. Any less and the fit degrades, any more and some clusters end up with needlessly similar local models.