HomePage

## 11.1

• y= 2+3x+ç
• x ~ Uniform(0,1)
• ζ ~ N(0,0.5)
• use SVD to fit y= a+bx,
• where a= [a, b] and $f_{m} =[1, x]$

writing Av=b: $~ \left( \begin{array}{cc} 1 & x_{1} \\ 1 & x_{2} \\ \vdots & \vdots \\ \vdots & \vdots \\ 1 & x_{100} \\\end{array} \right) \left( \begin{array}{c} a \\ b \end{array} \right) = \left( \begin{array}{c} b_{1} \\ b_{2} \\ \vdots \\ \vdots \\ b_{100} \end{array} \right)$

### Generating 100 data points and using SVD:

In [2]:
import random
import matplotlib.pyplot as plt
import numpy as np
import scipy.linalg as linalg

x=[]
y=[]
mu=0
sigma=0.5
ksi=np.random.normal(mu, sigma, 100)

L=np.empty([100,2])
A = L.copy()

for i in range(100):
x.append(random.random()) # produces a varaible x that is uniformly distributed between 0 and 1
y.append(2+3*x[i]+ksi[i]) # produces the data from x and a noise zeta
A[i,1]=x[i]
A[i,0]=1

b=y

U, s, Vh = linalg.svd(A)
S=linalg.diagsvd(s,100,2)
Sinv= S.copy()

Sinv[0,0]=1/Sinv[0,0]
Sinv[1,1]=1/Sinv[1,1]

v=np.dot(np.dot(np.transpose(Vh),np.dot(np.transpose(Sinv),np.transpose(U))),b)

print "a="
print v[0]
print "b="
print v[1]
yprime=[]
for i in range(100):
yprime.append(v[0]+v[1]*x[i])

import matplotlib.pyplot as plt
%pylab inline
fig=plt.figure()
ax1=plt.subplot(1,1,1)
ax1.scatter(x,y,color="red")
ax2=plt.subplot(1,1,1)
ax2.scatter(x,yprime,color="blue")

a=
2.06723182398
b=
2.87760719653
Populating the interactive namespace from numpy and matplotlib


WARNING: pylab import has clobbered these variables: ['random', 'linalg']
%matplotlib prevents importing * from pylab and numpy


Out[2]:
<matplotlib.collections.PathCollection at 0x108a1b310>


### (a) Estimating the error with equation (12.34)

${\sigma ^2_ {\eta , i} \over \sigma ^2_ \zeta} = \sum _j {V_{i,j}^2 \over w_j^2}$ where $_ =0.5$

In [11]:
Vsq= np.transpose(Vh)**2
w=Sinv**2
var=sigma**2

sigman=[]
sigman.append(var*Vsq[0,0]/w[0,0]+var*Vsq[0,1]/w[1,1])
sigman.append(var*Vsq[1,0]/w[0,0]+var*Vsq[1,1]/w[1,1])
print "sigma eta_1 = " +str(np.sqrt(sigman[0]))
print "sigma eta_2 = " +str(np.sqrt(sigman[1]))

sigma eta_1 = 5.0
sigma eta_2 = 2.96979183155



### (b) Bootstrap sampling to generate 100 data sets

Basically the new sample in a bootstrap is generated from the original sample by a random number generator to choose elements of the original sample.

In [13]:
originalsample=y
newsamples=[]
newsample=[]
k=len(originalsample)
stndrddeviation=[]
for i in range(100):
newsample=[]
for j in range(k):
newsample.append(originalsample[int(random.random()*k)])
stndrddeviation.append(np.std(newsample))
newsamples.append(newsample)

# calculate average standard deviation of newsamples:
avgstndrddeviation= np.average(stndrddeviation)
print "sigma average = " + str(avgstndrddeviation)

sigma average = 0.933022792425



### Levenberg- Marquardt

The algorithm was carried out in the following steps:

Compute $^2 (a)$.

Pick a value for $\lambda$.

Solve the linear equations ($a= -M^{-1} ^2 $) for$a$ and calculate $^2 (a + a)$.

If $^2 (a + a) ^2 (a)$, increase$$ by a factor of 10 and go back to (step 3).

If $^2 (a + a) < ^2 (a)$ decrease λ by a factor of 0.1 and go back to (step 3)

In [20]:
from sympy import *
import numpy as np
import random
import numpy.linalg as linalg

mu=0
sigma=0.1
ksi=np.random.normal(mu, sigma, 100)

x=[]
y=[]

for i in range(100):
x.append(random.random()) # produces a varaible x that is uniformly distributed between 0 and 1
y.append(np.sin(2+3*x[i])+ksi[i]) # produces the data from x and a noise zeta

#---function to calculte M -----
def params(xn,yn,sigman,a,b,lamda):
chisq=0
alpha1=0
alpha2=0
alpha3=0
deltachi1=0
deltachi2=0

R=np.matrix([[2,2],[2,2]]) #-- figure out how to populate a matrix in a more decent form
M=R.copy()

y = Symbol('y')
x = Symbol('x')
asym= Symbol('asym')
bsym= Symbol ('bsym')
sigmal= Symbol('sigmal')
f=sin(asym*x+bsym)
delchi1=(y-f)/(sigmal**2)*diff(f,asym)
delchi2=(y-f)/(sigmal**2)*diff(f,bsym)

chisqn=((y-sin(asym*x+bsym))/sigmal)**2
dchisqda=diff(chisqn, asym)
dchisqdb=diff(chisqn, bsym)
dsqchisqdasq=diff(dchisqda,asym)
dsqchisqdbsq=diff(dchisqdb,bsym)

for i in range(len(xn)):
chisq+=N(chisqn.subs([(x,xn[i]), (y,yn[i]),(asym,a),(bsym,b),(sigmal,sigman)]),20)
alpha1+=0.5*N(dsqchisqdasq.subs([(x,xn[i]), (y,yn[i]),(asym,a),(bsym,b),(sigmal,sigman)]),20)*(1+lamda)
alpha2+=0.5*N(dsqchisqdbsq.subs([(x,xn[i]), (y,yn[i]),(asym,a),(bsym,b),(sigmal,sigman)]),20)*(1+lamda)
deltachi1+=-2*N(delchi1.subs([(x,xn[i]), (y,yn[i]),(asym,a),(bsym,b),(sigmal,sigman)]),20)
deltachi2+=-2*N(delchi2.subs([(x,xn[i]), (y,yn[i]),(asym,a),(bsym,b),(sigmal,sigman)]),20)

M[0,0]=alpha1
M[1,1]=alpha2
M[0,1]=M[1,0]=alpha3

deltachisq=np.matrix([[deltachi1],[deltachi2]])

deltaA=-linalg.inv(M)*deltachisq
return deltaA,M,chisq

#Check
#deltaA,M,chisq= params(x,y,sigma,a,b,lamda)

#--- Initialize
L=np.matrix([[0],[0]])
A=L.copy()
a=2
b=3
A=np.matrix([[a],[b]])
lamda=100
yprime=[]
converged=0

for i in range(1000):
deltaA,M,ochisq= params(x,y,sigma,a,b,lamda)
a=a+float(deltaA[0])
b=b+float(deltaA[1])
deltaA,M,chisq= params(x,y,sigma,a,b,lamda)
#----Check difference in error for convergence
if abs(chisq-ochisq)<0.0001:
print "converged"
converged=i
break #check if needed
if i >= len(x):
break

#-----if less than then decrease lambda
if chisq<ochisq:
lamda=0.1*lamda

#--if  greater than then increase lambda
elif chisq >= ochisq:
lamda=10*lamda
yprime.append(np.sin(a*x[i]+b))
if converged !=0:
for i in range(len(x)-converged):
yprime.append(np.sin(a*x[i]+b))

In [22]:
import matplotlib.pyplot as plt
%pylab inline
fig=plt.figure()
ax1=plt.subplot(1,1,1)
ax1.scatter(x,y,color="red")
ax2=plt.subplot(1,1,1)
print len(x)
print len(yprime)
ax2.scatter(x,yprime,color="blue")
plt.show()

Populating the interactive namespace from numpy and matplotlib
100
100


WARNING: pylab import has clobbered these variables: ['Polygon', 'random', 'poly', 'cosh', 'flatten', 'conjugate', 'diff', 'Circle', 'tan', 'roots', 'plot', 'eye', 'trace', 'floor', 'diag', 'invert', 'var', 'nan', 'sqrt', 'source', 'add', 'zeros', 'take', 'prod', 'plotting', 'product', 'power', 'multinomial', 'transpose', 'test', 'beta', 'ones', 'sinh', 'vectorize', 'sign', 'mod', 'log', 'cos', 'seterr', 'tanh', 'det', 'pi', 'sin', 'binomial', 'solve', 'exp', 'reshape', 'trunc', 'gamma', 'interactive']
%matplotlib prevents importing * from pylab and numpy



However I tried using values of a=1 and b=1 before and I can't seem to get the initial guesses to converge.

### 11.3

#### (a)

• Maximize entropy $S= - \int _{-\infty} ^ \infty {p(x) \log p(x)}\,dx$
• Given the following constraints:
• Normalization Constraint: $\int _{-\infty} ^ \infty {p(x)} \,dx = 1$
• Measurement constraint $\int _{-\infty} ^ \infty {p(x)f_i(x)} \,dx = <f_i(x)>$

Then the variational problem is dealt with by introducing Langrange multipiers for each constraint equation $\delta [- \int _{-\infty} ^ \infty {p(x) \log p(x)}\,dx + \lambda _1 \int _{-\infty} ^ \infty {p(x)} \,dx + \sum_{i} {\lambda_i \int _{-\infty} ^ \infty {p(x)f_i(x)} \,dx }] =0$ which reduces to:${\delta \over \delta p} [-p \log p + \lambda _1 p + \sum_{i} {\lambda_i p f_i(x)}] =0$ which is equivalent to : $[- \log p -1 + \lambda _1 + \sum_{i} {\lambda_i f_i(x)}] =0$ Therefore $p(x)= e ^{(-1 + \lambda _1 + \sum_{i} {\lambda_i f_i(x)})}$

### (b)

$\sigma ^2 = \int _{- \infty} ^ {\infty} { p(x) x^2} \,dx$ since $f(x)= x^2$ $p(x)= e ^{(-1 + \lambda _1 + \lambda_2 x^2 )}$ The normalization constraint still holds where $\int _{-\infty} ^ \infty {p(x)} \,dx = 1$

In [21]:
from sympy import *

lamda1= Symbol('L1')
lamda2= Symbol('L2')
x=Symbol('x')
f= exp(-1+lamda1+lamda2*x**2)
eq1=integrate(f, (x,-oo,oo) )

eq2=integrate(f*x**2,(x,-oo,oo))

Piecewise((-I*sqrt(pi)*exp(-1)*exp(L1)*sqrt(polar_lift(L2))/L2, Abs(periodic_argument(exp_polar(I*pi)*polar_lift(L2), oo)) <= pi/2), (Integral(exp(L1 + L2*x**2 - 1), (x, -oo, oo)), True))
Piecewise((I*sqrt(pi)*exp(-1)*exp(L1)/(2*L2*sqrt(polar_lift(L2))), Abs(periodic_argument(exp_polar(I*pi)*polar_lift(L2), oo)) < pi/2), (Integral(x**2*exp(L1 + L2*x**2 - 1), (x, -oo, oo)), True))



still need to solve the equations in order to get the langrange multipliers.

In []: