Lab sessions 2 - Bayesian Statistics

Accept-Reject algorithm
Markov Chain: discrete example
Metropolis-Hastings

Accept-Reject algorithm

In each of the following cases, construct an Accept–Reject algorithm, generate a sample of the corresponding random variables, and draw the density function on top of the histogram. Compute the acceptance probability.

A-R algorithm:
- Generate \(Y \sim g\), with \(g\) candidate density, and, independently, \(U \sim U_{[0,1]}\);
- Accept \(X=Y\) if \(U \le f(Y)/cg(Y)\), discard \(Y\) and \(U\) otherwise.

Remember to check that \(f\) and \(g\) have compatible support and that the constant \(c\) is defined as

\[c : \frac{f(x)}{g(x)} \le c, \quad \forall x\]

Generate normal random variables using a Cauchy candidate in Accept–Reject algorithm.

set.seed(123)
c <- optimize(f=function(x){dnorm(x,0,1)/dcauchy(x,0,1)},maximum=T, 
              interval=c(-6,6))$objective
c

## [1] 1.520347

xl <- c(-4,4)
yl <- c(0,.6)
plot(seq(-4,4,by=.1),dnorm(seq(-4,4, by=.1)),
     type="l", xlim=xl, ylim=yl,
     xlab="", ylab="", col="darkgreen")
lines(seq(-4,4,by=.1),c*dcauchy(seq(-4,4, by=.1)),
      col="red4")

Nsim <- 2500
u <- runif(Nsim, max=c)
y <- rcauchy(Nsim)
x <- y[u<dnorm(y)/dcauchy(y)]
points(y,u*dcauchy(y),pch=19,cex=.4, col="firebrick3")
points(x,u[u<=(dnorm(y)/(dcauchy(y)))]*dcauchy(x),
       pch=19,cex=.4,col="forestgreen")

##acc prob
1/c

## [1] 0.6577446

Generate gamma \(\mathcal{G}(4.3, 6.2)\) random variables using a gamma \(\mathcal{G}(4, 7)\) candidate.

set.seed(123)
c <- optimize(f=function(x){dgamma(x,4.3,6.2)/dgamma(x,4,7)},maximum=T, 
              interval=c(0,2))$objective
c

## [1] 4.395428

xl <- c(0,3)
yl <- c(0,1.5)
plot(seq(0,3,by=.01),dgamma(seq(0,3, by=.01),4.3,6.2),
     type="l", xlim=xl, ylim=yl,
     xlab="", ylab="", col="darkgreen")
lines(seq(0,3,by=.01),c*dgamma(seq(0,3, by=.01),4,7),
      col="red4")

Nsim <- 2500
u <- runif(Nsim, max=c)
y <- rgamma(Nsim, 4,7)
x <- y[u<dgamma(y,4.3,6.2)/dgamma(y,4,7)]
points(y,u*dgamma(y,4,7),pch=19,cex=.4, col="firebrick3")
points(x,u[u<=(dgamma(y,4.3,6.2)/dgamma(y,4,7))]*dgamma(x,4,7),
       pch=19,cex=.4,col="forestgreen")

##acc prob
1/c

## [1] 0.2275091

Markov Chain: discrete example

Imagine that the following vectors describe the probabilities to move between 4 cities, i.e. the probability to move from Trieste to Milan is 0.11 and the probability to move from Milan to Trieste is 1.

fromMI <- c(0,1,0,0)
fromTS <- c(1,4,4,0)/9
fromFI <- c(0,4,4,1)/9
fromBO <- c(0,0,1,0)
Tmat <- rbind(fromMI,fromTS,fromFI,fromBO)
colnames(Tmat) <- c("toMI", "toTS", "toFI", "toBO")
Tmat

##             toMI      toTS      toFI      toBO
## fromMI 0.0000000 1.0000000 0.0000000 0.0000000
## fromTS 0.1111111 0.4444444 0.4444444 0.0000000
## fromFI 0.0000000 0.4444444 0.4444444 0.1111111
## fromBO 0.0000000 0.0000000 1.0000000 0.0000000

Now fix your position at the initial state, here Trieste. Then compute the probabilities to move to the 4 cities at time 1 and at time 2.

p0 <- c(0,1,0,0)
p1 <- p0 %*% Tmat; p1

##           toMI      toTS      toFI toBO
## [1,] 0.1111111 0.4444444 0.4444444    0

p2 <- p1 %*% Tmat; p2

##            toMI      toTS      toFI       toBO
## [1,] 0.04938272 0.5061728 0.3950617 0.04938272

Extend the process to compute the vector of probabilities p for a generic time t.

pt <- function(t) {
  pi <- p0
  for (i in 1:t){
    pi <- pi %*% Tmat
    }
  return(pi)
  }

p1.10 <- rbind(pt(1),pt(2),pt(3),pt(4),pt(5),pt(6),pt(7),pt(8),pt(9),pt(10))
p1.10

##             toMI      toTS      toFI       toBO
##  [1,] 0.11111111 0.4444444 0.4444444 0.00000000
##  [2,] 0.04938272 0.5061728 0.3950617 0.04938272
##  [3,] 0.05624143 0.4499314 0.4499314 0.04389575
##  [4,] 0.04999238 0.4561805 0.4438348 0.04999238
##  [5,] 0.05068672 0.4499992 0.4499992 0.04931498
##  [6,] 0.04999991 0.4506860 0.4493142 0.04999991
##  [7,] 0.05007622 0.4500000 0.4500000 0.04992380
##  [8,] 0.05000000 0.4500762 0.4499238 0.05000000
##  [9,] 0.05000847 0.4500000 0.4500000 0.04999153
## [10,] 0.05000000 0.4500085 0.4499915 0.05000000

barplot(p1.10, beside=T)

p100 <- pt(100)
p100

##      toMI toTS toFI toBO
## [1,] 0.05 0.45 0.45 0.05

Compute the stationary distribution to which our Markov Chain converges.

eigenvect1 <- eigen(t(Tmat))$vectors [,1] #left eigenvector

p.distr <- eigenvect1 / sum(eigenvect1)
p.distr

## [1] 0.05 0.45 0.45 0.05

barplot(rbind(pt(20),p.distr),beside=T)

Metropolis-Hastings

Student’s \(t\) density with \(\nu\) degrees of freedom, is given by

\[ f(x|\nu)= \frac{\Gamma\left(\frac{\nu+1}{2}\right)}{\Gamma\left(\frac{\nu}{2}\right)} \frac{1}{\sqrt{\nu \pi}}(1+x^2/\nu)^{(\nu+1)/2}\]

Calculate the mean of a t distribution with \(\nu = 4\) degrees of freedom using a Metropolis–Hastings algorithm with candidate density

\(N(0,1)\)
\(t\) with \(\nu= 2\) degrees of freedom.

Monitor the convergence across iterations and compute the acceptance rate. Note that in both cases we are dealing with:

Independent Metropolis-Hastings algorithm:
- Initialize \(y^{(0)}\)
- For S iterations repeat the following steps
- At time \(s\), generate a candidate \(y^*\) from the proposal distribution \(g(y^{*})\).
- Compute the acceptance probability as \[\rho(x^{(s-1)},y^*)=\min\left(1, \frac{\pi(y^*)g(x^{(s-1)})}{\pi(x^{(s-1)})g(y^*)}\right)\]
- Generate \(U \sim U(0,1)\): if \(U < \rho(x^{(s-1)},y^*)\), then \(x^{(s)}=y^*\), otherwise \(x^{(s)}=x^{(s-1)}\).

Note that the proposal distribution at time \(s\) does not depend on the value of the chain at time \(s-1\) and that the ratio of proposal distributions in the acceptance ratio does not simplify.

\(N(0,1)\)

set.seed(123)

Nsim <- 10^4
df <- 4
mh <- c()
mh[1] <- rnorm(1)
acc.rate <- 0
for(s in 2:Nsim){
    proposal <- rnorm(1)
    alpha.p <- min(1, (dt(proposal, df)*dnorm(mh[s-1]))/(dt(mh[s-1],df)*dnorm(proposal)))
    if(runif(1)<alpha.p){
        mh[s] <- proposal
        acc.rate <- acc.rate+1
    }else{
        mh[s]=mh[s-1]
    }
}

mean(mh)

## [1] -0.03735362

acc.rate/Nsim

## [1] 0.9099

##traceplot
par(mfrow=c(2,2))
plot(mh, type="l", ylab="Trace", xlab="Iteration")
##histogram
hist(mh, breaks=100, border="gray40",freq=F, main="")
lines(seq(-5,5, by=0.01), dt(seq(-5,5, by=0.01),4),type="l", col="gray30")
abline(v=mean(mh), col="firebrick3", lty=2)
##cumulative mean
a <- cumsum(mh)/1:Nsim
plot(a, type="l",ylab="Cumulative mean plot", xlab="Iteration")
abline(h=mean(mh), col="firebrick3", lty=2)
##autocorrelation
acf(mh, main="",ylab="Autocorrelation")

##this chain sometimes stucks on some 'marginal' values, it is due to the light tails of the Normal distribution.

##burn-in and thinning
post_sample <- mh[seq(100,Nsim,by=10)]

##traceplot
par(mfrow=c(2,2))
plot(post_sample, type="l", ylab="Trace", xlab="Iteration")
##histogram
hist(post_sample, breaks=100, border="gray40",freq=F, main="")
lines(seq(-5,5, by=0.01), dt(seq(-5,5, by=0.01),4),type="l", col="gray30")
abline(v=mean(post_sample), col="firebrick3", lty=2)
##autocorrelation
acf(post_sample, main="",ylab="Autocorrelation")

\(t\) with \(\nu= 2\) degrees of freedom.

mh <- c()
mh[1] <- rnorm(1)
acc.rate <- 0
for(s in 2:Nsim){
    proposal <- rt(1,2)
    alpha.p <- min(1, (dt(proposal,df)*dt(mh[s-1],2))/(dt(mh[s-1],df)*dt(proposal,2)))
    if(runif(1)<alpha.p){
        mh[s] <- proposal
        acc.rate <- acc.rate+1
    }else{
        mh[s]=mh[s-1]
    }
}

mean(mh)

## [1] 0.02742977

acc.rate/Nsim

## [1] 0.9177

##traceplot
par(mfrow=c(2,2))
plot(mh, type="l", ylab="Trace", xlab="Iteration")
##histogram
hist(mh, breaks=100, border="gray40",freq=F, main="")
lines(seq(-5,5, by=.01), dt(seq(-5,5, by=.01),4),type="l", col="gray30")
abline(v=mean(mh), col="firebrick3", lty=2)
##cumulative mean
a <- cumsum(mh)/1:Nsim
plot(a, type="l",ylab="Cumulative mean plot", xlab="Iteration")
abline(h=mean(mh), col="firebrick3", lty=2)
##autocorrelation
acf(mh, main="",ylab="Autocorrelation")

Lab sessions 2 - Bayesian Statistics

April, 17 2020

Accept-Reject algorithm

Markov Chain: discrete example

Metropolis-Hastings