Cockroaches

Recap

We are analysing data about complaints for the presence of cocroaches in 10 residential buildings. Using a Bayesian framework, we modelled the number of complaints using a Poisson model. Diagnostic analysis of the model highlights some residual variability. We decided to try to improve our estimates adding one predictor and an exposure term to the Poisson model, but the situation did not improve. The ‘Poisson’ assumption on the model variance turned out to be too restrictive to model our data. The Negative Binomial model gave better results, but examining the standardized residual plot we noticed that some of them were still very large.

A posterior predictive check considering that the data are clustered by building.

library(rstan)
library(dplyr)
library(lubridate)
library(ggplot2)
library(bayesplot)

theme_set(bayesplot::theme_default())
rstan_options(auto_write=TRUE)

Try to expand the previous model with varying slope.

\[ \text{complaints}_{ib} \sim \text{Neg-Binomial}(\lambda_{ib}, \phi) \\ \lambda_{ib} = \exp{(\eta_{ib})}\\ \eta_{ib} = \mu_{b(i)} + \kappa_{b(i)} \, \texttt{traps}_{ib} + \text{log_sq_foot}_i \\ \mu_b \sim \text{Normal}(\alpha + \texttt{building_data} \, \zeta, \sigma_{\mu}) \\ \kappa_b \sim \text{Normal}(\beta + \texttt{building_data} \, \gamma, \sigma_{\kappa}) \]

stan_dat_hier <- readRDS('pest_data_longer_stan_dat.RDS')

comp_model_NB_hier_slopes <- stan_model('hier_NB_regression_ncp_slopes_mod.stan')

fitted_model_NB_hier_slopes <-
  sampling(
    comp_model_NB_hier_slopes,
    data = stan_dat_hier,
    chains = 4#, #cores = 4,
#    control = list(adapt_delta = 0.95)
  )

## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000128 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.28 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 11.4609 seconds (Warm-up)
## Chain 1:                5.8632 seconds (Sampling)
## Chain 1:                17.3241 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 9.7e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.97 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 10.8796 seconds (Warm-up)
## Chain 2:                5.48696 seconds (Sampling)
## Chain 2:                16.3666 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 0.000108 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 1.08 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 10.6592 seconds (Warm-up)
## Chain 3:                9.55951 seconds (Sampling)
## Chain 3:                20.2188 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0.000112 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.12 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 10.4868 seconds (Warm-up)
## Chain 4:                8.65596 seconds (Sampling)
## Chain 4:                19.1427 seconds (Total)
## Chain 4:

mcmc_hist(
  as.matrix(fitted_model_NB_hier_slopes, pars = "sigma_kappa"),
  binwidth = 0.005
)

print(fitted_model_NB_hier_slopes, pars = c('kappa','beta','alpha','phi','sigma_mu','sigma_kappa','mu'))

## Inference for Stan model: hier_NB_regression_ncp_slopes_mod.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##              mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## kappa[1]    -0.02    0.00 0.07 -0.14 -0.07 -0.03  0.02  0.14  1156    1
## kappa[2]    -0.42    0.00 0.09 -0.61 -0.48 -0.41 -0.35 -0.25  1439    1
## kappa[3]    -0.59    0.00 0.10 -0.80 -0.65 -0.58 -0.52 -0.38  3907    1
## kappa[4]    -0.22    0.00 0.07 -0.36 -0.27 -0.22 -0.18 -0.09  1903    1
## kappa[5]    -0.60    0.00 0.09 -0.78 -0.66 -0.60 -0.54 -0.42  2697    1
## kappa[6]    -0.43    0.00 0.10 -0.65 -0.50 -0.43 -0.37 -0.25  1672    1
## kappa[7]    -0.31    0.00 0.07 -0.44 -0.35 -0.31 -0.26 -0.19  4662    1
## kappa[8]    -0.23    0.00 0.15 -0.54 -0.32 -0.22 -0.13  0.05  1226    1
## kappa[9]     0.08    0.00 0.06 -0.03  0.04  0.08  0.12  0.19  3964    1
## kappa[10]   -0.73    0.00 0.15 -1.01 -0.83 -0.73 -0.64 -0.41  1029    1
## beta        -0.35    0.00 0.06 -0.47 -0.38 -0.35 -0.31 -0.24  1584    1
## alpha        1.43    0.01 0.29  0.82  1.24  1.44  1.62  1.96  1635    1
## phi          1.61    0.00 0.19  1.26  1.47  1.60  1.73  2.00  3224    1
## sigma_mu     0.47    0.02 0.39  0.02  0.17  0.37  0.66  1.45   544    1
## sigma_kappa  0.12    0.00 0.07  0.03  0.07  0.10  0.15  0.30   686    1
## mu[1]        0.34    0.02 0.68 -1.22 -0.03  0.43  0.80  1.49  1159    1
## mu[2]        1.64    0.01 0.49  0.75  1.30  1.61  1.98  2.65  1449    1
## mu[3]        2.13    0.01 0.32  1.53  1.90  2.12  2.34  2.78  3821    1
## mu[4]        1.49    0.01 0.50  0.51  1.17  1.49  1.81  2.51  1991    1
## mu[5]        2.40    0.01 0.41  1.59  2.12  2.38  2.68  3.20  3067    1
## mu[6]        1.90    0.01 0.38  1.23  1.65  1.87  2.12  2.75  1665    1
## mu[7]        2.68    0.00 0.26  2.20  2.50  2.67  2.85  3.21  4057    1
## mu[8]       -0.55    0.03 0.95 -2.37 -1.17 -0.59  0.08  1.41  1369    1
## mu[9]        0.23    0.01 0.56 -0.84 -0.14  0.22  0.59  1.35  3646    1
## mu[10]       1.90    0.04 1.03 -0.55  1.31  2.03  2.62  3.59   701    1
## 
## Samples were drawn using NUTS(diag_e) at Thu May 21 00:59:40 2020.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

mcmc_hist(
  as.matrix(fitted_model_NB_hier_slopes, pars = "beta"),
  binwidth = 0.005
)

y_rep <- as.matrix(fitted_model_NB_hier_slopes, pars = "y_rep")
ppc_dens_overlay(
  y = stan_dat_hier$complaints,
  yrep = y_rep[1:200,]
)

Consider to extend your model adding a log-additive monthly effect \(\texttt{mo}_t\).

select_vec <- which(stan_dat_hier$mo_idx %in% 1:12)
ppc_stat_grouped(
  y = stan_dat_hier$complaints[select_vec],
  yrep = y_rep[,select_vec],
  group = stan_dat_hier$mo_idx[select_vec],
  stat = 'mean'
) + xlim(0, 11)

The linear predictor of the random intercept and random slope Negative Binomial model is then:

\[ \eta_{bt} = \mu_{b} + \kappa_{b} \, \texttt{traps}_{bt} + \texttt{mo}_t + \text{log_sq_foot}_b \]

Then define an autoregressive prior on the monthly effects:

\[ \texttt{mo}_t \sim \text{Normal}(\rho \, \texttt{mo}_{t-1}, \sigma_\texttt{mo}) \\ \equiv \\ \texttt{mo}_t = \rho \, \texttt{mo}_{t-1} +\epsilon_t , \quad \epsilon_t \sim \text{Normal}(0, \sigma_\texttt{mo}) \\ \quad \rho \in [-1,1] \]

Using the stationary assumption of the AR models, find the marginal distribution for \(\texttt{mo}_1\).

The marginal variance:

\[Var(\texttt{mo}_{t})=Var(\rho \texttt{mo}_{t-1}) + Var(\epsilon_t)\] by independence of \(\epsilon_{t-1}\) and \(\epsilon_{t}\). Then, by stationarity \[Var(\texttt{mo}_{t})=\rho^2 Var(\texttt{mo}_{t}) + \sigma^2_{mo}\] \[Var(\texttt{mo}_{t})=\frac{\sigma^2_{mo}}{1-\rho^2}\]

The marginal mean \[E(\texttt{mo}_{t})=E(\rho \texttt{mo}_{t-1}) + E(\epsilon_t)\] \[E(\texttt{mo}_{t})=\frac{0}{1-\rho}=0\]

for \(\rho\ne 1\).

Thus,

\[\texttt{mo}_{1} \sim Normal\left (0,\frac{\sigma_{mo}}{\sqrt{1-\rho^2}} \right)\]

Use the follow setting to define the prior on the parameter \(\rho\):

\[ \rho_{\text{raw}} \in [0, 1] \\ \rho = 2 \times \rho_{\text{raw}} - 1\\ \rho_{\text{raw}} \sim Beta(10,5) \]

comp_model_NB_hier_mos <- stan_model('hier_NB_regression_ncp_slopes_mod_mos.stan')

fitted_model_NB_hier_mos <- sampling(comp_model_NB_hier_mos, data = stan_dat_hier, chains = 4, 
                                     #cores = 4, 
                                     control = list(adapt_delta = 0.9))

## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000138 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.38 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 17.7287 seconds (Warm-up)
## Chain 1:                12.7718 seconds (Sampling)
## Chain 1:                30.5005 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 8.6e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.86 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 20.3316 seconds (Warm-up)
## Chain 2:                19.4778 seconds (Sampling)
## Chain 2:                39.8094 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 9.1e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.91 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 18.3813 seconds (Warm-up)
## Chain 3:                12.5758 seconds (Sampling)
## Chain 3:                30.9571 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0.000122 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.22 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 19.6604 seconds (Warm-up)
## Chain 4:                15.1678 seconds (Sampling)
## Chain 4:                34.8283 seconds (Total)
## Chain 4:

y_rep <- as.matrix(fitted_model_NB_hier_mos, pars = "y_rep")
ppc_dens_overlay(
  y = stan_dat_hier$complaints,
  yrep = y_rep[1:200,]
)

select_vec <- which(stan_dat_hier$mo_idx %in% 1:12)
ppc_stat_grouped(
  y = stan_dat_hier$complaints[select_vec],
  yrep = y_rep[,select_vec],
  group = stan_dat_hier$mo_idx[select_vec],
  stat = 'mean'
)

# overlay prior density curve on posterior draws
gen_rho_prior <- function(x) {
  alpha <- 10; beta <- 5
  a <- -1; c <- 1
  lp <- (alpha - 1) * log(x - a) +
        (beta - 1) * log(c - x) -
        (alpha + beta - 1) * log(c - a) -
         lbeta(alpha, beta)
  return(exp(lp))
}
mcmc_hist(as.matrix(fitted_model_NB_hier_mos, pars = "rho"),
          freq = FALSE, binwidth = 0.01) +
  overlay_function(fun = gen_rho_prior) +
  xlim(-1,1)

print(fitted_model_NB_hier_mos, pars = c('rho','sigma_mu','sigma_kappa', 'mu', 'kappa'))

## Inference for Stan model: hier_NB_regression_ncp_slopes_mod_mos.
## 4 chains, each with iter=2000; warmup=1000; thin=1; 
## post-warmup draws per chain=1000, total post-warmup draws=4000.
## 
##              mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
## rho          0.78    0.00 0.08  0.59  0.73  0.79  0.84  0.91  1281 1.00
## sigma_mu     0.32    0.01 0.25  0.01  0.12  0.27  0.46  0.91  1161 1.01
## sigma_kappa  0.08    0.00 0.05  0.01  0.05  0.07  0.11  0.22   691 1.00
## mu[1]        0.88    0.01 0.62 -0.28  0.47  0.85  1.27  2.17  1839 1.00
## mu[2]        0.50    0.01 0.52 -0.54  0.16  0.49  0.84  1.53  1904 1.00
## mu[3]        0.77    0.01 0.45 -0.12  0.49  0.77  1.05  1.64  1466 1.00
## mu[4]        0.74    0.01 0.56 -0.42  0.38  0.75  1.12  1.80  1554 1.00
## mu[5]        0.79    0.01 0.50 -0.19  0.48  0.79  1.11  1.79  1657 1.00
## mu[6]        0.81    0.01 0.48 -0.13  0.50  0.81  1.12  1.76  1586 1.00
## mu[7]        1.51    0.01 0.43  0.66  1.24  1.52  1.79  2.40  1404 1.00
## mu[8]        0.25    0.02 0.80 -1.25 -0.30  0.22  0.76  1.87  2016 1.00
## mu[9]        0.97    0.01 0.59 -0.18  0.59  0.97  1.36  2.12  2280 1.00
## mu[10]       0.47    0.02 0.76 -1.08 -0.01  0.47  0.97  1.97  2514 1.00
## kappa[1]    -0.12    0.00 0.05 -0.23 -0.16 -0.12 -0.09 -0.03  2235 1.00
## kappa[2]    -0.28    0.00 0.07 -0.42 -0.33 -0.28 -0.24 -0.15  4319 1.00
## kappa[3]    -0.26    0.00 0.07 -0.40 -0.31 -0.26 -0.22 -0.13  6328 1.00
## kappa[4]    -0.18    0.00 0.05 -0.28 -0.22 -0.18 -0.14 -0.07  1874 1.00
## kappa[5]    -0.33    0.00 0.07 -0.47 -0.38 -0.33 -0.29 -0.20  4419 1.00
## kappa[6]    -0.24    0.00 0.06 -0.38 -0.28 -0.24 -0.20 -0.13  3696 1.00
## kappa[7]    -0.06    0.00 0.04 -0.13 -0.08 -0.06 -0.03  0.02  3845 1.00
## kappa[8]    -0.41    0.00 0.11 -0.66 -0.48 -0.40 -0.33 -0.20  2067 1.00
## kappa[9]    -0.04    0.00 0.04 -0.12 -0.07 -0.04 -0.01  0.05  5513 1.00
## kappa[10]   -0.49    0.00 0.09 -0.66 -0.54 -0.48 -0.43 -0.33  3775 1.00
## 
## Samples were drawn using NUTS(diag_e) at Thu May 21 01:02:59 2020.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

ppc_intervals(
  y = stan_dat_hier$complaints,
  yrep = y_rep,
  x = stan_dat_hier$traps
) +
  labs(x = "Number of traps", y = "Number of complaints")

Compare all the fitted models using the approximate leave-one-out cross-validation.

comp_model_NB_hier_slopes_llik <- stan_model('hier_NB_regression_ncp_slopes_mod_loglik.stan')

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -D_REENTRANT  -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

fitted_model_NB_hier_slopes_llik <-
  sampling(
    comp_model_NB_hier_slopes_llik,
    data = stan_dat_hier,
    chains = 4, #cores = 4,
    control = list(adapt_delta = 0.95)
  )

## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_loglik' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000155 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.55 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 18.1268 seconds (Warm-up)
## Chain 1:                12.4775 seconds (Sampling)
## Chain 1:                30.6043 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_loglik' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 8.9e-05 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.89 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 18.2094 seconds (Warm-up)
## Chain 2:                13.3443 seconds (Sampling)
## Chain 2:                31.5537 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_loglik' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 8.8e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.88 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 15.2266 seconds (Warm-up)
## Chain 3:                9.84581 seconds (Sampling)
## Chain 3:                25.0724 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_loglik' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0.000108 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.08 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 16.3516 seconds (Warm-up)
## Chain 4:                10.7538 seconds (Sampling)
## Chain 4:                27.1054 seconds (Total)
## Chain 4:

comp_model_NB_hier_mos_llik <- stan_model('hier_NB_regression_ncp_slopes_mod_mos_loglik.stan')

## Running /Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB foo.c
## clang -mmacosx-version-min=10.13 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG   -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/unsupported"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/BH/include" -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/src/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/"  -I"/Library/Frameworks/R.framework/Versions/4.0/Resources/library/rstan/include" -DEIGEN_NO_DEBUG  -D_REENTRANT  -DBOOST_DISABLE_ASSERTS -DBOOST_PENDING_INTEGER_LOG2_HPP -include stan/math/prim/mat/fun/Eigen.hpp   -I/usr/local/include   -fPIC  -Wall -g -O2  -c foo.c -o foo.o
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:88:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:1: error: unknown type name 'namespace'
## namespace Eigen {
## ^
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/src/Core/util/Macros.h:613:16: error: expected ';' after top level declarator
## namespace Eigen {
##                ^
##                ;
## In file included from <built-in>:1:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:4:
## In file included from /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Dense:1:
## /Library/Frameworks/R.framework/Versions/4.0/Resources/library/RcppEigen/include/Eigen/Core:96:10: fatal error: 'complex' file not found
## #include <complex>
##          ^~~~~~~~~
## 3 errors generated.
## make: *** [foo.o] Error 1

fitted_model_NB_hier_mos_llik <-
  sampling(
    comp_model_NB_hier_mos_llik,
    data = stan_dat_hier,
    chains = 4, #cores = 4,
    control = list(adapt_delta = 0.95)
  )

## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos_loglik' NOW (CHAIN 1).
## Chain 1: 
## Chain 1: Gradient evaluation took 0.000159 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.59 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1: 
## Chain 1: 
## Chain 1: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 1: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 1: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 1: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 1: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 1: 
## Chain 1:  Elapsed Time: 21.5531 seconds (Warm-up)
## Chain 1:                11.9649 seconds (Sampling)
## Chain 1:                33.518 seconds (Total)
## Chain 1: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos_loglik' NOW (CHAIN 2).
## Chain 2: 
## Chain 2: Gradient evaluation took 0.000113 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 1.13 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2: 
## Chain 2: 
## Chain 2: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 2: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 2: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 2: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 2: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 2: 
## Chain 2:  Elapsed Time: 21.5646 seconds (Warm-up)
## Chain 2:                14.6373 seconds (Sampling)
## Chain 2:                36.2019 seconds (Total)
## Chain 2: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos_loglik' NOW (CHAIN 3).
## Chain 3: 
## Chain 3: Gradient evaluation took 9.2e-05 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.92 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3: 
## Chain 3: 
## Chain 3: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 3: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 3: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 3: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 3: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 3: 
## Chain 3:  Elapsed Time: 23.145 seconds (Warm-up)
## Chain 3:                12.4903 seconds (Sampling)
## Chain 3:                35.6352 seconds (Total)
## Chain 3: 
## 
## SAMPLING FOR MODEL 'hier_NB_regression_ncp_slopes_mod_mos_loglik' NOW (CHAIN 4).
## Chain 4: 
## Chain 4: Gradient evaluation took 0.000137 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 1.37 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4: 
## Chain 4: 
## Chain 4: Iteration:    1 / 2000 [  0%]  (Warmup)
## Chain 4: Iteration:  200 / 2000 [ 10%]  (Warmup)
## Chain 4: Iteration:  400 / 2000 [ 20%]  (Warmup)
## Chain 4: Iteration:  600 / 2000 [ 30%]  (Warmup)
## Chain 4: Iteration:  800 / 2000 [ 40%]  (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%]  (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%]  (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%]  (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%]  (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%]  (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%]  (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%]  (Sampling)
## Chain 4: 
## Chain 4:  Elapsed Time: 21.1044 seconds (Warm-up)
## Chain 4:                12.5108 seconds (Sampling)
## Chain 4:                33.6152 seconds (Total)
## Chain 4:

library(loo)
log_lik_slopes <- extract_log_lik(fitted_model_NB_hier_slopes_llik)
loo_slopes <- loo(log_lik_slopes, 
                  r_eff=relative_eff(exp(log_lik_slopes),rep(c(1,2,3,4),each=1000)))
loo_slopes

## 
## Computed from 4000 by 360 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -874.9 28.6
## p_loo        17.3  2.3
## looic      1749.7 57.3
## ------
## Monte Carlo SE of elpd_loo is NA.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     357   99.2%   299       
##  (0.5, 0.7]   (ok)         2    0.6%   726       
##    (0.7, 1]   (bad)        0    0.0%   <NA>      
##    (1, Inf)   (very bad)   1    0.3%   299       
## See help('pareto-k-diagnostic') for details.

log_lik_mos <- extract_log_lik(fitted_model_NB_hier_mos_llik)
loo_mos <- loo(log_lik_mos,
               r_eff=relative_eff(exp(log_lik_mos),rep(c(1,2,3,4),each=1000)))
loo_mos

## 
## Computed from 4000 by 360 log-likelihood matrix
## 
##          Estimate   SE
## elpd_loo   -743.9 26.8
## p_loo        42.5  3.8
## looic      1487.8 53.6
## ------
## Monte Carlo SE of elpd_loo is 0.2.
## 
## Pareto k diagnostic values:
##                          Count Pct.    Min. n_eff
## (-Inf, 0.5]   (good)     350   97.2%   327       
##  (0.5, 0.7]   (ok)        10    2.8%   321       
##    (0.7, 1]   (bad)        0    0.0%   <NA>      
##    (1, Inf)   (very bad)   0    0.0%   <NA>      
## 
## All Pareto k estimates are ok (k < 0.7).
## See help('pareto-k-diagnostic') for details.

loo_compare(loo_slopes, loo_mos)

##        elpd_diff se_diff
## model2    0.0       0.0 
## model1 -130.9      12.5

Lab sessions 6 - Bayesian Statistics

May, 21 2020

Cockroaches

Recap