Using spike and slab priors to shrink coefficients toward zero.
Stochastic search variable selection (SSVS) identifies promising subsets of multiple regression covariates via Gibbs sampling (George and McCulloch 1993). Here’s a short SSVS demo with JAGS and R.
Assume we have a multiple regression problem:
\[\boldsymbol{Y} \sim N_n(\boldsymbol{X \beta}, \sigma^2 \boldsymbol{I})\]
We suspect only a subset of the elements of \(\boldsymbol{\beta}\) are non-zero, i.e. some of the covariates have no effect.
Assume \(\boldsymbol{\beta}\) arises from one of two normal mixture components, depending on a latent variable \(\gamma_i\):
\[ \beta_i \mid \gamma_i \sim \left\{ \begin{array}{lr} N(0, \tau^2_i) & \gamma_i = 0\\ N(0, c^2_i \tau^2_i) & \gamma_i = 1 \end{array} \right. \]
\(\tau_i\) is positive but small s.t. \(\beta_i\) is close to zero when \(\gamma_i = 0\). \(c_i\) is large enough to allow reasonable deviations from zero when \(\gamma_i = 1\). The prior probability that covariate \(i\) has a nonzero effect is \(Pr(\gamma_i = 1) = p_i\).
Let’s simulate a dataset in which some covariates have strong effects on the linear predictor, and other don’t.
library(gridExtra)
library(runjags)
library(ggmcmc)
library(coda)
library(knitr)
ncov <- 20
nobs <- 60
var_beta <- .004
c <- 1000
p_inclusion <- .5
sigma_y <- 1
# generate covariates
X <- array(dim=c(nobs, ncov))
for (i in 1:ncov){
X[, i] <- rnorm(nobs, 0, 1)
}
included <- rbinom(ncov, 1, p_inclusion)
coefs <- rnorm(n=ncov,
mean=0,
sd=ifelse(included==1,
sqrt(var_beta * c),
sqrt(var_beta)
)
)
coefs <- sort(coefs)
Y <- rnorm(nobs, mean=X %*% coefs, sd=sigma_y)
Specifying the model:
cat("model{
alpha ~ dnorm(0, 1)
sd_y ~ dunif(0, 10)
tau_y <- pow(sd_y, -2)
# ssvs priors
sd_bet ~ dunif(0, 10)
tau_in <- pow(sd_bet, -2)
tau[1] <- tau_in # coef effectively zero
tau[2] <- tau_in / 1000 # nonzero coef
p_ind[1] <- 1/2
p_ind[2] <- 1 - p_ind[1]
for (j in 1:ncov){
indA[j] ~ dcat(p_ind[]) # returns 1 or 2
gamma[j] <- indA[j] - 1 # returns 0 or 1
beta[j] ~ dnorm(0, tau[indA[j]])
}
# likelihood
for (i in 1:nobs){
Y[i] ~ dnorm(alpha + X[i ,] %*% beta[], tau_y)
}
}
"
, file="ssvs.txt")
Fitting the model:
dat <- list(Y=Y, X=X, nobs=nobs, ncov=ncov)
vars <- c("alpha", "sd_bet", "gamma", "beta", "tau_in", "sd_y")
out <- run.jags("ssvs.txt", vars, data=dat, n.chains=3,
adapt=10000, burnin=10000)
Compiling rjags model...
Calling the simulation using the rjags method...
Adapting the model for 10000 iterations...
|
| | 0%
|
| | 1%
|
|+ | 2%
|
|++ | 3%
|
|++ | 4%
|
|++ | 5%
|
|+++ | 6%
|
|++++ | 7%
|
|++++ | 8%
|
|++++ | 9%
|
|+++++ | 10%
|
|++++++ | 11%
|
|++++++ | 12%
|
|++++++ | 13%
|
|+++++++ | 14%
|
|++++++++ | 15%
|
|++++++++ | 16%
|
|++++++++ | 17%
|
|+++++++++ | 18%
|
|++++++++++ | 19%
|
|++++++++++ | 20%
|
|++++++++++ | 21%
|
|+++++++++++ | 22%
|
|++++++++++++ | 23%
|
|++++++++++++ | 24%
|
|++++++++++++ | 25%
|
|+++++++++++++ | 26%
|
|++++++++++++++ | 27%
|
|++++++++++++++ | 28%
|
|++++++++++++++ | 29%
|
|+++++++++++++++ | 30%
|
|++++++++++++++++ | 31%
|
|++++++++++++++++ | 32%
|
|++++++++++++++++ | 33%
|
|+++++++++++++++++ | 34%
|
|++++++++++++++++++ | 35%
|
|++++++++++++++++++ | 36%
|
|++++++++++++++++++ | 37%
|
|+++++++++++++++++++ | 38%
|
|++++++++++++++++++++ | 39%
|
|++++++++++++++++++++ | 40%
|
|++++++++++++++++++++ | 41%
|
|+++++++++++++++++++++ | 42%
|
|++++++++++++++++++++++ | 43%
|
|++++++++++++++++++++++ | 44%
|
|++++++++++++++++++++++ | 45%
|
|+++++++++++++++++++++++ | 46%
|
|++++++++++++++++++++++++ | 47%
|
|++++++++++++++++++++++++ | 48%
|
|++++++++++++++++++++++++ | 49%
|
|+++++++++++++++++++++++++ | 50%
|
|++++++++++++++++++++++++++ | 51%
|
|++++++++++++++++++++++++++ | 52%
|
|++++++++++++++++++++++++++ | 53%
|
|+++++++++++++++++++++++++++ | 54%
|
|++++++++++++++++++++++++++++ | 55%
|
|++++++++++++++++++++++++++++ | 56%
|
|++++++++++++++++++++++++++++ | 57%
|
|+++++++++++++++++++++++++++++ | 58%
|
|++++++++++++++++++++++++++++++ | 59%
|
|++++++++++++++++++++++++++++++ | 60%
|
|++++++++++++++++++++++++++++++ | 61%
|
|+++++++++++++++++++++++++++++++ | 62%
|
|++++++++++++++++++++++++++++++++ | 63%
|
|++++++++++++++++++++++++++++++++ | 64%
|
|++++++++++++++++++++++++++++++++ | 65%
|
|+++++++++++++++++++++++++++++++++ | 66%
|
|++++++++++++++++++++++++++++++++++ | 67%
|
|++++++++++++++++++++++++++++++++++ | 68%
|
|++++++++++++++++++++++++++++++++++ | 69%
|
|+++++++++++++++++++++++++++++++++++ | 70%
|
|++++++++++++++++++++++++++++++++++++ | 71%
|
|++++++++++++++++++++++++++++++++++++ | 72%
|
|++++++++++++++++++++++++++++++++++++ | 73%
|
|+++++++++++++++++++++++++++++++++++++ | 74%
|
|++++++++++++++++++++++++++++++++++++++ | 75%
|
|++++++++++++++++++++++++++++++++++++++ | 76%
|
|++++++++++++++++++++++++++++++++++++++ | 77%
|
|+++++++++++++++++++++++++++++++++++++++ | 78%
|
|++++++++++++++++++++++++++++++++++++++++ | 79%
|
|++++++++++++++++++++++++++++++++++++++++ | 80%
|
|++++++++++++++++++++++++++++++++++++++++ | 81%
|
|+++++++++++++++++++++++++++++++++++++++++ | 82%
|
|++++++++++++++++++++++++++++++++++++++++++ | 83%
|
|++++++++++++++++++++++++++++++++++++++++++ | 84%
|
|++++++++++++++++++++++++++++++++++++++++++ | 85%
|
|+++++++++++++++++++++++++++++++++++++++++++ | 86%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 87%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 88%
|
|++++++++++++++++++++++++++++++++++++++++++++ | 89%
|
|+++++++++++++++++++++++++++++++++++++++++++++ | 90%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 91%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 92%
|
|++++++++++++++++++++++++++++++++++++++++++++++ | 93%
|
|+++++++++++++++++++++++++++++++++++++++++++++++ | 94%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 95%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 96%
|
|++++++++++++++++++++++++++++++++++++++++++++++++ | 97%
|
|+++++++++++++++++++++++++++++++++++++++++++++++++ | 98%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 99%
|
|++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
Burning in the model for 10000 iterations...
|
| | 0%
|
| | 1%
|
|* | 2%
|
|** | 3%
|
|** | 4%
|
|** | 5%
|
|*** | 6%
|
|**** | 7%
|
|**** | 8%
|
|**** | 9%
|
|***** | 10%
|
|****** | 11%
|
|****** | 12%
|
|****** | 13%
|
|******* | 14%
|
|******** | 15%
|
|******** | 16%
|
|******** | 17%
|
|********* | 18%
|
|********** | 19%
|
|********** | 20%
|
|********** | 21%
|
|*********** | 22%
|
|************ | 23%
|
|************ | 24%
|
|************ | 25%
|
|************* | 26%
|
|************** | 27%
|
|************** | 28%
|
|************** | 29%
|
|*************** | 30%
|
|**************** | 31%
|
|**************** | 32%
|
|**************** | 33%
|
|***************** | 34%
|
|****************** | 35%
|
|****************** | 36%
|
|****************** | 37%
|
|******************* | 38%
|
|******************** | 39%
|
|******************** | 40%
|
|******************** | 41%
|
|********************* | 42%
|
|********************** | 43%
|
|********************** | 44%
|
|********************** | 45%
|
|*********************** | 46%
|
|************************ | 47%
|
|************************ | 48%
|
|************************ | 49%
|
|************************* | 50%
|
|************************** | 51%
|
|************************** | 52%
|
|************************** | 53%
|
|*************************** | 54%
|
|**************************** | 55%
|
|**************************** | 56%
|
|**************************** | 57%
|
|***************************** | 58%
|
|****************************** | 59%
|
|****************************** | 60%
|
|****************************** | 61%
|
|******************************* | 62%
|
|******************************** | 63%
|
|******************************** | 64%
|
|******************************** | 65%
|
|********************************* | 66%
|
|********************************** | 67%
|
|********************************** | 68%
|
|********************************** | 69%
|
|*********************************** | 70%
|
|************************************ | 71%
|
|************************************ | 72%
|
|************************************ | 73%
|
|************************************* | 74%
|
|************************************** | 75%
|
|************************************** | 76%
|
|************************************** | 77%
|
|*************************************** | 78%
|
|**************************************** | 79%
|
|**************************************** | 80%
|
|**************************************** | 81%
|
|***************************************** | 82%
|
|****************************************** | 83%
|
|****************************************** | 84%
|
|****************************************** | 85%
|
|******************************************* | 86%
|
|******************************************** | 87%
|
|******************************************** | 88%
|
|******************************************** | 89%
|
|********************************************* | 90%
|
|********************************************** | 91%
|
|********************************************** | 92%
|
|********************************************** | 93%
|
|*********************************************** | 94%
|
|************************************************ | 95%
|
|************************************************ | 96%
|
|************************************************ | 97%
|
|************************************************* | 98%
|
|**************************************************| 99%
|
|**************************************************| 100%
Running the model for 10000 iterations...
|
| | 0%
|
| | 1%
|
|* | 2%
|
|** | 3%
|
|** | 4%
|
|** | 5%
|
|*** | 6%
|
|**** | 7%
|
|**** | 8%
|
|**** | 9%
|
|***** | 10%
|
|****** | 11%
|
|****** | 12%
|
|****** | 13%
|
|******* | 14%
|
|******** | 15%
|
|******** | 16%
|
|******** | 17%
|
|********* | 18%
|
|********** | 19%
|
|********** | 20%
|
|********** | 21%
|
|*********** | 22%
|
|************ | 23%
|
|************ | 24%
|
|************ | 25%
|
|************* | 26%
|
|************** | 27%
|
|************** | 28%
|
|************** | 29%
|
|*************** | 30%
|
|**************** | 31%
|
|**************** | 32%
|
|**************** | 33%
|
|***************** | 34%
|
|****************** | 35%
|
|****************** | 36%
|
|****************** | 37%
|
|******************* | 38%
|
|******************** | 39%
|
|******************** | 40%
|
|******************** | 41%
|
|********************* | 42%
|
|********************** | 43%
|
|********************** | 44%
|
|********************** | 45%
|
|*********************** | 46%
|
|************************ | 47%
|
|************************ | 48%
|
|************************ | 49%
|
|************************* | 50%
|
|************************** | 51%
|
|************************** | 52%
|
|************************** | 53%
|
|*************************** | 54%
|
|**************************** | 55%
|
|**************************** | 56%
|
|**************************** | 57%
|
|***************************** | 58%
|
|****************************** | 59%
|
|****************************** | 60%
|
|****************************** | 61%
|
|******************************* | 62%
|
|******************************** | 63%
|
|******************************** | 64%
|
|******************************** | 65%
|
|********************************* | 66%
|
|********************************** | 67%
|
|********************************** | 68%
|
|********************************** | 69%
|
|*********************************** | 70%
|
|************************************ | 71%
|
|************************************ | 72%
|
|************************************ | 73%
|
|************************************* | 74%
|
|************************************** | 75%
|
|************************************** | 76%
|
|************************************** | 77%
|
|*************************************** | 78%
|
|**************************************** | 79%
|
|**************************************** | 80%
|
|**************************************** | 81%
|
|***************************************** | 82%
|
|****************************************** | 83%
|
|****************************************** | 84%
|
|****************************************** | 85%
|
|******************************************* | 86%
|
|******************************************** | 87%
|
|******************************************** | 88%
|
|******************************************** | 89%
|
|********************************************* | 90%
|
|********************************************** | 91%
|
|********************************************** | 92%
|
|********************************************** | 93%
|
|*********************************************** | 94%
|
|************************************************ | 95%
|
|************************************************ | 96%
|
|************************************************ | 97%
|
|************************************************* | 98%
|
|**************************************************| 99%
|
|**************************************************| 100%
Simulation complete
Calculating summary statistics...
Note: The monitored variables 'gamma[1]', 'gamma[2]',
'gamma[3]', 'gamma[19]' and 'gamma[20]' appear to be
non-stochastic; they will not be included in the convergence
diagnostic
Note: The monitored variable 'gamma[4]' appears to be
stochastic in one chain but non-stochastic in another chain;
it will not be included in the convergence diagnostic
Calculating the Gelman-Rubin statistic for 44 variables....
Finished running the simulation
outdf <- ggs(as.mcmc.list(out))
out_summary <- summary(out)
Now, let’s visualize the inclusion probabilities.
gamma_rows <- grepl(rownames(out_summary), pattern = "^gamma\\[")
probs <- out_summary[gamma_rows, "Mean"]
labels <- rep(NA, ncov)
for (i in 1:ncov){
labels[i] <- paste("beta[", i, "]", sep="")
}
xdf <- data.frame(Parameter = labels, value = 1:ncov)
p1 <- ggs_caterpillar(outdf, "beta", X=xdf) +
theme_classic() +
geom_vline(xintercept = 0, linetype = "longdash") +
geom_point(data=data.frame(coefs, pos = 1:ncov),
aes(x=coefs, y=pos), size=5, col="green4", alpha=.7) +
xlab("Value") +
ylab("Coefficient") +
geom_hline(yintercept = 1:ncov, alpha=.05) +
scale_y_continuous(breaks=seq(0, ncov, 1))
df <- data.frame(probs=probs, coefs = coefs)
p2 <- ggplot(df, aes(x=abs(coefs), y=probs)) +
geom_point(size=5, alpha=.7) +
theme_classic() +
xlab("Absolute value of true coefficient") +
ylab("Posterior probability of non-zeroness")
grid.arrange(p1, p2, ncol=2)
On the left, green points indicate true coefficient values, with black posterior credible intervals. The right plot shows the relationship between the true magnitude of the effect and the posterior probability that the coefficient was non-zero, \(E(\gamma_i \mid \boldsymbol{Y})\).
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/mbjoseph/mbjoseph.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".