Maxwell B. Joseph: Stochastic search variable selection in JAGS

Stochastic search variable selection (SSVS) identifies promising subsets of multiple regression covariates via Gibbs sampling (George and McCulloch 1993). Here’s a short SSVS demo with JAGS and R.

Assume we have a multiple regression problem:

\[\boldsymbol{Y} \sim N_n(\boldsymbol{X \beta}, \sigma^2 \boldsymbol{I})\]

We suspect only a subset of the elements of \(\boldsymbol{\beta}\) are non-zero, i.e. some of the covariates have no effect.

Assume \(\boldsymbol{\beta}\) arises from one of two normal mixture components, depending on a latent variable \(\gamma_i\):

\[ \beta_i \mid \gamma_i \sim \left\{ \begin{array}{lr} N(0, \tau^2_i) & \gamma_i = 0\\ N(0, c^2_i \tau^2_i) & \gamma_i = 1 \end{array} \right. \]

\(\tau_i\) is positive but small s.t. \(\beta_i\) is close to zero when \(\gamma_i = 0\). \(c_i\) is large enough to allow reasonable deviations from zero when \(\gamma_i = 1\). The prior probability that covariate \(i\) has a nonzero effect is \(Pr(\gamma_i = 1) = p_i\).

Let’s simulate a dataset in which some covariates have strong effects on the linear predictor, and other don’t.

library(gridExtra)
library(runjags)
library(ggmcmc)
library(coda)
library(knitr)

ncov <- 20
nobs <- 60
var_beta <- .004
c <- 1000
p_inclusion <- .5
sigma_y <- 1

# generate covariates
X <- array(dim=c(nobs, ncov))
for (i in 1:ncov){
  X[, i] <- rnorm(nobs, 0, 1)
}

included <- rbinom(ncov, 1, p_inclusion)
coefs <- rnorm(n=ncov,
               mean=0,
               sd=ifelse(included==1,
                         sqrt(var_beta * c),
                         sqrt(var_beta)
                         )
               )
coefs <- sort(coefs)
Y <- rnorm(nobs, mean=X %*% coefs, sd=sigma_y)

Specifying the model:

cat("model{
  alpha ~ dnorm(0, 1)
  sd_y ~ dunif(0, 10)
  tau_y <- pow(sd_y, -2)

  # ssvs priors
  sd_bet ~ dunif(0, 10)
  tau_in <- pow(sd_bet, -2)
  tau[1] <- tau_in            # coef effectively zero
  tau[2] <- tau_in / 1000     # nonzero coef
  p_ind[1] <- 1/2
  p_ind[2] <- 1 - p_ind[1]

  for (j in 1:ncov){
    indA[j] ~ dcat(p_ind[]) # returns 1 or 2
    gamma[j] <- indA[j] - 1   # returns 0 or 1
    beta[j] ~ dnorm(0, tau[indA[j]])
  }

  # likelihood
  for (i in 1:nobs){
    Y[i] ~ dnorm(alpha + X[i ,] %*% beta[], tau_y)
  }
}
    "
    , file="ssvs.txt")

Fitting the model:

dat <- list(Y=Y, X=X, nobs=nobs, ncov=ncov)
vars <- c("alpha", "sd_bet", "gamma", "beta", "tau_in", "sd_y")
out <- run.jags("ssvs.txt", vars, data=dat, n.chains=3,
                adapt=10000, burnin=10000)

Compiling rjags model...
Calling the simulation using the rjags method...
Adapting the model for 10000 iterations...

  |                                                        
  |                                                  |   0%
  |                                                        
  |                                                  |   1%
  |                                                        
  |+                                                 |   2%
  |                                                        
  |++                                                |   3%
  |                                                        
  |++                                                |   4%
  |                                                        
  |++                                                |   5%
  |                                                        
  |+++                                               |   6%
  |                                                        
  |++++                                              |   7%
  |                                                        
  |++++                                              |   8%
  |                                                        
  |++++                                              |   9%
  |                                                        
  |+++++                                             |  10%
  |                                                        
  |++++++                                            |  11%
  |                                                        
  |++++++                                            |  12%
  |                                                        
  |++++++                                            |  13%
  |                                                        
  |+++++++                                           |  14%
  |                                                        
  |++++++++                                          |  15%
  |                                                        
  |++++++++                                          |  16%
  |                                                        
  |++++++++                                          |  17%
  |                                                        
  |+++++++++                                         |  18%
  |                                                        
  |++++++++++                                        |  19%
  |                                                        
  |++++++++++                                        |  20%
  |                                                        
  |++++++++++                                        |  21%
  |                                                        
  |+++++++++++                                       |  22%
  |                                                        
  |++++++++++++                                      |  23%
  |                                                        
  |++++++++++++                                      |  24%
  |                                                        
  |++++++++++++                                      |  25%
  |                                                        
  |+++++++++++++                                     |  26%
  |                                                        
  |++++++++++++++                                    |  27%
  |                                                        
  |++++++++++++++                                    |  28%
  |                                                        
  |++++++++++++++                                    |  29%
  |                                                        
  |+++++++++++++++                                   |  30%
  |                                                        
  |++++++++++++++++                                  |  31%
  |                                                        
  |++++++++++++++++                                  |  32%
  |                                                        
  |++++++++++++++++                                  |  33%
  |                                                        
  |+++++++++++++++++                                 |  34%
  |                                                        
  |++++++++++++++++++                                |  35%
  |                                                        
  |++++++++++++++++++                                |  36%
  |                                                        
  |++++++++++++++++++                                |  37%
  |                                                        
  |+++++++++++++++++++                               |  38%
  |                                                        
  |++++++++++++++++++++                              |  39%
  |                                                        
  |++++++++++++++++++++                              |  40%
  |                                                        
  |++++++++++++++++++++                              |  41%
  |                                                        
  |+++++++++++++++++++++                             |  42%
  |                                                        
  |++++++++++++++++++++++                            |  43%
  |                                                        
  |++++++++++++++++++++++                            |  44%
  |                                                        
  |++++++++++++++++++++++                            |  45%
  |                                                        
  |+++++++++++++++++++++++                           |  46%
  |                                                        
  |++++++++++++++++++++++++                          |  47%
  |                                                        
  |++++++++++++++++++++++++                          |  48%
  |                                                        
  |++++++++++++++++++++++++                          |  49%
  |                                                        
  |+++++++++++++++++++++++++                         |  50%
  |                                                        
  |++++++++++++++++++++++++++                        |  51%
  |                                                        
  |++++++++++++++++++++++++++                        |  52%
  |                                                        
  |++++++++++++++++++++++++++                        |  53%
  |                                                        
  |+++++++++++++++++++++++++++                       |  54%
  |                                                        
  |++++++++++++++++++++++++++++                      |  55%
  |                                                        
  |++++++++++++++++++++++++++++                      |  56%
  |                                                        
  |++++++++++++++++++++++++++++                      |  57%
  |                                                        
  |+++++++++++++++++++++++++++++                     |  58%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  59%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  60%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  61%
  |                                                        
  |+++++++++++++++++++++++++++++++                   |  62%
  |                                                        
  |++++++++++++++++++++++++++++++++                  |  63%
  |                                                        
  |++++++++++++++++++++++++++++++++                  |  64%
  |                                                        
  |++++++++++++++++++++++++++++++++                  |  65%
  |                                                        
  |+++++++++++++++++++++++++++++++++                 |  66%
  |                                                        
  |++++++++++++++++++++++++++++++++++                |  67%
  |                                                        
  |++++++++++++++++++++++++++++++++++                |  68%
  |                                                        
  |++++++++++++++++++++++++++++++++++                |  69%
  |                                                        
  |+++++++++++++++++++++++++++++++++++               |  70%
  |                                                        
  |++++++++++++++++++++++++++++++++++++              |  71%
  |                                                        
  |++++++++++++++++++++++++++++++++++++              |  72%
  |                                                        
  |++++++++++++++++++++++++++++++++++++              |  73%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++             |  74%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++            |  75%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++            |  76%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++            |  77%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++           |  78%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  79%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  80%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  81%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++         |  82%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++        |  83%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++        |  84%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++        |  85%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++       |  86%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++      |  87%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++      |  88%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++      |  89%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++     |  90%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++    |  91%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++    |  92%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++    |  93%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++   |  94%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++  |  95%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++  |  96%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++  |  97%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++++ |  98%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++++|  99%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
Burning in the model for 10000 iterations...

  |                                                        
  |                                                  |   0%
  |                                                        
  |                                                  |   1%
  |                                                        
  |*                                                 |   2%
  |                                                        
  |**                                                |   3%
  |                                                        
  |**                                                |   4%
  |                                                        
  |**                                                |   5%
  |                                                        
  |***                                               |   6%
  |                                                        
  |****                                              |   7%
  |                                                        
  |****                                              |   8%
  |                                                        
  |****                                              |   9%
  |                                                        
  |*****                                             |  10%
  |                                                        
  |******                                            |  11%
  |                                                        
  |******                                            |  12%
  |                                                        
  |******                                            |  13%
  |                                                        
  |*******                                           |  14%
  |                                                        
  |********                                          |  15%
  |                                                        
  |********                                          |  16%
  |                                                        
  |********                                          |  17%
  |                                                        
  |*********                                         |  18%
  |                                                        
  |**********                                        |  19%
  |                                                        
  |**********                                        |  20%
  |                                                        
  |**********                                        |  21%
  |                                                        
  |***********                                       |  22%
  |                                                        
  |************                                      |  23%
  |                                                        
  |************                                      |  24%
  |                                                        
  |************                                      |  25%
  |                                                        
  |*************                                     |  26%
  |                                                        
  |**************                                    |  27%
  |                                                        
  |**************                                    |  28%
  |                                                        
  |**************                                    |  29%
  |                                                        
  |***************                                   |  30%
  |                                                        
  |****************                                  |  31%
  |                                                        
  |****************                                  |  32%
  |                                                        
  |****************                                  |  33%
  |                                                        
  |*****************                                 |  34%
  |                                                        
  |******************                                |  35%
  |                                                        
  |******************                                |  36%
  |                                                        
  |******************                                |  37%
  |                                                        
  |*******************                               |  38%
  |                                                        
  |********************                              |  39%
  |                                                        
  |********************                              |  40%
  |                                                        
  |********************                              |  41%
  |                                                        
  |*********************                             |  42%
  |                                                        
  |**********************                            |  43%
  |                                                        
  |**********************                            |  44%
  |                                                        
  |**********************                            |  45%
  |                                                        
  |***********************                           |  46%
  |                                                        
  |************************                          |  47%
  |                                                        
  |************************                          |  48%
  |                                                        
  |************************                          |  49%
  |                                                        
  |*************************                         |  50%
  |                                                        
  |**************************                        |  51%
  |                                                        
  |**************************                        |  52%
  |                                                        
  |**************************                        |  53%
  |                                                        
  |***************************                       |  54%
  |                                                        
  |****************************                      |  55%
  |                                                        
  |****************************                      |  56%
  |                                                        
  |****************************                      |  57%
  |                                                        
  |*****************************                     |  58%
  |                                                        
  |******************************                    |  59%
  |                                                        
  |******************************                    |  60%
  |                                                        
  |******************************                    |  61%
  |                                                        
  |*******************************                   |  62%
  |                                                        
  |********************************                  |  63%
  |                                                        
  |********************************                  |  64%
  |                                                        
  |********************************                  |  65%
  |                                                        
  |*********************************                 |  66%
  |                                                        
  |**********************************                |  67%
  |                                                        
  |**********************************                |  68%
  |                                                        
  |**********************************                |  69%
  |                                                        
  |***********************************               |  70%
  |                                                        
  |************************************              |  71%
  |                                                        
  |************************************              |  72%
  |                                                        
  |************************************              |  73%
  |                                                        
  |*************************************             |  74%
  |                                                        
  |**************************************            |  75%
  |                                                        
  |**************************************            |  76%
  |                                                        
  |**************************************            |  77%
  |                                                        
  |***************************************           |  78%
  |                                                        
  |****************************************          |  79%
  |                                                        
  |****************************************          |  80%
  |                                                        
  |****************************************          |  81%
  |                                                        
  |*****************************************         |  82%
  |                                                        
  |******************************************        |  83%
  |                                                        
  |******************************************        |  84%
  |                                                        
  |******************************************        |  85%
  |                                                        
  |*******************************************       |  86%
  |                                                        
  |********************************************      |  87%
  |                                                        
  |********************************************      |  88%
  |                                                        
  |********************************************      |  89%
  |                                                        
  |*********************************************     |  90%
  |                                                        
  |**********************************************    |  91%
  |                                                        
  |**********************************************    |  92%
  |                                                        
  |**********************************************    |  93%
  |                                                        
  |***********************************************   |  94%
  |                                                        
  |************************************************  |  95%
  |                                                        
  |************************************************  |  96%
  |                                                        
  |************************************************  |  97%
  |                                                        
  |************************************************* |  98%
  |                                                        
  |**************************************************|  99%
  |                                                        
  |**************************************************| 100%
Running the model for 10000 iterations...

  |                                                        
  |                                                  |   0%
  |                                                        
  |                                                  |   1%
  |                                                        
  |*                                                 |   2%
  |                                                        
  |**                                                |   3%
  |                                                        
  |**                                                |   4%
  |                                                        
  |**                                                |   5%
  |                                                        
  |***                                               |   6%
  |                                                        
  |****                                              |   7%
  |                                                        
  |****                                              |   8%
  |                                                        
  |****                                              |   9%
  |                                                        
  |*****                                             |  10%
  |                                                        
  |******                                            |  11%
  |                                                        
  |******                                            |  12%
  |                                                        
  |******                                            |  13%
  |                                                        
  |*******                                           |  14%
  |                                                        
  |********                                          |  15%
  |                                                        
  |********                                          |  16%
  |                                                        
  |********                                          |  17%
  |                                                        
  |*********                                         |  18%
  |                                                        
  |**********                                        |  19%
  |                                                        
  |**********                                        |  20%
  |                                                        
  |**********                                        |  21%
  |                                                        
  |***********                                       |  22%
  |                                                        
  |************                                      |  23%
  |                                                        
  |************                                      |  24%
  |                                                        
  |************                                      |  25%
  |                                                        
  |*************                                     |  26%
  |                                                        
  |**************                                    |  27%
  |                                                        
  |**************                                    |  28%
  |                                                        
  |**************                                    |  29%
  |                                                        
  |***************                                   |  30%
  |                                                        
  |****************                                  |  31%
  |                                                        
  |****************                                  |  32%
  |                                                        
  |****************                                  |  33%
  |                                                        
  |*****************                                 |  34%
  |                                                        
  |******************                                |  35%
  |                                                        
  |******************                                |  36%
  |                                                        
  |******************                                |  37%
  |                                                        
  |*******************                               |  38%
  |                                                        
  |********************                              |  39%
  |                                                        
  |********************                              |  40%
  |                                                        
  |********************                              |  41%
  |                                                        
  |*********************                             |  42%
  |                                                        
  |**********************                            |  43%
  |                                                        
  |**********************                            |  44%
  |                                                        
  |**********************                            |  45%
  |                                                        
  |***********************                           |  46%
  |                                                        
  |************************                          |  47%
  |                                                        
  |************************                          |  48%
  |                                                        
  |************************                          |  49%
  |                                                        
  |*************************                         |  50%
  |                                                        
  |**************************                        |  51%
  |                                                        
  |**************************                        |  52%
  |                                                        
  |**************************                        |  53%
  |                                                        
  |***************************                       |  54%
  |                                                        
  |****************************                      |  55%
  |                                                        
  |****************************                      |  56%
  |                                                        
  |****************************                      |  57%
  |                                                        
  |*****************************                     |  58%
  |                                                        
  |******************************                    |  59%
  |                                                        
  |******************************                    |  60%
  |                                                        
  |******************************                    |  61%
  |                                                        
  |*******************************                   |  62%
  |                                                        
  |********************************                  |  63%
  |                                                        
  |********************************                  |  64%
  |                                                        
  |********************************                  |  65%
  |                                                        
  |*********************************                 |  66%
  |                                                        
  |**********************************                |  67%
  |                                                        
  |**********************************                |  68%
  |                                                        
  |**********************************                |  69%
  |                                                        
  |***********************************               |  70%
  |                                                        
  |************************************              |  71%
  |                                                        
  |************************************              |  72%
  |                                                        
  |************************************              |  73%
  |                                                        
  |*************************************             |  74%
  |                                                        
  |**************************************            |  75%
  |                                                        
  |**************************************            |  76%
  |                                                        
  |**************************************            |  77%
  |                                                        
  |***************************************           |  78%
  |                                                        
  |****************************************          |  79%
  |                                                        
  |****************************************          |  80%
  |                                                        
  |****************************************          |  81%
  |                                                        
  |*****************************************         |  82%
  |                                                        
  |******************************************        |  83%
  |                                                        
  |******************************************        |  84%
  |                                                        
  |******************************************        |  85%
  |                                                        
  |*******************************************       |  86%
  |                                                        
  |********************************************      |  87%
  |                                                        
  |********************************************      |  88%
  |                                                        
  |********************************************      |  89%
  |                                                        
  |*********************************************     |  90%
  |                                                        
  |**********************************************    |  91%
  |                                                        
  |**********************************************    |  92%
  |                                                        
  |**********************************************    |  93%
  |                                                        
  |***********************************************   |  94%
  |                                                        
  |************************************************  |  95%
  |                                                        
  |************************************************  |  96%
  |                                                        
  |************************************************  |  97%
  |                                                        
  |************************************************* |  98%
  |                                                        
  |**************************************************|  99%
  |                                                        
  |**************************************************| 100%
Simulation complete
Calculating summary statistics...
Note: The monitored variables 'gamma[1]', 'gamma[2]',
'gamma[3]', 'gamma[19]' and 'gamma[20]' appear to be
non-stochastic; they will not be included in the convergence
diagnostic
Note: The monitored variable 'gamma[4]' appears to be
stochastic in one chain but non-stochastic in another chain;
it will not be included in the convergence diagnostic
Calculating the Gelman-Rubin statistic for 44 variables....
Finished running the simulation

outdf <- ggs(as.mcmc.list(out))
out_summary <- summary(out)

Now, let’s visualize the inclusion probabilities.

gamma_rows <- grepl(rownames(out_summary), pattern = "^gamma\\[")
probs <- out_summary[gamma_rows, "Mean"]

labels <- rep(NA, ncov)
for (i in 1:ncov){
  labels[i] <- paste("beta[", i, "]", sep="")
}
xdf <- data.frame(Parameter = labels, value = 1:ncov)
p1 <- ggs_caterpillar(outdf, "beta", X=xdf) +
  theme_classic() +
  geom_vline(xintercept = 0, linetype = "longdash") +
  geom_point(data=data.frame(coefs, pos = 1:ncov),
              aes(x=coefs, y=pos), size=5, col="green4", alpha=.7) +
  xlab("Value") +
  ylab("Coefficient") +
  geom_hline(yintercept = 1:ncov, alpha=.05) +
  scale_y_continuous(breaks=seq(0, ncov, 1))

df <- data.frame(probs=probs, coefs = coefs)
p2 <- ggplot(df, aes(x=abs(coefs), y=probs)) +
  geom_point(size=5, alpha=.7) +
  theme_classic() +
  xlab("Absolute value of true coefficient") +
  ylab("Posterior probability of non-zeroness")

grid.arrange(p1, p2, ncol=2)

On the left, green points indicate true coefficient values, with black posterior credible intervals. The right plot shows the relationship between the true magnitude of the effect and the posterior probability that the coefficient was non-zero, \(E(\gamma_i \mid \boldsymbol{Y})\).

George, Edward I, and Robert E McCulloch. 1993. “Variable Selection via Gibbs Sampling.” Journal of the American Statistical Association 88 (423): 881–89. https://www.tandfonline.com/doi/abs/10.1080/01621459.1993.10476353.

Stochastic search variable selection in JAGS

References

Corrections

Reuse