Maxwell B. Joseph: Bayesian model II regression in JAGS

Regression is a mainstay of ecological and evolutionary data analysis. For example, a disease ecologist may use body size (e.g. a weight from a scale with measurement error) to predict infection. Classical linear regression assumes no error in covariates; they are known exactly. This is rarely the case in ecology, and ignoring error in covariates can bias regression coefficient estimates. This is where model II (aka errors-in variables and measurement errors) regression models come in handy. Here I’ll demonstrate how to construct such a model in a Bayesian framework, where substantive prior knowledge of covariate error facilitates less-biased parameter estimates.

Here’s a quick illustration of the problem: I’ll generate data from a known simple linear regression model, and fit models that ignore or incorporate error in the covariate.

library(rjags)
library(ggmcmc)

# simulate covariate data
n <- 40
sdx <- 6
sdobs <- 5
taux <- 1 / (sdobs * sdobs)
truex <- rnorm(n, 0, sdx)
errorx <- rnorm(n, 0, sdobs)
obsx <- truex + errorx

# simulate response data
alpha <- 0
beta <- 10
sdy <- 20
errory <- rnorm(n, 0, sdy)
obsy <- alpha + beta*truex + errory
observed_data <- data.frame(obsx = obsx, obsy = obsy)
parms <- data.frame(alpha, beta)

Ignoring error in the covariate:

# bundle data
jags_d <- list(x = obsx, y = obsy, n = length(obsx))

# write model
cat("
    model{
## Priors
alpha ~ dnorm(0, .001)
beta ~ dnorm(0, .001)
sdy ~ dunif(0, 100)
tauy <- 1 / (sdy * sdy)

## Likelihood
  for (i in 1:n){
    mu[i] <- alpha + beta * x[i]
    y[i] ~ dnorm(mu[i], tauy)
  }
}
",
    fill=TRUE, file="yerror.txt")

# initiate model
mod1 <- jags.model("yerror.txt", data=jags_d,
                   n.chains=3, n.adapt=1000)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 40
   Unobserved stochastic nodes: 3
   Total graph size: 170

Initializing model


  |                                                        
  |                                                  |   0%
  |                                                        
  |+                                                 |   2%
  |                                                        
  |++                                                |   4%
  |                                                        
  |+++                                               |   6%
  |                                                        
  |++++                                              |   8%
  |                                                        
  |+++++                                             |  10%
  |                                                        
  |++++++                                            |  12%
  |                                                        
  |+++++++                                           |  14%
  |                                                        
  |++++++++                                          |  16%
  |                                                        
  |+++++++++                                         |  18%
  |                                                        
  |++++++++++                                        |  20%
  |                                                        
  |+++++++++++                                       |  22%
  |                                                        
  |++++++++++++                                      |  24%
  |                                                        
  |+++++++++++++                                     |  26%
  |                                                        
  |++++++++++++++                                    |  28%
  |                                                        
  |+++++++++++++++                                   |  30%
  |                                                        
  |++++++++++++++++                                  |  32%
  |                                                        
  |+++++++++++++++++                                 |  34%
  |                                                        
  |++++++++++++++++++                                |  36%
  |                                                        
  |+++++++++++++++++++                               |  38%
  |                                                        
  |++++++++++++++++++++                              |  40%
  |                                                        
  |+++++++++++++++++++++                             |  42%
  |                                                        
  |++++++++++++++++++++++                            |  44%
  |                                                        
  |+++++++++++++++++++++++                           |  46%
  |                                                        
  |++++++++++++++++++++++++                          |  48%
  |                                                        
  |+++++++++++++++++++++++++                         |  50%
  |                                                        
  |++++++++++++++++++++++++++                        |  52%
  |                                                        
  |+++++++++++++++++++++++++++                       |  54%
  |                                                        
  |++++++++++++++++++++++++++++                      |  56%
  |                                                        
  |+++++++++++++++++++++++++++++                     |  58%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  60%
  |                                                        
  |+++++++++++++++++++++++++++++++                   |  62%
  |                                                        
  |++++++++++++++++++++++++++++++++                  |  64%
  |                                                        
  |+++++++++++++++++++++++++++++++++                 |  66%
  |                                                        
  |++++++++++++++++++++++++++++++++++                |  68%
  |                                                        
  |+++++++++++++++++++++++++++++++++++               |  70%
  |                                                        
  |++++++++++++++++++++++++++++++++++++              |  72%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++             |  74%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++            |  76%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++           |  78%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  80%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++         |  82%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++        |  84%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++       |  86%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++      |  88%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++     |  90%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++    |  92%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++   |  94%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++  |  96%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++++ |  98%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100%

# simulate posterior
out <- coda.samples(mod1, n.iter=1000, thin=1,
                    variable.names=c("alpha", "beta", "sdy"))


  |                                                        
  |                                                  |   0%
  |                                                        
  |*                                                 |   2%
  |                                                        
  |**                                                |   4%
  |                                                        
  |***                                               |   6%
  |                                                        
  |****                                              |   8%
  |                                                        
  |*****                                             |  10%
  |                                                        
  |******                                            |  12%
  |                                                        
  |*******                                           |  14%
  |                                                        
  |********                                          |  16%
  |                                                        
  |*********                                         |  18%
  |                                                        
  |**********                                        |  20%
  |                                                        
  |***********                                       |  22%
  |                                                        
  |************                                      |  24%
  |                                                        
  |*************                                     |  26%
  |                                                        
  |**************                                    |  28%
  |                                                        
  |***************                                   |  30%
  |                                                        
  |****************                                  |  32%
  |                                                        
  |*****************                                 |  34%
  |                                                        
  |******************                                |  36%
  |                                                        
  |*******************                               |  38%
  |                                                        
  |********************                              |  40%
  |                                                        
  |*********************                             |  42%
  |                                                        
  |**********************                            |  44%
  |                                                        
  |***********************                           |  46%
  |                                                        
  |************************                          |  48%
  |                                                        
  |*************************                         |  50%
  |                                                        
  |**************************                        |  52%
  |                                                        
  |***************************                       |  54%
  |                                                        
  |****************************                      |  56%
  |                                                        
  |*****************************                     |  58%
  |                                                        
  |******************************                    |  60%
  |                                                        
  |*******************************                   |  62%
  |                                                        
  |********************************                  |  64%
  |                                                        
  |*********************************                 |  66%
  |                                                        
  |**********************************                |  68%
  |                                                        
  |***********************************               |  70%
  |                                                        
  |************************************              |  72%
  |                                                        
  |*************************************             |  74%
  |                                                        
  |**************************************            |  76%
  |                                                        
  |***************************************           |  78%
  |                                                        
  |****************************************          |  80%
  |                                                        
  |*****************************************         |  82%
  |                                                        
  |******************************************        |  84%
  |                                                        
  |*******************************************       |  86%
  |                                                        
  |********************************************      |  88%
  |                                                        
  |*********************************************     |  90%
  |                                                        
  |**********************************************    |  92%
  |                                                        
  |***********************************************   |  94%
  |                                                        
  |************************************************  |  96%
  |                                                        
  |************************************************* |  98%
  |                                                        
  |**************************************************| 100%

# store parameter estimates
ggd <- ggs(out)
a <- ggd$value[which(ggd$Parameter == "alpha")]
b <- ggd$value[which(ggd$Parameter == "beta")]
d <- data.frame(a, b)

Incorporating error in the covariate: I’m assuming that we have substantive knowledge about covariate measurement represented in the prior for the precision in X. Further, the prior for the true X values reflects knowledge of the distribution of our X value in the population from which the sample was taken.

# specify model
cat("
    model {
## Priors
alpha ~ dnorm(0, .001)
beta ~ dnorm(0, .001)
sdy ~ dunif(0, 100)
tauy <- 1 / (sdy * sdy)
taux ~ dunif(.03, .05)

## Likelihood
  for (i in 1:n){
    truex[i] ~ dnorm(0, .04)
    x[i] ~ dnorm(truex[i], taux)
    y[i] ~ dnorm(mu[i], tauy)
    mu[i] <- alpha + beta * truex[i]
  }
}
    ", fill=T, file="xyerror.txt")

# bundle data
jags_d <- list(x = obsx, y = obsy, n = length(obsx))

# initiate model
mod2 <- jags.model("xyerror.txt", data=jags_d,
                   n.chains=3, n.adapt=1000)

Compiling model graph
   Resolving undeclared variables
   Allocating nodes
Graph information:
   Observed stochastic nodes: 80
   Unobserved stochastic nodes: 44
   Total graph size: 214

Initializing model


  |                                                        
  |                                                  |   0%
  |                                                        
  |+                                                 |   2%
  |                                                        
  |++                                                |   4%
  |                                                        
  |+++                                               |   6%
  |                                                        
  |++++                                              |   8%
  |                                                        
  |+++++                                             |  10%
  |                                                        
  |++++++                                            |  12%
  |                                                        
  |+++++++                                           |  14%
  |                                                        
  |++++++++                                          |  16%
  |                                                        
  |+++++++++                                         |  18%
  |                                                        
  |++++++++++                                        |  20%
  |                                                        
  |+++++++++++                                       |  22%
  |                                                        
  |++++++++++++                                      |  24%
  |                                                        
  |+++++++++++++                                     |  26%
  |                                                        
  |++++++++++++++                                    |  28%
  |                                                        
  |+++++++++++++++                                   |  30%
  |                                                        
  |++++++++++++++++                                  |  32%
  |                                                        
  |+++++++++++++++++                                 |  34%
  |                                                        
  |++++++++++++++++++                                |  36%
  |                                                        
  |+++++++++++++++++++                               |  38%
  |                                                        
  |++++++++++++++++++++                              |  40%
  |                                                        
  |+++++++++++++++++++++                             |  42%
  |                                                        
  |++++++++++++++++++++++                            |  44%
  |                                                        
  |+++++++++++++++++++++++                           |  46%
  |                                                        
  |++++++++++++++++++++++++                          |  48%
  |                                                        
  |+++++++++++++++++++++++++                         |  50%
  |                                                        
  |++++++++++++++++++++++++++                        |  52%
  |                                                        
  |+++++++++++++++++++++++++++                       |  54%
  |                                                        
  |++++++++++++++++++++++++++++                      |  56%
  |                                                        
  |+++++++++++++++++++++++++++++                     |  58%
  |                                                        
  |++++++++++++++++++++++++++++++                    |  60%
  |                                                        
  |+++++++++++++++++++++++++++++++                   |  62%
  |                                                        
  |++++++++++++++++++++++++++++++++                  |  64%
  |                                                        
  |+++++++++++++++++++++++++++++++++                 |  66%
  |                                                        
  |++++++++++++++++++++++++++++++++++                |  68%
  |                                                        
  |+++++++++++++++++++++++++++++++++++               |  70%
  |                                                        
  |++++++++++++++++++++++++++++++++++++              |  72%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++             |  74%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++            |  76%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++           |  78%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++          |  80%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++         |  82%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++        |  84%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++       |  86%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++      |  88%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++     |  90%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++    |  92%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++   |  94%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++  |  96%
  |                                                        
  |+++++++++++++++++++++++++++++++++++++++++++++++++ |  98%
  |                                                        
  |++++++++++++++++++++++++++++++++++++++++++++++++++| 100%

# simulate posterior
out <- coda.samples(mod2, n.iter=30000, thin=30,
                    variable.names=c("alpha", "beta", "tauy", "taux"))


  |                                                        
  |                                                  |   0%
  |                                                        
  |*                                                 |   2%
  |                                                        
  |**                                                |   4%
  |                                                        
  |***                                               |   6%
  |                                                        
  |****                                              |   8%
  |                                                        
  |*****                                             |  10%
  |                                                        
  |******                                            |  12%
  |                                                        
  |*******                                           |  14%
  |                                                        
  |********                                          |  16%
  |                                                        
  |*********                                         |  18%
  |                                                        
  |**********                                        |  20%
  |                                                        
  |***********                                       |  22%
  |                                                        
  |************                                      |  24%
  |                                                        
  |*************                                     |  26%
  |                                                        
  |**************                                    |  28%
  |                                                        
  |***************                                   |  30%
  |                                                        
  |****************                                  |  32%
  |                                                        
  |*****************                                 |  34%
  |                                                        
  |******************                                |  36%
  |                                                        
  |*******************                               |  38%
  |                                                        
  |********************                              |  40%
  |                                                        
  |*********************                             |  42%
  |                                                        
  |**********************                            |  44%
  |                                                        
  |***********************                           |  46%
  |                                                        
  |************************                          |  48%
  |                                                        
  |*************************                         |  50%
  |                                                        
  |**************************                        |  52%
  |                                                        
  |***************************                       |  54%
  |                                                        
  |****************************                      |  56%
  |                                                        
  |*****************************                     |  58%
  |                                                        
  |******************************                    |  60%
  |                                                        
  |*******************************                   |  62%
  |                                                        
  |********************************                  |  64%
  |                                                        
  |*********************************                 |  66%
  |                                                        
  |**********************************                |  68%
  |                                                        
  |***********************************               |  70%
  |                                                        
  |************************************              |  72%
  |                                                        
  |*************************************             |  74%
  |                                                        
  |**************************************            |  76%
  |                                                        
  |***************************************           |  78%
  |                                                        
  |****************************************          |  80%
  |                                                        
  |*****************************************         |  82%
  |                                                        
  |******************************************        |  84%
  |                                                        
  |*******************************************       |  86%
  |                                                        
  |********************************************      |  88%
  |                                                        
  |*********************************************     |  90%
  |                                                        
  |**********************************************    |  92%
  |                                                        
  |***********************************************   |  94%
  |                                                        
  |************************************************  |  96%
  |                                                        
  |************************************************* |  98%
  |                                                        
  |**************************************************| 100%

# store parameter estimates
ggd <- ggs(out)
a2 <- ggd$value[which(ggd$Parameter == "alpha")]
b2 <- ggd$value[which(ggd$Parameter == "beta")]
d2 <- data.frame(a2, b2)

Now let’s see how the two models perform.

ggplot(observed_data, aes(x=obsx, obsy)) +
  geom_abline(aes(intercept=a, slope=b), data=d, 
              color="red", alpha=0.05) +
  geom_abline(aes(intercept=a2, slope=b2), data=d2, 
              color="dodgerblue", alpha=0.05) +
  geom_abline(aes(intercept=alpha, slope=beta),
              data=parms, color="green", size=1.5, linetype="dashed") +
  geom_point(shape=19, size=3) +
  theme_minimal() +
  xlab("X values") + 
  ylab("Observed Y values") +
  ggtitle("Model results with and without modeling error in X")

The dashed green line shows the model that generated the data, i.e. the “true” line. The red lines show the posterior for the naive model ignoring error in X, while the less-biased blue lines show the posterior for the model incorporating error in X.

Bayesian model II regression in JAGS

Corrections

Reuse