13 Bayesian regression analyses

Experimental

This feature is still in an experimental state. This means that the syntax structure, arguments, and defaults may change in future scan versions and may not be backward compatible.

Starting with version 0.63.0 scan included Bayesian regression analyses through the bplm() function (Bayesian Piecewise Linear Model).

In inferential statistics, a distinction is made between frequentist and Bayesian approaches. Frequentist statistics assess the probability of observing the data under the assumption that a null hypothesis (there is no effect or association) is true.

Bayesian statistics, on the other hand, begins with prior distributions that represent initial beliefs (priors) about the parameters of interest. These priors are then updated using observed data through Bayes’ theorem, which means that the initial beliefs about the parameters are adjusted in proportion to how well they explain the data, producing a posterior distribution that reflects both prior knowledge and new evidence. The Bayesian approach evaluates how well the data fit different parameter values by computing the likelihood of the data given these parameter estimates, rather than testing against a fixed null hypothesis.

The Bayesian approach is computationally intensive and often produces results that are practically similar to those of a frequentist analysis. However, it offers several advantages. In particular, when working with small samples, incorporating prior knowledge can improve parameter estimation. Additionally, Bayesian statistics does not require uniform distributional assumptions for all variables but allows each variable to have its own empirically derived distribution. Another advantage is its greater robustness against overspecified models, especially when too many predictors are included and exhibit high collinearity (intercorrelations) while the number of data points is limited.

These advantages make it worthwhile to use a Bayesian approach for single-case data.

The bplm function call:

bplm(
  data,
  dvar,
  pvar,
  mvar,
  model = c(“W”, “H-M”, “B&L-B”),
  contrast_level = c(“first”, “preceding”),
  contrast_slope = c(“first”, “preceding”),
  trend = TRUE,
  level = TRUE,
  slope = TRUE,
  random_trend = FALSE,
  random_level = FALSE,
  random_slope = FALSE,
  fixed = NULL,
  random = NULL,
  update_fixed = NULL,
  …
)

The bplm() function comuputes a piecewise regression analysis. The syntax is quite similar to the plm() and hplm() functions. There you can find details about the general piecewise regression model, the interpretation of regression estimations, and the setting of contrast in models with more that two phases.

The bplm() works for datasets with one case or multiple cases.

Her is an example of a one-case dataset:

bplm(exampleAB$Johanna)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 127.364 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              54.445       46.521       63.164        1000 0.001
Trend (mt)              0.068       -3.017        3.434        1000 0.970
Level phase B (phaseB)  7.830       -5.159       19.885        1000 0.200
Slope phase B (interB)  1.571       -1.862        4.704        1000 0.364

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.389        3.522        7.305

Here is an example of a multi-case dataset:

bplm(exampleAB_50)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
50 Cases

Deviance Information Criterion: 8574.942 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              48.275       45.445       51.291        1000 0.001
Trend (mt)              0.578        0.362        0.790        1000 0.001
Level phase B (phaseB) 14.047       12.841       15.325        1000 0.001
Slope phase B (interB)  0.904        0.676        1.134        1000 0.001

G-Structure - Random effects (~case)

 Parameter     SD lower 95% CI upper 95% CI
 Intercept 10.266        8.068       12.357

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.294        5.095        5.483

13.1 Setting priors

The following example show the influence of priors on paramameter estimation. Firstly, we create a random case from previously defined parameters:
The starting value (intercept) is 50 (the standard deviation is 10). The level effect for Phase B is one standard deviation (that is, 10 points) and there is neither a slope nor a trend effect. Random noise is introduced with 20% measurment error (reliability is 0.8).

set.seed(123) #set random seed for replicability of the example
des <- design(
  start_value = 50, 
  s = 10,
  level = list(A = 0, B = 1), 
  trend = list(0),
  slope = list(0),
  rtt = 0.8
)
scdf <- random_scdf(des)
scplot(scdf)

Here are the estimations from a Bayesian model without informative priors:

bplm(scdf)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 128.5297 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              49.323       40.969       57.548     905.804 0.001
Trend (mt)              0.826       -2.195        4.368    1146.096 0.618
Level phase B (phaseB)  8.238       -4.645       19.994    1684.465 0.186
Slope phase B (interB) -0.984       -4.650        2.130    1109.566 0.532

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.599        3.457        7.517

Now we introduce our prior knowledge: an intercept of 50, a trend and slope effect of 0, and a level effect of 10. We also assume that our prior is quite uncertain (i.e., a weakly informative prior). mu sets the prior values for the four parameters in the order they appear in the regression model. V is a diagonal matrix of the variances of these estimates. The V matrix sets the strength of the prior. Here we set values of 100, which is one standard deviation (\(SD^2 = 10^2 = 100\)):

prior <- list(
  B = list(mu = c(50, 0, 10, 0), V = diag(c(100, 100, 100, 100)))
)
bplm(scdf, prior = prior)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 127.6719 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              49.375       42.277       56.028        1000 0.001
Trend (mt)              0.680       -2.038        3.266        1000 0.596
Level phase B (phaseB)  9.053       -0.393       18.603        1000 0.062
Slope phase B (interB) -0.859       -3.535        1.954        1000 0.546

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.432        3.478        7.197

Now we assume that we have some certainty (i.e. a prior of medium strength) by setting the variances to 10 (SD ~ 3.2):

prior <- list(
  B = list(mu = c(50, 0, 10, 0), V = diag(c(10, 10, 10, 10)))  # Prior for regression effects
)
bplm(scdf, prior = prior)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 125.5162 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              50.084       45.768       55.009    1000.000 0.001
Trend (mt)              0.409       -1.097        1.999    1000.000 0.620
Level phase B (phaseB)  9.746        4.723       15.148     895.603 0.001
Slope phase B (interB) -0.589       -2.465        1.105    1000.000 0.540

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.293        3.480        6.953

Now we are making somewhat incorrect and uncertain assumptions:

prior <- list(
  B = list(mu = c(40, 2, 5, 2), V = diag(c(100, 100, 100, 100)))
)
bplm(scdf, prior = prior)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 127.7585 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              47.789       39.874       54.501    1000.000 0.001
Trend (mt)              1.396       -1.397        4.187    1000.000 0.322
Level phase B (phaseB)  6.931       -3.901       15.898    1136.959 0.186
Slope phase B (interB) -1.552       -4.740        1.227    1000.000 0.302

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.542        3.536        7.484

Finally, we make the wrong assumptions with medium certainty:

prior <- list(
  B = list(mu = c(40, 2, 5, 2), V = diag(c(10, 10, 10, 10)))
)
bplm(scdf, prior = prior)

Bayesian Piecewise Linear Regression

Contrast model: W (level: first, slope: first)
Deviance Information Criterion: 127.267 

B-structure - Fixed effects (values ~ 1 + mt + phaseB + interB)

                            B lower 95% CI upper 95% CI sample size     p
Intercept              44.321       39.423       48.356        1000 0.001
Trend (mt)              2.287        0.832        3.828        1000 0.004
Level phase B (phaseB)  5.500        0.509       11.276        1000 0.060
Slope phase B (interB) -2.385       -4.307       -0.758        1000 0.014

R-Structure - Residuals

          SD lower 95% CI upper 95% CI 
       5.598        3.618        7.465