12  Randomization tests

The rand_test function call

rand_test(data, dvar, pvar, statistic = c(“Mean B-A”, “Mean A-B”, “Median B-A”, “Median A-B”, “Mean |A-B|”, “Median |A-B|”), number = 500, complete = FALSE, limit = 5, startpoints = NA, exclude.equal = FALSE, phases = c(1, 2), graph = FALSE, output = NULL, seed = NULL)

The rand_test function computes a randomization test for single or multiple baseline single-case data. The function is based on an algorithm from the SCRT package (Bulté & Onghena, 2008, 2009), but has been rewritten and extended.

The basic idea of a randomization test is to think counterfactually: “Assuming that the phase had no effect on the measured data, what would be the difference between the phases of my case if I had started phase B at a different time? Given the possible phase differences under the assumption that phase had no effect, how likely are the actual phase differences of the original case?”

Therefore, a number of new cases are generated with a random start for each phase. This means that these new cases have the same data as the original case, but different starting points for each phase. A specific statistic (e.g., the mean difference between Phase A and Phase B data) is now calculated for each new case. When enough random cases have been generated, we also generate a series of new statistics (e.g., mean differences). The statistic for the original case is now compared to these new statistic values. The percentile of the original statistic within the new generated statistic values is the probability of the original statistic assuming a random distribution of starting points for each phase. This percentile is returned as the p-value in the randomization test analyses.

Warning in add_theme(., "basic"): Deprecated. Please use `set_theme()`
Figure 12.1: Illustration of a randomization test

12.1 Arguments of the rand_test() function

The statistics argument defines the statistics on which the phase comparison is based. The following comparisons can be made:

Mean A-B Uses the difference between the mean of Phase A and the mean of Phase B. This is appropriate if a decrease in scores is expected for Phase B.
Mean B-A Uses the difference between the mean of Phase B and the mean of Phase A. This is appropriate if a increase in scores is expected for Phase B.
Mean |A-B| Uses the absolute value of the difference between the means of Phases A and B.
Median A-B The same as Mean A-B, but based on the median.
Median B-A The same as Mean B-A, but based on the median.

Further argument are

Arguments of the randomization test function.
Argument What it does …
number Sample size of the randomization distribution. The exactness of the p-value can not exceed 1/number (i.e., number = 100 results in p-values with an exactness of one percent). Default is number = 500. For faster processing use number = 100. For more precise p-values set number = 1000.
complete If TRUE, the distribution is based on a complete permutation of all possible starting combinations. This setting overwrites the number Argument. The default setting is FALSE.
limit Minimal number of data points per phase in the sample. The first number refers to the A-phase and the second to the B-phase (e.g., limit = c(5, 3)). If only one number is given, this number is applied to both phases. Default is limit = 5.
startpoints Alternative to the limit-parameter, startpoints exactly defines the possible start points of phase B (e.g., startpoints = 4:9 restricts the phase B start points to measurements 4 to 9. startpoints overwrite the limit-parameter.
exclude.equal If set to FALSE, which is the default, random distribution values equal to the observed distribution are counted as null-hypothesis conform. That is, they decrease the probability of rejecting the null-hypothesis (increase the p-value). exclude.equal should be set to TRUE if you analyse one single-case design (not a multiple baseline data set) to reach a sufficient power. But be aware, that it increases the chance of an alpha-error.

12.2 Example

res <- rand_test(exampleAB)
res
Randomization Test

Combined test for three cases.

Comparing phase 1 against phase 2 
Statistic:  Mean B-A 

Minimal length of each phase: A = 5 , B = 5 
Observed statistic =  20.55556 

Distribution based on a random sample of all 1331 possible combinations.
n   =  500 
M   =  18.61113 
SD  =  1.121436 
Min =  16.12222 
Max =  21.35421 

Probability of observed statistic based on distribution:
p   =  0.044 

Shapiro-Wilk Normality Test: W = 0.985; p = 0.000  (Hypothesis of normality rejected)

Probabilty of observed statistic based on the assumption of normality:
z = 1.7339, p = 0.0415 (single sided)
The plot_rand function call

plot_rand(object, xlab = NA, ylab = “Frequency”, title = “Random distribution”, text_observed = “observed”, color = “lightgrey”, …)

The print_rand() function plots a distribution of the random sample against the observed statistic:

plot_rand(res)