Argument | What it does ... |
---|---|

n | Number of cases to be created (Default is n = 1). |

phase_design | A list defining the length and label of each phase. E.g., phase.length = list(A1 = 10, B1 = 10, A2 = 10, B2 = 10). Use vectors if you want to define different values for each case phase.length = list(A = c(10, 15), B = c(10, 15). |

trend | Defines the effect size of a trend added incrementally to each measurement across the whole data-set. To assign different trends to several single-cases, use a vector of values (e.g. trend = c(.1, .3, .5)). If the number of cases exceeds the length of the vector, values are recycled. When using a 'gaussian' distribution, the trend parameters indicate effect size d changes. When using a binomial or poisson distribution, trend indicates an increase in points / counts per measurement. |

level | A list that defines the level increase (effect size d) at the beginning of each phase relative to the previous phase (e.g. list(A = 0, B = 1)). The first element must be zero as the first phase of a single-case has no level effect (if you have one less list element than the number of phases, scan will add a leading element with 0 values). Use vectors to define variable level effects for each case (e.g. list(A = c(0, 0), B = c(1, 2))). When using a 'gaussian' distribution, the level parameters indicate effect size d changes. When using a binomial or poisson distribution, level indicates an increase in points / counts with the onset of each phase. |

slope | A list that defines the increase per measurement for each phase compared to the previous phase. slope = list(A = 0, B = .1 generates an incremental increase of 0.1 per measurement starting at the B phase. The first list element must be zero as the first phase of a single-case has no slope effect (if you have one less list element than the number of phases, scan will add a leading element with 0 values). Use vectors to define variable slope effects for each case (e.g. list(A = c(0, 0), B = c(0.1, 0.2))). If the number of cases exceeds the length of the vector, values are recycled. When using a 'gaussian' distribution, the slope parameters indicate effect size d changes per measurement. When using a binomial or poisson distribution, slope indicates an increase in points / counts per measurement. |

rtt | Reliability of the underlying simulated measurements. Set rtt = .8 by default. To assign different reliabilities to several single-cases, use a vector of values (e.g. rtt = c(.6, .7, .8)). If the number of cases exceeds the length of the vector, values are repeated. rtt has no effect when you're using binomial or poisson distributed scores. |

start_value | Starting value at the first measurement. Default is 50. To assign different start values to several single-cases, use a vector of values (e.g. c(50, 42, 56)). If the number of cases exceeds the length of the vector, values are recycled. |

s | Standard deviation used to calculate absolute values from level, slope, trend effects and to calculate and error distribution from the rtt values. Set to 10 by default. To assign different variances to several single-cases, use a vector of values (e.g. s = c(5, 10, 15)). If the number of cases exceeds the length of the vector, values are recycled. if the distribution is 'poisson' or 'binomial' s is not applied. |

extreme_prop | Probability of extreme values. extreme.p = .05 gives a five percent probability of an extreme value. A vector of values assigns different probabilities to multiple cases. If the number of cases exceeds the length of the vector, values are repeated. |

extreme_range | Range for extreme values, expressed as effect size d. extreme.d = c(-7,-6) uses extreme values within a range of -7 and -6 standard deviations. In case of a binomial or poisson distribution, extreme.d indicates points / counts. Caution: the first value must be smaller than the second, otherwise the procedure will fail. |

missing_prop | Portion of missing values. missing.p = 0.1 creates 10% of all values as missing). A vector of values assigns different probabilities to multiple cases. If the number of cases exceeds the length of the vector, values are repeated. |

distribution | Distribution of the scores. Default is distribution = 'normal'. Possible values are 'normal' (or 'gaussian'), 'binomial', and 'poisson'. |

prob | If distribution is set 'binomial', prob passes the probability of occurrence. |

# 13 Power analyses (scan version 0.54 or later)

## 13.1 The idea of a power-test

The `powert_test()`

function provides the alpha error probability and power when analyzing a specific effect of a single-case design with a given statistical method.

For example, you have a one case design with phase length A = 10 and B = 20. You assume a strong level effect of d = 1 and you expect a slight trend effect of d = 0.02 (per measurement). You might be interested to answer two questions:

- How suitable is a plm model for detecting the level-effect? (also: what is the power to detect the level effect?).
- What if I had the same design but without a level-effect. How often would the plm falsely find a significant level-effect? (also: how large is the alpha-error probability for the level-effect?).

In principle, `power_test()`

takes a single case design and repeatedly generates random cases based on that design. Each case is now analyzed with a given statistical method. The proportion of significant effects in these analyses is an estimator of the test-power. In a second step the design is stripped of the target effect and again multiple cases are generated on this changed design and analyzed with the same method. Now, the proportion of significant effects is the estimator for the alpha-error probability.

## 13.2 Set up a single-case design

The `design`

function sets up a single-case design. You can define various parameters of that design:

## 13.3 Conducting a power-test

When conduction a power test, you firstly need to define a design which you like to be tested.

```
<- design(
design n = 1,
phase_design = list(A = 10, B = 20),
level = list(A = 0, B = 1),
trend = 0.02,
distribution = "normal"
)
```

Then you have to choose the statistical method. The `power_test`

function applies three methods by default: *plm*, *randomization test*, and *Tau U*. These default values are only suitable when your design is a one case single-case study.

Let us start with the defaults and conduct a power analysis for our previously set design: *(This might take some time. Even in the default setting with 100 simulations you might wait a few seconds. For more precise estimations I recommend 1000 simulations - or even higher.)*

```
<- power_test(design)
res res
```

```
Test-Power in percent:
Method Power Alpha Error Alpha:Beta Correct
plm_level 87 4 1:3.2 92
rand 78 4 1:5.5 87
tauU 100 26 1:0.0 87
```

The results show that the plm test and the randomization test have similar power and alpha-error probabilities (the differences here may be due to outliers of the random samples. A more intensive computation with 1000 simulations shows slightly better values for the plm). The tau U test has an unacceptably high alpha-error which is due to the trend we put into the design. *Alpha:Beta* depicts the relation of the Alpha and Beta error (power = 1 - Beta). *Correct* is the overall proportion of correct categorizations and *p* is the results of a binomial-test of *Correct* against 50%.

## 13.4 Statistical methods

The `method`

argument takes a list where each element depicts a statistical method. Currently, the following character strings are predefined:

Name | Single/ multiple cases | What it means ... |
---|---|---|

plm_level | single | A complete plm model for normal distributed dependent variables. It checks for the level effect. |

plm_slope | single | A complete plm model for normal distributed dependent variables. It checks for the slope effect. |

plm_poisson_level | single | Like plm_level but for poisson distributed dependent variables. |

plm_poisson_slope | single | Like plm_slope but for poisson distributed dependent variables. |

hplm_level | multiple | A complete hplm model for normal distributed dependent variables. It checks for the level effect. |

hplm_slope | multiple | A complete hplm model for normal distributed dependent variables. It checks for the slope effect. |

tauU | sinlge | A tauU test with method complete and taub estimations. It checks the 'A vs. B - Trend A' variation. |

tauU_slope | sinlge | A tauU test with method complete and taub estimations. It checks the 'A vs. B - Trend A + Trend B' variation. |

tauU_meta | multiple | Like 'TauU' but with the results from a meta analyses (fixed effects). Very slow. |

tauU_slope_meta | multiple | Like 'TauU_slope' but with the results from a meta analyses (fixed effects). Very slow. |

base_tau | single | A baseline corrected tau test. |

rand | single and multiple | A randomization test for 'Mean B-A' with 100 permutations. |

## 13.5 Confidence intervals and binomial tests

With only 100 simulations you will have quite large confidence intervals for the power, alpha error probability, and correct estimations. You can calculate these intervals by setting the `ci`

argument. For 95% CI’s set `ci = 0.95`

for 99% `ci = 0.99`

.

`power_test(design, ci = 0.95)`

```
Test-Power in percent:
Method Power 2.5% 97.5% Alpha Error 2.5% 97.5% Alpha:Beta Correct 2.5%
plm_level 73 63 81 4 1 10 1:6.8 84 79
rand 73 63 81 8 4 15 1:3.4 82 77
tauU 100 96 100 22 14 31 1:0.0 89 84
97.5%
89
87
93
```

You can also test the power, alpha error, and correct estimates against predefined values. In order to do that, set `binom_test = TRUE`

. The power will be tested against being greater or equal to 80%, the alpha error against being less or equal 5%, and the correct proportion against being greater equal 87.5%.

`power_test(design, binom_test = TRUE)`

```
Test-Power in percent:
Method Power Alpha Error Alpha:Beta Correct p Power>=80 p Alpha Error<=5
plm_level 70 3 1:10.0 84 1 0
rand 74 3 1:8.7 86 1 0
tauU 100 22 1:0.0 89 0 1
p Correct>=87.5
1.0
0.8
0.3
```

If you want to define individual values for the three tests, set the `binom_test_power`

. `binom_test_alpha`

, and/or, the `binom_test_correct`

arguments.

## 13.6 Advanced methods

*Note: You need specific knowledge on how to create functions in R and on data structures to follow all aspects of this section.*

Instead of one of the predefined character strings you can also create you own functions and implement these. You function must take an scdf as the first argument and return a single numeric p-value.

Here is an example that implements a method for the significance of a NAP (nonoverlap of all pairs) test. This is statistically identical to a U-Test comparing phase A and B.

```
set.seed(1) # only needed to make this example replicable
<- function(scdf) {
mcmethod_nap nap(scdf)$nap[1, "p"]
}
power_test(design, method = list(nap = mcmethod_nap, "rand", "plm_level"))
```

```
Test-Power in percent:
Method Power Alpha Error Alpha:Beta Correct
nap 100 47 1:0.0 76
rand 73 5 1:5.4 84
plm_level 73 3 1:9.0 85
```

Here is another example for a fast plm function for poisson distributed data based on the `fastglm`

package:

```
<- function(data) {
plm_fast <- unlist(data, recursive = FALSE)
data <- data$values
y <- sum(data$phase == "A")
n1 <- sum(data$phase == "B")
n2 <- c(rep(0, n1), rep(1, n2))
D <- data$mt
mt <- (mt - mt[n1]) * D
inter <- matrix(
x c(rep(1, n1 + n2), mt, D, inter),
nrow = n1 + n2,
ncol = 4
)<- fastglm::fastglm(x = x, y = y, family = "poisson", method = 2)
full summary(full)$coef[3, 4]
}
```

`power_test(design, method = list("fast plm" = plm_fast))`

## 13.7 Computation duration

You can print the returning object of the `power_test`

function with added computation duration time by setting `duration = TRUE`

`print(res, duration = TRUE)`

```
Test-Power in percent:
Method Power Alpha Error Alpha:Beta Correct
plm_level 87 4 1:3.2 92
rand 78 4 1:5.5 87
tauU 100 26 1:0.0 87
Computation duration is 0.9 seconds.
```

The duration depends heavily on the applied test methods. Regressions are faster than randomization tests and tau U tests are quiet slow:

```
<- power_test(design, method = "plm_level")
res1 <- power_test(design, method = "rand")
res2 <- power_test(design, method = "tauU")
res3
# Elapsed time in seconds for each procedure
attr(res1, "computation_duration")[3]
## elapsed
## 0.074
attr(res2, "computation_duration")[3]
## elapsed
## 0.24
attr(res3, "computation_duration")[3]
## elapsed
## 0.53
```

… and what about our new fast-glm function?

```
set.seed(1)
<- design(
design n = 1,
phase_design = list(A = 10, B = 20),
level = list(A = 0, B = 1),
trend = 0.02,
distribution = "poisson"
)
<- power_test(design, method = list("fast plm" = plm_fast))
res1 <- power_test(design, method = "plm_poisson_level")
res2
attr(res1, "computation_duration")[3]
```

```
elapsed
0.238
```

`attr(res2, "computation_duration")[3]`

```
elapsed
0.138
```

… it is much faster!