8  Overlapping indices

8.1 Overlap overview

The overlap function call

overlap(data, dvar, pvar, mvar, decreasing = FALSE, phases = c(1, 2))

overlap provides a table with some of the most important overlap indices for each case of an scdf. For calculating overlap indicators it is important to know if a decrease or an increase of values is expected between phases. By default overlap assumes an increase in values. If the argument decreasing = TRUE is set, calculation will be based on the assumption of decreasing values.

overlap(exampleAB)
Overlap Indices

Comparing phase 1 against phase 2 

             Johanna Karolina  Anja
Design           A-B      A-B   A-B
PND              100       87    93
PEM              100      100   100
PET              100       93   100
NAP              100       97    98
NAP rescaled     100       93    96
PAND             100       90    90
Tau_U           0.77     0.78  0.64
Base_Tau        0.63     0.59  0.61
Diff_mean      19.53    21.67 20.47
Diff_trend      1.53     0.54  2.50
SMD             8.11     3.17  6.71
Hedges_g        2.35     2.26  2.87

Overlap measures refer to a comparison of two phases within a single-case data-set. By default, overlap compares the first to the second phase.

8.1.1 Select and recombine phases

The select_phases function call

select_phases(data, A, B)

The select_phases() function is needed if you like to compare specific phases or even like to combine several phases. select_phases() is designed to work within a pipe structure. So the first argument is an scdf and it returns an scdf.

scdf %>% select_phases(A = 1, B = 3) %>% ...

select_phases() has the arguments A and B. Each argument takes a vector with the names or the numbers of the phases to be selected. If you want to compare the first to the third phase you can set select_phases(scdf, 1,3). If the phases of your case are named ‘A’, ‘B’, and ‘C’ you could alternatively set select_phases(scdf, "A","C"). It is also possible to compare a combination of several cases against a combination of other phases. Each of the two list-elements could contain more than one phase which are concatenated with the c command. For example if you have an ABAB-Design and like to compare the two A-phases against the two B-phases select_phases(scdf, c(1,3), c(2,4) ) will do the trick.

(As an alternative approach you can set the phases argument within the overlap() function. This argument takes a list with two elements where the first element defines the phases for the A-phase and the second argument the phases for the B-phase.)

exampleA1B1A2B2 %>%
  select_phases(c("A1","A2"), c("B1","B2")) %>%
  overlap()
Overlap Indices

Comparing phase 1 against phase 2 

             Pawel Moritz Jannis
Design         A-B    A-B    A-B
PND             55     78     71
PEM            100    100    100
PET            100    100    100
NAP             94     97     98
NAP rescaled    89     94     97
PAND            82     85     90
Tau_U         0.45   0.46   0.38
Base_Tau      0.65   0.68   0.68
Diff_mean    12.25  13.58  15.27
Diff_trend   -0.05   0.00  -0.54
SMD           2.68   3.27   3.62
Hedges_g      2.07   2.72   2.98
# Alternatively:
# overlap(exampleA1B1A2B2, phases = list( c("A1","A2"), c("B1","B2")))

8.2 Standardized mean differences

The smd function call

smd(data, dvar, pvar, mvar, decreasing = FALSE, phases = c(1, 2))

Standardized mean differences can be calculated in various ways. They refer to the difference in the means of two phases. The smd function provides an overview of the most common parameters for each single-case:

smd(exampleAB_score)
Standardized mean differences

                            Christiano Lionel Neymar
mA                                2.70   3.10   2.30
mB                               15.35  15.35  15.60
sdA                               1.42   1.59   1.49
sdB                               2.13   1.60   2.19
sd cohen                          1.81   1.60   1.87
sd hedges                         1.93   1.60   1.99
Glass' delta                      8.92   7.68   8.90
Hedges' g                         6.54   7.67   6.68
Hedges' g correction              6.37   7.46   6.50
Hedges' g durlak correction       6.15   7.21   6.28
Cohen's d                         6.98   7.67   7.10

8.3 Percentage non-overlapping data (PND)

The pnd function call

pnd(data, dvar, pvar, decreasing = FALSE, phases = c(“A”, “B”))

The percentage of non-overlapping data (PND) effect size measure was described by Scruggs, Mastropieri, & Casto (1987) . It is the percentage of all data-points of the second phase of a single-case study exceeding the maximum value of the first phase. In case you have a study where you expect a decrease of values in the second phase, PND is calculated as the percentage of data-point of the second phase below the minimum of the first phase.

Illustration of PND. PND is 60% as 9 out of 15 datapoints of phase B are higher than the maximum of phase A

The function pnd provides the PND for each case as well as the mean of all PNDs of that scdf. When you expect decreasing values set decreasing = TRUE. When there are more than two phases or phases are not named A and B, use the phases argument as described at the beginning of this chapter.

pnd(exampleAB)
Percent Non-Overlapping Data

     Case    PND Total Exceeds
  Johanna   100%    15      15
 Karolina 86.67%    15      13
     Anja 93.33%    15      14

Mean  : 93.33 %

8.4 Percentage exceeding the median (PEM)

The pem function call

pem(data, dvar, pvar, decreasing = FALSE, binom.test = TRUE, chi.test = FALSE, FUN = median, phases = c(1, 2), …)

The pem function returns the percentage of phase B data exceeding the phase A median. Additionally, a binomial test against a 50/50 distribution is computed. Different measures of central tendency can be addressed for alternative analyses.

Illustration of PEM. PEM is 75% as 13 out of 15 datapoints of phase B are higher than the median of phase A

pem(exampleAB)
Percent Exceeding the Median

         PEM positives total binom.p
Johanna  100        15    15       0
Karolina 100        15    15       0
Anja     100        15    15       0

Alternative hypothesis: true probability > 50%

8.5 Percentage exceeding the regression trend (PET)

The pet function call

pet(data, dvar, pvar, mvar, ci = 0.95, decreasing = FALSE, phases = c(1, 2))

The pet function provides the percentage of phase B data points exceeding the prediction based on the phase A trend. A binomial test against a 50/50 distribution is computed. Furthermore, the percentage of phase B data points exceeding the upper (or lower) 95 percent confidence interval of the predicted progress is computed.

pet(exampleAB)
Percent Exceeding the Trend

N cases =  3 

             PET binom.p  PET CI
Johanna  100.000       0  86.667
Karolina  93.333       0   0.000
Anja     100.000       0 100.000

Binom.test: alternative hypothesis: true probability > 50%
PET CI: Percent of values greater than upper 95% confidence threshold (greater 1.645*se above predicted value)

Illustration of PET. PET is 66.7% as 10 out of 15 datapoints of phase B are higher than the projected trend-line of phase A

8.6 Percentage of all non-overlapping data (PAND)

The pand function call

pand(data, dvar, pvar, decreasing = FALSE, correction = TRUE, phases = c(1, 2))

The pand function calculates the percentage of all non-overlapping data (Richard I. Parker, Hagan-Burke, & Vannest, 2007), an index to quantify a level increase (or decrease) in performance after the onset of an intervention. The argument correction = TRUE makes pand use a frequency matrix, which is corrected for ties. A tie is counted as the half of a measurement in both phases. Set correction = FALSE to use the uncorrected matrix, which is not recommended.

pand(exampleAB)
Percentage of all non-overlapping data

PAND =  93.3 %
Φ =  0.822  ; Φ² =  0.676 

Number of cases: 3 
Total measurements: 60  (in phase A: 15; in phase B: 45)
n overlapping data per case: 0, 2, 2
Total overlapping data: n = 4 ; percentage = 6.7 

2 x 2 Matrix of proportions
    % expected
    A   B   total
%    A  21.7    3.3 25
real B  3.3 71.7    75
 total  25  75

2 x 2 Matrix of counts
    expected
    A   B   total
     A  13  2   15
real B  2   43  45
 total  15  45


Note. Matrix is corrected for ties

Correlation based analysis:

z = 6.316, p = 0.000, τ = 0.822 

PAND indicates nonoverlap between phase A and B data (like PND), but uses all data and is therefore not based on one single (probably unrepresentative) datapoint. Furthermore, PAND allows the comparison of real and expected associations (Chi-square test) and estimation of the effect size Phi, which equals Pearsons r for dichotomous data. Thus, phi-Square is the amount of explained variance. The original procedure for computing PAND does not account for ambivalent datapoints (ties). The newer NAP overcomes this problem and has better precision-power (Richard I. Parker, Vannest, & Davis, 2011a).

8.7 Nonoverlap of all pairs (NAP)

The nap function call

nap(data, dvar, pvar, decreasing = FALSE, phases = c(1, 2))

The nap function calculates the nonoverlap of all pairs (Richard I. Parker & Vannest, 2009). NAP summarizes the overlap between all pairs of phase A and phase B data points. If an increase of phase B scores is expected, a non-overlapping pair has a higher phase B data point. The NAP equals number of pairs showing no overlap / number of pairs. Because NAP can only take values between 50 and 100 percent, a rescaled and therefore more intuitive NAP (0-100%) is also displayed. NAP is equivalent to the the U-test and Wilcox rank sum test. Thus, a Wilcox test is conducted and reported for each case.

nap(exampleAB)
Nonoverlap of All Pairs

     Case NAP Rescaled Pairs Positives Ties   W       p
  Johanna 100      100    75        75    0 0.0 0.00062
 Karolina  97       93    75        72    1 2.5 0.00129
     Anja  98       96    75        73    1 1.5 0.00095

8.8 Tau-U

The tau_u function call

tau_u(data, dvar, pvar, tau_method = c(“b”, “a”), method = c(“complete”, “parker”), phases = c(1, 2), meta_analyses = TRUE, ci = 0.95, ci_method = c(“z”, “tau”), meta_weight_method = c(“z”, “tau”), continuity_correction = FALSE, meta_method = NULL)

The Tau-U statistic has been proposed by Richard I. Parker, Vannest, Davis, & Sauber (2011b) and is one of the more broadly used approach for reporting effect sizes of single case data. Unfortunately, various and ambiguous implementations of Tau-U exist (Brossart, Laird, & Armstrong, 2018; Pustejovsky, 2016). The tau_u function tries to cover several of these implementation. It takes an scdf and returns Tau-U calculations for each single-case within that file. Additionally, an overall Tau-U value is calculated for all cases based on a meta-analysis.

8.8.1 Variations of Tau-U

Several arguments an be set to define how Tau-U should be calculated. By setting the argument method = "parker", Tau-U is calculated as described in Richard I. Parker et al. (2011b). This procedure could lead to Tau-U values above 1 and below -1 which are difficult to interpret. method = "complete, which is the default, applies a correction that keeps the values within the -1 to 1 range and should be more appropriate. In the original method proposed by Richard I. Parker et al. (2011b) data, calculations are based on Kendall’s Tau A which does not correct for ties. Alternatively, Kendall’s Tau B has a correction for Tau in the presence of ties. The tau_method can be set to decide on the tau method to use "a" for Kendall’s Tau A and "b"` for Kendall’s Tau B.

Here is an example with setting that reconstruct the values from the original example in Richard I. Parker, Vannest, Davis, & Sauber (2011c) :

tau_u(Parker2011, method = "parker", tau_method = "a", continuity_correction = FALSE)
Tau-U
Method: parker 
Applied Kendall's Tau-a
95% CIs for tau are reported.
CI method: 

Case: Case1 
                             Tau CI lower CI upper SD_S    Z    p
A vs. B                     0.80     0.29     0.96 8.16 1.96  .05
A vs. B - Trend A           0.65    -0.02     0.92 9.59 1.36  .18
A vs. B + Trend B           0.77     0.21     0.95 9.59 2.40 <.05
A vs. B + Trend B - Trend A 0.56    -0.17     0.89 9.59 2.09 <.05

A different implementation of the method (provided at http://www.singlecaseresearch.org/calculators/tau-u)) uses Kendall’s Tau B:

tau_u(exampleAB$Johanna, method = "parker", tau_method = "b", continuity_correction = FALSE)
Tau-U
Method: parker 
Applied Kendall's Tau-b
95% CIs for tau are reported.
CI method: 

Case: Johanna 
                             Tau CI lower CI upper  SD_S    Z     p
A vs. B                     1.00      NaN      NaN 22.91 3.27 <.001
A vs. B - Trend A           0.59     0.20     0.82 23.26 3.22 <.001
A vs. B + Trend B           0.79     0.53     0.91 30.53 4.75 <.001
A vs. B + Trend B - Trend A 0.77     0.49     0.90 30.81 4.71 <.001

Another online calculator created by Rumen Manolov is available at https://manolov.shinyapps.io/Overlap/. It uses an R code developed by Kevin Tarlow to calculate Tau-U. This setting will replicate the results of this approach:

tau_u(exampleAB$Johanna, method = "complete", tau_method = "a", continuity_correction = FALSE)
Tau-U
Method: complete 
Applied Kendall's Tau-a
95% CIs for tau are reported.
CI method: 

Case: Johanna 
                             Tau CI lower CI upper  SD_S    Z     p
A vs. B                     1.00      NaN      NaN 22.91 3.27 <.001
A vs. B - Trend A           0.88     0.72     0.95 30.82 2.43  <.05
A vs. B + Trend B           0.81     0.56     0.92 30.82 4.70 <.001
A vs. B + Trend B - Trend A 0.76     0.48     0.90 30.82 4.70 <.001

The standard return of the tau_u function does not display all calculations. If you like to have more details, apply the print function with the additional argument complete = TRUE.

tau_u(exampleAB$Johanna) %>% print(complete = TRUE)
Tau-U
Method: complete 
Applied Kendall's Tau-b
95% CIs for tau are reported.
CI method: 

Case: Johanna 
                            pairs pos neg ties   S      D  Tau CI lower
A vs. B                        75  75   0    0  75  75.00 1.00      NaN
Trend A                        10   5   5    0   0  10.00 0.00    -0.88
Trend B                       105  87  17    1  70 104.50 0.67     0.24
A vs. B - Trend A              85  80   5    0  75 126.75 0.59     0.20
A vs. B + Trend B             180 162  17    1 145 184.45 0.79     0.53
A vs. B + Trend B - Trend A   190 167  22    1 145 189.50 0.77     0.49
                            CI upper  SD_S  VAR_S SE_Tau    Z     p  n
A vs. B                          NaN 22.91 525.00   0.31 3.27 <.001 20
Trend A                         0.88  4.08  16.67    NaN 0.00  1.00  5
Trend B                         0.88 20.21 408.33   0.19 3.46 <.001 15
A vs. B - Trend A               0.82 23.26 541.22   0.18 3.22 <.001 20
A vs. B + Trend B               0.91 30.53 932.39   0.17 4.75 <.001 20
A vs. B + Trend B - Trend A     0.90 30.81 949.00   0.16 4.71 <.001 20

8.8.2 Meta analyses

Note

The procedure for calculating the meta-analyses has changed with scan version 0.55.7. Please make sure you are using the latest scan version.

If you pass multiple cases to the tau-u function, it will calculate a Tau-U table for each case and an overall calculation via a meta-analysis.

Calculating a Tau-U meta analysis

The calculation of the Tau-U-meta-analyses involves the following steps:

  1. The tau values are Fisher-Z transformed to \(Tau_z\).

  2. The standard error for each transformed value is calculated as either:

    \(se_z = {1 \over \sqrt{n-3}}\) (Hotelling, 1953)

    or
    \(se_z = \sqrt{0.437 \over n-4}\) (Fieller, Hartley, & Pearson, 1957)

  3. The average \(tau_z\) is the mean of \(tau_z\) weighted by \(1 \over se_z^2\)

  4. The standard error of the average \(tau_z\) is \(se_{M_{tau_z}} = \sqrt{\frac{1}{\sum{weights}}}\) (Cooper, Hedges, & Valentine, 2009)

  5. The p value is calculated with a Z-test (form \(Z = \frac{M_{tau_z}}{se_{M_{tau_z}}}\) )

  6. The overall tau value is derived from an inverse-Fisher-Z-transformation.

8.8.3 Confidence intervals

Note

The default method for calculating the confidence interval has changed with scan version 0.55.7. Confidence intervals could have been outside the [-1, 1] in earlier versions. Set ci_method = "s" for a replication of results from scan version 0.55.6 or earlier.

By default, 95% percent intervals are calculated for each tau value. You can specify a different interval with the ci argument (ci = 0.90 will calculate a 90% interval). There are three alternative approaches to calculating the confidence intervals. When ci_method = "z" is set (the default), a general formula for calculating the standard-error of Fisher-Z values is used (Hotelling, 1953). If ci_method = "tau", a specific formula for Fisher-Z transformed tau values is applied (Fieller et al., 1957). Both approaches give similar results. A third approach is derived from the standard deviation of the S statistic1. Set ci_method = "s" for this method. The S method could lead to implausible values blow -1 or above 1. I recommend to use the common “z” method or the accurate “tau” method.

tau_u(exampleAB, ci = 0.90, ci_method = "tau")
Tau-U
Method: complete 
Applied Kendall's Tau-b
90% CIs for tau are reported.
CI method: 

Tau-U meta analyses:
Weight method: z
90% CIs are reported.

                       Model Tau_U   se CI lower CI upper   z       p
                     A vs. B  1.00 0.14     1.00     1.00 Inf 0.0e+00
           A vs. B - Trend A  0.59 0.14     0.42     0.72 4.8 1.3e-06
           A vs. B + Trend B  0.75 0.14     0.63     0.83 6.9 4.4e-12
 A vs. B + Trend B - Trend A  0.74 0.14     0.61     0.82 6.7 1.8e-11

Case: Johanna 
                             Tau CI lower CI upper  SD_S    Z     p
A vs. B                     1.00      NaN      NaN 22.91 3.27 <.001
A vs. B - Trend A           0.59     0.39     0.74 23.26 3.22 <.001
A vs. B + Trend B           0.79     0.66     0.87 30.53 4.75 <.001
A vs. B + Trend B - Trend A 0.77     0.63     0.86 30.81 4.71 <.001

Case: Karolina 
                             Tau CI lower CI upper  SD_S    Z     p
A vs. B                     0.94     0.90     0.96 22.91 3.06 <.001
A vs. B - Trend A           0.55     0.34     0.71 23.25 3.01 <.001
A vs. B + Trend B           0.80     0.69     0.88 30.52 4.85 <.001
A vs. B + Trend B - Trend A 0.78     0.65     0.87 30.79 4.81 <.001

Case: Anja 
                             Tau CI lower CI upper  SD_S    Z     p
A vs. B                     0.97     0.94     0.98 22.91 3.14 <.001
A vs. B - Trend A           0.62     0.43     0.76 23.21 3.36 <.001
A vs. B + Trend B           0.63     0.43     0.76 30.45 3.74 <.001
A vs. B + Trend B - Trend A 0.64     0.45     0.78 30.71 3.91 <.001

8.9 Baseline corrected tau

The corrected_tau function call

corrected_tau(data, dvar, pvar, mvar, phases = c(1, 2), alpha = 0.05, continuity = FALSE, repeated = FALSE)

This method has been proposed by Tarlow (2016). The baseline data are checked for a significant autocorrelation (based on Kendalls Tau). If so, a non-parameteric Theil-Sen regression is applied for the baseline data where the dependent values are regressed on the measurement time. The resulting slope information is then used to predict data of the B-phase. The dependent variable is now corrected for this baseline trend and the residuals of the Theil-Sen regression are taken for further calculations. Finally, Kendalls tau is calculated for the dependent variable and the dichotomous phase variable. The function here provides two extensions to this procedure: The alternative Siegel repeated median regression is applied when repeated = TRUE (Siegel, 1982) and a continuity correction is applied when continuity = TRUE (both not the defaults).

Here is a replication of an example provided by Tarlow (2016) :

Warning: Removed 11 rows containing missing values (`geom_line()`).

Baseline corrected tau

Method: Theil-Sen regression
Continuity correction not applied.

Case1 :
                           tau     z     p
Baseline autocorrelation -0.75 -2.31  <.05
Uncorrected tau          -0.58 -2.98  <.01
Baseline corrected tau    0.69  3.57 <.001

Baseline correction should be applied.

8.10 Reliable change index

The rci function call

rci(data, dvar, pvar, rel, ci = 0.95, graph = FALSE, phases = c(1, 2))

Basically, the reliable change index (rci) depicts if a post-test is above a pre-test value. Based on the reliability of the measurements and the standard-deviation the standard error is calculated. The mean difference between phase-A and phase-B is divided by the standard-error. Several authors proposed refined methods for calculating the rci.

The rci function computes three indices of reliable change (Wise, 2004) and corresponding descriptive statistics.

rci(exampleAB$Johanna, rel = 0.8, graph = TRUE)

Reliable Change Index

Mean Difference =  19.53333 
Standardized Difference =  1.678301 

Descriptives:
         n     mean       SD       SE
A-Phase  5 54.60000 2.408319 1.077033
B-Phase 15 74.13333 8.943207 3.999524

Reliability =  0.8 

95 % Confidence Intervals:
           Lower    Upper
A-Phase 52.48905 56.71095
B-Phase 66.29441 81.97226

Reliable Change Indices:
                             RCI
Jacobson et al.         18.13624
Christensen and Mendoza 12.82426
Hageman and Arrindell   18.49426

  1. S is the difference between concordant and discordant comparisons in a Kendall’s tau calculation. This is the same statistic used to calculate the p-value.↩︎