Skip to contents

What is a dictionary file?

When you conduct research based on questionnaires or psychometric tests (and you are working in R), you typically create a data.frame with one column (variable) for each item on that questionnaire and one row for each person who participated. You can only store a limited amount of additional information about each item in that questionnaire within a data.frame (or tibble). You can give a variable a name and define a variable as a factor with appropriate levels. But basically, that is it. You cannot, at least not conveniently, include a longer label for each item, the name of a scale to which that item belongs to, information about reverse coding, etc.

I call the collection of this additional information about items an item dictionary. A dictionary contains a short label, a longer description, scale affiliation, and more for each item.

A dictionary file

A dictionary file is a table with one row for each variable and one column for each attribute of those variables. The most convenient way to create a dictionary file is in a spreadsheet program for later use with data sets.

Here is an extract from an example dic-file:

item_name item_label scale scale_label subscale subscale_label values value_labels missing type
itrf_I_1 Verbringt zu viel Zeit alleine ITRF Integrated teacher report form Int Internalizing 0:3 0 = not problematic; 1 = slightly problematic; 2 = problematic; 3 = strongly problematic -99 integer
itrf_I_2 Beschwert sich über Krankheit oder Schmerzen ITRF Integrated teacher report form Int Internalizing 0:3 0 = not problematic; 1 = slightly problematic; 2 = problematic; 3 = strongly problematic -99 integer
itrf_I_4 Vermeidet soziale Interaktionen ITRF Integrated teacher report form Int Internalizing 0:3 0 = not problematic; 1 = slightly problematic; 2 = problematic; 3 = strongly problematic -99 integer

A dictionary file can contain any additional attributes. This means that you can add a column with any name to store relevant information (e.g. the scale and scale label to which an item belongs, a translation of the item name). However, there are some predefined attributes with a specific meaning. The table below shows these attributes:

Parameter Meaning Example
item_name A short item name itrf_1
item_label Full text of the item Vermeidet die Teilnahme an Diskussionen im Unterricht
values Valid response values in an R manner 1:5 (for integers 1 to 5) 1,2,3 (for integers 1, 2, 3)
value_labels Labels for each response value 0 = nicht; 1 = leicht; 2 = mäßig; 3 = stark
missing Missing values -888, -999
type Data type (factor, integer, float, real) integer
weight Reversion of item and its weight 1 (positive), -1 (reverse), 1.5 (positive, weights 1.5 times)

Apply a dictionary file

When you combine a dataset with a dictionary file, each variable in the dataset that corresponds to a variable described in the dictionary is completed with the given dictionary information.
The resulting dataset is now ready for use with all other scaledic functions.

The apply_dic function takes the name of the dataset and the dictionary file and combines them. Missing values are replaced by NAs:

# Here we use the example dataset "dat_itrf" and the example dic file "dic_itrf"
dat <- apply_dic(dat_itrf, dic_itrf)
#> Found the following invalid values:
#> 
#> 'itrf_I_1' is 7 at row 3192
#> 'itrf_I_2' is 9 at row 4651
#> 'itrf_I_13' is 4 at row 3699
#> 'itrf_I_20' is 4 at row 2799
#> 'itrf_E_4' is 6 at row 2621
#> 'itrf_E_6' is 9 at row 2599
#> 'itrf_E_7' is 9 at row 2599
#> 'itrf_E_8' is 9 at row 2599
#> 'itrf_E_9' is 9 at row 2599
#> 'itrf_E_10' is 9 at row 2599
#> 'itrf_E_11' is 9 at row 2599
#> 'itrf_E_12' is 9 at row 2599
#> 'itrf_E_13' is 9, 11 at rows 2599, 4146
#> 'itrf_E_14' is 9 at row 2599

Let us take a look at all the scales in the dataset:

list_scales(dat, paste0(c("scale", "subscale", "subscale_2"), "_label"))
scale_label subscale_label subscale_2_label
Integrated teacher report form Internalizing Socially Withdrawn
Integrated teacher report form Internalizing Anxious/Depressed
Integrated teacher report form Externalizing Oppositional/Disruptive
Integrated teacher report form Externalizing Academic Productivity/Disorganization

Clean raw data

Firstly, we check for invalid values in the dataset (e.g., typos) and replace them with NA:

dat <- check_values(dat, replace = NA)
#> No errors found.

Now we impute missing values:

# Imputation for items of the subscale Ext
dat <- impute_missing(dat, subscale == "Ext")

# Imputation for items of the subscale Int
dat <- impute_missing(dat, subscale == "Int")

Select scales for analyszing

Let us look at the descriptive statistics for the internalising subscale:

dat |> select_items(subscale == "Int")
Table
Descriptive statistics
Variable Valid Missing Mean SD Min Max Range Median MAD
itrf_I_1 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_2 4772 4 0.3 0.7 0 3 3 0 0
itrf_I_4 4772 4 0.3 0.6 0 3 3 0 0
itrf_I_5 4772 4 0.2 0.6 0 3 3 0 0
itrf_I_6 4772 4 0.2 0.5 0 3 3 0 0
itrf_I_7 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_8 4772 4 0.3 0.7 0 3 3 0 0
itrf_I_9 4772 4 0.5 0.8 0 3 3 0 0
itrf_I_10 4772 4 0.3 0.7 0 3 3 0 0
itrf_I_11 4772 4 0.3 0.7 0 3 3 0 0
itrf_I_12 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_13 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_14 4772 4 0.3 0.7 0 3 3 0 0
itrf_I_15 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_16 4772 4 0.4 0.8 0 3 3 0 0
itrf_I_17 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_19 4772 4 0.2 0.6 0 3 3 0 0
itrf_I_23 4772 4 0.4 0.7 0 3 3 0 0
itrf_I_24 4772 4 0.4 0.7 0 3 3 0 0
Note. MAD is the median average deviation with a consistency adjustment.

See items instead of labels

It is more convenient to see the original items rather than the short labels:

dat |> 
  select_items(subscale == "Int") |> 
  rename_items() |> 
  wmisc::nice_descriptives(round = 1)
Table
Descriptive statistics
Variable Valid Missing Mean SD Min Max Range Median MAD
Verbringt zu viel Zeit alleine 4772 4 0.4 0.7 0 3 3 0 0
Beschwert sich über Krankheit oder Schmerzen 4772 4 0.3 0.7 0 3 3 0 0
Vermeidet soziale Interaktionen 4772 4 0.3 0.6 0 3 3 0 0
Spielt bevorzugt alleine 4772 4 0.2 0.6 0 3 3 0 0
Geht nicht auf Kontaktversuche der Mitschülerinnen und Mitschüler ein 4772 4 0.2 0.5 0 3 3 0 0
Macht sich Sorgen über unwichtige Details 4772 4 0.4 0.7 0 3 3 0 0
Beschwert sich über Kopfschmerzen oder Bauchschmerzen 4772 4 0.3 0.7 0 3 3 0 0
Wirkt unglücklich oder traurig 4772 4 0.5 0.8 0 3 3 0 0
Klammert sich an Erwachsene 4772 4 0.3 0.7 0 3 3 0 0
Verhält sich nervös 4772 4 0.3 0.7 0 3 3 0 0
Verhält sich ängstlich 4772 4 0.4 0.7 0 3 3 0 0
Behauptet sich nicht gegenüber anderen 4772 4 0.4 0.7 0 3 3 0 0
Verhält sich übermäßig schüchtern 4772 4 0.3 0.7 0 3 3 0 0
Beklagt sich oder jammert 4772 4 0.4 0.7 0 3 3 0 0
Beteiligt sich nicht an Gruppenaktionen 4772 4 0.4 0.8 0 3 3 0 0
Macht sich selbst schlecht 4772 4 0.4 0.7 0 3 3 0 0
Weint oder ist weinerlich 4772 4 0.2 0.6 0 3 3 0 0
Macht sich ständig Sorgen 4772 4 0.4 0.7 0 3 3 0 0
Lässt sich langsam auf neue Personen ein 4772 4 0.4 0.7 0 3 3 0 0
Note. MAD is the median average deviation with a consistency adjustment.

And then we analyse the factor structure. Here we use the rename_item() function to get a more convenient description.

dat |> 
  select_items(scale == "ITRF") |>
  rename_items(pattern = "({reverse}){subscale}_{subscale_2}: {label}", max_chars = 70) |> 
  psych::fa(nfactors = 4) |> 
  wmisc::nice_efa(cut = 0.4)
Table
Loading matrix
Variables
Factors
Communalities Complexity
MR1 MR3 MR2 MR4
Loadings
(+)Ext_OPP: Verliert die Beherrschung 0.85 0.72 1.01
(+)Ext_OPP: Macht unangebrachte Bemerkungen 0.83 0.7 1.01
(+)Ext_OPP: Streitet und zankt mit Lehrkräften 0.8 0.66 1
(+)Ext_OPP: Hat Konflikte mit Mitschülerinnen und Mitschülern 0.8 0.7 1.02
(+)Ext_OPP: Kommandiert rum 0.78 0.55 1.05
(+)Ext_OPP: Verwendet unangemessene Sprache 0.78 0.65 1.02
(+)Ext_OPP: Ist schnell verärgert 0.76 0.67 1.09
(+)Ext_OPP: Stört andere 0.65 0.63 1.4
(+)Ext_OPP: Respektiert nicht die Privatsphäre anderer 0.64 0.49 1.05
(+)Ext_APD: Erledigt Hausaufgaben unvollständig 0.82 0.65 1.01
(+)Ext_APD: Zeigt Unterrichtsaufgaben nicht selbstständig vor 0.8 0.67 1.03
(+)Ext_APD: Stellt Unterrichtsaufgaben nicht rechtzeitig fertig 0.76 0.59 1.04
(+)Ext_APD: Kommt unvorbereitet zum Unterricht 0.73 0.59 1.07
(+)Ext_APD: Nimmt Materialien, die zu Hause benötigt werden, nicht mit 0.73 0.56 1.03
(+)Ext_APD: Kontrolliert seine eigene Arbeit nicht 0.73 0.61 1.03
(+)Ext_APD: Beginnt mit der Aufgabenbearbeitung nicht selbstständig 0.72 0.61 1.03
(+)Ext_APD: Beteiligt sich nicht am Unterricht 0.53 0.42 1.81
(+)Int_SW: Vermeidet soziale Interaktionen 0.86 0.72 1.01
(+)Int_SW: Geht nicht auf Kontaktversuche der Mitschülerinnen und Mits 0.79 0.6 1.04
(+)Int_SW: Spielt bevorzugt alleine 0.78 0.6 1.01
(+)Int_SW: Verbringt zu viel Zeit alleine 0.66 0.5 1.03
(+)Int_SW: Beteiligt sich nicht an Gruppenaktionen 0.66 0.55 1.18
(+)Int_SW: Verhält sich übermäßig schüchtern 0.55 0.45 1.84
(+)Int_SW: Behauptet sich nicht gegenüber anderen 0.5 0.43 1.99
(+)Int_SW: Lässt sich langsam auf neue Personen ein 0.47 1.48
(+)Int_AD: Beschwert sich über Kopfschmerzen oder Bauchschmerzen 0.73 0.5 1.15
(+)Int_AD: Macht sich ständig Sorgen 0.71 0.54 1.01
(+)Int_AD: Beschwert sich über Krankheit oder Schmerzen 0.7 0.47 1.1
(+)Int_AD: Beklagt sich oder jammert 0.65 0.53 1.22
(+)Int_AD: Macht sich Sorgen über unwichtige Details 0.65 0.47 1.09
(+)Int_AD: Weint oder ist weinerlich 0.59 1.06
(+)Int_AD: Verhält sich ängstlich 0.47 0.49 2.07
(+)Int_AD: Wirkt unglücklich oder traurig 0.47 0.5 1.82
(+)Int_AD: Macht sich selbst schlecht 0.44 1.63
(+)Int_AD: Klammert sich an Erwachsene 0.4 1.96
(+)Int_AD: Verhält sich nervös 0.4 1.71
Variances
Eigenvalues 6 4.8 4.53 4.35
Explained variance 0.17 0.13 0.13 0.12
Cumulative explained variance 0.17 0.3 0.43 0.55
Proportion explained variance 0.3 0.24 0.23 0.22
Cumulative proportion explained variance 0.3 0.55 0.78 1
Note. Extraction method is minres. Rotation method is oblimin. RMSEA is 0.078 CI90% [0.077, 0.079]. Loadings below |0.4| are not displayed.

and provide item analyses

scales <- ex_itrf |> get_scales(
  'Anxious_Depressed' = subscale_2 == "APD",
  'Oppositional_Disruptive' = subscale_2 == "OPP",
  "Socially_Withdrawn" = subscale_2 == "SW",
  "Academic_Productivity_Disorganization" = subscale_2 == "AD"
)
wmisc::nice_alpha_table(dat, scales = scales)
Table
Item analysis
Scale
N
Alpha [95% CI]
Homogeneity Discriminations Means SDs |Loadings|
items cases Raw Standardized
Anxious_Depressed 8 [4776, 4776] .91 [.90, .91] .91 [.91, .91] .56 [.53, .78] [0.38, 0.95] [0.75, 1.04] [.55, .82]
Oppositional_Disruptive 9 [4776, 4776] .94 [.93, .94] .94 [.94, .94] .63 [.68, .81] [0.35, 0.83] [0.72, 0.96] [.69, .85]
Socially_Withdrawn 8 [4772, 4772] .88 [.87, .88] .88 [.88, .89] .48 [.53, .78] [0.21, 0.43] [0.51, 0.76] [.56, .86]
Academic_Productivity_Disorganization 11 [4772, 4772] .88 [.88, .89] .88 [.88, .89] .41 [.52, .69] [0.23, 0.48] [0.59, 0.77] [.55, .74]
Note. Values in brackets depict upper and lower bound of confidence intervals or [min,max] intervals. N cases is the min and max number of non-missing cases for the scale items.
wmisc::nice_item_analysis(dat, scales = scales)
Table
Item analysis
Labels M SD rit Loadings
Anxious_Depressed
n = 4776; Alpha = .91; Std. alpha = .91; Homogeneity = .56
itrf_E_1 0.95 1.04 0.74 0.77
itrf_E_2 0.76 0.98 0.75 0.78
itrf_E_3 0.58 0.90 0.75 0.80
itrf_E_4 0.57 0.86 0.78 0.82
itrf_E_5 0.93 0.96 0.74 0.78
itrf_E_6 0.53 0.87 0.69 0.74
itrf_E_14 0.38 0.75 0.71 0.75
itrf_E_15 0.70 0.91 0.53 0.55
Oppositional_Disruptive
n = 4776; Alpha = .94; Std. alpha = .94; Homogeneity = .63
itrf_I_20 0.55 0.91 0.76 0.79
itrf_E_7 0.35 0.75 0.79 0.82
itrf_E_8 0.49 0.90 0.81 0.84
itrf_E_9 0.83 0.96 0.73 0.75
itrf_E_10 0.40 0.79 0.78 0.81
itrf_E_11 0.77 0.95 0.81 0.84
itrf_E_12 0.46 0.81 0.71 0.73
itrf_E_13 0.47 0.85 0.81 0.85
itrf_E_16 0.35 0.72 0.68 0.69
Socially_Withdrawn
n = 4772; Alpha = .88; Std. alpha = .88; Homogeneity = .48
itrf_I_1 0.38 0.71 0.65 0.72
itrf_I_4 0.31 0.63 0.78 0.86
itrf_I_5 0.24 0.59 0.72 0.80
itrf_I_6 0.21 0.51 0.72 0.79
itrf_I_13 0.42 0.72 0.56 0.58
itrf_I_14 0.35 0.69 0.56 0.57
itrf_I_16 0.43 0.76 0.66 0.71
itrf_I_24 0.38 0.71 0.53 0.56
Academic_Productivity_Disorganization
n = 4772; Alpha = .88; Std. alpha = .88; Homogeneity = .41
itrf_I_2 0.35 0.69 0.60 0.65
itrf_I_7 0.41 0.72 0.64 0.69
itrf_I_8 0.35 0.69 0.61 0.66
itrf_I_9 0.48 0.77 0.64 0.68
itrf_I_10 0.28 0.66 0.52 0.56
itrf_I_11 0.33 0.68 0.52 0.55
itrf_I_12 0.37 0.69 0.56 0.60
itrf_I_15 0.37 0.71 0.65 0.70
itrf_I_17 0.38 0.71 0.56 0.60
itrf_I_19 0.23 0.59 0.57 0.62
itrf_I_23 0.36 0.68 0.69 0.74

and even a confirmatory factor analysis with the use of the lavaan package.

model <- lavaan_model(scales, orthogonal = FALSE)
Anxious_Depressed =~ itrf_E_1 + itrf_E_2 + itrf_E_3 + itrf_E_4 + itrf_E_5 + itrf_E_6 + itrf_E_14 + itrf_E_15
Oppositional_Disruptive =~ itrf_I_20 + itrf_E_7 + itrf_E_8 + itrf_E_9 + itrf_E_10 + itrf_E_11 + itrf_E_12 + itrf_E_13 + itrf_E_16
Socially_Withdrawn =~ itrf_I_1 + itrf_I_4 + itrf_I_5 + itrf_I_6 + itrf_I_13 + itrf_I_14 + itrf_I_16 + itrf_I_24
Academic_Productivity_Disorganization =~ itrf_I_2 + itrf_I_7 + itrf_I_8 + itrf_I_9 + itrf_I_10 + itrf_I_11 + itrf_I_12 + itrf_I_15 + itrf_I_17 + itrf_I_19 + itrf_I_23

Anxious_Depressed ~~ Oppositional_Disruptive + Socially_Withdrawn + Academic_Productivity_Disorganization
Oppositional_Disruptive ~~ Socially_Withdrawn + Academic_Productivity_Disorganization
Socially_Withdrawn ~~ Academic_Productivity_Disorganization
fit <- lavaan::cfa(model = model, data = dat)
wmisc::nice_sem(fit)
Table
Structure equation model
parameter Beta
95% CI
se z p
lower upper
Latent variables
Anxious_Depressed → itrf_E_1 0.76 0.75 0.77 0.01 110.77 <.001
Anxious_Depressed → itrf_E_2 0.78 0.76 0.79 0.01 120.02 <.001
Anxious_Depressed → itrf_E_3 0.80 0.79 0.81 0.01 135.58 <.001
Anxious_Depressed → itrf_E_4 0.82 0.81 0.83 0.01 148.15 <.001
Anxious_Depressed → itrf_E_5 0.78 0.76 0.79 0.01 120.22 <.001
Anxious_Depressed → itrf_E_6 0.75 0.74 0.76 0.01 106.50 <.001
Anxious_Depressed → itrf_E_14 0.76 0.74 0.77 0.01 109.90 <.001
Anxious_Depressed → itrf_E_15 0.55 0.53 0.57 0.01 51.36 <.001
Oppositional_Disruptive → itrf_I_20 0.80 0.79 0.81 0.01 141.76 <.001
Oppositional_Disruptive → itrf_E_7 0.82 0.81 0.83 0.01 155.03 <.001
Oppositional_Disruptive → itrf_E_8 0.85 0.84 0.85 0.00 181.97 <.001
Oppositional_Disruptive → itrf_E_9 0.75 0.74 0.77 0.01 112.49 <.001
Oppositional_Disruptive → itrf_E_10 0.82 0.81 0.83 0.01 154.79 <.001
Oppositional_Disruptive → itrf_E_11 0.83 0.82 0.84 0.00 168.90 <.001
Oppositional_Disruptive → itrf_E_12 0.72 0.71 0.74 0.01 97.22 <.001
Oppositional_Disruptive → itrf_E_13 0.84 0.83 0.85 0.00 176.69 <.001
Oppositional_Disruptive → itrf_E_16 0.69 0.67 0.71 0.01 85.80 <.001
Socially_Withdrawn → itrf_I_1 0.74 0.73 0.76 0.01 102.48 <.001
Socially_Withdrawn → itrf_I_4 0.86 0.85 0.87 0.00 179.03 <.001
Socially_Withdrawn → itrf_I_5 0.82 0.80 0.83 0.01 141.60 <.001
Socially_Withdrawn → itrf_I_6 0.79 0.78 0.81 0.01 127.41 <.001
Socially_Withdrawn → itrf_I_13 0.55 0.53 0.57 0.01 51.19 <.001
Socially_Withdrawn → itrf_I_14 0.53 0.51 0.56 0.01 48.38 <.001
Socially_Withdrawn → itrf_I_16 0.70 0.69 0.72 0.01 87.13 <.001
Socially_Withdrawn → itrf_I_24 0.56 0.54 0.58 0.01 52.14 <.001
Academic_Productivity_Disorganization → itrf_I_2 0.64 0.62 0.66 0.01 66.95 <.001
Academic_Productivity_Disorganization → itrf_I_7 0.68 0.67 0.70 0.01 79.32 <.001
Academic_Productivity_Disorganization → itrf_I_8 0.64 0.62 0.66 0.01 67.65 <.001
Academic_Productivity_Disorganization → itrf_I_9 0.70 0.68 0.71 0.01 83.66 <.001
Academic_Productivity_Disorganization → itrf_I_10 0.57 0.55 0.59 0.01 53.16 <.001
Academic_Productivity_Disorganization → itrf_I_11 0.56 0.54 0.58 0.01 52.39 <.001
Academic_Productivity_Disorganization → itrf_I_12 0.60 0.58 0.62 0.01 59.57 <.001
Academic_Productivity_Disorganization → itrf_I_15 0.70 0.68 0.71 0.01 83.08 <.001
Academic_Productivity_Disorganization → itrf_I_17 0.62 0.60 0.64 0.01 62.41 <.001
Academic_Productivity_Disorganization → itrf_I_19 0.61 0.59 0.63 0.01 61.29 <.001
Academic_Productivity_Disorganization → itrf_I_23 0.73 0.71 0.74 0.01 94.21 <.001
Covariances
Anxious_Depressed ↔ Oppositional_Disruptive 0.50 0.47 0.52 0.01 41.55 <.001
Anxious_Depressed ↔ Socially_Withdrawn 0.35 0.32 0.38 0.01 24.94 <.001
Anxious_Depressed ↔ Academic_Productivity_Disorganization 0.39 0.37 0.42 0.01 28.52 <.001
Oppositional_Disruptive ↔ Socially_Withdrawn 0.19 0.16 0.22 0.02 12.78 <.001
Oppositional_Disruptive ↔ Academic_Productivity_Disorganization 0.43 0.40 0.45 0.01 32.58 <.001
Socially_Withdrawn ↔ Academic_Productivity_Disorganization 0.57 0.55 0.59 0.01 49.49 <.001
Variances
itrf_E_1 0.42 0.40 0.44 0.01 40.81 <.001
itrf_E_2 0.40 0.38 0.42 0.01 39.50 <.001
itrf_E_3 0.36 0.34 0.38 0.01 37.60 <.001
itrf_E_4 0.33 0.31 0.35 0.01 36.24 <.001
itrf_E_5 0.40 0.38 0.42 0.01 39.47 <.001
itrf_E_6 0.44 0.42 0.46 0.01 41.47 <.001
itrf_E_14 0.43 0.41 0.45 0.01 40.94 <.001
itrf_E_15 0.70 0.67 0.72 0.01 59.02 <.001
itrf_I_20 0.36 0.34 0.37 0.01 39.32 <.001
itrf_E_7 0.33 0.31 0.35 0.01 38.18 <.001
itrf_E_8 0.28 0.27 0.30 0.01 36.23 <.001
itrf_E_9 0.43 0.41 0.45 0.01 42.55 <.001
itrf_E_10 0.33 0.31 0.35 0.01 38.20 <.001
itrf_E_11 0.31 0.29 0.32 0.01 37.13 <.001
itrf_E_12 0.48 0.46 0.50 0.01 44.92 <.001
itrf_E_13 0.29 0.28 0.31 0.01 36.59 <.001
itrf_E_16 0.52 0.50 0.55 0.01 47.22 <.001
itrf_I_1 0.45 0.43 0.47 0.01 41.30 <.001
itrf_I_4 0.26 0.24 0.28 0.01 31.64 <.001
itrf_I_5 0.34 0.32 0.35 0.01 35.78 <.001
itrf_I_6 0.37 0.35 0.39 0.01 37.55 <.001
itrf_I_13 0.70 0.67 0.72 0.01 58.43 <.001
itrf_I_14 0.71 0.69 0.74 0.01 60.45 <.001
itrf_I_16 0.50 0.48 0.53 0.01 44.39 <.001
itrf_I_24 0.69 0.67 0.71 0.01 57.81 <.001
itrf_I_2 0.59 0.57 0.62 0.01 49.12 <.001
itrf_I_7 0.53 0.51 0.55 0.01 45.04 <.001
itrf_I_8 0.59 0.57 0.61 0.01 48.85 <.001
itrf_I_9 0.51 0.49 0.53 0.01 43.87 <.001
itrf_I_10 0.68 0.65 0.70 0.01 55.87 <.001
itrf_I_11 0.68 0.66 0.71 0.01 56.36 <.001
itrf_I_12 0.64 0.61 0.66 0.01 52.34 <.001
itrf_I_15 0.51 0.49 0.54 0.01 44.02 <.001
itrf_I_17 0.62 0.60 0.64 0.01 51.01 <.001
itrf_I_19 0.63 0.60 0.65 0.01 51.52 <.001
itrf_I_23 0.47 0.45 0.49 0.01 41.42 <.001
Anxious_Depressed 1.00 1.00 1.00 0.00

Oppositional_Disruptive 1.00 1.00 1.00 0.00

Socially_Withdrawn 1.00 1.00 1.00 0.00

Academic_Productivity_Disorganization 1.00 1.00 1.00 0.00

Modelfit
N parameters 78.00




Observations 4772.00




Χ²(588) / p-value 21213.65 0.00



CFI 0.81




TLI 0.80




RMSEA 0.09




RMSEA ci lower 0.08




SRMR 0.08




AIC 304933.08




BIC 305437.78




Note. The estimation method employed in this analysis is Maximum Likelihood (ML). The Non-Linear Minimization, Bounded (nlminb) algorithm was applied for optimization. The analysis was performed with the lavaan package in R (Yves Rosseel, Terrence D. Jorgensen, Luc De Wilde, 2025).

Build scale scores

Now we will create scores for the internalizing and externalizing scales.

dat$itrf_ext <- score_scale(dat, scale == "ITRF" & subscale == "Ext", label = "Externalizing")
dat$itrf_int <- score_scale(dat, scale == "ITRF" & subscale == "Int", label = "Internalizing")

and get descriptives for those scores

dat[, c("itrf_ext", "itrf_int")] |> 
  rename_items() |> 
  wmisc::nice_descriptives(round = 1)
Table
Descriptive statistics
Variable Valid Missing Mean SD Min Max Range Median MAD
Externalizing 4776 0 0.6 0.6 0 3.0 3.0 0.4 0.5
Internalizing 4772 4 0.3 0.4 0 2.6 2.6 0.2 0.3
Note. MAD is the median average deviation with a consistency adjustment.

Look up norms from a norm table

Many scales come with norm tables to convert raw scores to t-scores, percentile ranks, etc.

The lookup_norms function helps with this conversion.

Firstly, you need a data frame (or Excel table etc) which includes raw-scores and corresponding norm-scores.

Here is an example of such a table:

ex_normtable_int |> slice(1:10) |> wmisc::nice_table()
group raw T PR T_from_PR
all 0 42 26 43
all 1 43 35 46
all 2 44 42 48
all 3 46 49 50
all 4 47 55 51
all 5 48 60 52
all 6 49 64 54
all 7 50 68 55
all 8 52 72 56
all 9 53 75 57

Then we need raw-scores from a scale. If they do not exist, you may use the score_scales function to add sum scores. Therefore set the sum argument to TRUE. By setting max_na = 0, we do not allow missing values in any scale item:

dat$raw_int <- score_scale(dat, subscale == "Int", sum = TRUE, max_na = 0)
dat$raw_ext <- score_scale(dat, subscale == "Ext", sum = TRUE, max_na = 0)

Looks up T values:

dat$T_int <- lookup_norms(dat$raw_int, normtable = ex_normtable_int, to = "T")
dat$T_ext <- lookup_norms(dat$raw_ext, normtable = ex_normtable_ext, to = "T")

Or percentile ranks:

dat$PR_int <- lookup_norms(dat$raw_int, normtable = ex_normtable_int, to = "PR")
dat$PR_ext <- lookup_norms(dat$raw_ext, normtable = ex_normtable_ext, to = "PR")

Which can look like that:

Internalizing
Externalizing
Raw T PR Raw T PR
1 43 35 0 40 14
1 43 35 6 46 49
4 47 55 7 47 53
24 71 95 48 87 100
4 47 55 24 64 89
2 44 42 10 50 64
16 61 88 0 40 14
3 46 49 6 46 49
9 53 75 21 61 86
4 47 55 38 77 98