# one package below requires that you use devtools to install it manually:# first install devtools by# install.packages('devtools')library(RISEkbmRasch) # devtools::install_github("pgmj/RISEkbmRasch")library(grateful)library(ggrepel)library(car)library(kableExtra)library(readxl)library(tidyverse)library(eRm)library(mirt)library(psych)library(psychotree)library(matrixStats)library(reshape)library(knitr)library(patchwork)library(formattable) library(glue)library(haven)library(labelled)### optional libraries#library(TAM)#library(skimr)### some commands exist in multiple packages, here we define preferred ones that are frequently usedselect <- dplyr::selectcount <- dplyr::countrecode <- car::recoderename <- dplyr::rename
1 Background
Data from (Langer et al. 2024). More comments on that paper will be added later. The only other Item Response Theory-based paper found analyzing the AAQ-2 is (Ong et al. 2019) but they do a rather limited analysis using the Graded Response Model, for example omitting test of unidimensionality, local independence of items, and ordering of response categories.
Code
### import datadf.all <-read_spss("data/DataBase AAQ-II_Criteria_rev.sav")df <- df.all %>%select(starts_with("AAQ_II"),GenderBirth)dif.sex <- df$GenderBirthdf$GenderBirth <-NULLnames(df) <-paste0("aaq_",c(1:7))#val_labels(dif.sex)itemlabels <-data.frame(stringsAsFactors =FALSE,itemnr =paste0("aaq_",c(1:7)),item =c("My painful experiences and memories make it difficult for me to live a life that I would value","I’m afraid of my feelings","I worry about not being able to control my worries and feelings","My painful memories prevent me from having a fulfilling life","Emotions cause problems in my life","It seems like most people are handling their lives better than I am","Worries get in the way of my success" ))source("RISE_theme.R")
2 All items in the analysis
Code
RIlistitems(df)
itemnr
item
aaq_1
My painful experiences and memories make it difficult for me to live a life that I would value
aaq_2
I’m afraid of my feelings
aaq_3
I worry about not being able to control my worries and feelings
aaq_4
My painful memories prevent me from having a fulfilling life
aaq_5
Emotions cause problems in my life
aaq_6
It seems like most people are handling their lives better than I am
aaq_7
Worries get in the way of my success
3 Demographics
Code
RIdemographics(dif.sex, "Gender")
Gender
n
Percent
1
571
33.5
2
1134
66.5
Code
hist(df.all$Age, title ="Age distribution", xlab ="Age", main ="Age distribution", col ="lightblue")
Code
#abline(v = mean(df.all$Age), col = "red", lwd = 2)
Very young sample, and not a lot of variation in age either.
4 Descriptives of raw data
Response distribution for all items together.
Code
RIallresp(df)
Response category
Number of responses
Percent
1
2541
21.3
2
2263
19.0
3
1443
12.1
4
2241
18.8
5
1525
12.8
6
1046
8.8
7
857
7.2
NA
19
0.2
We need the response scale to start at 0 instead of 1 for the Rasch analysis to work correctly.
Data is skewed towards the lower end of the scale. Most item responses are a bit oddly distributed. Response category 2 is consistently deviating from the expected (normal-ish) pattern. Makes one wonder about the data collection and the response category wording used. Let us check.
Code
val_labels(df.all$AAQ_II_1)
Nunca es verdad para mi Muy raramente es verdad para mi
1 2
Raramente es verdad para mi A veces es verdad para mi
3 4
Frecuentemente es verdad para mi Casi siempre es verdad para mi
5 6
Siempre es verdad para mi
7
My Spanish is not great, so here is the Google Translated version:
The eRm package, which uses Conditional Maximum Likelihood (CML) estimation, will be used primarily. For this analysis, the Partial Credit Model will be used.
itemnr
item
aaq_1
My painful experiences and memories make it difficult for me to live a life that I would value
aaq_2
I’m afraid of my feelings
aaq_3
I worry about not being able to control my worries and feelings
aaq_4
My painful memories prevent me from having a fulfilling life
aaq_5
Emotions cause problems in my life
aaq_6
It seems like most people are handling their lives better than I am
Values highlighted in red are above the chosen cutoff 0.5 logits. Background color brown and blue indicate the lowest and highest values among the DIF groups.
Item 5 has somewhat low item fit. PCA of residuals is below 2, but residual correlations show several issues. The strongest correlation is between items 1 and 4, followed by 2 and 3. Items 5 and 7 are also above the cutoff.
The tests for DIF indicates possible issues (p < .05), but the DIF table shows that the differences are small, at least on the item average level. The p-value is small due to the large sample.
All items show disordered response categories. The pattern is most clearly seen in the targeting figure. The consistency observed, with threshold 3 below threhold 2 for all items, makes one wonder if there was a mistake in the coding of responses to numbers, but we don’t know if the supplied datafile is the original.
We have several item pairs that correlate above the relative cutoff of 0.2 and will deal with them one at a time. The strongest correlation is between items 1 and 4.
item 1: My painful experiences and memories make it difficult for me to live a life that I would value
item 4: My painful memories prevent me from having a fulfilling life
These items are very much alike, apart from “fulfilling life” vs “a life that I would value” and item 1 adding “experiences” to “memories”, but even these differences are very similar. It’s not surprising that they correlate strongly.
Item 1 has better separation of item response thresholds. We’ll remove item 4.
Code
df2 <- df2 %>%select(!aaq_4)
6 Rasch analysis 2
itemnr
item
aaq_1
My painful experiences and memories make it difficult for me to live a life that I would value
aaq_2
I’m afraid of my feelings
aaq_3
I worry about not being able to control my worries and feelings
aaq_4
My painful memories prevent me from having a fulfilling life
aaq_5
Emotions cause problems in my life
aaq_6
It seems like most people are handling their lives better than I am
Relative cut-off value (highlighted in red) is -0.026, which is 0.2 above the average correlation.
Code
RIloadLoc(df2)
Code
# increase fig-height above as needed, if you have many itemsRItargeting(df2)
Code
RIitemHierarchy(df2)
There is a large gap in targeting due to the dysfunctional response categories that needed merging. Item 5 is low in item fit, “Emotions cause problems in my life”. It is a very general item, which may explain the low fit. We’ll keep it for now.
8 DIF-analysis
8.1 Gender
itemnr
item
aaq_1
My painful experiences and memories make it difficult for me to live a life that I would value
aaq_2
I’m afraid of my feelings
aaq_5
Emotions cause problems in my life
aaq_6
It seems like most people are handling their lives better than I am
Values highlighted in red are above the chosen cutoff 0.5 logits. Background color brown and blue indicate the lowest and highest values among the DIF groups.
Code
RIdifFigureLR(df2, dif.sex)
Code
RIdifThreshTblLR(df2, dif.sex)
Threshold locations
Standard errors
Item threshold
Male
Female
MaxDiff
All
SE_Male
SE_Female
SE_All
aaq_1
c1
-1.606
-1.394
0.212
-1.469
0.129
0.097
0.078
c2
0.693
0.656
0.037
0.673
0.129
0.091
0.074
c3
1.207
1.265
0.058
1.255
0.181
0.118
0.099
c4
2.296
1.555
0.741
1.709
0.313
0.148
0.132
c5
1.592
2.301
0.709
2.185
0.417
0.199
0.179
aaq_2
c1
-1.873
-1.678
0.195
-1.749
0.140
0.110
0.086
c2
0.373
0.138
0.235
0.223
0.128
0.092
0.074
c3
0.984
0.884
0.1
0.912
0.170
0.105
0.089
c4
1.289
1.466
0.177
1.428
0.221
0.128
0.111
c5
2.412
2.329
0.083
2.361
0.344
0.181
0.160
aaq_5
c1
-1.77
-1.763
0.007
-1.762
0.138
0.108
0.085
c2
0.338
0.41
0.072
0.389
0.126
0.093
0.075
c3
1.053
0.923
0.13
0.969
0.168
0.112
0.093
c4
1.754
1.377
0.377
1.469
0.250
0.135
0.118
c5
1.82
1.696
0.124
1.724
0.347
0.161
0.146
aaq_6
c1
-1.829
-1.658
0.171
-1.719
0.145
0.112
0.088
c2
0.239
0.18
0.059
0.203
0.138
0.100
0.081
c3
0.421
0.617
0.196
0.558
0.164
0.113
0.093
c4
1.014
0.948
0.066
0.977
0.191
0.125
0.104
c5
1.659
1.266
0.393
1.360
0.248
0.133
0.116
aaq_7
c1
-2.002
-1.979
0.023
-1.983
0.146
0.116
0.090
c2
0.332
0.129
0.203
0.198
0.132
0.092
0.075
c3
0.597
0.933
0.336
0.834
0.162
0.108
0.090
c4
1.491
1.246
0.245
1.320
0.216
0.130
0.111
c5
1.548
1.665
0.117
1.649
0.283
0.153
0.135
Note:
Values highlighted in red are above the chosen cutoff 0.5 logits. Background color brown and blue indicate the lowest and highest values among the DIF groups.
Code
RIdifThreshFigLR(df2, dif.sex)
Item 1 shows a surprising pattern for the two highest threholds.
9 Other metrics
9.1 Reliability
Code
RItif(df2)
9.2 Person location & infit ZSTD
Code
RIpfit(df)
9.3 Item parameters
Code
RIitemparams(df2)
Threshold 1
Threshold 2
Threshold 3
Threshold 4
Threshold 5
Item location
aaq_1
-1.47
0.67
1.25
1.71
2.18
0.87
aaq_2
-1.75
0.22
0.91
1.43
2.36
0.64
aaq_5
-1.76
0.39
0.97
1.47
1.72
0.56
aaq_6
-1.72
0.20
0.56
0.98
1.36
0.28
aaq_7
-1.98
0.20
0.83
1.32
1.65
0.4
Note:
Item location is the average of the thresholds for each item.
9.4 Transformation table
Code
RIscoreSE(df2)
Ordinal sum score
Logit score
Logit std.error
0
-4.000
1.487
1
-2.862
0.963
2
-2.172
0.793
3
-1.639
0.705
4
-1.182
0.643
5
-0.789
0.587
6
-0.460
0.537
7
-0.189
0.495
8
0.038
0.461
9
0.234
0.435
10
0.407
0.415
11
0.566
0.401
12
0.714
0.391
13
0.855
0.384
14
0.992
0.381
15
1.128
0.382
16
1.265
0.385
17
1.407
0.392
18
1.555
0.404
19
1.714
0.422
20
1.890
0.446
21
2.093
0.482
22
2.338
0.537
23
2.661
0.627
24
3.148
0.803
25
4.000
1.245
9.5 Ordinal/interval figure
Code
RIscoreSE(df2, output ="figure")
10 Summary of Rasch analysis
Two item pairs had strongly correlated residuals, and all items had issues with disordered thresholds related to the second lowest response category. There was no DIF for gender at birth.
Targeting is not great. Items have rather similar locations, and there is a large gap where the average person location is. If the intended use is clinical, there is decent reliability for those with above average locations.
11 Comparison of latent scores
The “standard” way to use the AAQ-2 is to sum/average the items, after recoding item categories to numerics. We will compare this to the Rasch estimated latent scores.
For ordinal sum scores, we will use the original 7 items with their original 7 response categories, even though the analysis does not support this. We do this to illustrate the difference that can be hidden behind data that is unjustifiedly sum scored and data based on psychometric analysis.
ggplot(df.viz, aes(x = latentscore, y = sumscore)) +geom_point() +labs(x ="Rasch estimated latent score", y ="Ordinal sum score") +theme_rise()
Code
pfit <-RIpfit(df2, output ="dataframe") %>% janitor::clean_names() %>%rename(id = rownumber)df.viz <-left_join(df.viz, pfit, by ="id")df.viz <- df.viz %>%mutate(person_fit =ifelse(person_infit_zstd >2| person_infit_zstd <-2, "outlier", "normal"))ggplot(df.viz, aes(x = latentscore, y = sumscore, color = person_fit)) +geom_point() +labs(x ="Rasch estimated latent score", y ="Ordinal sum score") +theme_rise() +scale_color_viridis_d() +geom_smooth(method ="loess", color ="darkgrey")
Code
hist(df.viz$latentscore, breaks =20, main ="Histogram of Rasch estimated (WL) latent scores", col ="lightpink")
Code
hist(df.viz$sumscore, breaks =20, main ="Histogram of ordinal sum scores", col ="lightblue")mtext(text ="Note: this is based on the original 7 items and 7 response categories.", side =3, line =0.4)
Huge residual correlations between items 1 and 4, and 2 and 3. Amusingly, GitHub Copilot now suggests to me that we add these to the model, which is a really bad practice that too often is used. Unidimensionality is a key assumption of this CFA, and we should not add items to the model just because they have high residual correlations.
Mair and Hatzinger (2007b); Mair and Hatzinger (2007a); Hatzinger and Rusch (2009); Rusch, Maier, and Hatzinger (2013); Koller, Maier, and Hatzinger (2015); Debelak and Koller (2019)
Trepte and Verbeet (2010); Strobl, Wickelmaier, and Zeileis (2011); Strobl, Kopf, and Zeileis (2015); Komboz, Zeileis, and Strobl (2018); Wickelmaier and Zeileis (2018)
Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2024. rmarkdown: Dynamic Documents for r. https://github.com/rstudio/rmarkdown.
Chalmers, R. Philip. 2012. “mirt: A Multidimensional Item Response Theory Package for the R Environment.”Journal of Statistical Software 48 (6): 1–29. https://doi.org/10.18637/jss.v048.i06.
Debelak, Rudolf, and Ingrid Koller. 2019. “Testing the Local Independence Assumption of the Rasch Model with Q3-Based Nonparametric Model Tests.”Applied Psychological Measurement 44. https://doi.org/10.1177/0146621619835501.
Koller, Ingrid, Marco Maier, and Reinhold Hatzinger. 2015. “An Empirical Power Analysis of Quasi-Exact Tests for the Rasch Model: Measurement Invariance in Small Samples.”Methodology 11. https://doi.org/10.1027/1614-2241/a000090.
Komboz, Basil, Achim Zeileis, and Carolin Strobl. 2018. “Tree-Based Global Model Tests for Polytomous Rasch Models.”Educational and Psychological Measurement 78 (1): 128–66. https://doi.org/10.1177/0013164416664394.
Langer, Álvaro I., Fernando P. Ponce, Jorge L. Ordóñez-Carrasco, Reiner Fuentes-Ferrada, Scarlett Mac-Ginty, Jorge Gaete, and Daniel Núñez. 2024. “Psychometric Evidence of the Acceptance and Action Questionnaire-II (AAQ-II): An Item Response Theory Analysis in University Students from Chile.”BMC Psychology 12 (1): 111. https://doi.org/10.1186/s40359-024-01608-w.
Mair, Patrick, and Reinhold Hatzinger. 2007a. “CML Based Estimation of Extended Rasch Models with the eRm Package in r.”Psychology Science 49. https://doi.org/10.18637/jss.v020.i09.
———. 2007b. “Extended Rasch Modeling: The eRm Package for the Application of IRT Models in r.”Journal of Statistical Software 20. https://doi.org/10.18637/jss.v020.i09.
Ong, Clarissa W., Benjamin G. Pierce, Douglas W. Woods, Michael P. Twohig, and Michael E. Levin. 2019. “The Acceptance and Action Questionnaire II: An Item Response Theory Analysis.”Journal of Psychopathology and Behavioral Assessment 41 (1): 123–34. https://doi.org/10.1007/s10862-018-9694-2.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Rosseel, Yves. 2012. “lavaan: An R Package for Structural Equation Modeling.”Journal of Statistical Software 48 (2): 1–36. https://doi.org/10.18637/jss.v048.i02.
Rusch, Thomas, Marco Maier, and Reinhold Hatzinger. 2013. “Linear Logistic Models with Relaxed Assumptions in r.” In Algorithms from and for Nature and Life, edited by Berthold Lausen, Dirk van den Poel, and Alfred Ultsch. Studies in Classification, Data Analysis, and Knowledge Organization. New York: Springer. https://doi.org/10.1007/978-3-319-00035-0_34.
Strobl, Carolin, Julia Kopf, and Achim Zeileis. 2015. “Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model.”Psychometrika 80 (2): 289–316. https://doi.org/10.1007/s11336-013-9388-3.
Strobl, Carolin, Florian Wickelmaier, and Achim Zeileis. 2011. “Accounting for Individual Differences in Bradley-Terry Models by Means of Recursive Partitioning.”Journal of Educational and Behavioral Statistics 36 (2): 135–53. https://doi.org/10.3102/1076998609359791.
Thériault, Rémi. 2023. “lavaanExtra: Convenience Functions for Lavaan.”Journal of Open Source Software 8 (90): 5701. https://doi.org/10.21105/joss.05701.
Trepte, Sabine, and Markus Verbeet, eds. 2010. Allgemeinbildung in Deutschland – Erkenntnisse Aus Dem SPIEGELStudentenpisa-Test. Wiesbaden: VS Verlag.
Wickelmaier, Florian, and Achim Zeileis. 2018. “Using Recursive Partitioning to Account for Parameter Heterogeneity in Multinomial Processing Tree Models.”Behavior Research Methods 50 (3): 1217–33. https://doi.org/10.3758/s13428-017-0937-z.
Wickham, Hadley. 2007. “Reshaping Data with the Reshape Package.”Journal of Statistical Software 21 (12). https://www.jstatsoft.org/v21/i12/.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.”Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
William Revelle. 2024. psych: Procedures for Psychological, Psychometric, and Personality Research. Evanston, Illinois: Northwestern University. https://CRAN.R-project.org/package=psych.
Xie, Yihui. 2014. “knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC.
———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.
———. 2023. knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
---title: "AAQ-2 Psychometric Analysis with Rasch Measurement Theory"title-block-banner: "#009ca6"title-block-banner-color: "#FFFFFF"author: name: Magnus Johansson affiliation: RISE Research Institutes of Sweden affiliation-url: https://www.ri.se/en/kbm orcid: 0000-0003-1669-592Xdate: last-modifieddate-format: isorepo-url: https://github.com/pgmj/aaq2_raschformat: html: toc: true toc-depth: 3 toc-title: "Table of contents" embed-resources: true standalone: true page-layout: full mainfont: 'Lato' monofont: 'Roboto Mono' code-overflow: wrap code-fold: true code-tools: true number-sections: true fig-dpi: 96 layout-align: left linestretch: 1.6 theme: - materia - custom.scss css: styles.css license: CC BYexecute: echo: true warning: false message: false cache: trueeditor_options: markdown: wrap: 72 chunk_output_type: consolebibliography: - grateful-refs.bib - references.bib---```{r}#| label: setup# one package below requires that you use devtools to install it manually:# first install devtools by# install.packages('devtools')library(RISEkbmRasch) # devtools::install_github("pgmj/RISEkbmRasch")library(grateful)library(ggrepel)library(car)library(kableExtra)library(readxl)library(tidyverse)library(eRm)library(mirt)library(psych)library(psychotree)library(matrixStats)library(reshape)library(knitr)library(patchwork)library(formattable) library(glue)library(haven)library(labelled)### optional libraries#library(TAM)#library(skimr)### some commands exist in multiple packages, here we define preferred ones that are frequently usedselect <- dplyr::selectcount <- dplyr::countrecode <- car::recoderename <- dplyr::rename```## BackgroundData from [@langer2024]. More comments on that paper will be addedlater. The only other Item Response Theory-based paper found analyzingthe AAQ-2 is [@ong2019] but they do a rather limited analysis using theGraded Response Model, for example omitting test of unidimensionality,local independence of items, and ordering of response categories.```{r}### import datadf.all <-read_spss("data/DataBase AAQ-II_Criteria_rev.sav")df <- df.all %>%select(starts_with("AAQ_II"),GenderBirth)dif.sex <- df$GenderBirthdf$GenderBirth <-NULLnames(df) <-paste0("aaq_",c(1:7))#val_labels(dif.sex)itemlabels <-data.frame(stringsAsFactors =FALSE,itemnr =paste0("aaq_",c(1:7)),item =c("My painful experiences and memories make it difficult for me to live a life that I would value","I’m afraid of my feelings","I worry about not being able to control my worries and feelings","My painful memories prevent me from having a fulfilling life","Emotions cause problems in my life","It seems like most people are handling their lives better than I am","Worries get in the way of my success" ))source("RISE_theme.R")```## All items in the analysis```{r}RIlistitems(df)```## Demographics```{r}#| layout-ncol: 2RIdemographics(dif.sex, "Gender")``````{r}hist(df.all$Age, title ="Age distribution", xlab ="Age", main ="Age distribution", col ="lightblue")#abline(v = mean(df.all$Age), col = "red", lwd = 2)```Very young sample, and not a lot of variation in age either.## Descriptives of raw dataResponse distribution for all items together.```{r}#| tbl-cap: "Total number of responses for all items"RIallresp(df)```We need the response scale to start at 0 instead of 1 for the Raschanalysis to work correctly.```{r}df <- df %>%mutate(across(starts_with("aaq_"), ~ .x -1))``````{r}allresp <-data.frame(Response.category =c(0L, 1L, 2L, 3L, 4L, 5L, 6L, NA),Number.of.responses =c(2541L, 2263L, 1443L, 2241L, 1525L, 1046L, 857L, 19L),Percent =c(21.3, 19, 12.1, 18.8, 12.8, 8.8, 7.2, 0.2))ggplot(allresp, aes(x = Response.category, y = Percent)) +geom_bar(stat ="identity", fill ="#009ca6") +geom_text(aes(label = Number.of.responses), vjust =-0.5, size =3) +labs(title ="Response distribution for all items",x ="Response category",y ="% of responses") +scale_x_continuous(breaks =c(0:6)) +theme_rise()```### Descriptives - item level```{r}#| column: margin#| echo: falseRIlistItemsMargin(df, fontsize =12)```::: panel-tabset#### Tile plot```{r}RItileplot(df)```#### Stacked bars```{r}RIbarstack(df)```#### Barplots```{r}#| layout-ncol: 2RIbarplot(df)```:::Data is skewed towards the lower end of the scale. Most item responsesare a bit oddly distributed. Response category 2 is consistentlydeviating from the expected (normal-ish) pattern. Makes one wonder aboutthe data collection and the response category wording used. Let uscheck.```{r}val_labels(df.all$AAQ_II_1)```My Spanish is not great, so here is the Google Translated version:1. It's never true for me2. It's very rarely true for me.3. It's rarely true for me4. Sometimes it's true for me5. It's often true for me.6. It's almost always true for me.7. It's always true for me### Missing data::: panel-tabset#### Missing responses/item```{r}RImissing(df)```#### Missing responses/person```{r}RImissingP(df, n =20)```:::One respondent with 100% missing, and 12 with 1 item missing. We'llremove all respondents with missing data since we have a large dataset.```{r}df <-na.omit(df)dif.sex <- df.all %>%select(GenderBirth,starts_with("AAQ")) %>%na.omit() %>%pull(GenderBirth)dif.sex <-factor(dif.sex, levels =c(1,2),labels =c("Male","Female"))```## Rasch analysis 1The eRm package, which uses Conditional Maximum Likelihood (CML)estimation, will be used primarily. For this analysis, the PartialCredit Model will be used.```{r}#| column: margin#| echo: falseRIlistItemsMargin(df, fontsize =13)```::: panel-tabset### Item fit```{r}RIitemfitPCM2(df, 250, 16 )```### PCA```{r}#| tbl-cap: "PCA of Rasch model residuals"RIpcmPCA(df)```### Residual correlations```{r}RIresidcorr(df, cutoff =0.2)```### 1st contrast loadings```{r}RIloadLoc(df)```### Response categories```{r}#| layout-ncol: 2RIitemCats(df)```### Targeting```{r}#| fig-height: 5# increase fig-height above as needed, if you have many itemsRItargeting(df)```### Item hierarchy```{r}#| fig-height: 5RIitemHierarchy(df)```### DIF 1```{r}erm.df <-PCM(df)LRtest(erm.df, dif.sex)RIdifTableLR(df, dif.sex)```:::Item 5 has somewhat low item fit. PCA of residuals is below 2, butresidual correlations show several issues. The strongest correlation isbetween items 1 and 4, followed by 2 and 3. Items 5 and 7 are also abovethe cutoff.The tests for DIF indicates possible issues (p \< .05), but the DIFtable shows that the differences are small, at least on the item average level. The p-value is small due tothe large sample.All items show disordered response categories. The pattern is mostclearly seen in the targeting figure. The consistency observed, withthreshold 3 below threhold 2 for all items, makes one wonder if therewas a mistake in the coding of responses to numbers, but we don't knowif the supplied datafile is the original.We'll merge categories 1 and 2:- It's very rarely true for me.- It's rarely true for me```{r}df2 <- df %>%mutate(across(starts_with("aaq_"), ~recode(., "2=1;3=2;4=3;5=4;6=5")))```### 1 and 2 merged```{r}mirt(df2, model=1, itemtype='Rasch', verbose =FALSE) %>%plot(type="trace", as.table =TRUE, theta_lim =c(-6,6))```Let's also try merging 2 and 3 for comparison.```{r}df2 <- df %>%mutate(across(starts_with("aaq_"), ~recode(., "3=2;4=3;5=4;6=5")))```### 2 and 3 merged instead```{r}mirt(df2, model=1, itemtype='Rasch', verbose =FALSE) %>%plot(type="trace", as.table =TRUE, theta_lim =c(-6,6))```It seems 1 and 2 works better, as expected.```{r}df2 <- df %>%mutate(across(starts_with("aaq_"), ~recode(., "2=1;3=2;4=3;5=4;6=5")))```### Residual correlationsWe have several item pairs that correlate above the relative cutoff of0.2 and will deal with them one at a time. The strongest correlation isbetween items 1 and 4.- item 1: My painful experiences and memories make it difficult for me to live a life that I would value- item 4: My painful memories prevent me from having a fulfilling lifeThese items are very much alike, apart from "fulfilling life" vs "a lifethat I would value" and item 1 adding "experiences" to "memories", buteven these differences are very similar. It's not surprising that theycorrelate strongly.Item 1 has better separation of item response thresholds. We'll removeitem 4.```{r}df2 <- df2 %>%select(!aaq_4)```## Rasch analysis 2```{r}#| column: margin#| echo: falseRIlistItemsMargin(df, fontsize =13)```::: panel-tabset### Item fit```{r}RIitemfitPCM2(df2, 250, 16)```### PCA```{r}#| tbl-cap: "PCA of Rasch model residuals"RIpcmPCA(df2)```### Residual correlations```{r}RIresidcorr(df2, cutoff =0.2)```### 1st contrast loadings```{r}RIloadLoc(df2)```### Targeting```{r}#| fig-height: 5# increase fig-height above as needed, if you have many itemsRItargeting(df2)```### Item hierarchy```{r}#| fig-height: 5RIitemHierarchy(df2)```:::Items 2 and 3 still correlate quite strongly.- item 2: I'm afraid of my feelings- item 3: I worry about not being able to control my worries and feelingsItem 3 has worse fit, and we'll remove it. It is also a dual question,"worries and feelings", which can explain the item fit.```{r}df2 <- df2 %>%select(!aaq_3)```## Rasch analysis 3```{r}#| column: margin#| echo: falseRIlistItemsMargin(df, fontsize =13)```::: panel-tabset### Item fit```{r}RIitemfitPCM2(df2, 250, 16)```### PCA```{r}#| tbl-cap: "PCA of Rasch model residuals"RIpcmPCA(df2)```### Residual correlations```{r}RIresidcorr(df2, cutoff =0.2)```### 1st contrast loadings```{r}RIloadLoc(df2)```### Targeting```{r}#| fig-height: 5# increase fig-height above as needed, if you have many itemsRItargeting(df2)```### Item hierarchy```{r}#| fig-height: 5RIitemHierarchy(df2)```:::There is a large gap in targeting due to the dysfunctional responsecategories that needed merging. Item 5 is low in item fit, "Emotionscause problems in my life". It is a very general item, which may explainthe low fit. We'll keep it for now.## DIF-analysis### Gender```{r}#| column: margin#| echo: falseRIlistItemsMargin(df2, fontsize =13)```::: panel-tabset#### Average locations```{r}RIdifTableLR(df2, dif.sex)RIdifFigureLR(df2, dif.sex)```#### Threshold locations```{r}RIdifThreshTblLR(df2, dif.sex)RIdifThreshFigLR(df2, dif.sex)```:::Item 1 shows a surprising pattern for the two highest threholds.## Other metrics### Reliability```{r}RItif(df2)```### Person location & infit ZSTD```{r}RIpfit(df)```### Item parameters```{r}RIitemparams(df2)```### Transformation table```{r}RIscoreSE(df2)```### Ordinal/interval figure```{r}RIscoreSE(df2, output ="figure")```## Summary of Rasch analysisTwo item pairs had strongly correlated residuals, and all items hadissues with disordered thresholds related to the second lowest responsecategory. There was no DIF for gender at birth.Targeting is not great. Items have rather similar locations, and thereis a large gap where the average person location is. If the intended useis clinical, there is decent reliability for those with above averagelocations.## Comparison of latent scoresThe "standard" way to use the AAQ-2 is to sum/average the items, afterrecoding item categories to numerics. We will compare this to the Raschestimated latent scores.For ordinal sum scores, we will use the original 7 items with theiroriginal 7 response categories, even though the analysis does notsupport this. We do this to illustrate the difference that can be hiddenbehind data that is unjustifiedly sum scored and data based onpsychometric analysis.```{r}df.viz <- df %>%na.omit() %>%mutate(sumscore =rowSums(across(starts_with("aaq_")))) %>%mutate(sumscore =as.numeric(sumscore)) %>%rownames_to_column("id") %>%mutate(id =as.numeric(id))df.viz$latentscore <-RIestThetas2(df2)``````{r}ggplot(df.viz, aes(x = latentscore, y = sumscore)) +geom_point() +labs(x ="Rasch estimated latent score", y ="Ordinal sum score") +theme_rise()``````{r}pfit <-RIpfit(df2, output ="dataframe") %>% janitor::clean_names() %>%rename(id = rownumber)df.viz <-left_join(df.viz, pfit, by ="id")df.viz <- df.viz %>%mutate(person_fit =ifelse(person_infit_zstd >2| person_infit_zstd <-2, "outlier", "normal"))ggplot(df.viz, aes(x = latentscore, y = sumscore, color = person_fit)) +geom_point() +labs(x ="Rasch estimated latent score", y ="Ordinal sum score") +theme_rise() +scale_color_viridis_d() +geom_smooth(method ="loess", color ="darkgrey")``````{r}hist(df.viz$latentscore, breaks =20, main ="Histogram of Rasch estimated (WL) latent scores", col ="lightpink")``````{r}hist(df.viz$sumscore, breaks =20, main ="Histogram of ordinal sum scores", col ="lightblue")mtext(text ="Note: this is based on the original 7 items and 7 response categories.", side =3, line =0.4)```## CFALet's make a brief comparison.```{r}library(lavaan)library(lavaanExtra)```Here is some info about the CFA methodology:<https://pgmj.github.io/ki_irt_mhcsf/cfa.html> It will be copied to thisdocument later.We will use both WLSMV and ML/MLR estimators.```{r}# Define latent variableslatent <-list(aaq2 =names(df))# Write the modelcfa.model <-write_lavaan(latent = latent)fit.wlsmv <-cfa(model = cfa.model, data = df,estimator ="WLSMV",ordered =TRUE,rotation ="oblimin")fit.ml <-cfa(model = cfa.model, data = df,estimator ="ML",rotation ="oblimin")fit.mlr <-cfa(model = cfa.model, data = df,estimator ="MLR",rotation ="oblimin")``````{r}fit_metrics_robust <-c("chisq.scaled", "df", "pvalue.scaled", "cfi.robust", "tli.robust", "rmsea.robust", "rmsea.ci.lower.robust","rmsea.ci.upper.robust","srmr")mfit1 <-fitmeasures(fit.wlsmv, fit_metrics_robust) %>%rbind() %>%as.data.frame() %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%rename(Chi2 = chisq.scaled,p = pvalue.scaled,CFI = cfi.robust,TLI = tli.robust,RMSEA = rmsea.robust,CI_low = rmsea.ci.lower.robust,CI_high = rmsea.ci.upper.robust,SRMR = srmr) %>%add_column(Model ="WLSMV", .before ="Chi2")``````{r}fit_metrics.ml <-c("chisq", "df", "pvalue", "cfi", "tli", "rmsea", "rmsea.ci.lower","rmsea.ci.upper","srmr")mfit2 <-fitmeasures(fit.ml, fit_metrics.ml) %>%rbind() %>%as.data.frame() %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%rename(Chi2 = chisq,p = pvalue,CFI = cfi,TLI = tli,RMSEA = rmsea,CI_low = rmsea.ci.lower,CI_high = rmsea.ci.upper,SRMR = srmr) %>%add_column(Model ="ML", .before ="Chi2")``````{r}fit_metrics.mlr <-c("chisq.scaled", "df.scaled", "pvalue", "cfi.scaled", "tli.scaled", "rmsea.scaled", "rmsea.ci.lower.scaled","rmsea.ci.upper.scaled","srmr")mfit3 <-fitmeasures(fit.mlr, fit_metrics.mlr) %>%rbind() %>%as.data.frame() %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%rename(Chi2 = chisq.scaled,p = pvalue,df = df.scaled,CFI = cfi.scaled,TLI = tli.scaled,RMSEA = rmsea.scaled,CI_low = rmsea.ci.lower.scaled,CI_high = rmsea.ci.upper.scaled,SRMR = srmr) %>%add_column(Model ="MLR", .before ="Chi2")```### Model fit comparison```{r}mfit1 %>%bind_rows(mfit2) %>%bind_rows(mfit3) %>%rownames_to_column("remove") %>%select(!remove) %>%kbl_rise()```### Modification indicesWe'll look at the MLR output, since that model had the least bad fit.```{r}modificationIndices(fit.mlr,standardized = T) %>%as.data.frame(row.names =NULL) %>%filter(mi >30, op =="~~") %>%arrange(desc(mi)) %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%kbl_rise(fontsize =14, tbl_width =75)```Huge residual correlations between items 1 and 4, and 2 and 3.Amusingly, GitHub Copilot now suggests to me that we add these to themodel, which is a really bad practice that too often is used.Unidimensionality is a key assumption of this CFA, and we should not additems to the model just because they have high residual correlations.### CFA based on Rasch model```{r}# Define latent variableslatent <-list(aaq2 =names(df2))# Write the modelcfa.model <-write_lavaan(latent = latent)fit.wlsmv <-cfa(model = cfa.model, data = df2,estimator ="WLSMV",ordered =TRUE,rotation ="oblimin")fit_metrics_robust <-c("chisq.scaled", "df", "pvalue.scaled", "cfi.robust", "tli.robust", "rmsea.robust", "rmsea.ci.lower.robust","rmsea.ci.upper.robust","srmr")mfit1 <-fitmeasures(fit.wlsmv, fit_metrics_robust) %>%rbind() %>%as.data.frame() %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%rename(Chi2 = chisq.scaled,p = pvalue.scaled,CFI = cfi.robust,TLI = tli.robust,RMSEA = rmsea.robust,CI_low = rmsea.ci.lower.robust,CI_high = rmsea.ci.upper.robust,SRMR = srmr) %>%add_column(Model ="WLSMV", .before ="Chi2")kbl_rise(mfit1)``````{r}modificationIndices(fit.wlsmv,standardized = T) %>%as.data.frame(row.names =NULL) %>%#filter(mi > 30,# op == "~~") %>% arrange(desc(mi)) %>%mutate(across(where(is.numeric),~round(.x, 3))) %>%kbl_rise(fontsize =14, tbl_width =75)```## Software used```{r}pkgs <-cite_packages(cite.tidyverse =TRUE,output ="table",bib.file ="grateful-refs.bib",include.RStudio =TRUE,out.dir =getwd())formattable(pkgs, table.attr ='class=\"table table-striped\" style="font-size: 15px; font-family: Lato; width: 80%"')``````{r}sessionInfo()```## References