Computes conditional infit mean-square (MSQ) statistics for each item using
iarm::out_infit(), enriched with item locations relative to the sample
mean person location.
Arguments
- data
A data.frame or matrix of item responses. Items must be scored starting at 0 (non-negative integers). Missing values (
NA) are allowed, but at least one complete case (row with noNA) must be present.- cutoff
Optional. Default
NULL(no cutoff applied, behaviour is identical to the current version). Can be:The return value of
RMitemInfitCutoff(a list with$item_cutoffs): the data.frame is extracted automatically and the number of simulation iterations,cutoff_method, andhdci_widthare included in the kable caption.The
$item_cutoffsdata.frame fromRMitemInfitCutoffdirectly: must have columnsItem,infit_low, andinfit_high. When provided, adds columnsInfit_low,Infit_high, andFlagged(logical;TRUEwhenInfit_MSQfalls outside the credible range) to the result.
- output
Character string controlling the return value. Either
"kable"(default) for a formattedknitr::kable()table, or"dataframe"for the underlying data.frame.- sort
Optional character string. When
sort = "infit", rows are sorted byInfit_MSQin descending order before output.
Value
If
output = "kable": aknitr_kableobject (plain text table viaformat = "pipe") with columns "Item", "Infit MSQ", and "Relative location", and a caption showing the number of complete cases. Whencutoffis provided, columns "Infit low", "Infit high", and "Flagged" are also included, and the caption notes the simulation-based cutoffs.If
output = "dataframe": a data.frame with columnsItem,Infit_MSQ, andRelative_location. Whencutoffis provided, columnsInfit_low,Infit_high, andFlaggedare also included (inserted afterInfit_MSQ, beforeRelative_location).
Details
Infit MSQ is a weighted fit statistic that emphasises deviations near the
item location. Values close to 1.0 indicate good fit. Values substantially
above 1.0 suggest underfit (unexpected responses), while values substantially
below 1.0 suggest overfit (overly predictable responses). The definition of
"substantially" depends on several factors such as sample size, and needs to
be determined by simulation using RMitemInfitCutoff. There is no general
rule-of-thumb value that is correct.
Conditional infit MSQ statistics are computed via iarm::out_infit(), which
uses the conditional distribution of the sufficient statistics (Müller, 2020).
Only complete cases (rows without any NA) are used in the conditional fit
calculation.
For dichotomous data (maximum score = 1), a Rasch model is fitted via
eRm::RM(). Item locations are the negative beta parameters. Person
locations are estimated via eRm::person.parameter().
For polytomous data (maximum score > 1), a Partial Credit Model is
fitted via eRm::PCM(). Item average locations are taken from the
"Location" column of the threshold parameter table returned by
eRm::thresholds(); if that column is absent, row means of the threshold
columns are used instead. Person locations are estimated via
eRm::person.parameter().
Relative item location is defined as the item's average location minus the sample mean person location, providing a measure of item targeting.
The iarm package must be installed (it is in Suggests, not Imports).
References
Müller, M. (2020). Item fit statistics for Rasch analysis: Can we trust them? Journal of Statistical Distributions and Applications, 7(5). doi:10.1186/s40488-020-00108-7
Examples
# \donttest{
# Simulate binary item response data (5 items, 40 persons)
set.seed(42)
sim_data <- as.data.frame(
matrix(sample(0:1, 40 * 5, replace = TRUE), nrow = 40, ncol = 5)
)
colnames(sim_data) <- paste0("Item", 1:5)
# Default kable output
RMitemInfit(sim_data)
#>
#>
#> Table: MSQ values based on conditional estimation (n = 40 complete cases).
#>
#> |Item | Infit MSQ| Relative location|
#> |:-----|---------:|-----------------:|
#> |Item1 | 1.049| -0.03|
#> |Item2 | 0.929| -0.54|
#> |Item3 | 0.828| -0.23|
#> |Item4 | 1.216| 0.26|
#> |Item5 | 0.931| -0.76|
# Sorted by infit MSQ descending
RMitemInfit(sim_data, sort = "infit")
#>
#>
#> Table: MSQ values based on conditional estimation (n = 40 complete cases).
#>
#> |Item | Infit MSQ| Relative location|
#> |:-----|---------:|-----------------:|
#> |Item4 | 1.216| 0.26|
#> |Item1 | 1.049| -0.03|
#> |Item5 | 0.931| -0.76|
#> |Item2 | 0.929| -0.54|
#> |Item3 | 0.828| -0.23|
# Return as data.frame for further processing
df <- RMitemInfit(sim_data, output = "dataframe")
# }
# \donttest{
# Simulation-based cutoffs (100 Monte-Carlo iterations)
cutoff_res <- RMitemInfitCutoff(sim_data, iterations = 100, parallel = FALSE,
seed = 42)
RMitemInfit(sim_data, cutoff = cutoff_res)
#>
#>
#> Table: MSQ values based on conditional estimation (n = 40 complete cases). Cutoff values based on 99 simulation iterations (99.9% HDCI).
#>
#> |Item | Infit MSQ| Infit low| Infit high|Flagged | Relative location|
#> |:-----|---------:|---------:|----------:|:-------|-----------------:|
#> |Item1 | 1.049| 0.576| 1.489|FALSE | -0.03|
#> |Item2 | 0.929| 0.660| 1.537|FALSE | -0.54|
#> |Item3 | 0.828| 0.547| 1.374|FALSE | -0.23|
#> |Item4 | 1.216| 0.699| 1.349|FALSE | 0.26|
#> |Item5 | 0.931| 0.689| 1.315|FALSE | -0.76|
RMitemInfit(sim_data, cutoff = cutoff_res, output = "dataframe")
#> Item Infit_MSQ Infit_low Infit_high Flagged Relative_location
#> 1 Item1 1.049 0.576 1.489 FALSE -0.03
#> 2 Item2 0.929 0.660 1.537 FALSE -0.54
#> 3 Item3 0.828 0.547 1.374 FALSE -0.23
#> 4 Item4 1.216 0.699 1.349 FALSE 0.26
#> 5 Item5 0.931 0.689 1.315 FALSE -0.76
# }