Build a rating-scale diagnostics report
Arguments
- fit
Output from
fit_mfrm().- diagnostics
Optional output from
diagnose_mfrm().- whexact
Use exact ZSTD transformation for category fit.
- drop_unused
If
TRUE, remove categories with zero count from the displayed category table;summaryandcaveatsstill retain the omitted score-support warning.
Value
A named list with:
category_table: category-level counts, expected counts, fit, and ZSTDthreshold_table: model step/threshold estimatessummary: one-row summary (usage and threshold monotonicity)caveats: structured score-support warning/review rowsdiagnostic_mode: character scalar carried fromdiagnostics$diagnostic_mode("legacy","both", or"marginal_fit"); used by downstream reporting helpers to pick the correct expected-count basismarginal_fit: list bundle fromdiagnostics$marginal_fitwhen strict marginal fit was computed, otherwiseNULL. Carries the raw OverallRMSD / OverallMaxAbsStdResidual / per-cell tables that feed theMarginalOverallRMSDcolumns insummary.
Details
This helper provides category usage/fit statistics and threshold summaries
for reviewing score-category functioning.
The category usage portion is a global observed-score screen. In PCM fits
with a step_facet, threshold diagnostics should be interpreted within each
StepFacet rather than as one pooled whole-scale verdict.
Typical checks:
sparse category usage (
Count,ExpectedCount)category fit (
Infit,Outfit,ZStd)threshold ordering within each
StepFacet(threshold_table$Estimate,GapFromPrev)
Interpreting output
Start with summary:
UsedCategoriesclose to totalCategoriessuggests that most score categories are represented in the observed data.very small
MinCategoryCountindicates potential instability.ThresholdMonotonic = FALSEindicates disordered thresholds within at least one threshold set. In PCM fits, inspectthreshold_tablebyStepFacetbefore drawing scale-wide conclusions.
Then inspect:
category_tablefor global category-level misfit/sparsity.threshold_tablefor adjacent-step gaps and ordering within eachStepFacet.
Typical workflow
Fit model:
fit_mfrm().Build diagnostics:
diagnose_mfrm().Run
rating_scale_table()and reviewsummary().Use
plot()to visualize category profile quickly.
Further guidance
For a plot-selection guide and a longer walkthrough, see
mfrmr_visual_diagnostics and
vignette("mfrmr-visual-diagnostics", package = "mfrmr").
Output columns
The category_table data.frame contains:
- Category
Score category value.
- Count, Percent
Observed count and percentage of total.
- AvgPersonMeasure
Mean person measure for respondents in this category.
- Infit, Outfit
Category-level fit statistics.
- InfitZSTD, OutfitZSTD
Standardized fit values.
- ExpectedCount, DiffCount
Expected count and observed-expected difference.
- LowCount
Logical;
TRUEif count is below minimum threshold.- InfitFlag, OutfitFlag, ZSTDFlag
Fit-based warning flags.
- ZeroCount, UnusedCategoryType, WeaklyIdentified, CategoryCaveat
Structured score-support caveats for retained zero-count categories.
The threshold_table data.frame contains:
- Step
Step label (e.g., "1-2", "2-3").
- Estimate
Estimated threshold/step difficulty (logits).
- StepFacet
Threshold family identifier when the fit uses facet-specific threshold sets.
- GapFromPrev
Difference from the previous threshold within the same
StepFacetwhen thresholds are facet-specific. Gaps below 1.4 logits may indicate category underuse; gaps above 5.0 may indicate wide unused regions (Linacre, 2002).- ThresholdMonotonic
Logical flag repeated within each threshold set. For PCM fits, read this within
StepFacet, not as a pooled item-bank verdict.- LowerCategory, UpperCategory, WeaklyIdentified, ThresholdCaveat
Adjacent score-category support metadata. Thresholds adjacent to retained zero-count categories are flagged for cautious interpretation.
References
Andrich, D. (1978). A rating formulation for ordered response categories. Psychometrika, 43(4), 561-573. doi:10.1007/BF02293814
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. doi:10.1007/BF02296272
Linacre, J. M. (2002). What do Infit and Outfit, mean-square and standardized mean? Rasch Measurement Transactions, 16(2), 878. (Source for the 0.5-1.5 mean-square acceptance band and the threshold-gap heuristics used in
summary(t8)$summary.)Wind, S. A. (2023). Detecting rating scale malfunctioning with the partial credit model and generalized partial credit model. Educational and Psychological Measurement, 83(5), 953-983. doi:10.1177/00131644221116292 (Recent simulation evidence on PCM- and GPCM-based rating-scale diagnostics; useful for interpreting the
summary(t8)$summaryflags in the boundedGPCMroute.)