Skip to contents

Integrates convergence, model fit, reliability, separation, element misfit, unexpected responses, category structure, connectivity, inter-rater agreement, and DIF/bias into a single pass/warn/fail report.

Usage

run_qc_pipeline(
  fit,
  diagnostics = NULL,
  threshold_profile = "standard",
  thresholds = NULL,
  rater_facet = NULL,
  include_bias = TRUE,
  bias_results = NULL
)

Arguments

fit

Output from fit_mfrm().

diagnostics

Output from diagnose_mfrm(). Computed automatically if NULL.

threshold_profile

Threshold preset: "strict", "standard" (default), or "lenient".

thresholds

Named list to override individual thresholds.

rater_facet

Character name of the rater facet for inter-rater check (auto-detected if NULL).

include_bias

If TRUE and bias available in diagnostics, check DIF/bias.

bias_results

Optional pre-computed bias results from estimate_bias().

Value

Object of class mfrm_qc_pipeline with verdicts, overall status, details, and recommendations.

Details

The pipeline evaluates 10 quality checks and assigns a verdict (Pass / Warn / Fail) to each. The overall status is the most severe verdict across all checks. Diagnostics are computed automatically via diagnose_mfrm() if not supplied.

Reliability and separation are used here as QC signals. In mfrmr, Reliability / Separation are model-based facet indices and RealReliability / RealSeparation provide more conservative lower bounds. For MML, these rely on model-based ModelSE values for non-person facets; for JML, they remain exploratory approximations.

Three threshold presets are available via threshold_profile:

Aspectstrictstandardlenient
Global fit warn1.31.51.7
Global fit fail1.52.02.5
Reliability pass0.900.800.70
Separation pass3.02.01.5
Misfit warn (pct)3510
Unexpected fail3510
Min cat count15105
Agreement pass605040
Bias fail (pct)51015

Individual thresholds can be overridden via the thresholds argument (a named list keyed by the internal threshold names shown above).

QC checks

The 10 checks are:

  1. Convergence: Did the model converge?

  2. Global fit: Infit/Outfit MnSq within the current review band.

  3. Reliability: Minimum non-person facet model reliability index.

  4. Separation: Minimum non-person facet model separation index.

  5. Element misfit: Percentage of elements with Infit/Outfit outside the current review band.

  6. Unexpected responses: Percentage of observations with large standardized residuals.

  7. Category structure: Minimum category count and threshold ordering.

  8. Connectivity: All observations in a single connected subset.

  9. Inter-rater agreement: Exact agreement percentage for the rater facet (if applicable).

  10. Functioning/Bias screen: Percentage of interaction cells that cross the screening threshold (if interaction results are available).

Interpreting output

  • $overall: character string "Pass", "Warn", or "Fail".

  • $verdicts: tibble with columns Check, Verdict, Value, and Threshold for each of the 10 checks.

  • $details: character vector of human-readable detail strings.

  • $raw_details: named list of per-check numeric details for programmatic access.

  • $recommendations: character vector of actionable suggestions for checks that did not pass.

  • $config: records the threshold profile and effective thresholds.

Typical workflow

  1. Fit a model: fit <- fit_mfrm(...).

  2. Optionally compute diagnostics and bias: diag <- diagnose_mfrm(fit); bias <- estimate_bias(fit, diag, ...).

  3. Run the pipeline: qc <- run_qc_pipeline(fit, diag, bias_results = bias).

  4. Check qc$overall for the headline verdict.

  5. Review qc$verdicts for per-check details.

  6. Follow qc$recommendations for remediation.

  7. Visualize with plot_qc_pipeline().

Examples

toy <- load_mfrmr_data("study1")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score",
                method = "JML", maxit = 25)
#> Warning: Optimizer did not fully converge (code = 1). Consider increasing maxit (current: 25) or relaxing reltol (current: 1e-06).
qc <- run_qc_pipeline(fit)
qc
#> --- QC Pipeline ---
#> Overall: Fail 
#> 
#>   [FAIL] Convergence               Model did NOT converge
#>   [PASS] Global Fit                Global Infit=0.997, Outfit=0.973
#>   [PASS] Reliability               Min non-person model reliability = 0.953
#>   [PASS] Separation                Min non-person model separation = 4.518
#>   [FAIL] Element Misfit            144 of 328 elements misfitting (43.9%)
#>   [FAIL] Unexpected Responses      5.4% unexpected responses
#>   [PASS] Category Structure        Thresholds ordered, min category count = 215
#>   [PASS] Connectivity              1 disjoint subset(s)
#>   [WARN] Inter-rater Agreement     Exact agreement = 36.2%
#>   [FAIL] Functioning/Bias Screen   80.0% of screened interactions crossed |screening t| > 2
#> 
#> Recommendations:
#>   - Model did not converge. Consider increasing maxit, simplifying the model, or checking data quality. 
#>   - Excessive element misfit detected. Review individual element fit statistics. 
#>   - High unexpected response rate. Inspect unexpected_response_table() for patterns. 
#>   - Many interaction cells were screen-positive. Review estimate_bias() or analyze_dff() before making substantive bias claims. 
summary(qc)
#> --- QC Pipeline Summary ---
#> Overall: Fail 
#> Pass: 5 | Warn: 1 | Fail: 4 | Skip: 0
#> 
#>                    Check Verdict                     Value
#>              Convergence    Fail                     FALSE
#>               Global Fit    Pass   Infit=1.00, Outfit=0.97
#>              Reliability    Pass                      0.95
#>               Separation    Pass                      4.52
#>           Element Misfit    Fail           144/328 (43.9%)
#>     Unexpected Responses    Fail                      5.4%
#>       Category Structure    Pass Ordered=Yes, MinCount=215
#>             Connectivity    Pass                         1
#>    Inter-rater Agreement    Warn                     36.2%
#>  Functioning/Bias Screen    Fail                     80.0%
#>                Threshold
#>         Converged = TRUE
#>             [0.50, 1.50]
#>   Pass>=0.80, Warn>=0.50
#>   Pass>=2.00, Warn>=1.00
#>       Pass<=5%, Fail>15%
#>        Pass<=2%, Fail>5%
#>      Ordered + count>=10
#>  Pass=1, Warn=2, Fail>=3
#>     Pass>=50%, Warn>=30%
#>       Pass<=0%, Fail>10%
#>                                                    Detail
#>                                    Model did NOT converge
#>                          Global Infit=0.997, Outfit=0.973
#>                  Min non-person model reliability = 0.953
#>                   Min non-person model separation = 4.518
#>                    144 of 328 elements misfitting (43.9%)
#>                                 5.4% unexpected responses
#>              Thresholds ordered, min category count = 215
#>                                      1 disjoint subset(s)
#>                                   Exact agreement = 36.2%
#>  80.0% of screened interactions crossed |screening t| > 2
#> 
#> Recommendations:
#>   - Model did not converge. Consider increasing maxit, simplifying the model, or checking data quality. 
#>   - Excessive element misfit detected. Review individual element fit statistics. 
#>   - High unexpected response rate. Inspect unexpected_response_table() for patterns. 
#>   - Many interaction cells were screen-positive. Review estimate_bias() or analyze_dff() before making substantive bias claims. 
qc$verdicts
#> # A tibble: 10 × 5
#>    Check                   Verdict Value                     Threshold    Detail
#>    <chr>                   <chr>   <chr>                     <chr>        <chr> 
#>  1 Convergence             Fail    FALSE                     Converged =… Model…
#>  2 Global Fit              Pass    Infit=1.00, Outfit=0.97   [0.50, 1.50] Globa…
#>  3 Reliability             Pass    0.95                      Pass>=0.80,… Min n…
#>  4 Separation              Pass    4.52                      Pass>=2.00,… Min n…
#>  5 Element Misfit          Fail    144/328 (43.9%)           Pass<=5%, F… 144 o…
#>  6 Unexpected Responses    Fail    5.4%                      Pass<=2%, F… 5.4% …
#>  7 Category Structure      Pass    Ordered=Yes, MinCount=215 Ordered + c… Thres…
#>  8 Connectivity            Pass    1                         Pass=1, War… 1 dis…
#>  9 Inter-rater Agreement   Warn    36.2%                     Pass>=50%, … Exact…
#> 10 Functioning/Bias Screen Fail    80.0%                     Pass<=0%, F… 80.0%…