Estimate legacy-compatible bias/interaction terms iteratively

Usage

estimate_bias(
  fit,
  diagnostics,
  facet_a = NULL,
  facet_b = NULL,
  interaction_facets = NULL,
  max_abs = 10,
  omit_extreme = TRUE,
  max_iter = 4,
  tol = 0.001
)

Arguments

fit: Output from fit_mfrm().
diagnostics: Output from diagnose_mfrm().
facet_a: First facet name.
facet_b: Second facet name.
interaction_facets: Character vector of two or more facets to model as one interaction effect. When supplied, this takes precedence over facet_a/facet_b.
max_abs: Bound for absolute bias size.
omit_extreme: Omit extreme-only elements.
max_iter: Iteration cap.
tol: Convergence tolerance.

Value

An object of class mfrm_bias with:

table: interaction rows with effect size, SE, screening t/p metadata, reporting-use flags, and fit columns
summary: compact summary statistics
chi_sq: fixed-effect chi-square style screening summary
facet_a, facet_b: first two analyzed facet names (legacy compatibility)
interaction_facets, interaction_order, interaction_mode: full interaction metadata
iteration: iteration history/metadata

Details

Bias (interaction) in MFRM refers to a systematic departure from the additive model: a specific rater-criterion (or higher-order) combination produces scores that are consistently higher or lower than predicted by the main effects alone. For example, Rater A might be unexpectedly harsh on Criterion 2 despite being lenient overall.

Mathematically, the bias term $b_{jc}$ for rater $j$ on criterion $c$ modifies the linear predictor:

$$\eta_{njc} = \theta_n - \delta_j - \beta_c - b_{jc}$$

The function estimates $b_{jc}$ from the residuals of the fitted (additive) model using iterative recalibration in a legacy-compatible style (Myford & Wolfe, 2003, 2004):

$$b_{jc} = \frac{\sum_n (X_{njc} - E_{njc})} {\sum_n \mathrm{Var}_{njc}}$$

Each iteration updates expected scores using the current bias estimates, then re-computes the bias. Convergence is reached when the maximum absolute change in bias estimates falls below tol.

For two-way mode, use facet_a and facet_b (or interaction_facets with length 2).
For higher-order mode, provide interaction_facets with length >= 3.

What this screening means

estimate_bias() summarizes interaction departures from the additive MFRM. It is best read as a targeted screening tool for potentially noteworthy cells or facet combinations that may merit substantive review.

What this screening does not justify

t and Prob. are screening metrics, not formal inferential quantities.
A flagged interaction cell is not, by itself, proof of rater bias or construct-irrelevant variance.
Non-flagged cells should not be over-read as evidence that interaction effects are absent.

Interpreting output

Use summary for global magnitude, then inspect table for cell-level interaction effects.

Prioritize rows with:

larger |Bias Size| (effect on logit scale; $> 0.5$ logits is typically noteworthy, $> 1.0$ is large)
larger |t| among the screening metrics ($|t| \ge 2$ suggests a screen-positive interaction cell)
smaller Prob. among the screening metrics

A positive Obs-Exp Average means the cell produced higher scores than the additive model predicts (unexpected leniency); negative means unexpected harshness.

iteration helps verify whether iterative recalibration stabilized. If the maximum change on the final iteration is still above tol, consider increasing max_iter.

Typical workflow

Fit and diagnose model.
Run estimate_bias(...) for target interaction facets.
Review summary(bias) and bias$table.
Visualize/report via plot_bias_interaction() and build_fixed_reports().

Interpreting key output columns

In bias$table, the most-used columns are:

Bias Size: estimated interaction effect $b_{jc}$ (logit scale)
t and Prob.: screening metrics, not formal inferential quantities
Obs-Exp Average: direction and practical size of observed-vs-expected gap on the raw-score metric

The chi_sq element provides a fixed-effect heterogeneity screen across all interaction cells.

Recommended next step

Use plot_bias_interaction() to inspect the flagged cells visually, then integrate the result with DFF, linking, or substantive scoring review before making formal claims about fairness or invariance.

Examples

toy <- load_mfrmr_data("example_bias")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 25)
diag <- diagnose_mfrm(fit, residual_pca = "none")
bias <- estimate_bias(fit, diag, facet_a = "Rater", facet_b = "Criterion", max_iter = 2)
summary(bias)
#> Many-Facet Rasch Bias Summary
#>   Interaction facets: Rater x Criterion | Cells: 16
#>   Order: 2 | Mode: pairwise
#>   Mean |Bias|: 0.31 | Max |Bias|: 1.103 | Screen-positive (p <= 0.050): 0
#> 
#> Fixed-effect chi-square
#>  FixedChiSq FixedDF FixedProb InferenceTier SupportsFormalInference
#>          NA      15        NA     screening                   FALSE
#>  FormalInferenceEligible PrimaryReportingEligible   ReportingUse
#>                    FALSE                    FALSE screening_only
#>                                 TestBasis InteractionFacets InteractionOrder
#>  conditional plug-in heterogeneity screen Rater x Criterion                2
#>  InteractionMode
#>         pairwise
#> 
#> Final iteration status
#>  Iteration MaxScoreResidual MaxScoreResidualPct MaxScoreResidualCategories
#>          2                0                  NA                         NA
#>  MaxLogitChange BiasCells
#>               0         0
#> 
#> Top |t| bias rows
#>                Pair Rater    Criterion Bias Size S.E.  t Prob. Obs-Exp Average
#>      R01 | Accuracy   R01     Accuracy     0.776   NA NA    NA              NA
#>       R01 | Content   R01      Content    -0.278   NA NA    NA              NA
#>      R01 | Language   R01     Language    -0.184   NA NA    NA              NA
#>  R01 | Organization   R01 Organization    -0.363   NA NA    NA              NA
#>      R02 | Accuracy   R02     Accuracy     0.246   NA NA    NA              NA
#>       R02 | Content   R02      Content    -0.031   NA NA    NA              NA
#>      R02 | Language   R02     Language    -0.209   NA NA    NA              NA
#>  R02 | Organization   R02 Organization    -0.023   NA NA    NA              NA
#>      R03 | Accuracy   R03     Accuracy    -0.055   NA NA    NA              NA
#>       R03 | Content   R03      Content     0.246   NA NA    NA              NA
#>  AbsT
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#> 
#> Notes
#>  - No immediate warnings from bias summary.
p_bias <- plot_bias_interaction(bias, draw = FALSE)
class(p_bias)
#> [1] "mfrm_plot_data" "list"