Estimate legacy-compatible bias/interaction terms iteratively
Source:R/api-tables.R
estimate_bias.RdEstimate legacy-compatible bias/interaction terms iteratively
Usage
estimate_bias(
fit,
diagnostics,
facet_a = NULL,
facet_b = NULL,
interaction_facets = NULL,
max_abs = 10,
omit_extreme = TRUE,
max_iter = 4,
tol = 0.001
)Arguments
- fit
Output from
fit_mfrm().- diagnostics
Output from
diagnose_mfrm().- facet_a
First facet name.
- facet_b
Second facet name.
- interaction_facets
Character vector of two or more facets to model as one interaction effect. When supplied, this takes precedence over
facet_a/facet_b.- max_abs
Bound for absolute bias size.
- omit_extreme
Omit extreme-only elements.
- max_iter
Iteration cap.
- tol
Convergence tolerance.
Value
An object of class mfrm_bias with:
table: interaction rows with effect size, SE, screening t/p metadata, reporting-use flags, and fit columnssummary: compact summary statisticschi_sq: fixed-effect chi-square style screening summaryfacet_a,facet_b: first two analyzed facet names (legacy compatibility)interaction_facets,interaction_order,interaction_mode: full interaction metadataiteration: iteration history/metadata
Details
Bias (interaction) in MFRM refers to a systematic departure from the additive model: a specific rater-criterion (or higher-order) combination produces scores that are consistently higher or lower than predicted by the main effects alone. For example, Rater A might be unexpectedly harsh on Criterion 2 despite being lenient overall.
Mathematically, the bias term \(b_{jc}\) for rater \(j\) on criterion \(c\) modifies the linear predictor:
$$\eta_{njc} = \theta_n - \delta_j - \beta_c - b_{jc}$$
The function estimates \(b_{jc}\) from the residuals of the fitted (additive) model using iterative recalibration in a legacy-compatible style (Myford & Wolfe, 2003, 2004):
$$b_{jc} = \frac{\sum_n (X_{njc} - E_{njc})} {\sum_n \mathrm{Var}_{njc}}$$
Each iteration updates expected scores using the current bias
estimates, then re-computes the bias. Convergence is reached when
the maximum absolute change in bias estimates falls below tol.
For two-way mode, use
facet_aandfacet_b(orinteraction_facetswith length 2).For higher-order mode, provide
interaction_facetswith length >= 3.
What this screening means
estimate_bias() summarizes interaction departures from the additive MFRM.
It is best read as a targeted screening tool for potentially noteworthy
cells or facet combinations that may merit substantive review.
What this screening does not justify
tandProb.are screening metrics, not formal inferential quantities.A flagged interaction cell is not, by itself, proof of rater bias or construct-irrelevant variance.
Non-flagged cells should not be over-read as evidence that interaction effects are absent.
Interpreting output
Use summary for global magnitude, then inspect table for cell-level
interaction effects.
Prioritize rows with:
larger
|Bias Size|(effect on logit scale; \(> 0.5\) logits is typically noteworthy, \(> 1.0\) is large)larger
|t|among the screening metrics (\(|t| \ge 2\) suggests a screen-positive interaction cell)smaller
Prob.among the screening metrics
A positive Obs-Exp Average means the cell produced higher scores
than the additive model predicts (unexpected leniency); negative
means unexpected harshness.
iteration helps verify whether iterative recalibration stabilized.
If the maximum change on the final iteration is still above tol,
consider increasing max_iter.
Typical workflow
Fit and diagnose model.
Run
estimate_bias(...)for target interaction facets.Review
summary(bias)andbias$table.Visualize/report via
plot_bias_interaction()andbuild_fixed_reports().
Interpreting key output columns
In bias$table, the most-used columns are:
Bias Size: estimated interaction effect \(b_{jc}\) (logit scale)tandProb.: screening metrics, not formal inferential quantitiesObs-Exp Average: direction and practical size of observed-vs-expected gap on the raw-score metric
The chi_sq element provides a fixed-effect heterogeneity screen across all
interaction cells.
Recommended next step
Use plot_bias_interaction() to inspect the flagged cells visually, then
integrate the result with DFF, linking, or substantive scoring review before
making formal claims about fairness or invariance.
Examples
toy <- load_mfrmr_data("example_bias")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 25)
diag <- diagnose_mfrm(fit, residual_pca = "none")
bias <- estimate_bias(fit, diag, facet_a = "Rater", facet_b = "Criterion", max_iter = 2)
summary(bias)
#> Many-Facet Rasch Bias Summary
#> Interaction facets: Rater x Criterion | Cells: 16
#> Order: 2 | Mode: pairwise
#> Mean |Bias|: 0.31 | Max |Bias|: 1.103 | Screen-positive (p <= 0.050): 0
#>
#> Fixed-effect chi-square
#> FixedChiSq FixedDF FixedProb InferenceTier SupportsFormalInference
#> NA 15 NA screening FALSE
#> FormalInferenceEligible PrimaryReportingEligible ReportingUse
#> FALSE FALSE screening_only
#> TestBasis InteractionFacets InteractionOrder
#> conditional plug-in heterogeneity screen Rater x Criterion 2
#> InteractionMode
#> pairwise
#>
#> Final iteration status
#> Iteration MaxScoreResidual MaxScoreResidualPct MaxScoreResidualCategories
#> 2 0 NA NA
#> MaxLogitChange BiasCells
#> 0 0
#>
#> Top |t| bias rows
#> Pair Rater Criterion Bias Size S.E. t Prob. Obs-Exp Average
#> R01 | Accuracy R01 Accuracy 0.776 NA NA NA NA
#> R01 | Content R01 Content -0.278 NA NA NA NA
#> R01 | Language R01 Language -0.184 NA NA NA NA
#> R01 | Organization R01 Organization -0.363 NA NA NA NA
#> R02 | Accuracy R02 Accuracy 0.246 NA NA NA NA
#> R02 | Content R02 Content -0.031 NA NA NA NA
#> R02 | Language R02 Language -0.209 NA NA NA NA
#> R02 | Organization R02 Organization -0.023 NA NA NA NA
#> R03 | Accuracy R03 Accuracy -0.055 NA NA NA NA
#> R03 | Content R03 Content 0.246 NA NA NA NA
#> AbsT
#> NA
#> NA
#> NA
#> NA
#> NA
#> NA
#> NA
#> NA
#> NA
#> NA
#>
#> Notes
#> - No immediate warnings from bias summary.
p_bias <- plot_bias_interaction(bias, draw = FALSE)
class(p_bias)
#> [1] "mfrm_plot_data" "list"