Skip to contents

Estimate legacy-compatible bias/interaction terms iteratively

Usage

estimate_bias(
  fit,
  diagnostics,
  facet_a = NULL,
  facet_b = NULL,
  interaction_facets = NULL,
  max_abs = 10,
  omit_extreme = TRUE,
  max_iter = 4,
  tol = 0.001
)

Arguments

fit

Output from fit_mfrm().

diagnostics

Output from diagnose_mfrm().

facet_a

First facet name.

facet_b

Second facet name.

interaction_facets

Character vector of two or more facets to model as one interaction effect. When supplied, this takes precedence over facet_a/facet_b.

max_abs

Bound for absolute bias size.

omit_extreme

Omit extreme-only elements.

max_iter

Iteration cap.

tol

Convergence tolerance.

Value

An object of class mfrm_bias with:

  • table: interaction rows with effect size, SE, screening t/p metadata, reporting-use flags, and fit columns

  • summary: compact summary statistics

  • chi_sq: fixed-effect chi-square style screening summary

  • facet_a, facet_b: first two analyzed facet names (legacy compatibility)

  • interaction_facets, interaction_order, interaction_mode: full interaction metadata

  • iteration: iteration history/metadata

Details

Bias (interaction) in MFRM refers to a systematic departure from the additive model: a specific rater-criterion (or higher-order) combination produces scores that are consistently higher or lower than predicted by the main effects alone. For example, Rater A might be unexpectedly harsh on Criterion 2 despite being lenient overall.

Mathematically, the bias term \(b_{jc}\) for rater \(j\) on criterion \(c\) modifies the linear predictor:

$$\eta_{njc} = \theta_n - \delta_j - \beta_c - b_{jc}$$

The function estimates \(b_{jc}\) from the residuals of the fitted (additive) model using iterative recalibration in a legacy-compatible style (Myford & Wolfe, 2003, 2004):

$$b_{jc} = \frac{\sum_n (X_{njc} - E_{njc})} {\sum_n \mathrm{Var}_{njc}}$$

Each iteration updates expected scores using the current bias estimates, then re-computes the bias. Convergence is reached when the maximum absolute change in bias estimates falls below tol.

  • For two-way mode, use facet_a and facet_b (or interaction_facets with length 2).

  • For higher-order mode, provide interaction_facets with length >= 3.

What this screening means

estimate_bias() summarizes interaction departures from the additive MFRM. It is best read as a targeted screening tool for potentially noteworthy cells or facet combinations that may merit substantive review.

What this screening does not justify

  • t and Prob. are screening metrics, not formal inferential quantities.

  • A flagged interaction cell is not, by itself, proof of rater bias or construct-irrelevant variance.

  • Non-flagged cells should not be over-read as evidence that interaction effects are absent.

Interpreting output

Use summary for global magnitude, then inspect table for cell-level interaction effects.

Prioritize rows with:

  • larger |Bias Size| (effect on logit scale; \(> 0.5\) logits is typically noteworthy, \(> 1.0\) is large)

  • larger |t| among the screening metrics (\(|t| \ge 2\) suggests a screen-positive interaction cell)

  • smaller Prob. among the screening metrics

A positive Obs-Exp Average means the cell produced higher scores than the additive model predicts (unexpected leniency); negative means unexpected harshness.

iteration helps verify whether iterative recalibration stabilized. If the maximum change on the final iteration is still above tol, consider increasing max_iter.

Typical workflow

  1. Fit and diagnose model.

  2. Run estimate_bias(...) for target interaction facets.

  3. Review summary(bias) and bias$table.

  4. Visualize/report via plot_bias_interaction() and build_fixed_reports().

Interpreting key output columns

In bias$table, the most-used columns are:

  • Bias Size: estimated interaction effect \(b_{jc}\) (logit scale)

  • t and Prob.: screening metrics, not formal inferential quantities

  • Obs-Exp Average: direction and practical size of observed-vs-expected gap on the raw-score metric

The chi_sq element provides a fixed-effect heterogeneity screen across all interaction cells.

Use plot_bias_interaction() to inspect the flagged cells visually, then integrate the result with DFF, linking, or substantive scoring review before making formal claims about fairness or invariance.

Examples

toy <- load_mfrmr_data("example_bias")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 25)
diag <- diagnose_mfrm(fit, residual_pca = "none")
bias <- estimate_bias(fit, diag, facet_a = "Rater", facet_b = "Criterion", max_iter = 2)
summary(bias)
#> Many-Facet Rasch Bias Summary
#>   Interaction facets: Rater x Criterion | Cells: 16
#>   Order: 2 | Mode: pairwise
#>   Mean |Bias|: 0.31 | Max |Bias|: 1.103 | Screen-positive (p <= 0.050): 0
#> 
#> Fixed-effect chi-square
#>  FixedChiSq FixedDF FixedProb InferenceTier SupportsFormalInference
#>          NA      15        NA     screening                   FALSE
#>  FormalInferenceEligible PrimaryReportingEligible   ReportingUse
#>                    FALSE                    FALSE screening_only
#>                                 TestBasis InteractionFacets InteractionOrder
#>  conditional plug-in heterogeneity screen Rater x Criterion                2
#>  InteractionMode
#>         pairwise
#> 
#> Final iteration status
#>  Iteration MaxScoreResidual MaxScoreResidualPct MaxScoreResidualCategories
#>          2                0                  NA                         NA
#>  MaxLogitChange BiasCells
#>               0         0
#> 
#> Top |t| bias rows
#>                Pair Rater    Criterion Bias Size S.E.  t Prob. Obs-Exp Average
#>      R01 | Accuracy   R01     Accuracy     0.776   NA NA    NA              NA
#>       R01 | Content   R01      Content    -0.278   NA NA    NA              NA
#>      R01 | Language   R01     Language    -0.184   NA NA    NA              NA
#>  R01 | Organization   R01 Organization    -0.363   NA NA    NA              NA
#>      R02 | Accuracy   R02     Accuracy     0.246   NA NA    NA              NA
#>       R02 | Content   R02      Content    -0.031   NA NA    NA              NA
#>      R02 | Language   R02     Language    -0.209   NA NA    NA              NA
#>  R02 | Organization   R02 Organization    -0.023   NA NA    NA              NA
#>      R03 | Accuracy   R03     Accuracy    -0.055   NA NA    NA              NA
#>       R03 | Content   R03      Content     0.246   NA NA    NA              NA
#>  AbsT
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#>    NA
#> 
#> Notes
#>  - No immediate warnings from bias summary.
p_bias <- plot_bias_interaction(bias, draw = FALSE)
class(p_bias)
#> [1] "mfrm_plot_data" "list"