Compares facet estimates across two or more calibration waves to identify elements whose difficulty/severity has shifted beyond acceptable thresholds. Useful for monitoring rater drift over time or checking the stability of item banks.
Usage
detect_anchor_drift(
fits,
facets = NULL,
drift_threshold = 0.5,
flag_se_ratio = 2,
reference = 1L,
include_person = FALSE
)
# S3 method for class 'mfrm_anchor_drift'
print(x, ...)
# S3 method for class 'mfrm_anchor_drift'
summary(object, ...)
# S3 method for class 'summary.mfrm_anchor_drift'
print(x, ...)Arguments
- fits
Named list of
mfrm_fitobjects (e.g.,list(Year1 = fit1, Year2 = fit2)).- facets
Character vector of facets to compare (default: all non-Person facets).
- drift_threshold
Absolute drift threshold for flagging (logits, default 0.5).
- flag_se_ratio
Drift/SE ratio threshold for flagging (default 2.0).
- reference
Index or name of the reference fit (default: first).
- include_person
Include person estimates in comparison.
- x
An
mfrm_anchor_driftobject.- ...
Ignored.
- object
An
mfrm_anchor_driftobject (forsummary).
Value
Object of class mfrm_anchor_drift with components:
- drift_table
Tibble of element-level drift statistics.
- summary
Drift summary aggregated by facet and wave.
- common_elements
Tibble of pairwise common-element counts.
- common_by_facet
Tibble of retained common-element counts by facet.
- config
List of analysis configuration.
Details
For each non-reference wave, the function extracts facet-level estimates
using make_anchor_table() and computes the element-by-element difference
against the reference wave. Standard errors are obtained from
diagnose_mfrm() applied to each fit. Only elements common to both the
reference and a comparison wave are included. Before reporting drift, the
function removes the weighted common-element link offset between the two
waves so that Drift represents residual instability rather than the
overall shift between calibrations. The function also records how many
common elements survive the screening step within each linking facet and
treats fewer than 5 retained common elements per facet as thin support.
An element is flagged when either condition is met: $$|\Delta_e| > \texttt{drift\_threshold}$$ $$|\Delta_e / SE_{\Delta_e}| > \texttt{flag\_se\_ratio}$$ The dual-criterion approach guards against flagging elements with large but imprecise estimates, and against missing small but precisely estimated shifts.
When facets is NULL, all non-Person facets are compared. Providing a
subset (e.g., facets = "Criterion") restricts comparison to those facets
only.
Which function should I use?
Use
anchor_to_baseline()when your starting point is raw new data plus a single baseline fit.Use
detect_anchor_drift()when you already have multiple fitted waves and want a reference-versus-wave comparison.Use
build_equating_chain()when the waves form a sequence and you need cumulative linking offsets.
Interpreting output
$drift_table: one row per element x wave combination, with columnsFacet,Level,Wave,Ref_Est,Wave_Est,LinkOffset,Drift,SE_Ref,SE_Wave,SE,Drift_SE_Ratio,LinkSupportAdequate, andFlag. Large drift signals instability after alignment to the common-element link.$summary: aggregated statistics by facet and wave: number of elements, mean/max absolute drift, and count of flagged elements.$common_elements: pairwise common-element counts in tidy table form. Small overlap weakens the comparison and results should be interpreted cautiously.$common_by_facet: retained common-element counts by linking facet for each reference-vs-wave comparison.LinkSupportAdequate = FALSEmeans the link rests on fewer than 5 retained common elements in at least one facet.$config: records the analysis parameters for reproducibility.A practical reading order is
summary(drift)first, thendrift$drift_table, thendrift$common_by_facetif overlap looks thin.
Typical workflow
Fit separate models for each administration wave.
Combine into a named list:
fits <- list(Spring = fit_s, Fall = fit_f).Call
drift <- detect_anchor_drift(fits).Review
summary(drift)andplot_anchor_drift(drift).Flagged elements may need to be removed from anchor sets or investigated for substantive causes (e.g., rater re-training).
Examples
d1 <- load_mfrmr_data("study1")
d2 <- load_mfrmr_data("study2")
fit1 <- fit_mfrm(d1, "Person", c("Rater", "Criterion"), "Score",
method = "JML", maxit = 15)
#> Warning: Optimizer did not fully converge (code = 1). Consider increasing maxit (current: 15) or relaxing reltol (current: 1e-06).
fit2 <- fit_mfrm(d2, "Person", c("Rater", "Criterion"), "Score",
method = "JML", maxit = 15)
#> Warning: Optimizer did not fully converge (code = 1). Consider increasing maxit (current: 15) or relaxing reltol (current: 1e-06).
drift <- detect_anchor_drift(list(Wave1 = fit1, Wave2 = fit2))
summary(drift)
#> --- Anchor Drift Screen ---
#> Reference: Wave1
#> Method: screened_common_element_alignment | Intended use: review_screen
#> Comparisons: 12 | Flagged: 9
#>
#> Drift summary by facet and wave:
#> Facet Wave N Mean_Drift Max_Drift N_Flagged
#> Rater Wave2 12 0.563 1.58 9
#>
#> Common elements:
#> Wave1 Wave2 N_Common
#> Wave1 Wave2 12
#>
#> Retained common elements by facet:
#> Reference Wave Facet N_Common N_Retained GuidelineMinCommon
#> Wave1 Wave2 Rater 12 7 5
#> LinkSupportAdequate
#> TRUE
#>
#> Flagged elements:
#> Facet Level Reference Wave Ref_Est Wave_Est LinkOffset Drift SE_Ref SE_Wave
#> Rater R07 Wave1 Wave2 0.807 -0.5009 0.271 -1.580 0.252 0.0759
#> Rater R09 Wave1 Wave2 -1.126 0.3703 0.271 1.225 0.148 0.0753
#> Rater R12 Wave1 Wave2 0.460 -0.0841 0.271 -0.816 0.152 0.0807
#> Rater R02 Wave1 Wave2 -0.076 -0.5774 0.271 -0.773 0.179 0.0830
#> Rater R01 Wave1 Wave2 0.796 0.4336 0.271 -0.634 0.165 0.0774
#> Rater R06 Wave1 Wave2 0.561 0.3115 0.271 -0.520 0.134 0.0750
#> Rater R05 Wave1 Wave2 -0.273 0.3491 0.271 0.351 0.114 0.0736
#> Rater R04 Wave1 Wave2 -0.913 -0.2931 0.271 0.348 0.153 0.0736
#> Rater R10 Wave1 Wave2 0.303 0.2999 0.271 -0.274 0.107 0.0732
#> SE Drift_SE_Ratio LinkSupportAdequate Flag
#> 0.263 6.01 TRUE TRUE
#> 0.166 7.37 TRUE TRUE
#> 0.172 4.73 TRUE TRUE
#> 0.197 3.92 TRUE TRUE
#> 0.182 3.48 TRUE TRUE
#> 0.154 3.39 TRUE TRUE
#> 0.136 2.58 TRUE TRUE
#> 0.170 2.05 TRUE TRUE
#> 0.130 2.12 TRUE TRUE
head(drift$drift_table[, c("Facet", "Level", "Wave", "Drift", "Flag")])
#> # A tibble: 6 × 5
#> Facet Level Wave Drift Flag
#> <chr> <chr> <chr> <dbl> <lgl>
#> 1 Rater R07 Wave2 -1.58 TRUE
#> 2 Rater R09 Wave2 1.23 TRUE
#> 3 Rater R12 Wave2 -0.816 TRUE
#> 4 Rater R02 Wave2 -0.773 TRUE
#> 5 Rater R01 Wave2 -0.634 TRUE
#> 6 Rater R06 Wave2 -0.520 TRUE
drift$common_elements
#> # A tibble: 1 × 3
#> Wave1 Wave2 N_Common
#> <chr> <chr> <int>
#> 1 Wave1 Wave2 12