Compares facet estimates across two or more calibration waves to identify elements whose difficulty/severity has shifted beyond acceptable thresholds. Useful for monitoring rater drift over time or checking the stability of item banks.
Usage
detect_anchor_drift(
fits,
facets = NULL,
drift_threshold = 0.5,
flag_se_ratio = 2,
reference = 1L,
include_person = FALSE
)
# S3 method for class 'mfrm_anchor_drift'
print(x, ...)
# S3 method for class 'mfrm_anchor_drift'
summary(object, ...)
# S3 method for class 'summary.mfrm_anchor_drift'
print(x, ...)Arguments
- fits
Named list of
mfrm_fitobjects (e.g.,list(Year1 = fit1, Year2 = fit2)).- facets
Character vector of facets to compare (default: all non-Person facets).
- drift_threshold
Absolute drift threshold for flagging (logits, default 0.5).
- flag_se_ratio
Drift/SE ratio threshold for flagging (default 2.0).
- reference
Index or name of the reference fit (default: first).
- include_person
Include person estimates in comparison.
- x
An
mfrm_anchor_driftobject.- ...
Ignored.
- object
An
mfrm_anchor_driftobject (forsummary).
Value
Object of class mfrm_anchor_drift with components:
- drift_table
Tibble of element-level drift statistics.
- summary
Drift summary aggregated by facet and wave.
- common_elements
Tibble of pairwise common-element counts.
- common_vs_reference
Tibble of common-element counts between each wave and the reference wave (i.e., which elements remain comparable across the entire chain).
- n_common_all_waves
Integer count of elements that are common across every wave; used by
summary()to gauge how robust the chain is to chained linking error.- common_by_facet
Tibble of retained common-element counts by facet.
- config
List of analysis configuration.
Details
For each non-reference wave, the function extracts facet-level estimates
using make_anchor_table() and computes the element-by-element difference
against the reference wave. Standard errors are obtained from
diagnose_mfrm() applied to each fit. Only elements common to both the
reference and a comparison wave are included. Before reporting drift, the
function removes the weighted common-element link offset between the two
waves so that Drift represents residual instability rather than the
overall shift between calibrations. The function also records how many
common elements survive the screening step within each linking facet and
treats fewer than 5 retained common elements per facet as thin support.
An element is flagged when either condition is met: $$|\Delta_e| > \texttt{drift\_threshold}$$ $$|\Delta_e / SE_{\Delta_e}| > \texttt{flag\_se\_ratio}$$ The dual-criterion approach guards against flagging elements with large but imprecise estimates, and against missing small but precisely estimated shifts.
When facets is NULL, all non-Person facets are compared. Providing a
subset (e.g., facets = "Criterion") restricts comparison to those facets
only.
Which function should I use?
Use
anchor_to_baseline()when your starting point is raw new data plus a single baseline fit.Use
detect_anchor_drift()when you already have multiple fitted waves and want a reference-versus-wave comparison.Use
build_equating_chain()when the waves form a sequence and you need cumulative linking offsets.
Interpreting output
$drift_table: one row per element x wave combination, with columnsFacet,Level,Wave,Ref_Est,Wave_Est,LinkOffset,Drift,SE_Ref,SE_Wave,SE,Drift_SE_Ratio,LinkSupportAdequate, andFlag. Large drift signals instability after alignment to the common-element link.$summary: aggregated statistics by facet and wave: number of elements, mean/max absolute drift, and count of flagged elements.$common_elements: pairwise common-element counts in tidy table form. Small overlap weakens the comparison and results should be interpreted cautiously.$common_by_facet: retained common-element counts by linking facet for each reference-vs-wave comparison.LinkSupportAdequate = FALSEmeans the link rests on fewer than 5 retained common elements in at least one facet.$config: records the analysis parameters for reproducibility.A practical reading order is
summary(drift)first, thendrift$drift_table, thendrift$common_by_facetif overlap looks thin.
Typical workflow
Fit separate models for each administration wave.
Combine into a named list:
fits <- list(Spring = fit_s, Fall = fit_f).Call
drift <- detect_anchor_drift(fits).Review
summary(drift)andplot_anchor_drift(drift).Flagged elements may need to be removed from anchor sets or investigated for substantive causes (e.g., rater re-training).
Examples
if (FALSE) { # \dontrun{
d1 <- load_mfrmr_data("study1")
d2 <- load_mfrmr_data("study2")
fit1 <- fit_mfrm(d1, "Person", c("Rater", "Criterion"), "Score",
method = "JML", maxit = 30)
fit2 <- fit_mfrm(d2, "Person", c("Rater", "Criterion"), "Score",
method = "JML", maxit = 30)
drift <- detect_anchor_drift(list(Wave1 = fit1, Wave2 = fit2))
summary(drift)
head(drift$drift_table[, c("Facet", "Level", "Wave", "Drift", "Flag")])
drift$common_elements
} # }