Build an inter-rater agreement report
Usage
interrater_agreement_table(
fit,
diagnostics = NULL,
rater_facet = NULL,
context_facets = NULL,
exact_warn = 0.5,
corr_warn = 0.3,
include_precision = TRUE,
top_n = NULL
)Arguments
- fit
Output from
fit_mfrm().- diagnostics
Optional output from
diagnose_mfrm().- rater_facet
Name of the rater facet. If
NULL, inferred from facet names.- context_facets
Optional context facets used to match observations for agreement. If
NULL, all remaining facets (includingPerson) are used.- exact_warn
Warning threshold for exact agreement.
- corr_warn
Warning threshold for pairwise correlation.
- include_precision
If
TRUE, append rater severity spread indices from the facet precision summary when available.- top_n
Optional maximum number of pair rows to keep.
Value
A named list with:
summary: one-row inter-rater summarypairs: pair-level agreement tablesettings: applied options and thresholds
Details
This helper computes pairwise rater agreement on matched contexts and returns both a pair-level table and a one-row summary. The output is package-native and does not require knowledge of legacy report numbering.
Interpreting output
summary: overall agreement level, number/share of flagged pairs.pairs: pairwise exact agreement, correlation, and direction/size gaps.settings: applied facet matching and warning thresholds.
Pairs flagged by both low exact agreement and low correlation generally deserve highest calibration priority.
Typical workflow
Run with explicit
rater_facet(andcontext_facetsif needed).Review
summary(ir)and top flagged rows inir$pairs.Visualize with
plot_interrater_agreement().
Output columns
The pairs data.frame contains:
- Rater1, Rater2
Rater pair identifiers.
- N
Number of matched-context observations for this pair.
- Exact
Proportion of exact score agreements.
- ExpectedExact
Expected exact agreement under chance.
- Adjacent
Proportion of adjacent (+/- 1 category) agreements.
- MeanDiff
Signed mean score difference (Rater1 - Rater2).
- MAD
Mean absolute score difference.
- Corr
Pearson correlation between paired scores.
- Flag
Logical;
TRUEwhen Exact <exact_warnor Corr <corr_warn.- OpportunityCount, ExactCount, ExpectedExactCount, AdjacentCount
Raw counts behind the agreement proportions.
The summary data.frame contains:
- RaterFacet
Name of the rater facet analyzed.
- TotalPairs
Number of rater pairs evaluated.
- ExactAgreement
Mean exact agreement across all pairs.
- AgreementMinusExpected
Observed exact agreement minus expected exact agreement.
- MeanCorr
Mean pairwise correlation.
- FlaggedPairs, FlaggedShare
Count and proportion of flagged pairs.
- RaterSeparation, RaterReliability
Severity-spread indices for the rater facet, reported separately from agreement.
Examples
if (FALSE) { # interactive()
toy <- load_mfrmr_data("example_core")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 30)
ir <- interrater_agreement_table(fit, rater_facet = "Rater")
# One-row overview: ExactAgreement, ExpectedExactAgreement, MeanCorr,
# RaterSeparation, and RaterReliability are the headline reportable
# statistics.
ir$summary
# Per-pair detail (Rater1 vs Rater2 with Exact, Adjacent, Corr, MAD).
head(ir$pairs)
p_ir <- plot(ir, draw = FALSE)
p_ir$data$plot
}