Analyze practical equivalence within a facet
Source:R/api-facet-equivalence.R
analyze_facet_equivalence.RdAnalyze practical equivalence within a facet
Usage
analyze_facet_equivalence(
fit,
diagnostics = NULL,
facet = NULL,
equivalence_bound = 0.5,
conf_level = 0.95
)Arguments
- fit
Output from
fit_mfrm().- diagnostics
Optional output from
diagnose_mfrm(). WhenNULL, diagnostics are computed withresidual_pca = "none".- facet
Character scalar naming the non-person facet to evaluate. If
NULL, the function prefers a rater-like facet and otherwise uses the first model facet.- equivalence_bound
Practical-equivalence bound in logits. Default
0.5.- conf_level
Confidence level used for the forest-style interval view. Default
0.95.
Details
This function tests whether facet elements (e.g., raters) are similar enough to be treated as practically interchangeable, rather than merely testing whether they differ significantly. This is the key distinction from a standard chi-square heterogeneity test: absence of evidence for difference is not evidence of equivalence.
The function uses existing facet estimates and their standard errors
from diagnostics$measures; no re-estimation is performed.
The bundle combines four complementary views:
Fixed chi-square test: tests \(H_0\): all element measures are equal. A non-significant result is necessary but not sufficient for interchangeability. It is reported as context, not as direct evidence of equivalence.
Pairwise TOST (Two One-Sided Tests): for each pair of elements, tests whether the difference falls within \(\pm\)
equivalence_bound. The TOST procedure (Schuirmann, 1987) rejects the null hypothesis of non-equivalence when both one-sided tests are significant at level \(\alpha\). A pair is declared "Equivalent" when the TOST p-value < 0.05.BIC-based Bayes-factor heuristic: an approximate screening tool (not full Bayesian inference) that compares the evidence for a common-facet model (all elements equal) against a heterogeneity model (elements differ). Values > 3 favour the common-facet model; < 1/3 favour heterogeneity.
ROPE-style grand-mean proximity: the proportion of each element's normal-approximation confidence distribution that falls within \(\pm\)
equivalence_boundof the weighted grand mean. This is a descriptive proximity summary, not a Bayesian ROPE decision rule around a prespecified null value.
Choosing equivalence_bound: the default of 0.5 logits is a
moderate criterion. For high-stakes certification, 0.3 logits may
be appropriate; for exploratory or low-stakes contexts, 1.0 logits
may suffice. The bound should reflect the smallest difference that
would be practically meaningful in your application.
What this analysis means
analyze_facet_equivalence() is a practical-interchangeability screen. It
asks whether facet levels are close enough, under a user-defined logit
bound, to be treated as practically similar for the current use case.
What this analysis does not justify
A non-significant chi-square result is not evidence of equivalence.
Forest/ROPE displays are descriptive and do not replace the pairwise TOST decision rule.
The BIC-based Bayes-factor summary is a heuristic screen, not a full Bayesian equivalence analysis.
Interpreting output
Start with summary$Decision, which is a conservative summary of the
pairwise TOST results. Then use the remaining tables as context:
chi_square: is there broad heterogeneity in the facet?pairwise: which specific pairs meet the practical-equivalence bound?rope/forest: how close is each level to the facet grand mean?
Smaller equivalence_bound values make the criterion stricter. If the
decision is "partial_pairwise_equivalence", that means some pairwise
contrasts satisfy the practical-equivalence bound but not all of them do.
Decision rule
The final Decision is a pairwise TOST summary rather than a global
equivalence proof. If all pairwise contrasts satisfy the practical-
equivalence bound, the facet is labeled "all_pairs_equivalent". If at
least one, but not all, pairwise contrasts are equivalent, the facet is
labeled "partial_pairwise_equivalence". If no pairwise contrasts meet the
practical-equivalence bound, the facet is labeled
"no_pairwise_equivalence_established". The chi-square, Bayes-factor, and
grand-mean proximity summaries are reported as descriptive context.
How to read the main outputs
summary: one-row pairwise-TOST decision summary and aggregate context.pairwise: pair-level TOST detail; use this for the primary inferential read.chi_square: broad heterogeneity screen.rope/forest: level-wise proximity to the weighted grand mean.
Recommended next step
If the result is borderline or high-stakes, re-run the analysis with a
tighter or looser equivalence_bound, then inspect pairwise and
plot_facet_equivalence() before deciding how strongly to claim
interchangeability.
Typical workflow
Fit a model with
fit_mfrm().Run
analyze_facet_equivalence()for the facet you want to screen.Read
summaryandchi_squarefirst.Use
plot_facet_equivalence()to inspect which levels drive the result.
Output
The returned bundle has class mfrm_facet_equivalence and includes:
summary: one-row overview with convergent decisionchi_square: fixed chi-square / separation summarypairwise: pairwise TOST detail tablerope: element-wise ROPE probabilities around the weighted grand meanforest: element-wise estimate, confidence interval, and ROPE statussettings: applied facet and threshold settings
Examples
toy <- load_mfrmr_data("example_core")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score",
method = "JML", maxit = 25)
eq <- analyze_facet_equivalence(fit, facet = "Rater")
eq$summary[, c("Facet", "Elements", "Decision", "MeanROPE")]
#> Facet Elements Decision MeanROPE
#> 1 Rater 4 partial_pairwise_equivalence 97.86198
head(eq$pairwise[, c("ElementA", "ElementB", "Equivalent")])
#> ElementA ElementB Equivalent
#> 1 R01 R02 TRUE
#> 2 R01 R03 FALSE
#> 3 R01 R04 FALSE
#> 4 R02 R03 FALSE
#> 5 R02 R04 FALSE
#> 6 R03 R04 TRUE