Produce a side-by-side comparison of multiple fit_mfrm() results using
information criteria, log-likelihood, and parameter counts. When exactly
two nested models are supplied, a likelihood-ratio test is included.
Arguments
- ...
Two or more
mfrm_fitobjects to compare.- labels
Optional character vector of labels for each model. If
NULL, labels are generated from model/method combinations.- warn_constraints
Logical. If
TRUE(the default), emit a warning when models use different centering constraints (noncenter_facetordummy_facets), which can make information-criterion comparisons misleading.- nested
Logical. Set to
TRUEonly when the supplied models are known to be nested and fitted with the same likelihood basis on the same observations. The default isFALSE, in which case no likelihood-ratio test is reported. WhenTRUE, the function still runs a conservative structural audit and computes the LRT only for supported nesting patterns.
Value
An object of class mfrm_comparison (named list) with:
table: data.frame of model-level statistics (LogLik, AIC, BIC, Delta_AIC, AkaikeWeight, Delta_BIC, BICWeight, npar, nobs, Model, Method, Converged, ICComparable).lrt: data.frame with likelihood-ratio test result (only when two models are supplied andnested = TRUE). ContainsChiSq,df,p_value.evidence_ratios: data.frame of pairwise Akaike-weight ratios (Model1, Model2, EvidenceRatio).NULLwhen weights cannot be computed.preferred: named list with the preferred model label by each criterion.comparison_basis: list describing whether IC and LRT comparisons were considered comparable. Includes a conservativenesting_audit.
Details
Models should be fit to the same data (same rows, same person/facet columns) for the comparison to be meaningful. The function checks that observation counts match and warns otherwise.
Information-criterion ranking is reported only when all candidate models
use the package's MML estimation path, analyze the same observations, and
converge successfully. Raw AIC and BIC values are still shown for each
model, but Delta_*, weights, and preferred-model summaries are suppressed
when the likelihood basis is not comparable enough for primary reporting.
Nesting: Two models are nested when one is a special case of the other obtained by imposing equality constraints. The most common nesting in MFRM is RSM (shared thresholds) inside PCM (item-specific thresholds). Models that differ only in estimation method (MML vs JML) on the same specification are not nested in the usual sense—use information criteria rather than LRT for that comparison.
The likelihood-ratio test (LRT) is reported only when exactly two
models are supplied, nested = TRUE, the structural audit passes, and the
difference in the number of parameters is positive:
$$\Lambda = -2 (\ell_{\mathrm{restricted}} - \ell_{\mathrm{full}}) \sim \chi^2_{\Delta p}$$
The LRT is asymptotically valid when models are nested and the data are independent. With small samples or boundary conditions (e.g., variance components near zero), treat p-values as approximate.
Information-criterion diagnostics
In addition to raw AIC and BIC values, the function computes:
Delta_AIC / Delta_BIC: difference from the best (minimum) value. A Delta < 2 is typically considered negligible; 4–7 suggests moderate evidence; > 10 indicates strong evidence against the higher-scoring model (Burnham & Anderson, 2002).
AkaikeWeight / BICWeight: model probabilities derived from
exp(-0.5 * Delta), normalised across the candidate set. An Akaike weight of 0.90 means the model has a 90\ being the best in the candidate set.Evidence ratios: pairwise ratios of Akaike weights, quantifying the relative evidence for one model over another (e.g., an evidence ratio of 5 means the preferred model is 5 times more likely).
AIC penalises complexity less than BIC; when they disagree, AIC favours the more complex model and BIC the simpler one.
What this comparison means
compare_mfrm() is a same-basis model-comparison helper. Its strongest
claims apply only when the models were fit to the same response data,
under a compatible likelihood basis, and with compatible constraint
structure.
What this comparison does not justify
Do not treat AIC/BIC differences as primary evidence when
table$ICComparableisFALSE.Do not interpret the LRT unless
nested = TRUEand the structural audit incomparison_basis$nesting_auditpasses.Do not compare models fit to different datasets, different score codings, or materially different constraint systems as if they were commensurate.
Interpreting output
Lower AIC/BIC values indicate better parsimony-accuracy trade-off only when
table$ICComparableisTRUE.A significant LRT p-value suggests the more complex model provides a meaningfully better fit only when the nesting assumption truly holds.
preferredindicates the model preferred by each criterion.evidence_ratiosgives pairwise Akaike-weight ratios (returned only when Akaike weights can be computed for at least two models).When comparing more than two models, interpret evidence ratios cautiously—they do not adjust for multiple comparisons.
How to read the main outputs
table: first-pass comparison table; start withICComparable,Model,Method,AIC, andBIC.comparison_basis: records whether IC and LRT claims are defensible for the supplied models.lrt: nested-model test summary, present only when the requested and audited conditions are met.preferred: candidate preferred by each criterion when those summaries are available.
Recommended next step
Inspect comparison_basis before writing conclusions. If comparability is
weak, treat the result as descriptive and revise the model setup (for
example, explicit step_facet, common data, or common constraints) before
using IC or LRT results in reporting.
Typical workflow
Fit two models with
fit_mfrm()(e.g., RSM and PCM).Compare with
compare_mfrm(fit_rsm, fit_pcm).Inspect
summary(comparison)for AIC/BIC diagnostics and, when appropriate, an LRT.
Examples
toy <- load_mfrmr_data("example_core")
fit_rsm <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score",
method = "MML", model = "RSM", maxit = 25)
fit_pcm <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score",
method = "MML", model = "PCM",
step_facet = "Criterion", maxit = 25)
comp <- compare_mfrm(fit_rsm, fit_pcm, labels = c("RSM", "PCM"))
comp$table
#> # A tibble: 2 × 14
#> Label Model Method nobs npar LogLik AIC BIC Converged ICComparable
#> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <lgl> <lgl>
#> 1 RSM RSM MML 768 9 -899. 1817. 1858. TRUE TRUE
#> 2 PCM PCM MML 768 18 -892. 1821. 1905. TRUE TRUE
#> # ℹ 4 more variables: Delta_AIC <dbl>, AkaikeWeight <dbl>, Delta_BIC <dbl>,
#> # BICWeight <dbl>
comp$evidence_ratios
#> # A tibble: 1 × 3
#> Model1 Model2 EvidenceRatio
#> <chr> <chr> <dbl>
#> 1 RSM PCM 9.32