Build an observed-data resampling specification

Usage

build_mfrm_resampling_spec(
  data,
  person,
  facets,
  score,
  strata = NULL,
  preserve_facets = NULL,
  design = c("stratified_subsample", "stratified_bootstrap"),
  reps = 20,
  sample_fraction = 0.5,
  sample_n = NULL,
  replace = NULL,
  seed = NULL,
  min_per_stratum = 1,
  topup_preserve_facets = TRUE
)

Arguments

data: A long-format observed MFRM data set.
person: Person/respondent identifier column.
facets: Non-person facet columns used by the target MFRM fit.
score: Ordered score column.
strata: Optional person-level stratification columns, for example a Region or L1 group column. Each person must have at most one unique stratum combination.
preserve_facets: Optional facet columns whose level coverage should be reviewed and, when possible, topped up after the stratified person draw. A common choice is the rater facet.
design: Resampling design. "stratified_subsample" samples persons without replacement inside each stratum. "stratified_bootstrap" samples persons with replacement inside each stratum and re-keys duplicate person instances in the returned data.
reps: Number of resampling replicates to draw.
sample_fraction: Fraction of persons to draw within each stratum when sample_n = NULL.
sample_n: Optional target number of persons to draw per stratum. Supply either one scalar used for every stratum, or a named numeric vector whose names match the computed stratum labels.
replace: Optional logical override for replacement. By default, replacement is FALSE for "stratified_subsample" and TRUE for "stratified_bootstrap".
seed: Optional seed used by draw_mfrm_resamples().
min_per_stratum: Minimum target persons per represented stratum.
topup_preserve_facets: Logical; if TRUE, add extra person clusters when possible to recover missing levels of preserve_facets.

Value

An object of class mfrm_resampling_spec.

Details

This helper defines a resampling design for observed-data stability checks. It is intentionally separate from build_mfrm_sim_spec() and evaluate_mfrm_recovery(). The full-data estimates used with these draws are reference estimates, not known truth, so downstream summaries should be described as estimation stability, reproducibility, or agreement with a full-data reference rather than strict parameter recovery.

The design is person-clustered: all rows for a selected person are kept together. For bootstrap draws, duplicated person clusters are re-keyed in the returned data while the original identifier is retained in .mfrm_original_person.

Examples

toy <- simulate_mfrm_data(n_person = 12, n_rater = 3, n_criterion = 2,
                          raters_per_person = 2, seed = 11)
region_map <- setNames(rep(c("A", "B", "C"),
                           length.out = length(unique(toy$Person))),
                       unique(toy$Person))
toy$Region <- unname(region_map[toy$Person])
spec <- build_mfrm_resampling_spec(
  toy, person = "Person", facets = c("Rater", "Criterion"),
  score = "Score", strata = "Region", preserve_facets = "Rater",
  reps = 2, sample_fraction = 0.5, seed = 99
)
draws <- draw_mfrm_resamples(spec)
summary(draws)$overview
#>                 Design Reps SamplesReturned Replace
#> 1 stratified_subsample    2               2   FALSE
#>   PreserveCoverageCompleteRate TopupReps GapOrFallbackReps
#> 1                            1         0                 0