Changelog
Source:NEWS.md
mfrmr 0.2.2
- The example-execution policy now keeps
R CMD checktime within the CRAN incoming budget (the 0.2.1 submission passed the content checks on Windows and Debian but tripped the overall-checktime limit on the Windows incoming host before being accepted on review): long-running illustrations previously wrapped in\donttestare now\dontrun, and example pages whose executed examples measurably exceeded 0.25 seconds are gated with@examplesIf interactive(), keeping 129 lightweight pages plus the corefit_mfrm()example in the executed surface. All illustrations remain in the help pages and run in interactive sessions. - Release metadata and reviewer-facing documentation now consistently use the 0.2.2 maintenance-candidate boundary:
CITATION.cff, README routing, validation evidence-map references, and the bounded-GPCM roadmap all point to the current 0.2.2 surface rather than the previous CRAN 0.2.1 baseline. - The README now starts with the public user workflow and keeps reproducibility notes journal-neutral, with
fit_mfrm()->mfrm_results()->mfrm_report()->export_mfrm_results()as the first-contact route. - The workflow vignette now uses small generated CSV artifacts for representative output during CRAN-style builds. Heavy fitting and simulation chunks remain disabled unless
NOT_CRAN=true, while the default vignette still shows result tables for the first-contact route. - Reporting and visualization documentation now makes the publication boundary explicit: APA/report helpers provide conservative drafting scaffolds, claim-readiness and caveat-routing tables, and reproducible handoff files, while high-stakes journal claims still require study-specific design, literature, sensitivity, and substantive interpretation decisions.
- The visual diagnostics vignette and
plot_data()help now show how to usedraw = FALSE,plot_data_components(), andplot_data()to build custom figures while retaining reference lines, guidance, and interpretation metadata from the package-native plot. - Reporting and visual help now make the fit-level HTML bundle route explicit:
export_mfrm_bundle(fit, include = c(..., "html"))writes a local HTML/CSV/replay bundle without requiring a priormfrm_results()object, whilepreset = "monochrome"is documented as the grayscale / print-friendly plot route. - Model-comparison reporting is now routed through the same appendix/export contract:
build_summary_table_bundle()acceptsbuild_model_choice_review()objects and their summaries, and README / reporting-vignette guidance now separates same-data fit comparison, equal-weighting versus bounded-GPCMmodel roles, cautious wording, and latent-regression reporting boundaries. - The pkgdown configuration now groups reference topics by primary workflow, bounded-GPCM boundary, diagnostics, reporting/export, simulation/recovery, linking/bias/DFF, FACETS migration, and advanced review surfaces.
-
cran-comments.mdhas been rewritten as a CRAN maintenance-release note rather than a journal-preparation note; any remaining pkgdown URL NOTE must be resolved before release by publishing the prepared pkgdown route or by revising release URLs.
mfrmr 0.2.1
CRAN release: 2026-06-12
This release focuses on a clearer public workflow, a more readable reporting surface, and source-aware review outputs. The main user-facing entry points are mfrm_results(), mfrm_report(), export_mfrm_results(), launch_mfrmr_viewer(), and mfrmr_output_guide("public").
Highlights:
- A shorter public workflow now starts from
fit_mfrm()->mfrm_results()->mfrm_report()->export_mfrm_results(). -
mfrm_report()andsummary(mfrm_report(...))provide first-screen report readiness tables, report routes, and cautious interpretation boundaries. - Simulation and recovery summaries expose operating-characteristic, sparse-design, peer-review, and recovery-review tables for appendix and export workflows.
- Linking, anchor, bias, misfit/pathway, and network review outputs are routed through explicit result components and scoped follow-up helpers.
- Release-review artifacts now distinguish CRAN-time lightweight checks from full non-CRAN regression evidence, and the readiness gate checks that the local
R CMD checklog matches the target package version.
Detailed changes:
- The generated APA narrative no longer labels overall fit as “acceptable”/“elevated”. It now reports whether mean-square fit fell within or outside the active screening band, names the band, and states that band position is screening evidence rather than a model-validity decision. Element-level wording uses the same screening-band language.
- The APA narrative now separates separation reliability from observed inter-rater agreement explicitly: reliability sentences state that separation indices are not inter-rater agreement, and agreement sentences are introduced as a separate quantity.
- APA Methods text for
MMLfits now states the estimation basis: person measures are EAP estimates, and residual-based fit statistics are evaluated at those EAP measures rather than at JMLE estimates. A matching fit-basis caution routes external FACETS comparisons tomethod = "JML". - The APA narrative and Table 1 note now report measure-level confidence intervals when available: the Results fit/precision section states the CI level, method, and
CIEligiblecounts, and the facet-summary table note instructs reportingCI_Lower/CI_Upperfor eligible rows. - APA Wright-map and facet-summary notes now state the fitted sign convention explicitly (higher person values = higher ability; higher non-person facet values = greater severity/difficulty under the default negative orientation), including any
positive_facetsexceptions, instead of the previous “depending on facet orientation” wording. The pathway-map note andplot.mfrm_fit()help now state that the expected-score pathway display is distinct from the Bond-and-Fox-style measure-versus-fit bubble chart available viaplot_bubble(). - The MML-vs-FACETS residual basis is now documented across the fit surfaces:
diagnose_mfrm(),facets_fit_df_guide(),facets_fit_review()(including a new residual-basis guidance row and interpretation-guide rows), the package statistical-background help, and the FACETS migration vignette all state that MML fit statistics are computed at shrunken EAP person measures and route JMLE-style comparisons tomethod = "JML". - Small-df ZSTD availability is now documented as an explicit boundary: mfrmr withholds ZSTD as
NAwhen the applicable df falls below 1 (Wilson-Hilferty instability), while FACETS/Winsteps underWHEXACTcan report a value on the same sparse cell. The fit guides andfacets_fit_review()now describe NA-vs-finite ZSTD pairs as availability differences rather than fit differences. - The MML person-row separation/reliability convention is now documented: person rows apply the separation formulas to EAP measures with posterior SDs, which yields a conservative summary that is lower than the IRT empirical-reliability convention and is not numerically comparable to JMLE-based FACETS person reliability. The APA reliability sentence carries the same note when a Person row is present under
MML. -
preset = "monochrome"is now accepted by all preset-based plot helpers, includingplot_fair_average(),plot_displacement(),plot_unexpected(),plot_interrater_agreement(),plot_marginal_fit(),plot_marginal_pairwise(),plot_facets_chisq(),plot_qc_dashboard(),plot_threshold_ladder(), and the network/secondary plot methods, not only the fit-family plots. -
assess_mfrm_recovery()now reports uncertainty evidence separately from recovery metrics. This makes unavailable SE/coverage evidence visible without turning it into an implied RMSE or bias failure. -
inst/validation/release-readiness.Rnow follows the target package version fromDESCRIPTION, selects versioned evidence-map/checklist files when available, and reports staleR CMD checklogs as package-check review items when the check-log package version does not match the target release. - The release-readiness GPCM scope review now also works against an installed package: when the
R/capability-matrix source file is not present (for example insideR CMD checkruns or CI artifact reviews), it readsgpcm_capability_matrix()andgpcm_runtime_guard_coverage()from the installed namespace instead of reporting a missing-source concern. -
precision_review_report()now includes a source-grounded fit/separation basis table. This keeps mean-square fit, ZSTD standardization, Rasch/FACETS-style separation, and package QC thresholds in separate reporting lanes. - New
mfrm_results()gives users a FACETS-style first-screen entry point: it accepts an existing fit, arun_mfrm_facets()object, or a standard long-format data frame, runs diagnostics automatically, gathers available tables/reviews/plot routes, and can emit a lightweight temporary HTML report. Its summary includes next-action routes and a replay-code scaffold. Existing table, report, analysis, and review helpers remain supported as detailed components.mfrm_results_interactive()adds an explicit opt-in column-selection wizard for interactive sessions only. -
summary(mfrm_results(...))now includes atriagetable that orders unavailable, review, information, and OK signals across diagnostics, plots, tables, precision/reliability, reporting, model scope, and network review surfaces.next_actionsuses that triage layer for first-screen routing. -
mfrmr_output_guide("public")now gives the shortest top-level public API map: explicit fit, comprehensive results, report readiness, optional viewer, download export, scoped guide, and opt-in interactive routes. The guide also carriesAPILayer,ObjectRole, andDecisionBoundarycolumns so top-level public surfaces, specialist follow-ups, advanced design-review rows, and migration/integration routes are separated before users scan the broader namespace, and so users can see whether a route estimates, summarizes, displays, exports, or only points to the next helper. -
mfrmr_output_guide("entry")still gives the first-screen creation routes, including explicit-fit, comprehensive-results, purpose-specific guide, and opt-in interactive routes. The guide labels lifecycle, user level, recommended-entry status, and includes advanced simulation/network route rows. - The README now starts from a shorter public-surface workflow:
fit_mfrm()->mfrm_results()->mfrm_report()->export_mfrm_results(). Detailed table, report, review, and export helpers are presented as scoped follow-ups rather than as functions to memorize before the first analysis. -
mfrmr_output_guide("viewer")now maps local Shiny viewer workflows back to themfrm_results(include = ...)object that should be created first, including publication, validation, bias-screen, pathway/misfit, and combined review routes. -
plot.mfrm_diagnostic_screening()now gives diagnostic-screening simulation output an integrated visualization route. The default overview combines legacy ZSTD, strict marginal, strict pairwise, strict combined, and optional report-review rates; focused views cover report signals, scenario contrasts, and runtime summaries. The draw-free return is anmfrm_plot_dataobject, soplot_data(diag_eval, type = "overview", component = "plot_long")can be used directly for ggplot2, plotly, Quarto, or custom export workflows. Draw-free diagnostic-screening plot objects also carryoverview,reading_order,next_actions,reporting_notes, andfigure_recipes, keeping custom figure/report handoffs aligned with the same interpretation boundaries used bysummary(diag_eval). -
summary(evaluate_mfrm_diagnostic_screening(...))now provides an explicit diagnostic-screening report surface, andbuild_summary_table_bundle()/export_summary_appendix()can export its scenario, performance, report-signal, contrast, and draw-free plot-data tables. - Diagnostic-screening summaries now include
reading_order,next_actions,reporting_notes, andfigure_recipestables so users can separate first-read tables, follow-up actions, figure/caption planning, appendix-only tables, operating-characteristic claims, and validation-gate boundaries before writing report text. -
mfrmr_output_guide("simulation")now routes users to diagnostic-screening evaluation and appendix export alongside data generation, design/recovery evaluation, network review, and peer-review design review. -
mfrm_results(include = ...)now accepts purpose presets:"publication","validation","facets","bias","misfit_review","network", and"gpcm_review"in addition to"standard"and"all". The bias preset surfaces facet-level bias-screen guidance while leaving interaction-bias facet-pair selection explicit; the misfit preset bundles unexpected responses, displacement review, and pathway-map fit annotations. -
mfrmr_output_guide("binary")and the README now make the ordinary person-item dichotomous route explicit: pass the item column as the single non-person facet and usemodel = "RSM"; with exactly two ordered categories this is the usual binary Rasch logit up to centering and threshold-identification conventions. - New
launch_mfrmr_viewer()provides an optional local Shiny reader for existingmfrm_resultsobjects. It does not estimate models, read external web applications, or change diagnostics; it displays the already-built overview, triage, status, QC evidence, APA-style report text when available, bias screens, pathway/misfit review, tables, plots, and replay code. The QC, Report, Bias, and Pathway/Misfit tabs now include local section-status tables so unavailable or not-requested sections are visible at the point of inspection. - New
export_mfrm_results()writes a lightweight download folder from an existingmfrm_resultsobject: summary CSVs, collected tables, HTML, RDS, replay code, and a written-files manifest, with optional PNG plot export and best-effort zip creation. This is the compact result-object handoff route;export_mfrm_bundle()remains the broader fit-centered analysis archive. -
export_mfrm_results()now acceptsinclude = "report"to writemfrm_report()artifacts into the same download folder: report-table CSVs such asreport_index, evidence summaries, and reporting templates, plus report Markdown and report HTML. - New
mfrm_report()turns an existingmfrm_resultsobject into report-ready QC, APA, validation, reviewer, or technical section plans with claim-readiness, report-gap, evidence-boundary, and next-action tables. It is a reporting layer over existing results, not a new estimator, diagnostic, or acceptance rule. -
mfrm_report()now includesfirst_screen, a FACETS-like entry table with an overall row and one row per major evidence area. It reports status, readiness, the main issue, next action, and primary route before users open the detailed evidence and template tables. -
print(mfrm_report(...)), the help-page examples, and the README now follow the same short reading order: readsummary(report)andreport$first_screenfirst, usereport$report_indexandreport$template_indexas table indexes, and open detailed evidence tables only when those indexes point to them. -
summary(mfrm_report(...))now provides a shorter reader-facing report summary: overview status, first-screen rows, immediate actions, optional not-requested sections, claim-readiness counts, report gaps, boundary rows, and standard routes. This keeps the comprehensive report object easier to read without creating a new diagnostic decision rule. -
mfrm_report(output = "html")now starts from the same reader-facing guidance and report-summary tables before the full Markdown report. This keeps browser output aligned with thesummary(report)andfirst_screenworkflow. -
mfrm_report()now includes a compactreport_indextable that lists the major evidence areas, evidence status, readiness label, review-signal count, and the primary/template tables to inspect next. This keeps the expanded report surface navigable without hiding the detailed tables. -
mfrm_report()report_indexnow also includes explicit evidence, template, plot, export, andmfrm_results(include = ...)routes. These columns keep report drafting, figure review, and download handoff connected to the same evidence surface without adding a new diagnostic decision rule. -
mfrm_report()now also exposes fit-specificfit_criteria,zstd_conventions, andfit_decision_policytables. These make the selected MnSq threshold band, alternative published fit bands, and engine-vs-FACETS-style df/ZSTD conventions visible before users write fit, separation, or reliability claims. -
mfrm_report()now adds result-specific fit evidence tables from the storedmfrm_results()fit-measures component: observed fit-status counts, threshold-profile sensitivity, df/ZSTD sensitivity counts, and row-level df-sensitive prompts. This keeps FACETS-style ZSTD differences visible without turning them into a new acceptance rule or a different MnSq signal. -
mfrm_report()now includesfit_reporting_templates, a cautious wording scaffold for APA, QC, validation, reviewer, and technical reports. The templates summarize observed fit counts, threshold-profile sensitivity, ZSTD convention, and df sensitivity in separate sentences so fit, separation, and reliability are not collapsed into one pass/fail claim. - All
mfrm_report()reporting-template tables now share evidence and claim boundary columns:EvidenceTable,EvidenceRoute,BoundaryType,ClaimStrength, andRecommendedUse. This makes template wording easier to trace back to its source table and helps keep descriptive, caveated, and follow-up-only claims separate. -
mfrm_report()now includestemplate_index, a stacked template index over all fit, precision, bias, misfit/pathway, and linking/anchor reporting templates. It lets users review boundary type, claim strength, recommended use, and evidence route before opening full template text. -
mfrm_report()now also includes precision-specific reporting surfaces:precision_evidence_summary,precision_basis, andprecision_reporting_templates. These summarize separation, reliability, strata, precision tier, and review/warn checks while keeping Rasch/FACETS-style separation reliability distinct from inter-rater agreement, model fit, and standalone validity evidence. -
mfrm_report()now includes bias-specific reporting surfaces when the source result was built withinclude = "bias":bias_evidence_summaryandbias_reporting_templates. These keep facet-level bias screens, interaction-bias contrasts, DFF follow-up, and fairness conclusions separated so screen-positive rows are not reported as final fairness or invariance decisions. -
mfrm_report()now includes misfit/pathway reporting surfaces when the source result was built withinclude = "misfit_review":misfit_evidence_summaryandmisfit_reporting_templates. These keep unexpected-response rows, displacement review, pathway-map evidence, and case-review wording separate so local misfit prompts are not reported as automatic exclusion, fairness, or validity decisions. -
mfrm_report()now includes linking/anchor reporting surfaces when the source result was built withinclude = "linking":linking_evidence_summaryandlinking_reporting_templates. These keep anchor readiness, drift review, screened equating-chain review, and GPCM support boundaries separate so anchor evidence is not reported as automatic drift absence, completed equating, DFF support, or validity proof. -
mfrm_results(include = "linking")now adds the fitted object’s stored anchor-review evidence and an operational linking-readiness surface to the comprehensive results object. It exposesplot(res, type = "anchors")for anchor-readiness visualization and routes drift/equating follow-up to explicit multi-fit calls such asdetect_anchor_drift()andbuild_equating_chain(). - Recovery review can now retain fit/separation operating characteristics as diagnostic context. These summaries help users inspect MnSq, ZSTD, separation, and reliability behavior without making them top-line recovery gates. Recovery assessment and validation summaries also expose
diagnostic_reporting_notesso zero separation/reliability and ZSTD sensitivity are routed into report caveats rather than recovery or release decisions. -
assess_mfrm_recovery()now includescondition_reviewandcondition_reporting_notesfor recovery simulations. For boundedGPCM, these tables separate slope-regime context and generated score-category support from recovery metrics before users interpret stress cases. - Bounded-
GPCMsimulation specifications now carryslope_regimemetadata. These labels are documented as package recovery-review labels, not literature-derived fit or adequacy cut points. - The optional recovery-validation summary now separates the core release recovery decision from extended sensitivity evidence. This keeps stress cases visible without treating them as top-line release failures by default.
-
build_summary_table_bundle()can now convert recovery-validation summaries into appendix-ready tables, including top-line decision, case, condition, and diagnostic reporting tables.export_summary_appendix()accepts the same recovery-validation summaries for CSV/HTML appendix handoff, andexport_mfrm_bundle(summary_tables = ...)can co-locate those tables with a fit-based export bundle. -
export_summary_appendix()andexport_mfrm_bundle(summary_tables = ...)now accept person-fit summary objects consistently withbuild_summary_table_bundle(). -
precision_review_report()can now be sent throughbuild_summary_table_bundle()and appendix/export helpers. Itsfit_separation_basistable remains a precision-review surface so fit, ZSTD, separation/reliability/strata, and QC thresholds are not mistaken for release or recovery success gates. -
fit_measures_table()andfacets_fit_review()summaries can now be sent through the same appendix/export helpers. Their df/ZSTD sensitivity and optional external FACETS matching tables stay separate from MnSq fit status and top-line validation decisions. -
reporting_checklist()now includes a Global Fit row for the fit/separation reporting boundary, pointing users toprecision_review_report(),fit_measures_table(), andfacets_fit_review()before drafting fit, ZSTD, separation, or reliability claims. - New observed-data resampling helpers,
build_mfrm_resampling_spec()anddraw_mfrm_resamples(), create person-clustered stratified subsample or bootstrap inputs with manifest tables for stratum representation and rater/facet coverage. These helpers are explicitly framed as stability or reproducibility inputs against a full-data reference, not true-parameter recovery evidence. -
build_mfrm_sim_spec()andsimulate_mfrm_data()now supportassignment = "sparse_linked"for planned-missing sparse rating designs. The generated data retain sparse-design metadata for design density, planned missingness, rater coverage, and rater-pair common-person links. -
evaluate_mfrm_design()andevaluate_mfrm_recovery()now carry sparse linked generators directly, including run-level and summary columns for planned missingness and rater-pair linking diagnostics. -
build_summary_table_bundle()now separates those sparse linked diagnostics into appendix-readysparse_designtables for design-evaluation and recovery-simulation outputs, so planned missingness and rater linkage are not buried inside performance or recovery metrics. The table also labels zero common-person rater pairs and requested-link target shortfalls as design-review issues, not recovery failures. -
summary(evaluate_mfrm_design(...)),summary(evaluate_mfrm_recovery(...)), andbuild_summary_table_bundle()now include a compactsparse_reviewtable when sparse linked designs are active.plot.mfrm_design_evaluation()also accepts sparse-design metrics such asplannedmissingrate,mincommonpersons,zerocommonpairs, andpairsshorttarget. - New
build_mfrm_network_review()synthesizes the existing design-network analysis into a reportable review surface: connectedness, articulation points, bridge edges, facet-level vulnerability, optional sparse-linking diagnostics, and a reporting map that keeps network evidence separate from MFRM estimates, fit, separation, and recovery gates. - New
build_peer_review_sim_spec()creates peer-review / peer-assessment simulation specifications where submissions and reviewers share the same ID universe, self-review can be structurally excluded, and common-link anchor submissions can be assigned many reviewers. Generated data carry peer-review design metadata for assignment density, reviewer load, reciprocal review pairs, and common submissions per reviewer pair.build_peer_review_design_review()turns the same metadata into appendix-ready assignment diagnostics, andbuild_mfrm_network_review()can include the metadata alongside graph connectedness review. - README, help pages, and vignettes now show the recommended reading order for recovery review:
summary(recovery_review), condition notes/review, diagnostic notes/review, status plot, metric plot, then row-level recovery rows that need follow-up. Recovery assessment and validation summaries now expose this as areading_ordertable so users can find the next table to inspect without opening plot data.
mfrmr 0.2.0
CRAN release: 2026-05-16
Documentation accuracy pass plus research-grounded visualization and GPCM bias-inference refinements. Documentation, citations, and band attributions are corrected against primary sources, with mathematical screening-SE corrections, Snijders-corrected person-fit reporting where the assumptions are met, and clearer plot data for review.
This release keeps the 0.1.6 defaults, but it is not only an infrastructure polish release. Public review helpers have been consolidated on the *_review* names documented below, and the former *_audit* public spellings, S3 compatibility classes, and duplicate top-level fields have been removed as a deliberate breaking cleanup.
Release overview
For most users, the main changes in 0.2.0 are:
- More defensible mathematics and inference: RSM/PCM/GPCM reductions, GPCM slope handling, fit df/ZSTD conventions, Snijders-corrected person fit, information curves, recovery simulations, and score-support edge cases are now covered by explicit regression tests.
-
Clearer FACETS relationship:
mfrmris positioned as a package-native MFRM workflow with FACETS-style tables, review helpers, and migration routes, not as a promise to numerically reproduce every FACETS estimate. - Better user-facing diagnostics: fit-measure tables, data-quality reports, category curves, residual-PCA follow-up, person-fit summaries, recovery checks, and reporting bundles now expose structured tables before asking users to interpret plots or console output.
-
R-first visualization access: plot helpers increasingly return reusable
draw = FALSEdata, long-form plot tables, annotations, and style metadata so users can rebuild figures in ggplot2, plotly, Quarto, or other reporting workflows. - Quieter, more reproducible workflows: routine preparation and rating-range messages are stored in fit/data objects and are opt-in at the console; long-running design evaluation shows progress only in interactive runs by default.
Breaking changes in 0.2.0 are intentional and concentrated around public naming clarity: former exported *_audit* helper names and their compatibility S3 classes were removed in favour of the canonical *_review* surface. Model defaults from 0.1.6 are retained.
The detailed notes below are organized as follows:
- user pathways, output contracts, visualization, and naming changes;
- mathematical/statistical corrections and regression-test coverage;
- recovery simulation and validation workflow;
- citation/documentation corrections;
- smaller feature additions, bug fixes, build hygiene, and deferred work.
User pathways, output contracts, and terminology
-
FACETS positioning guide: new
facets_positioning_guide()makes the package boundary explicit for reports and migration notes.mfrmris not presented as a FACETS numerical clone: estimates remain package-native unless external FACETS output is supplied for comparison, while FACETS-style wrappers, coverage tables, and output files serve transition, handoff, and report-organization purposes. -
Report-ready FACETS relationship wording:
reporting_checklist()andsummary(reporting_checklist(...))now carry afacets_positioningtable so Quarto, appendix, and handoff workflows can quote the same boundary language used byfacets_positioning_guide().build_summary_table_bundle()includes this table asfacets_relationship_wording. -
FACETS output-contract review naming:
facets_output_contract_review()is now the sole public helper for FACETS-style output-contract review. The returned bundle class ismfrm_facets_contract_review; the helper checks package output columns and derived metric consistency against the FACETS-style contract, and does not claim numerical FACETS equivalence. Public result components usecolumn_reviewandmetric_checks, not older bookkeeping labels. -
FACETS pathway for bias and Wright maps:
mfrmr_output_guide("facets")now explicitly routes FACETS users to Table 14-style bias/interactions viaestimate_bias(),bias_interaction_report(),bias_pairwise_report(), andplot_bias_interaction(), and to variable-map review viaplot(fit, type = "wright"),plot_wright_unified(), andplot_data(type = "wright"). -
FACETS pathway for anchors and category outputs:
mfrmr_output_guide("facets")now also exposes direct/group anchor routes throughreview_mfrm_anchors(),make_anchor_table(), andfit_mfrm(anchors = ..., group_anchors = ...), drift/linking review throughdetect_anchor_drift()andplot_anchor_drift(), and FACETS-style category and fair-average routes throughrating_scale_table(),category_structure_report(),category_curves_report(), andfair_average_table(). -
Standalone residual and subset file writers: new
write_mfrm_residual_file()writes observation-level observed, expected, residual, standardized residual, score-information, and optional category probability columns to CSV/TSV. Newwrite_mfrm_subset_file()writes connected-subset summaries plus node-membership files for external linking review, without forcing users through the legacy graph/score output bundle. -
Category-specific information curves:
category_curves_report()now includes acategory_informationtable andplot(..., type = "category_information"). Category contributions usea^2 P_k(theta) (k - E[X | theta])^2and sum to the total curve information at each theta value; for PCM/RSM this reduces to the unit-slope form. -
Cumulative probability curves:
category_curves_report()now also includescumulative_probabilitiesandcumulative_boundaries, withplot(..., type = "cumulative")for the FACETS/Winsteps-style accumulated category-probability view. BothP(X <= k)and flippedP(X >= k)directions are returned; boundary rows report approximate theta values whereP(X <= k) = .5, with crossing status columns so out-of-range or multiple-crossing boundaries are not over-interpreted. -
Category-curve overview plot:
plot(category_curves_report(...))now defaults to an overview panel that shows category probabilities, cumulative probabilities, total information, and category-specific information together. Existing focused views remain available throughtype = "ogive","ccc","cumulative","information", and"category_information". The plot now also supportspreset = "monochrome"with line-type separation and explicit cumulative.5boundary-line control throughboundary_status = "in_range","all", or"none".plot_data(..., component = "plot_long")returns a ggplot2/plotly-friendly long table spanning ogive, category-probability, cumulative-probability, total-information, and category-specific-information series, withcurve_stylecarrying the resolved color/line-type mapping. -
Quieter rating-range provenance:
fit_mfrm()no longer emits an informational message for routine observed-score range inference by default. The same provenance is retained infit$prep,summary(fit)$settings_overview, anddescribe_mfrm_data()so users can still tell whetherrating_min/rating_maxwere declared or inferred. Setoptions(mfrmr.show_inferred_rating_range = TRUE)to restore the one-time message during interactive checks. -
Structured data-preparation notes:
prepare_mfrm_data()now stores row retention and preparation notes infit$prep$row_retentionandfit$prep$preparation_notes. Row drops, whitespace trimming, duplicate person-by-facet cells, and single-level facets are therefore available tosummary(fit)andsummary(describe_mfrm_data(...))instead of existing only as transient console messages. Routine row-drop, trim, and single-level-facet messages are quiet by default; setoptions(mfrmr.show_preparation_messages = TRUE)to show them during interactive checks. -
Fit-annotated pathway plot data:
plot(fit, type = "pathway", draw = FALSE)now returns R-friendlypathway_longandpathway_annotationstables alongsidefit_measures,fit_status,curve_fit_status, andfit_measure_status. This lets users rebuild FACETS-style pathway maps in ggplot2, plotly, or Quarto while retaining the same underfit/overfit labels used byfit_measures_table(). -
R-first plot-data contracts for bias and information plots:
plot_bias_interaction(..., draw = FALSE)now exposesplot_long,plot_annotations,flag_summary, andplot_settingsacross scatter, ranked, heatmap, and facet-profile views.compute_information()now storesconditional_sem,information_long, and a precision/SEM summary, andplot_information(type = "sem")/"csem"are supported aliases for the conditional standard-error-of-measurement curve. -
Category-probability plot aliases and annotations:
plot(category_curves_report(...), type = "category_probability")andtype = "conditional_probability"now route to the same category-probability curves astype = "ccc", matching FACETS/Winsteps terminology without changing the underlying probability data. Draw-free plot data now includeplot_annotationsandcurve_summaryalongsideplot_long,curve_style,boundary_lines, andplot_settings. -
FACETS feature coverage matrix: new
facets_feature_coverage()gives a public, release-scoped map from the FACETS 64-bit output index to currentmfrmrroutes, separatingimplemented,partial,not_implemented, andnot_targetedsurfaces. This makes unsupported FACETS-specific outputs such as Winsteps control-file export, raw FACETS report parsing, and arbitrary Web/Excel menu plots explicit rather than implicit. -
G-study / D-study planning route: new
mfrm_d_study()extendsmfrm_generalizability()from observed variance-component review to planned design comparison. It reports projectedGandPhiunder alternative numbers of raters, criteria, or other random measurement facets, and exposes residual-scaling sensitivity assumptions so simplified G-study residuals are not silently over-interpreted.plot(mfrm_d_study(...), draw = FALSE)andplot_data()expose reusable coefficient/error-variance series for custom design-planning visuals.plot.mfrm_d_study()now supports line plots withgroup_var, ggplot2-likepanel_by/panel_gridsmall multiples, and two-axisheatmap/contourviews for rater-by-task design grids. An optionalsurface3dview is available for exploration, while heatmap/contour remain the recommended reporting displays. Plot data labels these coefficients asMetricFamily = "G-theory"so they are not confused with IRT or classical-test-theory reliability coefficients. -
Connectivity network visualization:
subset_connectivity_report()now includes reusable node/edge tables, andplot(..., type = "network")provides an igraph-based co-observation graph whenigraphis installed. Withdraw = FALSE, the returned plot data supports custom R visualization without depending on the base plotting default. -
MFRM design-network analysis: new
mfrm_network_analysis()treats the person/facet-level observation design as an undirected weighted co-observation graph and returns graph-level connectedness, node degree and strength, betweenness/closeness, articulation points, bridge edges, and facet-level vulnerability summaries. These are explicitly framed as design linking diagnostics, not as person ability, rater quality, or model-fit statistics.plot(..., type = "centrality"),plot(..., type = "facet_summary"), andplot(..., type = "network")provide immediate visual checks with draw-free plot data. -
Rater-effect network analysis: new
rater_network_analysis()adds a Lamprianou-style pairwise rater network route separate from design connectedness. It supports agreement, disagreement, and directed severity-direction networks, returns rater-level in/out strength, betweenness/closeness, a finite network severity index, retained edge tables, and all pairwise metrics used before thresholding. The help page states that these indices are descriptive network diagnostics rather than Rasch logit estimates or formal fit statistics.plot(..., type = "network"),"severity","centrality", and"matrix"provide immediate visual checks and reusable plot data. -
Halo-effect network screening: new
rater_halo_network_analysis()reshapes observed ratings into rater-by-criterion nodes, computes Spearman/Pearson/Kendall node-pair correlations, labels same-rater cross-criterion edges ashalo, and contrasts halo-edge weights with non-halo edges. The default Bonferroni-adjusted edge filter follows the conservative network-screening strategy used in Lamprianou’s halo example. The returned bundle includessummary,node_metrics,edge_metrics,pair_metrics,halo_summary_by_rater, and caveats that the Welch halo/non-halo comparison is descriptive because network edges are dependent.halo_summary_by_raternow includesReviewStatusandReviewReasonbased on same-rater cross-criterion mean weight, incident non-halo comparison weight, and retained halo-edge count, with labels framed as screening priorities rather than causal halo diagnoses.plot(..., type = "edge_distribution"),"halo_summary","matrix", and"network"provide immediate visual review and draw-free plot data. -
Fit-measures review table: new
fit_measures_table()gives a direct FACETS-style fit-measure view for raters, criteria, or other facet elements. It returns both R-friendly columns and afacets_tablewith labels such asInfit MnSq,Outfit ZStd,Fit Status, andReview Reason;underfit,overfit, andmixedsubsets are included for immediate review. -
Fit-threshold sensitivity summaries:
fit_measures_table()now returnsprofile_summary_by_facetandprofile_summary_overall, reporting underfit, overfit, mixed, and any-flag rates under multiple literature-based MnSq bands from Linacre, Bond & Fox, and Wright & Linacre. The main table still uses the active review band, whilethreshold_profilescontrols whether literature, active, all, or no profile summaries are returned. -
Fit-measure CI display:
fit_measures_table(ci_level = ...)now adds approximate measure confidence intervals to both the R-friendly table andfacets_table.plot(fit_measures, type = "measure_ci", ci_level = ..., preset = "monochrome")draws an interval plot with the requested confidence level. -
FACETS df/ZSTD guide: new
facets_fit_df_guide()documents the engine-vs-FACETS-style degrees-of-freedom distinction and the MnSq-to-ZSTD transformation workflow.fit_measures_table(fit_df_method = "both")now exposes primary df plus FACETS-style companion df/ZSTD columns, and addsdf_sensitivity,df_sensitive, anddf_sensitivity_summaryso users can identify rows whose |ZSTD| flag status or interpretation is convention-sensitive. The df-sensitivity screen exposes explicitdf_zstd_tolerance,df_zstd_large_shift, anddf_ratio_tolerancesettings so FACETS-style reviews can be reproduced under stricter or more permissive rules.plot(fit_measures, type = "df_sensitivity")visualizes the largest engine-vs-FACETS-style ZSTD shifts.facets_fit_review()now uses the same row-level df-sensitivity engine and returnsdf_sensitivity,df_sensitive, anddf_sensitivity_summary; the formerinternal_comparisoncomponent has been removed to keep the public output vocabulary consistent. External FACETS ZSTD tolerance is now namedexternal_zstd_tolerance, separating external-table comparison from engine-vs-FACETS-style df sensitivity. The sameplot(..., type = "df_sensitivity")route is available forfacets_fit_review()bundles. -
Data quality score-support review:
data_quality_report()now keeps the full fitted score support incategory_counts, addsscore_support_review, and surfacescaveatsfor zero-frequency categories. Intermediate gaps such as a declared 1-5 scale with observed1, 2, 4, 5are flagged either as retained zero-count categories (keep_original = TRUE) or as original-label gaps hidden by internal recoding (keep_original = FALSE).plot(data_quality_report(...), type = "score_support")highlights those categories, withpreset = "monochrome"available. -
Facet-level category-usage QC:
data_quality_report()now addscategory_usage_by_facetandcategory_usage_summary, covering every fitted facet level crossed with the retained score support. This flags local zero and sparse category use, such as a rater who never uses the middle category even when the category appears elsewhere in the data.plot(data_quality_report(...), type = "facet_category_usage")provides a quick view of affected facet levels. -
Data quality dashboard:
plot(data_quality_report(...), type = "dashboard")now combines row review, score-support category use, facet-level category-use issues, and missing/invalid row counts in one base-R view. Withdraw = FALSE, the returned plot data contains all four panel tables for report handoff. -
Data quality flags:
data_quality_report()now includesquality_flags, a prioritized QC table that summarizes row exclusions, unknown design levels, score-support gaps, facet-level category-use cautions, and restricted facet response patterns with counts and next actions.summary(data_quality_report(...))previews this table when any priority flag is present. -
Facet response-pattern QC:
data_quality_report()now addsfacet_response_patterns, which flags facet levels that use only one score category, assign one category to at least the configured dominant-category cutoff, or use only boundary categories. This catches cases such as a rater assigning score 1 to all responses on a 1-5 scale. -
Data quality overview and score map plots:
data_quality_report()now addsquality_overview, a compact area-level status table for rows, score support, facet-level category use, facet response patterns, and design matching.plot(..., type = "quality_flags")summarizes priority QC flags by area,plot(..., type = "facet_response_patterns")shows dominant local category use by facet level, andplot(..., type = "score_map")shows original-to-internal score mappings when labels have been recoded. -
User-pathway and plot-data access:
mfrmr_output_guide("facets"),mfrmr_output_guide("conquest"), andmfrmr_output_guide("r")now give focused starting points for users arriving from FACETS, ConQuest, or R-first visualization workflows. Newplot_data()extracts the full reusable plot-data list, or one named component, from anymfrm_plot_dataobject or mfrmr plot helper that supportsdraw = FALSE. Newplot_data_components()lists each reusable plot-data component, its shape, role, accessor call, and custom-graphics notes so users can discoverplot_long, annotation, settings, style, and review tables without reading the underlying list structure. -
Monochrome plot preset: plot helpers that use the package visual preset system now accept
preset = "monochrome". Color remains the default ("standard"), while monochrome supports print-oriented figures and color-independent review. -
Interval-aware visualization guide: new
mfrmr_interval_guide()maps public 95% CI / uncertainty routes across fit-measure tables, Wright maps, fair averages, bias screens, displacement, DFF/DIF summaries, anchor drift, rater severity profiles, rater trajectories, manuscript Figure 1 composites, shrinkage, and ICC review. The guide records the interval basis and interpretation boundary so CI displays remain precision or screening evidence rather than automatic fit, fairness, or validity decisions. -
Searchable help concepts: high-level route guides and CI-capable plot / table helpers now carry Rd concept tags such as
confidence intervals,visual diagnostics,reporting workflow,route selection, andGPCM boundaries. This makeshelp.search()useful for finding the right help page before users know the function name. -
Shrinkage CI plot data:
plot_shrinkage_funnel(show_ci = TRUE, ci_level = ...)now draws approximate raw and shrunken estimate whiskers and returnsRawCI_Lower,RawCI_Upper,ShrunkCI_Lower,ShrunkCI_Upper, andCI_Levelfor downstream graphics.plot(fit, type = "shrinkage")now uses the requestedci_levelfor CI whiskers instead of a fixed 95% level. -
Shrinkage figure guidance:
visual_reporting_template()and the visual diagnostics vignette now include an empirical-Bayes shrinkage funnel route, caption skeleton, beginner check, and interpretation guardrails so users can report shrinkage movement without treating it as automatic rater-quality, bias, or validity evidence. -
Response-time diagnostic layer: new
response_time_review()andplot_response_time_review()summarize response-time metadata outside the fitted MFRM likelihood. The review returns rapid/slow thresholds, event-level flags, person/facet/score summaries, and grouped plot data so timing patterns can be reviewed as descriptive QC rather than joint speed-accuracy parameters or automatic exclusion rules.mfrmr_output_guide("response_time")and the R-user pathway now expose this route alongside other review and reusable plot-data helpers.mfrm_results(include = "response_time", response_time = ..., response_time_data = ...)can now carry the same descriptive review into the first-screen result object,summary(res)$next_actions,plot(res, type = "response_time"), the local viewer payload, andexport_mfrm_results()table exports without changing fitted MFRM estimates. -
Bounded GPCM boundary text: README, vignettes, help pages, and unsupported-path messages now state the current
GPCMscope consistently: direct data generation, parameter recovery, fair averages, bias screening, summary-table bundles, appendix export, caveated APA/QC/export bundles, and exploratory linking review are available within documented caveats. Role-based design evaluation and population forecasting are also available as caveated bounded-GPCMsensitivity evidence when the requested design preserves the simulation specification’s slope structure. Role-based diagnostic and signal-detection design screening is available as caveated slope-aware operating-characteristic evidence withgpcm_boundaryoutput. Full FACETS-style score-side contract review or score-side equivalence, posterior predictive checks, and heavy backend routes remain outside the validated route. The validation artifacts now include a bounded-GPCMroadmap so thosesupported_with_caveat,blocked, anddeferredrows are tracked as explicit release-scope decisions instead of implicit support gaps. -
GPCM score-side boundary checks: blocked bounded-
GPCMscore-side helpers now stop before producing partial outputs. Thefacets_output_file_bundle()andfacets_output_contract_review()help pages also state that graph and package-native scorefile output are caveated bounded-GPCMroutes while full FACETS output-contract review remains outside the bounded-GPCMboundary.gpcm_score_side_contract()records the estimand, uncertainty, reduction-test, schema, guard, and release-wording requirements that separate caveated scorefile output from full FACETS-style score-side review. -
GPCM route guidance:
gpcm_capability_matrix()now includesRecommendedRouteandNextValidationStepcolumns, so each supported, caveated, blocked, or deferred helper family states both the current substitute workflow and the validation evidence needed before the boundary can move. -
GPCM out-of-scope route guidance: blocked and deferred bounded-
GPCMroutes now report the matching capability-matrix row, recommended substitute route, and next validation step before returning an error. The errors carry classmfrmr_gpcm_scope_errorwith helper, area, status, recommended-route, and next-validation-step fields for programmatic handling. -
GPCM route-boundary coverage: the capability-matrix tests and validation review now verify that every blocked or deferred bounded-
GPCMrow is represented by a structured stop condition or an explicit future-scope entry, and that those messages stay synchronized with the matrix’s area, status, route, and next-validation-step fields.gpcm_runtime_guard_coverage()is exported as the public table for this route-boundary check.mfrmr_output_guide("gpcm")now points users to bothgpcm_capability_matrix()and this coverage table. -
GPCM release-readiness alignment:
inst/validation/release-readiness.Rnow checks that blocked and deferredgpcm_capability_matrix()rows have non-empty route guidance, are represented in the installed bounded-GPCMscope notes, and are covered by future-scope rows in the release-evidence checklist. -
User-facing wording: public visualization and reporting documentation now uses “plot data”, “surface data”, or “data handoff” instead of implementation terminology where possible, while retaining actual field names such as
plot_payloadsonly where users need to access them. -
RSM/PCM wording review: package-level, fitting, information, bias, and export docs now distinguish the equal-weighting
RSM/PCMreference route from the broader ordered-response model surface. Stale “Rasch-only” and “legacy-compatible” labels were narrowed where they described helpers that now also serve boundedGPCMor direct summary-table workflows. -
Model-choice user guide: README and the GPCM/MML vignettes now give user-facing guidance for choosing
RSM,PCM, or boundedGPCM, including report wording templates and a warning that betterGPCMfit is sensitivity evidence rather than an automatic operational-scoring decision. A documentation terminology regression test now guards against reintroducing the stale Rasch-only phrasing removed in this pass. -
Fit-level model-choice review: new
build_model_choice_review()bundlescompare_mfrm(), model-role guidance, downstream route availability, report wording templates, the bounded-GPCMsupport matrix, and an optionalbuild_weighting_review()run so users can reviewRSM/PCM/ bounded-GPCMcandidates from the actual fitted objects. -
Review-name migration completed as a breaking cleanup:
review_mfrm_anchors(),precision_review_report(),review_conquest_overlap(),facet_small_sample_review(), andbuild_weighting_review()are now the only exported review-name implementations. The former public*_audit*function spellings, their S3 compatibility classes, and old public component names such asorientation_audit,nesting_audit,hierarchical_audit, andshrinkage_audithave been removed or renamed rather than shown as user-facing migration artifacts. -
Compatibility registry narrowed:
compatibility_alias_table()now lists only retained compatibility names that remain part of the public package surface, such asmfrmRFacets,analyze_dif,JMLE, and long-standing output column aliases. It no longer advertises removed review-name migration artifacts. -
Public output review wording: linking-review tables and plot routes now use
anchor_reviewlabels in user-facing source metadata, model-choice raw objects and summaries expose onlyweighting_review_status/weighting_review, and diagnostics summary-table bundles exportprecision_reviewwithout the old duplicate table. Bias reports now exposeorientation_review; model-comparison output usesnesting_review; and reproducibility manifests usehierarchical_reviewandshrinkage_review. DFF subgroup refit rows now record anchor-review notes inLinkingReview. -
Review component accessors: new
anchor_review()andprecision_review()helpers provide a stable route to package-native review components. They intentionally read only canonical*_reviewfields. -
Reference review naming:
reference_case_review()is now the canonical package-native report-completeness helper, and reporting-checklist / cheatsheet wording now uses review labels for hierarchical and complete-case follow-up items. -
Review-wording guardrail: current public guides and generated help now avoid exposing
auditas a user-facing package concept. Data-quality row-status output usesrow_review; prediction provenance usesrow_review/population_review; bias, model-comparison, and manifest components use*_reviewnames; and ordinary user-facing guidance uses review/check/traceability terminology. -
Output helper guide: new
mfrmr_output_guide()gives users a compact purpose-to-helper map for choosing among*_table,*_report,*_review,*_bundle,export_*, and compatibility routes. The compatibility guide now also states that old*_audithelper and component names are not part of the 0.2.0 public surface.
Mathematical and inferential corrections
-
Identified step/threshold parameterization:
RSM,PCM, and boundedGPCMnow optimize step/threshold profiles with the correct sum-to-zero degrees of freedom (steps - 1per profile). Earlier pre-release implementations centered the step vector after optimization but still left the centered-away null direction in the optimizer, AIC/BIC parameter count, and Hessian. The point-estimate scale is unchanged; the likelihood parameter count and observed-information basis are now aligned with the stated identification constraint. -
MML joint covariance layer for structural parameters:
diagnose_mfrm()now reuses one observed-information covariance for non-person facet SEs and exposes the same covariance basis for step/threshold and bounded-GPCMslope uncertainty indiagnostics$parameter_uncertainty. Step rows getSE, normalCI_Lower/CI_Upper, and covariance status metadata. GPCM slope rows get log-slope SEs plus positive-scale delta-method SEs and log-normal confidence limits.fit_mfrm(..., attach_diagnostics = TRUE)attaches those structural SE columns tofit$stepsandfit$slopeswhen the MML Hessian is available. -
Measure-level CI contract:
diagnose_mfrm()$measuresnow recordsCI_Level = 0.95andCI_Method = "Normal approximation"alongsideCI_Lower/CI_Upper. The interval calculation usesqnorm(0.975)rather than a rounded multiplier, while row-levelCIEligible,CIBasis, andCIUsecontinue to distinguish primary reporting intervals from review or screening approximations. -
Weighted BIC transparency:
compare_mfrm()now reportsWeightedN,ICSampleSize, andICSampleSizeBasisin its comparison table. This makes the BIC penalty basis explicit: ordinary fits use row count, while weighted fits use the sum of weights already used by the fitted model summary. - Residual-PCA boundary handling: exploratory residual-PCA helpers now capture non-fatal PCA-engine warnings inside the returned PCA bundle instead of emitting them as loose warnings during diagnostics or plotting. Degenerate residual-correlation conditions therefore remain visible for review without looking like confirmatory test failures.
-
Residual-PCA parallel analysis:
analyze_residual_pca()now supportsparallel = TRUEfor residual-permutation parallel analysis. The null comparison permutes standardized residuals within residual columns, preserving column distributions and missingness while breaking residual association. PCA tables gainParallelMean,ParallelCutoff,ExcessOverParallelCutoff, andExceedsParallelCutoff, andparallel_statusrecords availability and successful permutation counts.plot_residual_pca()addsparallel_screeandparallel_excessviews. This is reported as exploratory follow-up evidence for dimensionality review, not as a standalone proof of unidimensionality or multidimensionality. -
GPCM fair-average structural SEs:
fair_average_table(fair_se = TRUE)now adds opt-in structural delta-method SE and CI columns for boundedGPCMfair averages when the MML observed-information covariance is available. The originalSE/Model S.E./Real S.E.columns keep their measure-SE meaning; fair-average uncertainty is exposed in distinct columns such asFair(M) S.E.,AdjustedAverageSE, andAdjustedAverageCI_Lower/AdjustedAverageCI_Upper. Person rows remain unavailable because MML person EAP estimates are conditioned on rather than included in the structural Hessian.summary(fair_average_table(...))and its print method now surface whether fair-average SEs were requested, how many rows are available, and the resulting status mix.plot_fair_average(show_ci = TRUE)uses these columns automatically for bounded-GPCMfit objects. -
GPCM expected-score consistency: the internal
expected_score_table()route now uses the same response-probability bundle as diagnostics and category-count calculations, so boundedGPCMexpected scores respect the fitted slope parameters instead of falling through to the PCM kernel. Fair-average documentation now also states that non-slope-facet rows use an identification-based reporting convention, not a FACETS score-side equivalence claim. -
GPCM invalid-slope guard: the low-level GPCM expected-score helper no longer treats non-finite, zero, or negative slopes as slope = 1. It returns unavailable expected scores instead, so a malformed bounded-
GPCMobject cannot silently become a PCM-style calculation in fair-average internals. The internal iteration-state replay helper also now has an explicit GPCM probability-kernel branch, avoiding a latent PCM fallback if that route is later moved inside the supported GPCM boundary. -
RSM-to-PCM reduction checks: the test suite now pins the
RSMspecial case as common-thresholdPCM. Under identical common thresholds,RSMandPCMmust agree for category probabilities, unweighted and weighted log-likelihoods, response-bundle diagnostics, generated simulation data, and reconstructed simulation probabilities. The publiccompare_mfrm(..., nested = TRUE)path now also has a regression check that the reported LRT degrees of freedom equal the identified RSM-to-PCM step-structure difference. -
Boundary-safe LRT reporting:
compare_mfrm(..., nested = TRUE)now recordscomparison_basis$lrt_statusandcomparison_basis$lrt_reason. Non-finite log-likelihoods, equal parameter counts, unsupported nesting, or negative likelihood-ratio statistics no longer fail silently or imply a model-choice conclusion; the LRT is withheld and the print/summary path states why it was not reported. -
GPCM slope-scale consistency: GPCM simulation specifications now treat supplied slopes as relative discriminations and normalize them to the same geometric-mean-one log-slope identification used by
fit_mfrm(). Recovery summaries compare identified log slopes without an additional mean-alignment step, so absolute slope-scale bias is not hidden by the recovery table. -
PCM reduction check for GPCM simulation: the simulation tests now pin the special case in which bounded
GPCMhas unit slopes. With the same simulation specification, seed, and step-facet thresholds, the generated visible data and reconstructed category-probability matrix must matchPCMto numerical tolerance. This guards the intended mathematical reduction without implying that a freely estimatedGPCMfit should equal aPCMfit. -
PCM reduction check for downstream diagnostics: the unit-slope
GPCMreduction is now also tested at the response-probability bundle layer. The bundle used by expected scores, variance-based fit diagnostics, and information calculations must match thePCMbundle for probabilities, expected category scores, score variances, fourth central moments, and score information. A companion sensitivity check verifies that non-unit slopes move the same quantities away from thePCMvalues, so the GPCM path is neither a hidden PCM fallback nor an unconstrained divergence. -
Mathematical consistency regression tests: the test suite now pins probability, expectation, variance, fourth-moment, information, and conditional-SEM identities across the low-level bounded-
GPCMresponse bundle,RSM/PCM/ bounded-GPCMcategory-curve reports, draw-free CCC/pathway plot data, andcompute_information(). The checks also require facet-level information-contribution curves to aggregate back to the total information curve. This guards user-visible visualization and reporting tables against drifting away from the probability kernels. -
Fit-measure consistency regression tests: the test suite now verifies that
fit_measures_table()preserves the documented df/ZSTD formulas, confidence-interval formulas, active fit-status labels, threshold-profile counts and rates, df-sensitivity status taxonomy, and draw-freemeasure_ci/df_sensitivityplot data. This pins the FACETS-style reporting surface to the same row-level calculations users see in the returned tables. -
Data-quality consistency regression tests: the test suite now verifies that
data_quality_report()summary counts are recomputable from returned detail tables, thatquality_flagsandquality_overviewsummarize the same QC evidence, and that draw-freequality_flags,facet_category_usage,facet_response_patterns, andscore_mapplot data preserve score-support gaps, facet-level category-use issues, restricted rater response patterns, and original-label gaps hidden by score recoding. -
GPCM bias SE (
estimate_bias()): the conditional plug-in SE for the additive bias shift now uses the correct GPCM information (_i a_i^2 (X_i)). The previous pre-release implementation optimized the point estimate with the slope-aware GPCM kernel but used the PCM information (_i (X_i)) forS.E./t/Prob.. The review label remains"screening". -
FACETS-style fit ZSTD df layer:
diagnose_mfrm()now acceptsfit_df_method = "engine","facets", or"both". The default keeps the existing package-native df convention (DF_Infit = sum(Var * Weight),DF_Outfit = sum(Weight)). The FACETS path adds the Wright-Masters/FACETS fourth-moment df approximation (df = 2 / q^2) and caps FACETS-style ZSTD values at +/-9. Usefit_df_method = "both"when comparing mfrmr fit flags with FACETS output: it preserves the engine ZSTD columns and addsDF_Infit_FACETS,DF_Outfit_FACETS,InfitZSTD_FACETS, andOutfitZSTD_FACETS. -
FACETS fit review helper: new
facets_fit_review()separates engine-level fit-standardization differences from optional external FACETS table comparisons. The engine-vs-FACETS review compares engine and FACETS-style df/ZSTD values row-by-row and flags cases where the df convention changes the usual|ZSTD| >= 2screen. When a FACETS-like table is supplied, the external review matches rows byFacet/Level(or person labels for person-only tables) and classifies differences assame,rounding,df_or_whexact_difference,mnsq_or_measure_difference, orneeds_review. This makes FACETS comparisons reproducible without treating mfrmr’s package-native df convention as an error. -
FACETS fit table import: new
read_facets_fit_table()/import_facets_fit_table()reads existing FACETS output into theFacet/Level/Infit/Outfit/ZSTD/ df schema expected byfacets_fit_review(). It supports already harmonized CSV/TSV-style tables, partial FACETS extracts with ZSTD andTCountbut no MnSq/df columns, and FACETSscore.N.txtfiles, including fixed-field score files using the FACETS manual column positions, with an optionalfacet_mapfor assigning score-file numbers to user-facing facet names.facets_fit_review()now returnsexternal_table_qualityso users can see duplicateFacetxLevelrows and whether MnSq, ZSTD, df, and count columns were available in the supplied external table. -
GPCM fair-average CI display:
plot_fair_average(show_ci = TRUE)no longer fabricates CIs from measure-level SEs for boundedGPCMfits. It now uses the opt-in structural fair-average SE columns when a fit object is supplied and records an unavailable-CI note for precomputed fair-average bundles that lack those columns. -
GPCM bias likelihood checks:
estimate_bias()now adds conditional profile-likelihood columns for boundedGPCMrows:LR ChiSq,LR d.f.,LR Prob.,Profile CI Lower,Profile CI Upper,Profile CI Level, andProfile CI Status. These compare the fitted additive bias shift with zero while holding theta, steps, slopes, and other facet estimates fixed. They strengthen the GPCM screening evidence without turning it into standalone confirmatory fairness inference.summary(estimate_bias(...))now surfaces the profile-LR screen-positive count and carries these columns through the top-row review table when they are available.
Research-grounded visualization refinements
-
Category-curve information output:
category_curves_report()now carries per-curveScoreVariance,Slope, andInformationcolumns. ForGPCM,Informationis computed as (a^2 (X )), matching the Muraki/Samejima polytomous information identity; forRSM/PCM, this reduces to the usual score variance.plot(category_curves_report(fit), type = "information")returns the corresponding curve-level information plot data. -
Bias heatmap review data:
plot_bias_interaction(plot = "heatmap", draw = FALSE)now returnsheatmap_cells,heatmap_matrix, flag/count matrices, interpretation guidance, and reference notes. The display is documented as a FACETS Table 13-style screening follow-up, not confirmatory evidence.
Recovery simulation workflow
-
evaluate_mfrm_recovery()adds a dedicated parameter-recovery simulation route. It repeatedly simulates from a known MFRM generating setup, refits the requested model, and returns row-level truth/estimate comparisons plus summaries by parameter type. Location-like parameters are mean-aligned within replication before reporting recoveryBias,RMSE,MAE, correlation, and 95% coverage where standard errors are available; bounded-GPCMslopes are compared on the identified log-slope scale after the generator and fitter have both imposed the geometric-mean-one slope convention. The output also carries ADEMP-style simulation-study metadata so recovery checks are separated from broader design-evaluation claims. -
plot(evaluate_mfrm_recovery(...))adds review plots for recovery summaries, coverage, row-level error distributions, truth-estimate scatter, and replication status.draw = FALSEreturns anmfrm_plot_dataobject with reusable plot tables and notes. -
assess_mfrm_recovery()adds a user-facing adequacy checklist for recovery simulations. It separates run completion, convergence, uncertainty availability, coverage, Monte Carlo precision, and optional practical RMSE/Bias thresholds intook/review/concernstyle statuses with next-action text.plot(assess_mfrm_recovery(...))now provides checklist status-count and parameter-metric review plots so users can see which part of the assessment needs attention before reading the full tables. Thedraw = FALSEplot data includereading_order,guidance, and user-facing handoff tables such assection_statusso follow-up starts with review/concern rows rather than raw row-level output. -
Simulation refit score support: simulation-based refit helpers now pass the generator’s declared
1:score_levelsscore support intofit_mfrm(). This keeps zero-count boundary categories in the fitted support during recovery, design-evaluation, diagnostic-screening, and bias-screening runs, and avoids repeated rating-range inference messages in release-validation logs. -
Compact step-threshold specifications:
build_mfrm_sim_spec()andsimulate_mfrm_data()now accept step-facet-specific thresholds as a named list or row-named numeric matrix, in addition to the existing longStepFacet/StepIndex/Estimatetable. -
Design-evaluation progress control:
evaluate_mfrm_design(progress = interactive())now shows the progress bar only in interactive sessions by default. Non-interactive tests, Quarto rendering, and batch scripts stay quiet unless users setprogress = TRUE; users can also setprogress = FALSEfor fully silent exploratory runs. -
Release recovery-validation protocol:
inst/validation/recovery-validation.Rprovides an optional long-run validation script for release review. It defines structured review steps, coreRSM/PCM/ bounded-GPCMrecovery cases, an extended latent-regression case, practical thresholds, and a summary writer that produces top-line release-decision, case-level release-decision, review-step, case-plan, case-summary, metric-summary, overall decision-table, domain decision-table, run-note, RDS, and Markdown outputs without adding heavy Monte Carlo runs to routine package tests. The release decision uses recovery metrics, convergence, and Monte Carlo precision as the primary evidence; the domain decision table separately reports uncertainty status so missing JML coverage columns are not mistaken for recovery failure. Printing the validation object or callingsummary(validation)now shows the release-level decision and case-level statuses before the full tables. -
Recovery reporting handoff:
build_summary_table_bundle()now acceptsevaluate_mfrm_recovery()andassess_mfrm_recovery()outputs directly, including ADEMP-style methods metadata, replication status, checklist rows, metric review rows, thresholds, notes, and appendix-preset roles. -
Recovery appendix export:
export_summary_appendix()now recognizes recovery simulation and recovery assessment objects as direct inputs. The workflow vignette shows the full sequence from simulation specification to recovery plots, adequacy assessment, summary-table bundle, and appendix export. -
Research-grounded release evidence map:
inst/validation/release-evidence-map-0.2.0.mdgives a source-based review plan for 0.2.0. It links the release checks to Andrich’sRSM, Masters’PCM, Muraki’sGPCMand information-function work, FACETS/Winsteps fit conventions, and ADEMP-style simulation-study reporting, then separates release-gate checks from future-scope items. The companionrelease-evidence-checklist-0.2.0.csvprovides a structured required / caveat / future-scope checklist for release review.external-parameter-recovery-simulation-0.2.0.mdsummarizes the separate common-data recovery and cross-engine agreement workflow and its limits without bundling the generated simulation datasets; the validation bundle also includes a sourceable helper for re-reading a local external-output directory when that workflow is refreshed. -
Release-readiness protocol:
inst/validation/release-readiness.Rturns the evidence map into a reproducible review object. It records eight review steps, parses anR CMD checklog, checks the 0.2.0 version contract, verifies the CI workflow contract for warning failures and retained check artifacts, scans public docs for disallowed removed-helper wording, confirms evidence artifacts, and reports a top-lineok/review/concerngate summary without adding exported user-facing API.
Citation and attribution corrections
-
Muraki DOI consistency:
DESCRIPTIONnow cites Muraki (1992) using the GPCM article DOI,10.1177/014662169201600206. The Muraki (1993) Applied Psychological Measurement reference in GPCM information help pages now uses10.1177/014662169301700403. -
Wright (1998) page:
R/api-shrinkage.Rreferences corrected from Rasch Measurement Transactions, 12(2), 638 to 632-633 (page 638 in the same RMT issue is a different paper; verified at https://www.rasch.org/rmt/rmt122.htm). -
Linacre (1989, “2004”) in
reporting_checklist(): the bare “Linacre (1989, 2004)” tag inR/api-reporting-checklist.Ris now Linacre (1989, 2002). The 2002 paper is “Optimizing rating scale category effectiveness,” JAM, 3(1), 85-106 – the canonical Linacre reference for rating-scale guidance. (No bibliographic entry existed for “Linacre (2004)”.) -
Eckes (2011) full reference: the inline
(cf. Eckes, 2011; ...)caveat indif_report()now has a complete@referencesentry pointing to Introduction to Many-Facet Rasch Measurement (1st ed., Peter Lang). McNamara & Knoch (2012) is also fully cited. -
Mean-square fit ranges in
?mfrmr-packagepreviously attributed the context-specific bands (high-stakes / clinical / survey) to Linacre (2002). The actual source is Wright & Linacre (1994), RMT 8(3), 370. The band assignments were also swapped: high-stakes MCQ is 0.8-1.2 (not 0.6-1.4), survey is 0.6-1.4, clinical observation is 0.5-1.7. Corrected. -
Yen Q3 (
q3_statistic()): previously stated mfrmr’s Q3 uses standardized residuals as if matching Yen (1984). Yen’s eq. 7 (p. 127) uses raw residuals; mfrmr’s standardized-residual choice is now documented as a deliberate departure. The|Q3| > 0.20cutoff was attributed to Yen but is from Chen & Thissen (1997), JEBS, 22(3), 265-289. Re-attributed. -
Christensen et al. (2017) in
q3_statistic(): the central finding of Christensen et al. is that no single critical value is appropriate across designs and that a parametric bootstrap should be used. Documentation now states this clearly; the fixedrelative_offset = 0.20is described as a screening default rather than as a re-implementation ofQ3_*. -
Morris (1983) posterior-SE correction formula in
?apply_empirical_bayes_shrinkagewas dimensionally wrong: previously written2 B^2 (tau^2 + SE^2)^2 / (K - 3), which is SE^4-units. The actual Morris (1983, eq. 4.1-4.2, p. 51) correction is(2 / (K - r - 2)) * B^2 * delta^2. Corrected, with re-derived magnitude examples (SE understated by ~73% at K=3, ~29% at K=5, ~7% at K=15). -
Koo & Li (2016) ICC band boundary:
compute_facet_icc()previously placed ICC = 0.9 in Excellent (>= 0.9). Koo & Li (2016, p. 161) write “values greater than 0.90 indicate excellent reliability” – strict>. Code atR/api-hierarchical-audit.Rnow uses> 0.9for Excellent; ICC = 0.9 reads as Good.
Documentation refinements
- Linacre FACETS / Winsteps manuals: cited years updated from 2023 / 2024 to 2026 (current FACETS 4.5.0 = April 2026, Winsteps 5.11.0 = March 2026 per https://www.winsteps.com/index.htm).
-
Bock & Aitkin (1981) clarification:
?mfrmr-packagenow notes that the defaultmml_engine = "direct"optimises the marginal log-likelihood by gradient methods (BFGS / L-BFGS-B), not by Bock & Aitkin’s signature EM. The"em"and"hybrid"engines follow the EM template but with a BFGS M-step (rather than B&A’s probit IRLS), because the target is the polytomous Rasch family rather than 2PL. -
Linacre (1994) sample-size bands:
mfrm_core.Randreporting.Rnow describe the bands as “adapted from Linacre (1994)” rather than “follow Linacre (1994)”. Only the 30-examinee floor is Linacre’s; the< 10 sparseand< 50 standardwatermarks are mfrmr-specific screening choices. -
Snijders (2001) lz\* correction:
compute_person_fit_indices()now computes the Snijders weight-projection correction for JML/fixed-effect person estimates, conditional on the fitted non-person calibration. The implementation uses the polytomous formw_tilde_k = log(P_k) - c_n d log(P_k) / d theta, withc_n = Cov(log P, score) / I(theta). MML/EAP person scores keeplz_star = NAwithlz_star_status = "not_applicable_eap"because EAP does not satisfy the ML/MAP/WLE estimating-equation setup. -
Report-ready person-fit output:
compute_person_fit_indices()now adds practical 5% / 1% flag columns and compactReportIndex,ReportValue,ReviewStatus,ReviewReason, andReportCaveatcolumns.ReportIndexuseslz_staronly when the Snijders correction was actually computed; otherwise it falls back to uncorrectedlzwith the status caveat left visible.plot_person_fit()now carries these person-fit indices in its draw-free plot data, adds reusableplot_longandflag_summarytables, and supportsfit_index = "loglik"pluspreset = "monochrome"for report-focused person-fit displays. -
Person-fit summary and table-bundle handoff:
compute_person_fit_indices()now returns anmfrm_person_fit_indicesdata-frame subclass.summary(person_fit)gives overview counts,ReviewStatus/ReportIndex/lz_star_statussummaries, top review rows, thresholds, caveats, and a reporting map. The same summary is now accepted bybuild_summary_table_bundle(), so person-fit review rows and Snijders-availability caveats can move into appendix/report workflows without custom table wrangling. -
Marais (2013)
|Q3| > 0.30: documented as a community convention Marais cites, not as her own recommendation; her actual recommendation is the relative-to-mean comparison.
Default changes
No defaults change between 0.1.6 and 0.2.0. The 0.1.6 defaults (quad_points = 31, diagnostic_mode = "both", plot.mfrm_fit(type = "wright"), keep_original = FALSE) are retained.
Note for users upgrading directly from CRAN 0.1.5 to 0.2.0 (skipping intermediate 0.1.6 builds): three defaults were flipped in 0.1.6 and remain on those values in 0.2.0 – diagnose_mfrm(diagnostic_mode) went from "legacy" to "both", plot(fit) returns the Wright map alone instead of a three-plot overview (the overview is still available via plot(fit, type = "bundle")), and fit_mfrm(quad_points) went from 15 to 31. See the “mfrmr 0.1.6” section below for the full description and revert paths.
New features
Continuous integration
New GitHub Actions workflows added alongside the existing pkgdown.yaml: R-CMD-check.yaml runs the matrix on Ubuntu (release / devel / oldrel-1) plus macos-latest and windows-latest (release), and test-coverage.yaml runs covr with artifact upload (no external service contacted).
Differential-functioning display controls
plot_dif_heatmap() gains display controls for cell labels (show_values, value_digits), absolute flag thresholds (flag_threshold, flag_color), and shared symmetric color limits (scale_limit) so several heatmaps can be drawn on a comparable scale.
plot_dif_summary() gains optional normal-approximation confidence intervals, effect-threshold guide lines, method-aware axis labels, and interpretation-guide data that downstream code can render alongside the figure.
Plot data printing
print.mfrm_plot_data() is now defined, so the headline draw = FALSE return value renders as a compact summary (name, title, reusable data shapes, legend / reference-line counts) instead of a raw list dump.
Bounded GPCM fair-average and bias unblock (slope-aware)
fair_average_table() and estimate_bias() no longer hard-stop on GPCM fits. Both helpers now use the slope-aware element-conditional GPCM construction:
fair_average_table(): for slope-facet element rows, the fair-average uses that element’s own discriminationa_{j*}and threshold structure:FA_{p,j*} = sum_k k * P_GPCM(X = k | theta_p, a_{j*}, delta_{j*}). For non-slope facets (Person, Rater, …), the fair-average uses the geometric-mean-one slope by GPCM identification, so the construction is continuous with the PCM Linacre fair-average and reduces to it exactly when all slopes equal one (regression-tested at machine precision).estimate_bias(): the per-cell bias parameter is the additive shift on the linear predictor that maximises the per-cell GPCM log-likelihood. The dispatch routes the innernlland the per-iterationcategory_probcalls through the GPCM kernel instead of the PCM kernel; SE / t / Prob columns retain the screening-tier semantics documented in?estimate_bias.
Both helpers gain method = "GPCM-slope-aware" and a caveat field that names the slope convention. For fair averages, the original SE columns remain measure-level SEs, while fair_se = TRUE adds structural delta-method fair-average SEs for non-person rows when the MML Hessian is available. For bias values, the SE / t / Prob columns retain their conditional screening interpretation. See ?fair_average_table, ?estimate_bias, and gpcm_capability_matrix() for the full support contract.
build_apa_outputs(), facets_output_contract_review(), and facets_output_file_bundle(include = "score") remain blocked under GPCM in 0.2.0; they require the same SE infrastructure to ship as publication-quality outputs.
Bug fixes
-
compute_person_fit_indices()now computeslzfrom the model category probability of the observed category directly (true Drasgow, Levine & Williams (1985) polytomous form), via three new intermediate columnsPrObserved,ItemEntropy, andItemVarLogPoncompute_obs_table(). The previous Gaussian-residual approximation overstatedVar[log P]by roughly a factor of five on a 4-category fixture and pulledlztoward zero. - The
ECI4column is removed fromcompute_person_fit_indices(). The previous implementation was the standardized chi-square(sum StdSq - n) / sqrt(2 * n), which is the linear (Smith) approximation toOutfitZSTD, not the Tatsuoka & Tatsuoka (1983) extended-caution index. Users who want the equivalent statistic should useOutfitZSTDdirectly.lz_starnow uses the Snijders- weight-projection correction where its estimating-equation assumptions are met, and otherwise stays
NAwith an explicit status.
- weight-projection correction where its estimating-equation assumptions are met, and otherwise stays
-
displacement_table()$summarynow returnsNA_real_forMaxAbsDisplacementandMaxAbsDisplacementTwhen every flagged level has zero information (so everyDisplacementisNA). Previously the helper calledmax(..., na.rm = TRUE)on an all-NAvector, which returned-Infand emitted a “no non-missing arguments to max; returning -Inf” warning. The guarded version is regression-tested intest-core-coverage-gaps.R. -
analyze_dff()anddif_interaction_table()now reject invalidp_adjust, non-integermin_obs, invalidfocalgroups, and all-missing group columns up front, instead of failing later inside the contrast computation. Missing or empty group rows are dropped with amessage().
Documentation
?analyze_dff,?plot_dif_summary,?mfrmr_linking_and_dff, and?mfrmr_visual_diagnosticsnow distinguish residual-method screening labels from refit-method ETS A/B/C classifications more explicitly and route users to bothplot_dif_heatmap()andplot_dif_summary().?compute_person_fit_indicesnow describes whenlz_staris computed and when it is intentionally leftNA: JML/fixed-effect person scores receive the Snijders (2001) correction, whereas MML/EAP scores remain uncorrected because EAP is outside the Snijders estimating-equation setup.?mfrm_generalizabilitynow discloses that the lme4 random-effects model is main-effects only (Score ~ 1 + (1|Person) + (1|Facet) + ... + Residual, no explicit(1|Person:Facet)interaction terms), which folds two-way interaction variance into Residual and can biasGdownward. The companionmfrm_d_study()projectsGandPhiunder planned facet counts, but reports the residual-scaling assumption explicitly; users who need a full p x r x i decomposition should treat these projections as planning evidence, not as a substitute for separately estimated interaction components.?q3_statisticnow discloses that, when the chosen facet has multiple residual rows per (Person, Level) cell because of additional facets in the design, the standardized residuals are mean-aggregated to one value per cell before the Pearson correlation. Yen’s (1984) original definition takes the correlation over per-(Person, Item) residuals without aggregation, so the published|Q3| > 0.20threshold and the Christensen et al. (2017) critical values were derived for the original formulation; the values returned here should be treated as a screening summary rather than a direct substitute for those thresholds.?bias_pairwise_reportnow discloses that the contrast SE uses the independence approximationsqrt(SE_i^2 + SE_j^2). For same-facet bias values that share a sum-to-zero identification the trueCov(b_i, b_j) < 0, so the reported SE is an over-estimate and the t-statistic / p-value are conservative (the true significance is higher than reported). For across-facet contrasts the covariance term is approximately zero and the approximation is appropriate.Two new vignettes ship in the
Migration and Scopesection of the pkgdown article navigation:vignette("mfrmr-facets-migration")walks Facets users through the correspondingmfrmrworkflow and numeric contract checks, andvignette("mfrmr-gpcm-scope")documents which downstream helpers the boundedGPCMroute currently supports versus restricts and what to use as a substitute when a helper is restricted.
Build hygiene
.Rbuildignore tightened the inst/references/ source-package boundary. The two runtime / user-facing files in that directory – facets_column_contract.csv (read at runtime by facets_output_contract_review()) and FACETS_manual_mapping.md (the FACETS Table to mfrmr helper mapping cited in the README) – are preserved.
Performance note
The cpp11 MML backend (src/mml_backend.cpp, RSM and PCM only) is opt-in via options(mfrmr.use_cpp11_backend = TRUE) for this release. It is validated against the pure-R reference at tolerance = 1e-12 on a fixed regression fixture. The default flip to ON is planned for a follow-up release after a cycle of community testing.
Deferred to a follow-up release
Considered for 0.2.0 but not shipped in 0.2.0; carried over to a later release:
- User-facing GPCM unblock for
build_apa_outputs(),facets_output_contract_review(), andfacets_output_file_bundle(include = "score"). (fair_average_table()andestimate_bias()are unblocked above.) - A classical-DIF helper (working title
analyze_dif_classical()) covering Mantel-Haenszel, logistic regression, and SIBTEST. - Five additional Rasch / IRT classic plots (KIDMAP, TCC, expected score curve, cumulative ICC, information surface).
- A native classical-DIF vignette (the migration and bounded-GPCM-scope vignettes ship in this release; see the Documentation section above).
These are scheduled for a follow-up release.
mfrmr 0.1.6
This release adds empirical-Bayes shrinkage for small-N facets, a hierarchical-structure and sample-adequacy review layer, integrated missing-code pre-processing, APA output adapters for Word / HTML, model-estimated two-way non-person facet interactions, confidence-interval propagation through the plot surface and the ICC reporting family, and expanded reproducibility manifests. Six bug fixes close issues that affected bias statistics, ZSTD sign, input validation, and graphical state hygiene.
Default changes (three breaking flips)
Three default values change in this release. Scripts that explicitly pass the old value are unaffected; scripts that rely on the default should be reviewed.
-
diagnose_mfrm(diagnostic_mode = ...)default flips from"legacy"to"both". Strict marginal screens are produced automatically forRSM/PCMfits without the caller having to request them. Passdiagnostic_mode = "legacy"to restore the earlier behaviour. -
plot(fit)default output is now the Wright map alone, returned as anmfrm_plot_dataobject. The previous three-plot overview (Wright + pathway + CCC) remains available viaplot(fit, type = "bundle"), which returns anmfrm_plot_bundlewith the same three slots. -
fit_mfrm(quad_points = ...)default increases from15to31so a default MML fit is stable enough for direct manuscript reporting. Passquad_points = 15(or7) to restore the earlier iteration speed for exploratory scans.
New features
Model-estimated facet interactions
fit_mfrm() gains facet_interactions for confirmatory two-way interactions between non-person facets in RSM and PCM fits, for example facet_interactions = "Rater:Criterion". These terms are estimated simultaneously with the main MFRM parameters as fixed effects under zero marginal-sum constraints, contributing (A - 1) * (B - 1) free parameters for an A x B interaction block.
New supporting pieces:
-
interaction_effect_table(fit)returns one row per interaction cell, with estimates, weighted counts, sparse-cell flags, and the identification note. -
summary(fit)reports a compact interaction overview when interaction terms are present. -
compare_mfrm(..., nested = TRUE)now recognizes same-family additive-vs- interaction comparisons as nested when all other structural settings match and the smaller model’s interaction set is a subset of the larger model’s set.
The feature is intentionally narrow for the initial CRAN-facing release: person-involving interactions, higher-order interactions, GPCM interactions, and random-effect facet interactions are deferred. Residual bias screening via estimate_bias() and estimate_all_bias() remains separate from these model-estimated fixed effects.
Empirical-Bayes facet shrinkage
fit_mfrm(..., facet_shrinkage = "empirical_bayes") applies James-Stein / empirical-Bayes shrinkage to each non-person facet’s fixed-effect estimates. fit$facets$others gains ShrunkEstimate, ShrunkSE, and ShrinkageFactor columns, and fit$shrinkage_report summarises the per-facet prior variance, mean shrinkage, and effective degrees of freedom.
The estimator is the classical method-of-moments form (Efron & Morris, 1973):
-
tau_hat^2 = max(0, mean(delta_hat_j^2) - mean(SE_j^2)), using the raw second moment under mfrmr’s sum-to-zero identification (the facet mean is exactly 0 by construction, so no degree of freedom is consumed). -
B_j = SE_j^2 / (tau_hat^2 + SE_j^2)(shrinkage factor). -
delta_hat_j^EB = (1 - B_j) * delta_hat_jandSE_j^EB = sqrt((1 - B_j) * SE_j^2)(posterior mean / SE; the posterior SE treatstau_hat^2as known, omitting the Morris- correction for
tau_hat^2uncertainty).
- correction for
Two post-hoc helpers make shrinkage available to existing fits:
-
apply_empirical_bayes_shrinkage(fit, facet_prior_sd = NULL, shrink_person = FALSE)augments an existingmfrm_fit. -
shrinkage_report(fit)returns the per-facet summary table.
The "laplace" alias currently routes to the empirical-Bayes path and is reserved for a future penalised-MML implementation.
Integration: summary(fit) exposes FacetShrinkage and FacetShrinkageTau2Mean; build_apa_outputs() adds a Method-section sentence naming the mode, mean tau_hat^2, and mean shrinkage with a Efron & Morris (1973) citation; build_mfrm_manifest() gains a shrinkage_audit table; reporting_checklist() gains an “Empirical-Bayes shrinkage” item.
Hierarchical structure and sample-adequacy review
Five new exported functions describe the observed design, flag small-N facet levels, and quantify ICC / design effect. Estimation remains fixed-effects MFRM; these helpers are purely descriptive and do not alter the fit.
-
detect_facet_nesting(data, facets, person)classifies every ordered pair of facets (plus Person, optionally) as Fully nested, Near-perfectly nested, Partially nested, or Crossed using the conditional-entropy index1 - H(B|A)/H(B). -
facet_small_sample_review(fit)returns per-levelN / Estimate / SE / Infit / Outfit / SampleCategoryfor every facet.SampleCategoryis one of"sparse"(< 10),"marginal"(< 30),"standard"(< 50),"strong"(>= 50). Thresholds follow Linacre (1994) and are configurable. -
compute_facet_icc(data, facets, score, person)fitslme4::lmer(Score ~ 1 + (1|Person) + (1|Facet1) + ...)and reports the variance-component share per facet. Person uses the Koo & Li- reliability bands; other facets use a “variance share” label (Trivial / Small / Moderate / Large).
-
compute_facet_design_effect(data, facets, icc_table)computes the Kish (1965)Deff = 1 + (m - 1) * rhoand effective N per facet. -
analyze_hierarchical_structure(data, facets, ...)bundles the four helpers above and (whenigraphis available) a bipartite connectivity summary over Person * facet-level edges.
Fit- and reporting-stack integration:
-
fit$summarycarriesFacetSampleSizeFlag,FacetMinLevelN, andFacetSparseCount. -
reporting_checklist()gains two items: “Facet sample-size adequacy” (auto-ready when the flag is"standard"/"strong") and “Hierarchical structure review” (ready when the user passeshierarchical_structure = analyze_hierarchical_structure(...)). -
build_apa_outputs()adds a Method sentence naming the sample-adequacy band and linking tofacet_small_sample_review(). -
build_mfrm_manifest()gains ahierarchical_audittable. -
recommend_mfrm_design()$caveatsnow points users at the three post-fit audit functions.
Optional dependencies igraph and lme4 move to Suggests; when either is absent the relevant report is omitted with a clear message().
Missing-code pre-processing in the fit call
fit_mfrm() now accepts missing_codes = NULL | TRUE | "default" | <character vector>, forwarded to prepare_mfrm_data(), review_mfrm_anchors(), and describe_mfrm_data(). When active, the standard FACETS / SPSS / SAS sentinels ("99", "999", "-1", "N", "NA", "n/a", ".", "" by default, or any caller- supplied set) are converted to NA on the person, facets, and score columns before any downstream processing. Replacement counts are recorded in fit$prep$missing_recoding and surfaced through build_mfrm_manifest()$missing_recoding. The default (missing_codes = NULL) is strictly backward-compatible.
A standalone recode_missing_codes() helper is also exported for users who prefer to recode before calling fit_mfrm().
APA output adapters
-
as_kable.apa_table()converts anapa_tableinto aknitr::kable()object with the caption above and the note below. WhenkableExtrais installed the note becomes a proper table footnote; otherwise it is appended as Markdown. -
as_flextable.apa_table()produces aflextable::flextable()with caption and note pre-wired, suitable forofficer/ Word / PowerPoint exports. - Two generics,
as_kable()andas_flextable(), are exported so other mfrmr classes (or third-party wrappers) can register compatible methods. -
build_apa_outputs(..., context = list(output_mode = "reflow"))now returns the Method / Results paragraphs as single long lines per sentence-joined paragraph, which is the format Word / Quarto / RMarkdown prefer. The default"wrapped"keeps the 92-column layout for console readability.
kableExtra and flextable join Suggests.
Shrinkage and review visualisations
-
plot(fit, type = "shrinkage")renders a horizontal forest-style dotplot of original and shrunk facet-level estimates, with arrows indicating shrinkage direction, optional 95 % CI error bars (show_ci = TRUE), and a reference line at zero. When shrinkage is not applied the plot shows an unavailable-state message inviting the user to re-fit withfacet_shrinkage = "empirical_bayes". -
plot.mfrm_facet_sample_review()draws a horizontal bar chart of per-level observation counts coloured by Linacre band, with dashed vertical lines at the thresholds. -
plot.mfrm_facet_nesting()renders the pairwise nesting index as a heatmap with numeric cell labels.
All three methods follow the existing preset = c("standard", "publication", "compact") convention and use base-R graphics.
Confidence intervals across the plot surface
-
plot_bias_interaction(show_ci = TRUE, ci_level = 0.95)drawsBiasSize +/- z * SEwhiskers on the scatter and ranked views. -
plot_displacement(show_ci = TRUE)drawsDisplacement +/- z * DisplacementSEwhiskers in the lollipop view. -
plot_fair_average(show_ci = TRUE)draws fair-average CI whiskers on the observed-score scale using a delta-method propagationSE_fair = Var(X | Measure) * ModelSEfrom the logitMeasureerror. Rows near a rating boundary (where the implied score variance is effectively zero) are excluded from the whiskers, drawn as open circles, and counted in the subtitle. -
compute_facet_icc(ci_method = "profile" | "boot")returns ICC confidence intervals in newICC_CI_Lower/ICC_CI_Upper/ICC_CI_Level/ICC_CI_Methodcolumns, propagated throughanalyze_hierarchical_structure()and drawn as whiskers onplot.mfrm_hierarchical_structure(type = "icc"). The defaultci_method = "none"keeps the point-estimate-only behaviour.
Additional visualisations
Fourteen additions across the plot surface, all base-R / additive (default behaviours unchanged):
-
plot_threshold_ladder()(new) — vertical ladder of Rasch-Andrich thresholds for RSM and PCM, with disordered-step crossings highlighted in the preset’sfailcolour. The returned object includes per-stepGroup / Step / Threshold / Disorderedrows. -
plot(fit, type = "ccc_overlay")(new branch onplot.mfrm_fit) — observed category proportions binned by person measure overlaid on the model CCC curves, for an at-a-glance model-data fit visual. -
plot_person_fit()(new) — FACETS Table 6 style per-person Infit / Outfit bubble with the standard 0.5-1.5 acceptance band (Linacre, 2002). -
plot_bias_interaction(plot = "heatmap")(new mode) — diverging Rater x Criterion grid coloured by bias size, with flagged cells outlined for emphasis. -
plot(fit, type = "wright", group = ..., group_data = ...)(new option) — overlays per-group person-density curves on the Wright map’s left density column, useful for DIF / DFF screening. -
plot_rater_severity_profile()(new) — per-rater severity ranking with CI whiskers and optional+/-0.5(gentle) and+/-1.0(strict) guidance bands for rater-training feedback. -
plot_anchor_drift(type = "forest")(new mode) — per-wave anchor-element CI forest with point estimate +z * SEwhiskers. -
plot.mfrm_equating_chain()(new S3 method) —type = "common_anchors"(default bar chart of pairwise common-anchor counts) andtype = "graph"(bipartite Wave x anchor element graph viaigraph). -
plot_apa_figure_one()(new) — 2x2 publication composite bundling Wright map, rater severity profile, threshold ladder, and a one-panel summary block. -
plot_dif_summary()(new) — compact effect-size summary for [analyze_dff()] / [analyze_dif()] with ETS A / B / C colour coding. -
plot_guttman_scalogram()(new) — Person x facet-level observed-category matrix, ordered by person measure and location measure, with unexpected cells highlighted. -
plot_residual_qq()(new) — normal Q-Q plot of person-level standardized residuals for distributional misfit diagnostics. -
plot_rater_trajectory()(new) — per-rater severity trajectory across an ordered wave / session variable with CI whiskers. Accepts a named list of fits. -
plot_rater_agreement_heatmap()(new) — symmetric rater x rater agreement matrix colored by exact agreement (default) or the Pearson-styleCorrcolumn frominterrater_agreement_table(). Quadratic-weighted kappa is not currently computed by that helper and is therefore not exposed as ametricoption.
igraph is already in Suggests; the equating-graph view falls back to the bar chart when igraph is not installed.
Expanded test coverage
Direct regression tests for the 0.1.6 additions:
-
test-attach-diagnostics.R— 18 assertions covering theattach_diagnostics = TRUEmerge, type validation, idempotence, and MML / JML agreement checks. -
test-icc-ci-method.R— 25 assertions coveringcompute_facet_icc(ci_method = "profile" / "boot"), bootstrap seed reproducibility, range validation, deprecatedicc_ci_methodalias, andplot.mfrm_hierarchical_structure(type = "icc")integration. -
test-ci-api-consistency.R— 21 assertions covering thelifecycle::deprecate_warn()path forconf_level,show_ci/ci_levelonplot_fair_average/plot_displacement/plot_bias_interaction, and CI column schema. -
test-messaging-and-guards.R— assertions covering the quiet-by-default inferred rating-range message, the opt-in one-time message,analyze_dff(method = "refit")missing(diagnostics)guard, andmissing_codesintegration. -
test-lme4-confint-helper.R— 17 assertions covering.lme4_confint_components()across terse and verbose lme4 row-name conventions. -
test-plotting-extras.R+test-plotting-screening.R— 78 assertions covering all 14 new plot helpers.
Internal architecture
row_max_fast() and the three category_prob_* polytomous-response kernels are now in R/core-category-probabilities.R instead of inline in R/mfrm_core.R. Pure file-level reorganization; no behaviour change. The remaining structural split of mfrm_core.R (likelihood / optimizer / EM / gradients / prep / report tables) is scheduled for a future release.
Package-level MnSq misfit threshold
mfrm_misfit_thresholds() returns the lower / upper Linacre acceptance band that mfrmr screens use when flagging element-level Infit / Outfit MnSq misfit. Defaults are c(lower = 0.5, upper = 1.5) and can be overridden globally via R options:
options(mfrmr.misfit_lower = 0.7)options(mfrmr.misfit_upper = 1.3)
Helpers that consume the band include summary(diagnose_mfrm(...)) (misfit_flagged block + key_warnings auto-flag), build_misfit_casebook() (the new element_fit source family), the bias / misfit narrative inside build_apa_outputs(), and facet_quality_dashboard() when misfit_warn = NULL. Setting the options once at the top of an analysis script therefore changes every downstream screen at once.
Additional secondary plots
Four new public helpers extend the diagnostic plot family:
-
plot_local_dependence_heatmap(fit)– N x N Q3-style pairwise residual correlation heatmap between facet levels. Complementsplot_marginal_pairwise()by showing every pair on a shared color scale rather than a top-N bar list. -
plot_reliability_snapshot(fit)– compact facet x reliability / separation / strata bar overview built fromdiagnostics$reliability. Useful as a single small figure for “are persons / raters / criteria distinguishable?”. -
plot_residual_matrix(fit)– person x facet-level standardized residual heatmap. Complementsplot_guttman_scalogram()by showing residual sign and magnitude rather than the raw response code. -
plot_shrinkage_funnel(fit)– empirical-Bayes shrinkage caterpillar / funnel for fits augmented viaapply_empirical_bayes_shrinkage().
plot_bubble() gains a view = c("measure", "infit_outfit") argument. The default "measure" keeps the historical Measure (logit) x MnSq bubble layout; view = "infit_outfit" switches to the Winsteps Table 30 layout (Infit MnSq on x, Outfit MnSq on y, bubble size defaults to N). Both views return the same mfrm_plot_data contract.
plot_dif_heatmap(draw = FALSE) now returns an mfrm_plot_data object whose data$matrix is the metric matrix (was previously the bare matrix only).
plot_information(..., draw = FALSE) outputs now include a series field listing which curves the legend describes ("Information", "SE", or both for type = "both"), so downstream ggplot2 re-renderers can map the right column without inspecting type manually.
Reporting surface enrichments
-
summary(diagnose_mfrm(...))now prints the fixed-effect chi-square block (“are all elements equal?”) directly fromdiag$facets_chisq(Facet,Levels,MeanMeasure,SD,FixedChiSq,FixedDF,FixedProb, plus the random-effect counterparts when present) and the inter-rater agreement summary (Exact / Expected / Adjacent agreement, MeanAbsDiff, MeanCorr, RaterSeparation, RaterReliability) instead of leaving them in the diagnostics object only. The newsummary(diag)$facets_chisqandsummary(diag)$interraterslots also expose the same tables for programmatic use. -
summary(diagnose_mfrm(...))$key_warningsnow names the worst MnSq-misfit elements (e.g.MnSq misfit: Person:P023 (Infit=1.70, Outfit=2.40; outside 0.5-1.5).) and prints a dedicatedMnSq misfitblock showing every flagged element. Threshold pair is exposed atsummary(diag)$misfit_thresholdsand is steered bymfrm_misfit_thresholds()(see above). -
summary(diagnose_mfrm(...))now prints a category usage block (one row per observed score withCount,AvgMeasure, and aDisorderingflag when the average measure decreases across adjacent categories). Exposed programmatically atsummary(diag)$category_usage. -
summary(fit_mfrm(...))now prints a targeting block (Person mean - Facet mean, plusPersonSD/FacetSD/SpreadRatio) for every non-person facet. Under the package’s sum-to-zero identification this collapses to the person mean by construction; the row labels make that explicit and the spread ratio surfaces whether persons or facets dominate the test scale. -
summary(estimate_bias(...))now reports Bonferroni and Holm significant-cell counts alongside the raw screen-positive count. Both are exposed insummary(bias)$overviewasBonferroniSignificantandHolmSignificant. -
print(fit)andprint(summary(fit))now show an “Attached diagnostics” line whenfit_mfrm(..., attach_diagnostics = TRUE)has merged per-element fit columns ontofit$facets. The attach-diagnostics path now extends to the person-facet table, so per-personInfit,Outfit,InfitZSTD,OutfitZSTD, andPtMeaCorrcolumns are visible insummary(fit)$person_highandsummary(fit)$person_low.
Internal architecture: file split
To improve navigability of the core estimation engine, four self-contained sections moved out of R/mfrm_core.R into focused files. All functions remain internal and the public API is unchanged.
-
R/core-likelihood.R– polytomous Rasch likelihoods and cumulative response-probability helpers. -
R/core-data-prep.R– data validation, indexing, and small formatting utilities. -
R/core-anchor-review.R– anchor-table reading, normalization, and connectivity / overlap audit. -
R/core-optimizer.R– optim() / EM dispatch and MML-EM scaffolding.
R/api-simulation.R similarly grew an R/api-simulation-future-branch.R companion file holding the future-branch design-schema layer. Public simulation entry points (simulate_mfrm_data, evaluate_mfrm_design, evaluate_mfrm_diagnostic_screening, evaluate_mfrm_signal_detection) remain in R/api-simulation.R. - evaluate_mfrm_diagnostic_screening(include_report = TRUE) can now retain mfrm_results() / mfrm_report() report_index signals at the replicate level and summarize them in report_signal_summary, keeping report-layer readiness separate from diagnostic-screening Type I and sensitivity proxies. - plot.mfrm_diagnostic_screening() can now turn diagnostic-screening summaries into integrated plot-data bundles for overview rates, report signals, scenario contrasts, and runtime checks. The draw-free return follows the mfrm_plot_data / plot_data() contract.
R/api-plotting-extras2.R was renamed to R/api-plotting-screening.R to drop the numerical suffix in favour of a functional name; tests follow the same rename.
A new tests/testthat/helper-fixtures.R exposes make_toy_fit() / make_toy_diagnostics() / local_toy_fit() helpers so future tests can reuse the standard example_core fit without retyping the load_mfrmr_data() + fit_mfrm() + diagnose_mfrm() chain.
Replay-script overhaul
export_mfrm_bundle() and build_mfrm_replay_script() now write a self-contained replay package:
- The generated
replay.Rincludes every argument that affected the originalfit_mfrm()call. Earlier 0.1.x scripts silently droppedmissing_codes,mml_engine,slope_facet,anchor_policy,min_common_anchors,min_obs_per_*,facet_shrinkage,facet_prior_sd,shrink_person, andattach_diagnostics, so fits that depended on those arguments did not actually replay. -
fit_mfrm()now records its inputs infit$config$replay_inputs(postmatch.arg) so the bundle generator has a single source of truth. - The replay script begins with a
utils::packageVersion("mfrmr")guard that warns when the installed version differs from the recorded one. -
export_mfrm_bundle(..., data = ...)accepts the original analysis data; when supplied, the data is written into the bundle as<prefix>_replay_data.csvand the replay script reads from that co-located file. The recorded input hash is now computed against the user’s original data (not the package’s internalprep$data, which carries synthesised columns), so users can verify their CSV matches the recorded fingerprint. - A new
tests/testthat/test-replay-roundtrip.Ractually sources the generated replay script in a fresh environment and compares the reproduced log-likelihood and person estimates to the original.
Performance: diagnose_mfrm() on large designs
calc_interrater_agreement() (the inter-rater agreement helper that diagnose_mfrm() calls when Person is part of facet_cols) previously used a list() for the per-context probability lookup and c(exp_vals, ...) accumulation inside a per-row loop. This gave near-quadratic scaling: 6,400 observations took ~2 s, but 72,000 observations took ~141 s. The lookup is now an environment (hash-backed for character keys) and exp_vals is preallocated and filled by index, so the helper now scales linearly in the number of observations. On the 72,000-observation benchmark in the review, diagnose_mfrm() drops from ~141 s to ~15 s.
The make_union_find() helper used by the connectivity audit was also rewritten with an iterative find_root (with path compression) instead of the previous recursive form. Designs whose union chain depth exceeded options(expressions) (default 5,000) no longer error out with “evaluation is too deeply nested”.
Input validation: degenerate inputs surface earlier
prepare_mfrm_data() now:
- records how many rows it dropped due to missing values or non-positive weights, instead of dropping them silently;
- trims leading/trailing whitespace from
Personand facet IDs and records the row count so ” P01 ” and “P01” do not silently become two persons; -
warning()s when the input contains duplicate Person x facet rows (which violate MFRM’s conditional-independence assumption) but lets the fit continue rather than refusing it outright.
fit_mfrm() now treats NaN / Inf for maxit, reltol, and quad_points as invalid input with a localised English error, instead of falling through to R’s locale-dependent “missing value where TRUE/FALSE needed” message.
Pre-rendered cheatsheet PDF
The two-page landscape cheatsheet now ships in pre-rendered form at system.file("cheatsheet", "mfrmr-cheatsheet.pdf", package = "mfrmr") alongside the existing .Rmd source. Users without a working LaTeX toolchain can open the PDF directly; users who want to customize it can still knit the .Rmd with rmarkdown::render(). The README and ?mfrmr package help now point at both files.
Help-page examples: “what to look for” annotations
The most-visited help pages now embed concrete interpretation comments inside their @examples blocks. Each shipped example shows what value ranges or patterns indicate “good”, what threshold or rule of thumb applies, and what follow-up to run if the value is off. Coverage in 0.1.6 includes:
-
?fit_mfrm(convergence, person SD, targeting bands). -
?diagnose_mfrm(key_warnings, MnSq misfit lines, facets_chisq, inter-rater agreement minus expected). -
?summary.mfrm_fitand?summary.mfrm_diagnostics(overview, person distribution, top_fit ZSTD bands, facets_chisq, targeting). -
?estimate_bias,?analyze_dff,?compute_facet_icc,?apply_empirical_bayes_shrinkage(effect-size bands, Penfield classification, Koo & Li 2016 reliability bands, shrinkage factor interpretation). -
?build_apa_outputs,?reporting_checklist,?plot_qc_dashboard,?plot.mfrm_fit(manuscript-readiness signals, dashboard panel status, Wright / pathway / CCC interpretation). -
?plot_bubble,?plot_dif_heatmap,?plot_local_dependence_heatmap,?plot_reliability_snapshot,?plot_residual_matrix,?plot_shrinkage_funnel,?plot_guttman_scalogram,?plot_residual_qq,?plot_rater_trajectory,?plot_rater_agreement_heatmap(cell / band thresholds, reference-line interpretation).
Help-page examples: lighter-weight \donttest{}
Several main entry points now expose a small fast-path block (a JML fit on example_core plus a single diagnostic / plot call) before the heavier \donttest{} block. The fast path is below R CMD check’s example-time budget and provides a regression net that runs every check, while the full \donttest{} block continues to showcase the larger MML / publication-route examples. A final CRAN-timing pass keeps the active examples at lightweight maxit = 30 smoke fits so the printed examples demonstrate converged objects without returning to the original long-running example surface. Affected pages: ?fit_mfrm, ?diagnose_mfrm, ?plot_qc_dashboard, ?reporting_checklist, ?build_apa_outputs.
Documentation
-
?mfrmr_visual_diagnosticsadds a “Cross-reference to FACETS / Winsteps tables” section that lists the closest mfrmr helper for each canonical Rasch / MFRM table or figure family (Wright map, pathway / probability curves, test information, misfit, bias / interaction, DIF / DRF, inter-rater agreement, anchoring / linking). -
?mfrmr_visual_diagnosticsand the visual reporting template now enumerate the 4 secondary plot helpers and the 4 screening helpers added in 0.1.6. -
?diagnose_mfrmcites Wright & Masters (1982) at the separation / strata / reliability section and reproduces the formulae (G = TrueSD / RMSE, R = G^2 / (1 + G^2), H = (4G + 1) / 3) so the reliability outputs are traceable to source. -
?fit_mfrmexample block now flags thequad_points = 7opening fit as an exploratory speed setting. - The README and
?mfrmrpackage help now point at the public cheatsheet (system.file("cheatsheet", "mfrmr-cheatsheet.Rmd", package = "mfrmr")). - The bias / misfit APA narrative now spells out
|ZSTD|(or|MnSq - 1|when ZSTD is unavailable) instead of the generic|metric|label. -
build_misfit_casebook()now also draws element-level Infit / Outfit MnSq misfit cases fromdiagnostics$fit(in addition to marginal cells, pairwise screens, unexpected responses, and displacement). The casebook therefore matches what its name implies.
Yen Q3 local-dependence statistic
q3_statistic(fit, diagnostics) returns the Yen (1984) Q3 index between every facet-level pair, with three published reporting thresholds (Yen 0.20, Marais 0.30, Christensen et al. relative 0.20) and a textual Interpretation column that names which flag(s) each pair triggered. The helper reuses the standardized- residual pivot that plot_local_dependence_heatmap() already draws, so the table and the heatmap stay numerically consistent.
Extended person-fit indices
compute_person_fit_indices(diagnostics, fit) adds person-level fit detail on top of the Infit / Outfit / ZSTD columns that diagnose_mfrm() already exposes:
- lz (Drasgow, Levine & Williams, 1985): standardized log-likelihood under the fitted model.
-
lz\* (Snijders, 2001): estimated-ability correction computed for JML/fixed-effect person estimates conditional on the fitted non-person calibration; returned as
NAfor MML/EAP scores with an explanatory status.
The reported lz statistic is asymptotically standard normal under the conditional-independence assumption; |lz| > 1.96 / 2.58 are the 5% / 1% reporting flags.
Generalizability-theory adapter
mfrm_generalizability(fit) re-fits the rating data as a crossed random-effects model Score ~ 1 + (1 | Person) + (1 | Facet1) + ... via lme4::lmer and returns the canonical G / Phi coefficients plus per-source variance components. Useful when a reviewer asks for a generalizability-theory complement to the Rasch-style separation / reliability statistics that diagnose_mfrm() already emits.
Import adapters: mirt / TAM / eRm
Three thin importers expose external fit objects via the same mfrm_fit interface that the mfrmr plot and table helpers consume:
-
import_mirt_fit(fit, model)accepts amirt::mirt()result. -
import_tam_fit(fit, model)acceptsTAM::tam.mml()/TAM::tam.jml(). -
import_erm_fit(fit, model)acceptseRm::PCM()/eRm::RM()/eRm::RSM().
The imported objects carry the mfrm_imported_fit class and populate measurement-side slots (facets$person, facets$others, steps, summary) only. Bias / DIF / anchor / replay slots are explicitly not populated; full bundle import is planned for a future release.
Parallel parametric-bootstrap ICC
compute_facet_icc(boot = "boot") gains ci_boot_parallel ("no" / "multicore" / "snow") and ci_boot_ncpus arguments that are forwarded to lme4::bootMer(). The per-replicate cli progress bar is suppressed under parallel execution because worker processes hold their own copy of the progress state.
Parallel evaluate_mfrm_design (scaffold)
evaluate_mfrm_design() accepts a parallel = c("no", "future") argument. When "future" is requested and the future.apply Suggests package is installed, the rep loop within each design row honours whatever future::plan() is currently active; cross-design-row parallelism is planned for a future release. Without future.apply the call falls back to serial execution with an explicit message.
Resumable MML EM fits
fit_mfrm() accepts a checkpoint = list(file = ..., every_iter = ...) argument. When supplied to a mml_engine = "em" (or hybrid) fit, the EM scaffolding writes its state to file every every_iter outer iterations using saveRDS(). If the file exists when a subsequent call starts, the engine resumes from the recorded iteration. The direct optim() engine ignores the checkpoint; non-EM fits run unaffected.
GPCM verification tests
A new tests/testthat/test-gpcm-verification.R exercises every "supported" and "supported_with_caveat" row of gpcm_capability_matrix() on a toy dataset and asserts the documented helper returns the expected shape. "blocked" and "deferred" rows have negative tests that confirm the helper either refuses to run or returns an explicit caveat. These tests make the GPCM scope a contract that future commits cannot silently shrink.
Optional FACETS Table 7 style fit output on fitothers
fit_mfrm(attach_diagnostics = TRUE) runs diagnose_mfrm() once after the fit and merges the per-level SE, Infit, Outfit, and PtMeaCorr columns onto fit$facets$others. This makes the facet table look like a FACETS Table 7 summary without a separate call. The default FALSE preserves the minimal Facet / Level / Estimate layout from 0.1.5.
Reproducibility
build_mfrm_manifest() gains several new tables so replay bundles carry everything a deterministic re-run needs:
-
environmentnow recordsRNGKind,RNGSeedDigest,Locale, and a UTC ISO-8601 timestamp in addition to the existing package and platform fields. -
dependencies(new) records the installed version of everyImportsandSuggestsdependency, with aRolecolumn. -
input_hash(new) hashes the input data, anchors, group anchors, andscore_mapwith SHA-256 (viadigest, now inSuggests) or an MD5-of-RDS fallback. The hash is deterministic across sessions. -
session_info(new) unrollsutils::sessionInfo()into a long data frame (Scope/Package/Version). -
hierarchical_audit,missing_recoding, andshrinkage_audit(new) surface the three new audit layers in one place.
digest is added to Suggests.
Bug fixes
-
Bias / interaction NA.
estimate_bias()andestimate_all_bias()previously returnedNAfor every cell’sS.E.,t,Prob.,Obs-Exp Average,Infit, andOutfit, andSignificantcounts collapsed to zero. Root cause was annzchar(NA_character_)call (which returnsTRUE) in an internal predicate. Downstream helpers such asbias_interaction_report()andplot_bias_interaction()are now populated again. -
estimate_bias()silent failure on typo’d facet names. A mis-spelledfacet_a/facet_b(e.g."Raters"with trailings) previously returned an emptylist()with no warning. It now raises an informative error naming the available facets. Missingdiagnosticsargument likewise raises an explicit mfrmr error rather than falling through to R’s locale-dependent missing- argument message. -
ZSTD sign.
zstd_from_mnsq()was numerically unstable for very small degrees of freedom and could return large positive ZSTD whenMnSqwas close to zero, flipping the sign relative to the companion Outfit ZSTD for the same element. Adf >= 1guard returnsNAin degenerate cells. -
Score out of range.
prepare_mfrm_data()now stops when any observedScorefalls outside the declared[rating_min, rating_max]range. Previously negativescore_kvalues passed throughm[cbind(i, 0)], silently dropping those rows from the likelihood whilen_obskept its original value. -
Silent facet-name mismatches.
sanitize_noncenter_facet(),sanitize_dummy_facets(), andbuild_facet_signs()now emit a warning when supplied facet names are not part of the fitted model. Previously typos such aspositive_facets = "rater"(lowercase) ornoncenter_facet = "Raters"could silently flip the sign convention of facet measures. -
Graphical state hygiene.
apply_plot_preset()and.draw_shrinkage_plot()now restore the user’spar()on exit, per “Writing R Extensions” 2.1. All plot methods that relied onapply_plot_preset()inherit this automatically. -
DFF contrast sign flip.
analyze_dff()adds aContrastDirectioncolumn to the residual and refit branches. The two methods use opposite sign conventions by design, so the new column spells out which interpretation applies. -
compute_facet_icc()singular fit. Total variance belowsqrt(.Machine$double.eps)is now reported asICC = NAwithInterpretation = "Non-identifiable"instead of a falsely meaningful value. The firstlme4convergence diagnostic surfaces as amessage()rather than being silently suppressed. -
Extreme-person flag persistence.
as.data.frame.mfrm_fit()now carries the newExtremecolumn through to ggplot2 / CSV pipelines instead of dropping it. -
as_kable(format = "pipe")output. Previously silently returned HTML whenkableExtrawas installed and theapa_tablecarried a non-emptynote."pipe"now consistently returns the Markdown table with an appendedNote.line. -
review_mfrm_anchors()false positives. Overlap-adequacy risk flags are skipped when no anchors or group anchors were supplied, so single-wave analyses no longer emit “high severity” warnings becauseOverlapLevels == 0everywhere. -
Fractional-score tolerance. Tightened from
sqrt(.Machine$double.eps)(~1.5e-8) to1e-6, so integer codes like1.0000001that round-trip through CSV floats are now accepted. Genuinely fractional scores (1.5,2.75) are still caught. -
Rating-range inference output. When
rating_min/rating_maxare inferred from the observed scores, the provenance is now retained in fit summaries and data-description output rather than emitted as a routine message. Users who prefer the interactive reminder can setoptions(mfrmr.show_inferred_rating_range = TRUE);fit_mfrm()still limits that opt-in message to one per fit. -
Locale-independent error for
plot(fit, type = ...). Passing an unknowntypepreviously raised R’s locale-dependentmatch.arg()error. It now raises an English mfrmr-style error listing the valid choices. -
plot_dif_heatmap(draw = FALSE)return contract. The helper documented anmfrm_plot_dataobject but invisibly returned the barematrix, breaking the documented contract used by siblingplot_*helpers. It now returns anmfrm_plot_datawhosedataslot bundlesmatrix,pairs,metric, andvalue_column. Code that relied on the old shape should switch fromdim(heat)todim(heat$data$matrix). -
Approximate 95% CI whiskers on bias and displacement plots.
plot_bias_interaction(show_ci = TRUE, ci_level = 0.95)(scatter and ranked modes) now drawsBiasSize \u00b1 z \u00b7 SEwhiskers using the per-cell SE from [estimate_bias()].plot_displacement( show_ci = TRUE)(lollipop mode) drawsDisplacement \u00b1 z \u00b7 DisplacementSEwhiskers from the audit-table standard error. Both functions now populateCI_Lower/CI_Upper/CI_Levelcolumns on the returned plot-data element so downstream pipelines can reuse the bounds.plot_fair_average()CI support (now implemented later in this release) uses a delta-method propagation because the fair-average SE lives on the logit scale while the plot uses the observed-score scale, which requires a delta-method transformation.
Messaging improvements
-
fit_mfrm()emits a one-timemessage()when called withanchor_policy = "silent"while the anchor review flags issues. -
prepare_mfrm_data()records whetherrating_min/rating_maxwere inferred from the observed scores or supplied explicitly, and fit/data summaries surface that provenance. The former informational message is opt-in throughoptions(mfrmr.show_inferred_rating_range = TRUE). Row drops, ID trimming, and facets with only one observed level are recorded inpreparation_notes; routine preparation messages are opt-in throughoptions(mfrmr.show_preparation_messages = TRUE). - Non-numeric score labels (
"low","medium","high") now raise a targeted error up front instead of surfacing as the opaque “No valid observations remain” message. -
detect_anchor_drift()andbuild_equating_chain()thin-linking warnings now list per-facet retained-vs-threshold counts (e.g."Rater (3/5)"). -
bias_interaction_report()$summarycarries aFlagStatuscolumn so emptyranked_tablerows are no longer ambiguous between “nothing flagged” and “nothing computed”. - Latent-regression fits warn when the design matrix is near-singular (
rcond(mm) < 1e-8), catching numerically collinear covariates rather than only exact rank deficiency.
Documentation and citations
-
apply_empirical_bayes_shrinkage()docstring and R comments now document the shrinkage-variance formula asmax(0, K^{-1} * sum(delta^2) - mean(SE^2)), matching the implementation under sum-to-zero identification. -
?fit_mfrm“Input requirements” now states the MFRM conditional independence assumption (Linacre, 1989) and points atdiagnose_mfrm(..., diagnostic_mode = "both")/strict_pairwise_local_dependenceas the exploratory follow-up. -
?fit_mfrmexample block now flags thequad_points = 7opening fit as an exploratory speed setting and reminds readers that the package defaultquad_points = 31is the publication tier, so the example no longer reads as a recommendation against the new default. -
?fit_mfrmnow presents the recommendedquad_pointstiers as a\tabular{}block (7fast scan,15default,31+publication) so readers do not have to re-extract the recommendation from prose. The “adapted from Linacre (1994)” wording for the sample-size bands and the wall-clock cost ofdiagnostic_mode = "both"are also spelled out. -
?fit_mfrmmissing_codesdocstring is now an itemised list of the three branches (NULL/TRUE/ custom vector) instead of dense prose. -
?run_mfrm_facetsnow notes thatmethod = "JML"is the default for legacy FACETS-style output continuity and points users atfit_mfrm(..., method = "MML")for new analysis scripts. -
?apply_empirical_bayes_shrinkagenow states thatEffectiveDF = Σ(1 − B_j)matches the “effective number of parameters” from Efron & Morris (1973). -
?compute_facet_iccnotes that Koo & Li (2016) recommend applying the reliability bands to the 95% confidence interval of the ICC, while the current implementation bands the point estimate only. -
?analyze_facet_equivalenceadds Kass & Raftery (1995) as the reference for the BIC-based Bayes-factor approximationBF_{01} ≈ exp((BIC_{H1} − BIC_{H0}) / 2). - ZSTD docstring in
?mfrmr-packagenow cites Wilson & Hilferty- explicitly alongside Wright & Linacre (1994).
-
calc_displacement_table()now cites the Winsteps user guide for the combined|Displacement| > 0.5 logitand|t| > 2flagging rule. -
analyze_facet_equivalence()docstring frames theequivalence_bound = 0.5default as a starting point. - Added
print()S3 methods for 13 classes that previously fell back to the default list printer (mfrm_apa_outputs,mfrm_bias,mfrm_bundle,mfrm_design_evaluation,mfrm_diagnostics,mfrm_facet_dashboard,mfrm_future_branch_active_branch,mfrm_plausible_values,mfrm_population_prediction,mfrm_reporting_checklist,mfrm_signal_detection,mfrm_threshold_profiles,mfrm_unit_prediction). Each delegates to the existingsummary()method.
Reference citations corrected:
- Efron & Morris (1973) page range
379-402(was379-421). - McEwen (2018) BYU dissertation year (was 2017).
- Wright (1998) RMT 12(2) on extreme scores (was Wright 1988, which does not exist).
- Jones & Wind (2018) JAM 19(2), 148-161 (was “Wind & Jones, 2018, JAM 19(1), 1-19”, which does not exist).
- Linacre (2023) A User’s Guide to Facets, Version 4.5 applied uniformly where comments had used 2024.
Plot polish
-
plot_qc_dashboard()plots a signed ZSTD distribution combining Infit and Outfit ZSTD, with reference lines at-3 / -2 / 0 / 2 / 3. The previous absolute-value histogram collapsed over-fit and under-fit tails. -
plot_residual_pca(..., plot_type = "scree")draws Rasch secondary-dimension reference lines at1.0 / 1.4 / 2.0 / 3.0, consistent with the Winsteps user guide. The legend and returnedreference_linesrecord the new entries. - The Residual-PCA
content_checksentry uses a case-insensitive regex so the check passes whether the APA contract uses"Residual PCA"or the longer"Exploratory residual PCA".
Other additions
-
fit$facets$personexposesPosteriorSDandSEaliases alongside the legacySD. MML fits populate all three with the posterior SD under the Gauss-Hermite prior; JML fits setSE = NA_real_and note that per-person SEs should be pulled fromdiagnose_mfrm()$measures. -
analyze_dff(method = "refit")subgroup fits return aLinkingReviewcolumn that captures the anchor-review messages emitted during the refit, replacing the previous hard-codedanchor_policy = "silent"silence. -
detect_anchor_drift()returnscommon_vs_referenceandn_common_all_wavesalongside the existing pairwisecommon_elementstable, for 3+ wave linking reviews. -
analyze_residual_pca(..., pca_max_factors = "auto")caps the factor count atmin(10, ncol - 1, nrow - 1)per matrix; this value was previously silently coerced toNA. -
describe_mfrm_data()returns two new components:missing_rate_summary(per-column missing / non-missing counts) andfacet_crosstabs(long-format pairwise observation-count tables, suitable for heatmap plotting). - DESCRIPTION removes the duplicated
Author/Maintainerfields auto-generated by CRAN;Authors@Rremains the single source of truth. TheDescription:field is now three sentences (was 10 lines of prose), improving CRAN web readability while retaining the two rating-scale / partial-credit DOI references. -
inst/CITATIONnow tracksmeta$Versionandmeta$Titledynamically, socitation("mfrmr")prints the current installed version rather than a hard-coded string.
Test suite
6,380+ tests pass (up from 6,343 in 0.1.5), with 0 failures and 0 errors. New test files:
-
test-shrinkage.R(40 tests) covers the closed-form math, edge cases (K < 3,tau^2 <= 0, user-supplied prior),fit_mfrmintegration, reporting and manifest trails, and the three new plot methods. -
test-missing-codes-integration.R(17 tests) coversfit_mfrm,describe_mfrm_data,review_mfrm_anchors, and manifest paths. -
test-hierarchical-audit.R(10 tests) covers the five new hierarchical-audit helpers and their integration points.
Pre-existing test-harness errors unrelated to 0.1.5 behaviour have also been cleaned up (S3 dispatch, GPCM scope wording, internal-helper prefixing with mfrmr:::).
mfrmr 0.1.5
CRAN release: 2026-04-12
Maintenance release
First-use workflow
- Reworked
print(fit),summary(fit), andsummary(diagnose_mfrm(...))so results start withStatus,Key warnings, andNext actions. - Added a clearer recommended workflow in the README and help pages: fit with
MML, review diagnostics withdiagnostic_mode = "both", then move to reporting helpers. - Improved ordered-score handling and guidance, including binary two-category use, rejection of fractional score values, non-consecutive score-code mapping through
score_map, and clearer warnings for retained zero-count categories.
Estimation and scoring
- Added the first public latent-regression
MMLbranch for orderedRSM/PCMfits with person covariates, including simulation and scoring support for the fitted population model. - Added bounded
GPCMsupport for the documented direct route, including core summaries, diagnostics, plots, posterior scoring, and information checks, while keeping unsupported downstream routes explicit. - Extended ordered-response support and documentation for binary
RSM/PCMuse, fixed-calibration scoring afterJML, andPCMinformation curves.
Diagnostics, reporting, and visualization
- Added strict marginal follow-up plots through
plot_marginal_fit()andplot_marginal_pairwise(). - Strengthened the reporting surface with
reporting_checklist(),build_summary_table_bundle(),export_summary_appendix(), andvisual_reporting_template()for manuscript-oriented tables, appendix artifacts, and figure-placement guidance. - Added structured caveats in summaries and appendix tables for retained zero-count score categories and latent-regression population-model omission/design issues.
- Added exploratory
plot(fit, type = "ccc_surface", draw = FALSE)output for advanced visualization while keeping 2D Wright/pathway/category plots as the default reporting route.
mfrmr 0.1.3
CRAN resubmission
- Revised
DESCRIPTIONreferences to use the requestedauthors (year) <doi:...>format. - Added a documented return-value section for
facet_quality_dashboard(), including the output class, structure, and interpretation. - Replaced
\dontrun{}with\donttest{}for executable examples so CRAN can exercise those examples during checks.
mfrmr 0.1.0
Initial release
- Native R implementation of many-facet Rasch model (MFRM) estimation without TAM/sirt backends.
- Supports arbitrary facet counts with
fit_mfrm()and method selection (MMLdefault,JML). - Includes FACETS-style bias/interaction iterative estimation via
estimate_bias(). - Provides fixed-width report helpers (
build_fixed_reports()). - Adds APA-style narrative output (
build_apa_outputs()). - Adds visual warning summaries (
build_visual_summaries()) with configurable threshold profiles. - Implements residual PCA diagnostics and visualization (
analyze_residual_pca(),plot_residual_pca()). - Bundles Eckes & Jin (2021)-inspired synthetic Study 1/2 datasets in both
data/andinst/extdata/.