Build a data quality summary report (preferred alias)
Source:R/api-reports.R
data_quality_report.RdBuild a data quality summary report (preferred alias)
Usage
data_quality_report(
fit,
data = NULL,
person = NULL,
facets = NULL,
score = NULL,
weight = NULL,
min_category_count = 10,
dominant_category_cutoff = 0.95,
include_fixed = FALSE
)Arguments
- fit
Output from
fit_mfrm().- data
Optional raw data frame used for row-level review.
- person
Optional person column name in
data.- facets
Optional facet column names in
data.- score
Optional score column name in
data.- weight
Optional weight column name in
data.- min_category_count
Minimum raw or weighted count used to label a non-zero facet-level score category as sparse. Default
10.- dominant_category_cutoff
Proportion in
(0, 1]used to flag a facet level whose responses are dominated by one score category. Default0.95.- include_fixed
If
TRUE, include a legacy-compatible fixed-width text block.
Details
summary(out) is supported through summary().
plot(out) is dispatched through plot() for class
mfrm_data_quality (type = "dashboard", "quality_flags",
"row_review", "category_counts", "score_support",
"facet_category_usage", "facet_response_patterns", "score_map",
"missing_rows").
Interpreting output
summary: retained/dropped row overview.quality_overview: area-level QC status for rows, score support, facet-category use, and design matching.quality_flags: prioritized QC flags with counts and recommended next actions. This is not an item/person/rater table.row_review: reason-level breakdown for data issues.category_counts: post-filter category usage, including retained zero-count score-support categories.score_support_review: quick view of zero-count boundary/intermediate categories and their threshold-functioning caveats.category_usage_by_facet: facet-level category counts over the retained score support.category_usage_summary: per-facet-level zero/sparse category summary.facet_response_patterns: facet-level response-pattern summaries, including single-category and dominant-category use.caveats: user-facing score-support warnings, including cases where non-consecutive original labels such as1, 2, 4, 5were recoded becausekeep_original = FALSE.score_map: original-to-internal score mapping used when labels are recoded.unknown_elements: facet levels in raw data but not in fitted design.
Typical workflow
Run
data_quality_report(...)with raw data.Check
summary(out)andplot(out, type = "dashboard"), then inspectquality_flags, score-support, score-map, facet-response-pattern, and missing/unknown element sections as needed.Resolve missing values, score-support gaps, and sparse categories before final estimation/reporting.
Examples
toy <- load_mfrmr_data("example_core")
fit <- fit_mfrm(toy, "Person", c("Rater", "Criterion"), "Score", method = "JML", maxit = 30)
out <- data_quality_report(
fit,
data = toy, person = "Person",
facets = c("Rater", "Criterion"), score = "Score"
)
summary(out)
#> mfrmr Data Quality Summary
#> Class: mfrm_data_quality
#> Components (14): summary, quality_overview, quality_flags, model_match, row_review, unknown_elements, category_counts, score_support_review, category_usage_by_facet, category_usage_summary, facet_response_patterns, caveats, score_map, settings
#>
#> Data quality overview
#> TotalLinesInData TotalDataLines TotalNonBlankResponsesFound MissingScoreRows
#> 768 768 768 0
#> MissingFacetRows MissingPersonRows InvalidWeightRows OutOfRangeScoreRows
#> 0 0 0 0
#> ValidResponsesUsedForEstimation ZeroCountScoreCategories
#> 768 0
#> IntermediateZeroCountScoreCategories FacetLevelsWithZeroCategories
#> 0 0
#> FacetLevelsWithIntermediateZeroCategories FacetLevelsWithSparseCategories
#> 0 0
#> FacetLevelsWithSingleCategoryUse FacetLevelsWithDominantCategoryUse
#> 0 0
#> FacetLevelsWithBoundaryOnlyUse ScoreSupportCaveats
#> 0 0
#>
#> Review rows: quality_overview
#> Area Status Count Unit PercentOfData
#> Design match ok 0 levels NA
#> Facet category use ok 0 facet levels NA
#> Facet response patterns ok 0 facet levels NA
#> Rows ok 0 rows 0
#> Score support ok 0 conditions NA
#> Message
#> No raw-data level outside the fitted design was found.
#> No facet-level zero or sparse category-use issue was found.
#> No single-category or dominant-category facet response pattern was found.
#> No row-level missingness, invalid weight, or out-of-range score issue was found.
#> No score-support gap was found over the fitted score scale.
#> NextStep QualityFlags
#> Proceed with estimation diagnostics and reporting checks. 0
#> Continue to design-match checks. 0
#> Continue to design-match checks. 0
#> Continue to score-support and facet-level category-use checks. 0
#> Continue to facet-level category-use checks. 0
#> HighSeverityFlags
#> 0
#> 0
#> 0
#> 0
#> 0
#>
#> Settings
#> Setting Value
#> min_category_count 10
#> dominant_category_cutoff 0.95
#>
#> Notes
#> - Data quality summary for missingness, row status, score support, and category usage.
#> - QC overview: 0 high-priority area(s), 0 review area(s).
#> - No priority QC flags were found in the supplied data-quality checks.
p_dq <- plot(out, draw = FALSE)
p_dq$data$plot
#> [1] "dashboard"