This vignette outlines a reproducible workflow for:
For a plot-first companion guide, see the separate
mfrmr-visual-diagnostics vignette.
For speed-sensitive work, a useful pattern is:
method = "JML" or
quad_points = 7diagnose_mfrm(..., residual_pca = "none") for the
first passmfrmr treats MML and JML
differently on purpose.
MML integrates over the person distribution with
Gauss-Hermite quadrature.MML branch optimizes the quadrature-based
marginal log-likelihood directly; it is not an EM implementation.JML is often useful for quick exploratory passes, but
MML remains the preferred route for final estimation and
fixed-calibration follow-up.For RSM and PCM, diagnostics now expose two
distinct evidence paths:
diagnostic_mode = "legacy" keeps the residual/EAP-based
stack.diagnostic_mode = "marginal_fit" adds the strict
latent-integrated screen.diagnostic_mode = "both" is the safest default when you
want to inspect both views side by side.The strict marginal branch is still screening-oriented in the current
release. Use summary(diag)$diagnostic_basis to separate the
legacy residual evidence from the strict marginal evidence rather than
pooling them into one decision.
Start with the packaged example_core dataset. It is
intentionally compact, category-complete, and generated from a single
latent trait plus facet main effects so that help-page examples stay
fast without relying on undersized toy data. The same object is also
available via
data("mfrmr_example_core", package = "mfrmr"):
data("mfrmr_example_core", package = "mfrmr")
toy <- mfrmr_example_core
fit_toy <- fit_mfrm(
data = toy,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "JML",
model = "RSM",
maxit = 30
)
diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none")
summary(fit_toy)$overview
summary(diag_toy)$overview
names(plot(fit_toy, draw = FALSE))#> Model Method MethodUsed N Persons Facets FacetInteractions
#> 1 RSM JML JMLE 768 48 2 0
#> InteractionParameters InteractionCells InteractionSparseCells Categories
#> 1 0 0 0 4
#> LogLik AIC BIC Converged Iterations
#> 1 -820.948992262413 1753.89798452483 2013.95020958109 TRUE 71
#> IterationsBasis MMLEngineRequested MMLEngineUsed MMLEngineDetail
#> 1 function_evaluations
#> EMIterations EMConverged EMRelativeChange OptimizerMethod ConvergenceCode
#> 1 BFGS 0
#> ConvergenceBasis ConvergenceStatus ConvergenceReason ConvergenceSeverity
#> 1 optimizer_gradient converged tolerance_met pass
#> ConvergenceMessage ConvergenceDetail ReviewableWarning
#> 1 Optimizer returned convergence code 0. FALSE
#> GradientReviewTolerance FunctionEvaluations GradientEvaluations
#> 1 1e-04 71 24
#> TerminalGradientSupNorm TerminalGradientRMS FacetSampleSizeFlag
#> 1 0.0328356577663257 0.00837237974728151 strong
#> FacetMinLevelN FacetSparseCount ExtremeHighN ExtremeLowN
#> 1 192 0 0 0
#> Observations Persons Facets Categories Subsets ResidualPCA DiagnosticMode
#> 1 768 48 2 4 1 none both
#> Method PrecisionTier MarginalFit
#> 1 JML exploratory not_available
#> Component
#> 1 name
#> 2 data
The same fit can then move through the public first-contact route:
res_toy <- mfrm_results(fit_toy)
report_toy <- mfrm_report(res_toy, style = "qc")
summary(res_toy)$next_actions
summary(report_toy)$overview
export_dir <- file.path(tempdir(), "mfrmr-workflow-export")
export_toy <- export_mfrm_results(
res_toy,
output_dir = export_dir,
include = c("default", "report"),
overwrite = TRUE
)
head(export_toy$written_files)#> Priority Area
#> 1 1 Overview
#> 2 2 Triage
#> 3 3 Diagnostics
#> 4 4 Visual diagnostics
#> 5 5 Precision
#> 6 6 Reporting
#> 7 11 Tables
#> Action
#> 1 Read the compact results summary.
#> 2 Read the first-screen triage before branching.
#> 3 Review diagnostic key warnings before report drafting.
#> 4 Open the QC dashboard before drilling into individual tables.
#> 5 Inspect fit, separation, reliability, and ZSTD wording boundaries.
#> 6 Use the reporting checklist as the manuscript-routing surface.
#> 7 Create an appendix-ready summary-table bundle.
#> Route
#> 1 summary(res)
#> 2 summary(res)$triage
#> 3 summary(res$diagnostics)$key_warnings
#> 4 plot(res, type = "qc", preset = "publication")
#> 5 summary(res$components$precision_review)
#> 6 summary(res$components$reporting_checklist)
#> 7 build_summary_table_bundle(res)
#> Reason
#> 1 Confirms input mode, model, method, section status, table coverage, and plot routes.
#> 2 Triage orders unavailable, review, information, and OK signals across diagnostics, tables, plots, and reporting surfaces.
#> 3 Diagnostic warnings identify the highest-priority fit, precision, residual, or category follow-up surfaces.
#> 4 The QC route gives a first visual check of fit, residual, and category surfaces.
#> 5 Precision review keeps fit-size, standardized fit, and separation evidence in separate reporting lanes.
#> 6 Checklist rows identify report-ready, missing, and caveated sections.
#> 7 The bundle exposes table roles, plot readiness, and conservative appendix presets.
#> Style OverallStatus FirstAction ReviewAreas CaveatAreas OptionalAreas
#> 1 qc review Start with Fit. 2 0 3
#> UnavailableAreas OkAreas
#> 1 0 0
#> SourceInclude
#> 1 fit, diagnostics, tables, precision, reporting, categories, plots
#> Component Format Path Note
#> 1 summary_overview csv mfrmr_results_summary_overview.csv
#> 2 summary_status csv mfrmr_results_summary_status.csv
#> 3 summary_component_index csv mfrmr_results_summary_component_index.csv
#> 4 summary_table_index csv mfrmr_results_summary_table_index.csv
#> 5 summary_plot_map csv mfrmr_results_summary_plot_map.csv
#> 6 summary_triage csv mfrmr_results_summary_triage.csv
t4_toy <- unexpected_response_table(
fit_toy,
diagnostics = diag_toy,
abs_z_min = 1.5,
prob_max = 0.4,
top_n = 10
)
t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy)
t13_toy <- bias_interaction_report(
estimate_bias(fit_toy, diag_toy,
facet_a = "Rater", facet_b = "Criterion",
max_iter = 2),
top_n = 10
)
class(summary(t4_toy))
class(summary(t12_toy))
class(summary(t13_toy))
names(plot(t4_toy, draw = FALSE))
names(plot(t12_toy, draw = FALSE))
names(plot(t13_toy, draw = FALSE))
chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy)
subset(
chk_toy$checklist,
Section == "Visual Displays",
c("Item", "DraftReady", "NextAction")
)#> Object SummaryClass
#> 1 unexpected_response_table summary.mfrm_bundle
#> 2 fair_average_table summary.mfrm_bundle
#> 3 bias_interaction_report summary.mfrm_bundle
#> Object Components
#> 1 unexpected_response_table name, data
#> 2 fair_average_table name, data
#> 3 bias_interaction_report name, data
#> Item DraftReady
#> 1 Wright map TRUE
#> 2 QC / facet dashboard TRUE
#> 3 Residual PCA visuals FALSE
#> 4 Connectivity / design-matrix visual TRUE
#> 5 Inter-rater / displacement visuals TRUE
#> 6 Strict marginal visuals FALSE
#> 7 Bias / DIF visuals FALSE
#> 8 Precision / information curves FALSE
#> 9 Fit/category visuals TRUE
#> NextAction
#> 1 Include a Wright map when the manuscript benefits from a shared-scale targeting display.
#> 2 Use the dashboard as a first-pass triage view, then move to the specific follow-up plot behind each flag.
#> 3 Run residual PCA if you want scree/loadings visuals for residual-structure follow-up.
#> 4 Use the design-matrix view to support linkage and comparability claims.
#> 5 Use displacement and inter-rater views to localize QC issues after dashboard screening.
#> 6 For MML reporting runs, call diagnose_mfrm(..., diagnostic_mode = "both") to enable strict marginal follow-up visuals where supported.
#> 7 Run bias or DIF screening before discussing interaction-level visuals.
#> 8 Use information curves descriptively and keep the current precision-tier caveat in the narrative.
#> 9 Use category curves and fit visuals as local descriptive follow-up after QC screening.
For a realistic analysis, use the packaged Study 1 dataset:
fit <- fit_mfrm(
data = ej2021_study1,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "MML",
model = "RSM",
quad_points = 7
)
diag <- diagnose_mfrm(
fit,
residual_pca = "none"
)
summary(fit)
summary(diag)If you need residual-structure evidence for a final report, you can add residual PCA after the initial diagnostic pass. Treat this as an exploratory screen, not as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In MFRM reporting, a cautious claim should combine global residual fit, element-level fit, residual PCA, and local-dependence screens, for example: “evidence consistent with essential unidimensionality under the specified facet structure.”
For RSM and PCM, the package can now keep
the legacy residual path and the strict marginal path side by side:
fit_rsm_strict <- fit_mfrm(
data = toy,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "MML",
model = "RSM",
quad_points = 7,
maxit = 30
)
diag_rsm_strict <- diagnose_mfrm(
fit_rsm_strict,
diagnostic_mode = "both",
residual_pca = "none"
)
fit_pcm_strict <- fit_mfrm(
data = toy,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "MML",
model = "PCM",
step_facet = "Criterion",
quad_points = 7,
maxit = 30
)
diag_pcm_strict <- diagnose_mfrm(
fit_pcm_strict,
diagnostic_mode = "both",
residual_pca = "none"
)
summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]When you want a compact simulation-based screening check for the
strict branch, use evaluate_mfrm_diagnostic_screening() on
a small design:
screen_rsm <- evaluate_mfrm_diagnostic_screening(
design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
reps = 1,
scenarios = c("well_specified", "local_dependence"),
model = "RSM",
maxit = 30,
quad_points = 7,
seed = 123
)
screen_pcm <- evaluate_mfrm_diagnostic_screening(
design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
reps = 1,
scenarios = c("well_specified", "step_structure_misspecification"),
model = "PCM",
maxit = 30,
quad_points = 7,
seed = 123
)
screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")]
screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")]The same strict branch is now reflected in the reporting router:
chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict)
subset(
chk_rsm_strict$checklist,
Section == "Visual Displays" &
Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"),
c("Item", "Available", "DraftReady", "NextAction")
)pca <- analyze_residual_pca(diag_pca, mode = "both")
plot_residual_pca(pca, mode = "overall", plot_type = "scree")data("mfrmr_example_bias", package = "mfrmr")
bias_df <- mfrmr_example_bias
fit_bias <- fit_mfrm(
bias_df,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "MML",
model = "RSM",
quad_points = 7
)
diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none")
bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion")
fixed <- build_fixed_reports(bias)
apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias)
mfrm_threshold_profiles()
vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard")
vis$warning_map$residual_pca_overallThe same example_bias dataset also carries a
Group variable so DIF-oriented examples can show a non-null
pattern instead of a fully clean result. It can be loaded either with
load_mfrmr_data("example_bias") or
data("mfrmr_example_bias", package = "mfrmr").
spec <- specifications_report(fit, title = "Study run")
data_qc <- data_quality_report(
fit,
data = ej2021_study1,
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score"
)
iter <- estimation_iteration_report(fit, max_iter = 8)
subset_rep <- subset_connectivity_report(fit, diagnostics = diag)
facet_stats <- facet_statistics_report(fit, diagnostics = diag)
cat_structure <- category_structure_report(fit, diagnostics = diag)
cat_curves <- category_curves_report(fit, theta_points = 101)
bias_rep <- bias_interaction_report(bias, top_n = 20)
plot_bias_interaction(bias_rep, plot = "scatter")The package also supports a separate simulation/prediction layer. The key distinction is:
evaluate_mfrm_recovery() checks whether known
generating parameters are recovered under a stated simulation design. It
is the first simulation check to run when you are validating a model
specification or a planned design.evaluate_mfrm_design() and
predict_mfrm_population() are design-level helpers that
summarize expected operating characteristics under an explicit
simulation specification.mfrm_generalizability() and mfrm_d_study()
summarize observed univariate G-study components and analytic D-study
projections. Read IdentificationStatus,
GStatus, and PhiStatus before reporting
projected coefficients; boundary or singular mixed-model fits are
design-identification warnings rather than high-stakes-ready reliability
evidence.predict_mfrm_units() and
sample_mfrm_plausible_values() score future or partially
observed persons under a fixed MML calibration.if (requireNamespace("lme4", quietly = TRUE)) {
gt <- mfrm_generalizability(fit)
gt$coefficients[, c("G", "Phi", "GStatus", "PhiStatus",
"IdentificationStatus")]
ds <- mfrm_d_study(
gt,
data.frame(Rater = c(2, 3, 4), Criterion = 4),
residual_scaling = "sensitivity"
)
ds[, c("n_Rater", "n_Criterion", "ResidualScaling",
"G", "Phi", "GStatus", "PhiStatus", "IdentificationStatus")]
}sim_spec <- build_mfrm_sim_spec(
n_person = 30,
n_rater = 4,
n_criterion = 4,
raters_per_person = 2,
assignment = "rotating"
)
recovery <- suppressWarnings(
evaluate_mfrm_recovery(
sim_spec = sim_spec,
reps = 2,
maxit = 30,
include_diagnostics = TRUE,
diagnostic_fit_df_method = "both",
seed = 2
)
)
summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")]
plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table
recovery_review <- assess_mfrm_recovery(
recovery,
min_reps = 2,
min_se_available = NULL,
max_mcse_rmse_ratio = NULL,
max_rmse = c(facet = 1, step = 1, default = 1.5),
max_abs_bias = c(default = 0.75)
)
summary(recovery_review)$checklist[, c("Section", "Item", "Status")]
summary(recovery_review)$reading_order
summary(recovery_review)$condition_reporting_notes
summary(recovery_review)$condition_review
summary(recovery_review)$diagnostic_reporting_notes
summary(recovery_review)$diagnostic_review
status_plot <- plot(recovery_review, type = "status", draw = FALSE)
status_plot$data$section_status
status_plot$data$reading_order
metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE)
metric_plot$data$plot_table
metric_plot$data$guidance
recovery_bundle <- build_summary_table_bundle(
recovery_review,
appendix_preset = "recommended"
)
recovery_bundle$table_index[, c("Table", "Rows", "Role")]
# For a release-scale check, use the optional validation protocol:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation")
# summary(validation)
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$release_decision_table
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# validation_summary$domain_decision_table
# validation_bundle <- build_summary_table_bundle(validation_summary)
# validation_bundle$tables$reading_order
# validation_bundle$tables$topline_release_decision
# validation_bundle$tables$condition_reporting_notes
# validation_bundle$tables$condition_summary
# validation_bundle$tables$diagnostic_reporting_notes
# The top-line table is the release-recovery conclusion. It uses recovery
# metrics, convergence, and Monte Carlo precision from core cases as the
# primary evidence, while ExtendedSensitivityStatus reports non-core stress
# cases separately. The condition notes and condition table keep generator
# stress and score support separate from recovery metrics, so OverallStatus = "review" is not
# read as a recovery-metric failure by itself. The diagnostic reporting notes
# turn fit/separation operating characteristics into report caveats rather
# than release gates.
pred_pop <- predict_mfrm_population(
sim_spec = sim_spec,
reps = 2,
maxit = 30,
seed = 1
)
summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")]
keep_people <- unique(toy$Person)[1:18]
toy_mml <- suppressWarnings(
fit_mfrm(
toy[toy$Person %in% keep_people, , drop = FALSE],
person = "Person",
facets = c("Rater", "Criterion"),
score = "Score",
method = "MML",
quad_points = 5,
maxit = 30
)
)
new_units <- data.frame(
Person = c("NEW01", "NEW01"),
Rater = unique(toy$Rater)[1],
Criterion = unique(toy$Criterion)[1:2],
Score = c(2, 3)
)
pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0)
pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1)
summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")]
summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")]For a report or appendix handoff, pass the recovery objects through the same summary-table export route used by the rest of the package:
export_summary_appendix(
list(recovery = recovery, recovery_review = recovery_review),
output_dir = tempdir(),
prefix = "mfrmr_recovery_appendix",
preset = "recommended",
include_html = FALSE,
overwrite = TRUE
)For a quick smoke test, reps = 2 or another very small
value is useful only to check that the data-generating setup and refit
path run. For a study report, increase reps, keep the
ADEMP-style metadata in the exported tables, and set substantive
RMSE/Bias thresholds so that assess_mfrm_recovery() can
mark those rows as ok, review, or
concern rather than not_assessed.