mfrmr Workflow

This vignette outlines a reproducible workflow for:

  • loading packaged simulation data
  • fitting an MFRM with flexible facets
  • running diagnostics and residual PCA
  • generating APA and visual summary outputs
  • moving from fitted models into design simulation and fixed-calibration prediction

For a plot-first companion guide, see the separate mfrmr-visual-diagnostics vignette.

For speed-sensitive work, a useful pattern is:

  • start with method = "JML" or quad_points = 7
  • use diagnose_mfrm(..., residual_pca = "none") for the first pass
  • reuse the same diagnostics object in downstream reports and plots

MML and Diagnostic Modes

mfrmr treats MML and JML differently on purpose.

  • MML integrates over the person distribution with Gauss-Hermite quadrature.
  • The current MML branch optimizes the quadrature-based marginal log-likelihood directly; it is not an EM implementation.
  • JML is often useful for quick exploratory passes, but MML remains the preferred route for final estimation and fixed-calibration follow-up.

For RSM and PCM, diagnostics now expose two distinct evidence paths:

  • diagnostic_mode = "legacy" keeps the residual/EAP-based stack.
  • diagnostic_mode = "marginal_fit" adds the strict latent-integrated screen.
  • diagnostic_mode = "both" is the safest default when you want to inspect both views side by side.

The strict marginal branch is still screening-oriented in the current release. Use summary(diag)$diagnostic_basis to separate the legacy residual evidence from the strict marginal evidence rather than pooling them into one decision.

Load Data

library(mfrmr)

list_mfrmr_data()

data("ej2021_study1", package = "mfrmr")
head(ej2021_study1)

study1_alt <- load_mfrmr_data("study1")
identical(names(ej2021_study1), names(study1_alt))

Minimal Runnable Example

Start with the packaged example_core dataset. It is intentionally compact, category-complete, and generated from a single latent trait plus facet main effects so that help-page examples stay fast without relying on undersized toy data. The same object is also available via data("mfrmr_example_core", package = "mfrmr"):

data("mfrmr_example_core", package = "mfrmr")
toy <- mfrmr_example_core

fit_toy <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "JML",
  model = "RSM",
  maxit = 30
)
diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none")

summary(fit_toy)$overview
summary(diag_toy)$overview
names(plot(fit_toy, draw = FALSE))
#>   Model Method MethodUsed   N Persons Facets FacetInteractions
#> 1   RSM    JML       JMLE 768      48      2                 0
#>   InteractionParameters InteractionCells InteractionSparseCells Categories
#> 1                     0                0                      0          4
#>              LogLik              AIC              BIC Converged Iterations
#> 1 -820.948992262413 1753.89798452483 2013.95020958109      TRUE         71
#>        IterationsBasis MMLEngineRequested MMLEngineUsed MMLEngineDetail
#> 1 function_evaluations                                                 
#>   EMIterations EMConverged EMRelativeChange OptimizerMethod ConvergenceCode
#> 1                                                      BFGS               0
#>     ConvergenceBasis ConvergenceStatus ConvergenceReason ConvergenceSeverity
#> 1 optimizer_gradient         converged     tolerance_met                pass
#>   ConvergenceMessage                      ConvergenceDetail ReviewableWarning
#> 1                    Optimizer returned convergence code 0.             FALSE
#>   GradientReviewTolerance FunctionEvaluations GradientEvaluations
#> 1                   1e-04                  71                  24
#>   TerminalGradientSupNorm TerminalGradientRMS FacetSampleSizeFlag
#> 1      0.0328356577663257 0.00837237974728151              strong
#>   FacetMinLevelN FacetSparseCount ExtremeHighN ExtremeLowN
#> 1            192                0            0           0
#>   Observations Persons Facets Categories Subsets ResidualPCA DiagnosticMode
#> 1          768      48      2          4       1        none           both
#>   Method PrecisionTier   MarginalFit
#> 1    JML   exploratory not_available
#>   Component
#> 1      name
#> 2      data

The same fit can then move through the public first-contact route:

res_toy <- mfrm_results(fit_toy)
report_toy <- mfrm_report(res_toy, style = "qc")

summary(res_toy)$next_actions
summary(report_toy)$overview

export_dir <- file.path(tempdir(), "mfrmr-workflow-export")
export_toy <- export_mfrm_results(
  res_toy,
  output_dir = export_dir,
  include = c("default", "report"),
  overwrite = TRUE
)
head(export_toy$written_files)
#>   Priority               Area
#> 1        1           Overview
#> 2        2             Triage
#> 3        3        Diagnostics
#> 4        4 Visual diagnostics
#> 5        5          Precision
#> 6        6          Reporting
#> 7       11             Tables
#>                                                               Action
#> 1                                  Read the compact results summary.
#> 2                     Read the first-screen triage before branching.
#> 3             Review diagnostic key warnings before report drafting.
#> 4      Open the QC dashboard before drilling into individual tables.
#> 5 Inspect fit, separation, reliability, and ZSTD wording boundaries.
#> 6     Use the reporting checklist as the manuscript-routing surface.
#> 7                     Create an appendix-ready summary-table bundle.
#>                                            Route
#> 1                                   summary(res)
#> 2                            summary(res)$triage
#> 3          summary(res$diagnostics)$key_warnings
#> 4 plot(res, type = "qc", preset = "publication")
#> 5       summary(res$components$precision_review)
#> 6    summary(res$components$reporting_checklist)
#> 7                build_summary_table_bundle(res)
#>                                                                                                                      Reason
#> 1                                      Confirms input mode, model, method, section status, table coverage, and plot routes.
#> 2 Triage orders unavailable, review, information, and OK signals across diagnostics, tables, plots, and reporting surfaces.
#> 3               Diagnostic warnings identify the highest-priority fit, precision, residual, or category follow-up surfaces.
#> 4                                          The QC route gives a first visual check of fit, residual, and category surfaces.
#> 5                   Precision review keeps fit-size, standardized fit, and separation evidence in separate reporting lanes.
#> 6                                                     Checklist rows identify report-ready, missing, and caveated sections.
#> 7                                        The bundle exposes table roles, plot readiness, and conservative appendix presets.
#>   Style OverallStatus     FirstAction ReviewAreas CaveatAreas OptionalAreas
#> 1    qc        review Start with Fit.           2           0             3
#>   UnavailableAreas OkAreas
#> 1                0       0
#>                                                       SourceInclude
#> 1 fit, diagnostics, tables, precision, reporting, categories, plots
#>                 Component Format                                      Path Note
#> 1        summary_overview    csv        mfrmr_results_summary_overview.csv     
#> 2          summary_status    csv          mfrmr_results_summary_status.csv     
#> 3 summary_component_index    csv mfrmr_results_summary_component_index.csv     
#> 4     summary_table_index    csv     mfrmr_results_summary_table_index.csv     
#> 5        summary_plot_map    csv        mfrmr_results_summary_plot_map.csv     
#> 6          summary_triage    csv          mfrmr_results_summary_triage.csv

Diagnostics and Reporting

t4_toy <- unexpected_response_table(
  fit_toy,
  diagnostics = diag_toy,
  abs_z_min = 1.5,
  prob_max = 0.4,
  top_n = 10
)
t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy)
t13_toy <- bias_interaction_report(
  estimate_bias(fit_toy, diag_toy,
                facet_a = "Rater", facet_b = "Criterion",
                max_iter = 2),
  top_n = 10
)

class(summary(t4_toy))
class(summary(t12_toy))
class(summary(t13_toy))

names(plot(t4_toy, draw = FALSE))
names(plot(t12_toy, draw = FALSE))
names(plot(t13_toy, draw = FALSE))

chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy)
subset(
  chk_toy$checklist,
  Section == "Visual Displays",
  c("Item", "DraftReady", "NextAction")
)
#>                      Object        SummaryClass
#> 1 unexpected_response_table summary.mfrm_bundle
#> 2        fair_average_table summary.mfrm_bundle
#> 3   bias_interaction_report summary.mfrm_bundle
#>                      Object Components
#> 1 unexpected_response_table name, data
#> 2        fair_average_table name, data
#> 3   bias_interaction_report name, data
#>                                  Item DraftReady
#> 1                          Wright map       TRUE
#> 2                QC / facet dashboard       TRUE
#> 3                Residual PCA visuals      FALSE
#> 4 Connectivity / design-matrix visual       TRUE
#> 5  Inter-rater / displacement visuals       TRUE
#> 6             Strict marginal visuals      FALSE
#> 7                  Bias / DIF visuals      FALSE
#> 8      Precision / information curves      FALSE
#> 9                Fit/category visuals       TRUE
#>                                                                                                                               NextAction
#> 1                                               Include a Wright map when the manuscript benefits from a shared-scale targeting display.
#> 2                              Use the dashboard as a first-pass triage view, then move to the specific follow-up plot behind each flag.
#> 3                                                  Run residual PCA if you want scree/loadings visuals for residual-structure follow-up.
#> 4                                                                Use the design-matrix view to support linkage and comparability claims.
#> 5                                                Use displacement and inter-rater views to localize QC issues after dashboard screening.
#> 6 For MML reporting runs, call diagnose_mfrm(..., diagnostic_mode = "both") to enable strict marginal follow-up visuals where supported.
#> 7                                                                 Run bias or DIF screening before discussing interaction-level visuals.
#> 8                                      Use information curves descriptively and keep the current precision-tier caveat in the narrative.
#> 9                                                 Use category curves and fit visuals as local descriptive follow-up after QC screening.

Fit and Diagnose with Full Data

For a realistic analysis, use the packaged Study 1 dataset:

fit <- fit_mfrm(
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)

diag <- diagnose_mfrm(
  fit,
  residual_pca = "none"
)

summary(fit)
summary(diag)

If you need residual-structure evidence for a final report, you can add residual PCA after the initial diagnostic pass. Treat this as an exploratory screen, not as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In MFRM reporting, a cautious claim should combine global residual fit, element-level fit, residual PCA, and local-dependence screens, for example: “evidence consistent with essential unidimensionality under the specified facet structure.”

diag_pca <- diagnose_mfrm(
  fit,
  residual_pca = "both",
  pca_max_factors = 6
)

summary(diag_pca)

Strict Diagnostics for RSM and PCM

For RSM and PCM, the package can now keep the legacy residual path and the strict marginal path side by side:

fit_rsm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7,
  maxit = 30
)
diag_rsm_strict <- diagnose_mfrm(
  fit_rsm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

fit_pcm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "PCM",
  step_facet = "Criterion",
  quad_points = 7,
  maxit = 30
)
diag_pcm_strict <- diagnose_mfrm(
  fit_pcm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]

When you want a compact simulation-based screening check for the strict branch, use evaluate_mfrm_diagnostic_screening() on a small design:

screen_rsm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "local_dependence"),
  model = "RSM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)
screen_pcm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "step_structure_misspecification"),
  model = "PCM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)

screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")]
screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")]

The same strict branch is now reflected in the reporting router:

chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict)
subset(
  chk_rsm_strict$checklist,
  Section == "Visual Displays" &
    Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"),
  c("Item", "Available", "DraftReady", "NextAction")
)

Residual PCA and Reporting

pca <- analyze_residual_pca(diag_pca, mode = "both")
plot_residual_pca(pca, mode = "overall", plot_type = "scree")
data("mfrmr_example_bias", package = "mfrmr")
bias_df <- mfrmr_example_bias
fit_bias <- fit_mfrm(
  bias_df,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)
diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none")
bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion")
fixed <- build_fixed_reports(bias)
apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias)

mfrm_threshold_profiles()
vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard")
vis$warning_map$residual_pca_overall

The same example_bias dataset also carries a Group variable so DIF-oriented examples can show a non-null pattern instead of a fully clean result. It can be loaded either with load_mfrmr_data("example_bias") or data("mfrmr_example_bias", package = "mfrmr").

Human-Readable Reporting API

spec <- specifications_report(fit, title = "Study run")
data_qc <- data_quality_report(
  fit,
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score"
)
iter <- estimation_iteration_report(fit, max_iter = 8)
subset_rep <- subset_connectivity_report(fit, diagnostics = diag)
facet_stats <- facet_statistics_report(fit, diagnostics = diag)
cat_structure <- category_structure_report(fit, diagnostics = diag)
cat_curves <- category_curves_report(fit, theta_points = 101)
bias_rep <- bias_interaction_report(bias, top_n = 20)
plot_bias_interaction(bias_rep, plot = "scatter")

Design Simulation and Prediction

The package also supports a separate simulation/prediction layer. The key distinction is:

  • evaluate_mfrm_recovery() checks whether known generating parameters are recovered under a stated simulation design. It is the first simulation check to run when you are validating a model specification or a planned design.
  • evaluate_mfrm_design() and predict_mfrm_population() are design-level helpers that summarize expected operating characteristics under an explicit simulation specification.
  • mfrm_generalizability() and mfrm_d_study() summarize observed univariate G-study components and analytic D-study projections. Read IdentificationStatus, GStatus, and PhiStatus before reporting projected coefficients; boundary or singular mixed-model fits are design-identification warnings rather than high-stakes-ready reliability evidence.
  • predict_mfrm_units() and sample_mfrm_plausible_values() score future or partially observed persons under a fixed MML calibration.
if (requireNamespace("lme4", quietly = TRUE)) {
  gt <- mfrm_generalizability(fit)
  gt$coefficients[, c("G", "Phi", "GStatus", "PhiStatus",
                      "IdentificationStatus")]

  ds <- mfrm_d_study(
    gt,
    data.frame(Rater = c(2, 3, 4), Criterion = 4),
    residual_scaling = "sensitivity"
  )
  ds[, c("n_Rater", "n_Criterion", "ResidualScaling",
         "G", "Phi", "GStatus", "PhiStatus", "IdentificationStatus")]
}
sim_spec <- build_mfrm_sim_spec(
  n_person = 30,
  n_rater = 4,
  n_criterion = 4,
  raters_per_person = 2,
  assignment = "rotating"
)

recovery <- suppressWarnings(
  evaluate_mfrm_recovery(
    sim_spec = sim_spec,
    reps = 2,
    maxit = 30,
    include_diagnostics = TRUE,
    diagnostic_fit_df_method = "both",
    seed = 2
  )
)

summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")]
plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table

recovery_review <- assess_mfrm_recovery(
  recovery,
  min_reps = 2,
  min_se_available = NULL,
  max_mcse_rmse_ratio = NULL,
  max_rmse = c(facet = 1, step = 1, default = 1.5),
  max_abs_bias = c(default = 0.75)
)

summary(recovery_review)$checklist[, c("Section", "Item", "Status")]
summary(recovery_review)$reading_order
summary(recovery_review)$condition_reporting_notes
summary(recovery_review)$condition_review
summary(recovery_review)$diagnostic_reporting_notes
summary(recovery_review)$diagnostic_review

status_plot <- plot(recovery_review, type = "status", draw = FALSE)
status_plot$data$section_status
status_plot$data$reading_order

metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE)
metric_plot$data$plot_table
metric_plot$data$guidance

recovery_bundle <- build_summary_table_bundle(
  recovery_review,
  appendix_preset = "recommended"
)
recovery_bundle$table_index[, c("Table", "Rows", "Role")]

# For a release-scale check, use the optional validation protocol:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation")
# summary(validation)
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$release_decision_table
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# validation_summary$domain_decision_table
# validation_bundle <- build_summary_table_bundle(validation_summary)
# validation_bundle$tables$reading_order
# validation_bundle$tables$topline_release_decision
# validation_bundle$tables$condition_reporting_notes
# validation_bundle$tables$condition_summary
# validation_bundle$tables$diagnostic_reporting_notes
# The top-line table is the release-recovery conclusion. It uses recovery
# metrics, convergence, and Monte Carlo precision from core cases as the
# primary evidence, while ExtendedSensitivityStatus reports non-core stress
# cases separately. The condition notes and condition table keep generator
# stress and score support separate from recovery metrics, so OverallStatus = "review" is not
# read as a recovery-metric failure by itself. The diagnostic reporting notes
# turn fit/separation operating characteristics into report caveats rather
# than release gates.

pred_pop <- predict_mfrm_population(
  sim_spec = sim_spec,
  reps = 2,
  maxit = 30,
  seed = 1
)

summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")]

keep_people <- unique(toy$Person)[1:18]
toy_mml <- suppressWarnings(
  fit_mfrm(
    toy[toy$Person %in% keep_people, , drop = FALSE],
    person = "Person",
    facets = c("Rater", "Criterion"),
    score = "Score",
    method = "MML",
    quad_points = 5,
    maxit = 30
  )
)

new_units <- data.frame(
  Person = c("NEW01", "NEW01"),
  Rater = unique(toy$Rater)[1],
  Criterion = unique(toy$Criterion)[1:2],
  Score = c(2, 3)
)

pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0)
pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1)

summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")]
summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")]

For a report or appendix handoff, pass the recovery objects through the same summary-table export route used by the rest of the package:

export_summary_appendix(
  list(recovery = recovery, recovery_review = recovery_review),
  output_dir = tempdir(),
  prefix = "mfrmr_recovery_appendix",
  preset = "recommended",
  include_html = FALSE,
  overwrite = TRUE
)

For a quick smoke test, reps = 2 or another very small value is useful only to check that the data-generating setup and refit path run. For a study report, increase reps, keep the ADEMP-style metadata in the exported tables, and set substantive RMSE/Bias thresholds so that assess_mfrm_recovery() can mark those rows as ok, review, or concern rather than not_assessed.