mfrmr Workflow

This vignette outlines a reproducible workflow for:

loading packaged simulation data
fitting an MFRM with flexible facets
running diagnostics and residual PCA
generating APA and visual summary outputs
moving from fitted models into design simulation and fixed-calibration prediction

For a plot-first companion guide, see the separate mfrmr-visual-diagnostics vignette.

For speed-sensitive work, a useful pattern is:

start with method = "JML" or quad_points = 7
use diagnose_mfrm(..., residual_pca = "none") for the first pass
reuse the same diagnostics object in downstream reports and plots

MML and Diagnostic Modes

mfrmr treats MML and JML differently on purpose.

MML integrates over the person distribution with Gauss-Hermite quadrature.
The current MML branch optimizes the quadrature-based marginal log-likelihood directly; it is not an EM implementation.
JML is often useful for quick exploratory passes, but MML remains the preferred route for final estimation and fixed-calibration follow-up.

For RSM and PCM, diagnostics now expose two distinct evidence paths:

diagnostic_mode = "legacy" keeps the residual/EAP-based stack.
diagnostic_mode = "marginal_fit" adds the strict latent-integrated screen.
diagnostic_mode = "both" is the safest default when you want to inspect both views side by side.

The strict marginal branch is still screening-oriented in the current release. Use summary(diag)$diagnostic_basis to separate the legacy residual evidence from the strict marginal evidence rather than pooling them into one decision.

Load Data

library(mfrmr)

list_mfrmr_data()

data("ej2021_study1", package = "mfrmr")
head(ej2021_study1)

study1_alt <- load_mfrmr_data("study1")
identical(names(ej2021_study1), names(study1_alt))

Minimal Runnable Example

Start with the packaged example_core dataset. It is intentionally compact, category-complete, and generated from a single latent trait plus facet main effects so that help-page examples stay fast without relying on undersized toy data. The same object is also available via data("mfrmr_example_core", package = "mfrmr"):

data("mfrmr_example_core", package = "mfrmr")
toy <- mfrmr_example_core

fit_toy <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "JML",
  model = "RSM",
  maxit = 30
)
diag_toy <- diagnose_mfrm(fit_toy, residual_pca = "none")

summary(fit_toy)$overview
summary(diag_toy)$overview
names(plot(fit_toy, draw = FALSE))

#>   Model Method MethodUsed   N Persons Facets FacetInteractions
#> 1   RSM    JML       JMLE 768      48      2                 0
#>   InteractionParameters InteractionCells InteractionSparseCells Categories
#> 1                     0                0                      0          4
#>              LogLik              AIC              BIC Converged Iterations
#> 1 -820.948992262413 1753.89798452483 2013.95020958109      TRUE         71
#>        IterationsBasis MMLEngineRequested MMLEngineUsed MMLEngineDetail
#> 1 function_evaluations                                                 
#>   EMIterations EMConverged EMRelativeChange OptimizerMethod ConvergenceCode
#> 1                                                      BFGS               0
#>     ConvergenceBasis ConvergenceStatus ConvergenceReason ConvergenceSeverity
#> 1 optimizer_gradient         converged     tolerance_met                pass
#>   ConvergenceMessage                      ConvergenceDetail ReviewableWarning
#> 1                    Optimizer returned convergence code 0.             FALSE
#>   GradientReviewTolerance FunctionEvaluations GradientEvaluations
#> 1                   1e-04                  71                  24
#>   TerminalGradientSupNorm TerminalGradientRMS FacetSampleSizeFlag
#> 1      0.0328356577663257 0.00837237974728151              strong
#>   FacetMinLevelN FacetSparseCount ExtremeHighN ExtremeLowN
#> 1            192                0            0           0
#>   Observations Persons Facets Categories Subsets ResidualPCA DiagnosticMode
#> 1          768      48      2          4       1        none           both
#>   Method PrecisionTier   MarginalFit
#> 1    JML   exploratory not_available
#>   Component
#> 1      name
#> 2      data

The same fit can then move through the public first-contact route:

res_toy <- mfrm_results(fit_toy)
report_toy <- mfrm_report(res_toy, style = "qc")

summary(res_toy)$next_actions
summary(report_toy)$overview

export_dir <- file.path(tempdir(), "mfrmr-workflow-export")
export_toy <- export_mfrm_results(
  res_toy,
  output_dir = export_dir,
  include = c("default", "report"),
  overwrite = TRUE
)
head(export_toy$written_files)

#>   Priority               Area
#> 1        1           Overview
#> 2        2             Triage
#> 3        3        Diagnostics
#> 4        4 Visual diagnostics
#> 5        5          Precision
#> 6        6          Reporting
#> 7       11             Tables
#>                                                               Action
#> 1                                  Read the compact results summary.
#> 2                     Read the first-screen triage before branching.
#> 3             Review diagnostic key warnings before report drafting.
#> 4      Open the QC dashboard before drilling into individual tables.
#> 5 Inspect fit, separation, reliability, and ZSTD wording boundaries.
#> 6     Use the reporting checklist as the manuscript-routing surface.
#> 7                     Create an appendix-ready summary-table bundle.
#>                                            Route
#> 1                                   summary(res)
#> 2                            summary(res)$triage
#> 3          summary(res$diagnostics)$key_warnings
#> 4 plot(res, type = "qc", preset = "publication")
#> 5       summary(res$components$precision_review)
#> 6    summary(res$components$reporting_checklist)
#> 7                build_summary_table_bundle(res)
#>                                                                                                                      Reason
#> 1                                      Confirms input mode, model, method, section status, table coverage, and plot routes.
#> 2 Triage orders unavailable, review, information, and OK signals across diagnostics, tables, plots, and reporting surfaces.
#> 3               Diagnostic warnings identify the highest-priority fit, precision, residual, or category follow-up surfaces.
#> 4                                          The QC route gives a first visual check of fit, residual, and category surfaces.
#> 5                   Precision review keeps fit-size, standardized fit, and separation evidence in separate reporting lanes.
#> 6                                                     Checklist rows identify report-ready, missing, and caveated sections.
#> 7                                        The bundle exposes table roles, plot readiness, and conservative appendix presets.
#>   Style OverallStatus     FirstAction ReviewAreas CaveatAreas OptionalAreas
#> 1    qc        review Start with Fit.           2           0             3
#>   UnavailableAreas OkAreas
#> 1                0       0
#>                                                       SourceInclude
#> 1 fit, diagnostics, tables, precision, reporting, categories, plots
#>                 Component Format                                      Path Note
#> 1        summary_overview    csv        mfrmr_results_summary_overview.csv     
#> 2          summary_status    csv          mfrmr_results_summary_status.csv     
#> 3 summary_component_index    csv mfrmr_results_summary_component_index.csv     
#> 4     summary_table_index    csv     mfrmr_results_summary_table_index.csv     
#> 5        summary_plot_map    csv        mfrmr_results_summary_plot_map.csv     
#> 6          summary_triage    csv          mfrmr_results_summary_triage.csv

Diagnostics and Reporting

t4_toy <- unexpected_response_table(
  fit_toy,
  diagnostics = diag_toy,
  abs_z_min = 1.5,
  prob_max = 0.4,
  top_n = 10
)
t12_toy <- fair_average_table(fit_toy, diagnostics = diag_toy)
t13_toy <- bias_interaction_report(
  estimate_bias(fit_toy, diag_toy,
                facet_a = "Rater", facet_b = "Criterion",
                max_iter = 2),
  top_n = 10
)

class(summary(t4_toy))
class(summary(t12_toy))
class(summary(t13_toy))

names(plot(t4_toy, draw = FALSE))
names(plot(t12_toy, draw = FALSE))
names(plot(t13_toy, draw = FALSE))

chk_toy <- reporting_checklist(fit_toy, diagnostics = diag_toy)
subset(
  chk_toy$checklist,
  Section == "Visual Displays",
  c("Item", "DraftReady", "NextAction")
)

#>                      Object        SummaryClass
#> 1 unexpected_response_table summary.mfrm_bundle
#> 2        fair_average_table summary.mfrm_bundle
#> 3   bias_interaction_report summary.mfrm_bundle
#>                      Object Components
#> 1 unexpected_response_table name, data
#> 2        fair_average_table name, data
#> 3   bias_interaction_report name, data
#>                                  Item DraftReady
#> 1                          Wright map       TRUE
#> 2                QC / facet dashboard       TRUE
#> 3                Residual PCA visuals      FALSE
#> 4 Connectivity / design-matrix visual       TRUE
#> 5  Inter-rater / displacement visuals       TRUE
#> 6             Strict marginal visuals      FALSE
#> 7                  Bias / DIF visuals      FALSE
#> 8      Precision / information curves      FALSE
#> 9                Fit/category visuals       TRUE
#>                                                                                                                               NextAction
#> 1                                               Include a Wright map when the manuscript benefits from a shared-scale targeting display.
#> 2                              Use the dashboard as a first-pass triage view, then move to the specific follow-up plot behind each flag.
#> 3                                                  Run residual PCA if you want scree/loadings visuals for residual-structure follow-up.
#> 4                                                                Use the design-matrix view to support linkage and comparability claims.
#> 5                                                Use displacement and inter-rater views to localize QC issues after dashboard screening.
#> 6 For MML reporting runs, call diagnose_mfrm(..., diagnostic_mode = "both") to enable strict marginal follow-up visuals where supported.
#> 7                                                                 Run bias or DIF screening before discussing interaction-level visuals.
#> 8                                      Use information curves descriptively and keep the current precision-tier caveat in the narrative.
#> 9                                                 Use category curves and fit visuals as local descriptive follow-up after QC screening.

Fit and Diagnose with Full Data

For a realistic analysis, use the packaged Study 1 dataset:

fit <- fit_mfrm(
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)

diag <- diagnose_mfrm(
  fit,
  residual_pca = "none"
)

summary(fit)
summary(diag)

If you need residual-structure evidence for a final report, you can add residual PCA after the initial diagnostic pass. Treat this as an exploratory screen, not as a standalone unidimensionality test or as a DIMTEST/UNIDIM substitute. In MFRM reporting, a cautious claim should combine global residual fit, element-level fit, residual PCA, and local-dependence screens, for example: “evidence consistent with essential unidimensionality under the specified facet structure.”

diag_pca <- diagnose_mfrm(
  fit,
  residual_pca = "both",
  pca_max_factors = 6
)

summary(diag_pca)

Strict Diagnostics for RSM and PCM

For RSM and PCM, the package can now keep the legacy residual path and the strict marginal path side by side:

fit_rsm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7,
  maxit = 30
)
diag_rsm_strict <- diagnose_mfrm(
  fit_rsm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

fit_pcm_strict <- fit_mfrm(
  data = toy,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "PCM",
  step_facet = "Criterion",
  quad_points = 7,
  maxit = 30
)
diag_pcm_strict <- diagnose_mfrm(
  fit_pcm_strict,
  diagnostic_mode = "both",
  residual_pca = "none"
)

summary(diag_rsm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]
summary(diag_pcm_strict)$diagnostic_basis[, c("DiagnosticPath", "Status", "Basis")]

When you want a compact simulation-based screening check for the strict branch, use evaluate_mfrm_diagnostic_screening() on a small design:

screen_rsm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "local_dependence"),
  model = "RSM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)
screen_pcm <- evaluate_mfrm_diagnostic_screening(
  design = list(person = 18, rater = 3, criterion = 3, assignment = 3),
  reps = 1,
  scenarios = c("well_specified", "step_structure_misspecification"),
  model = "PCM",
  maxit = 30,
  quad_points = 7,
  seed = 123
)

screen_rsm$performance_summary[, c("Scenario", "EvaluationUse", "LegacyAnyFlagRate", "StrictAnyFlagRate")]
screen_pcm$performance_summary[, c("Scenario", "EvaluationUse", "LegacySensitivityProxy", "StrictSensitivityProxy", "DeltaStrictMinusLegacyFlagRate")]

The same strict branch is now reflected in the reporting router:

chk_rsm_strict <- reporting_checklist(fit_rsm_strict, diagnostics = diag_rsm_strict)
subset(
  chk_rsm_strict$checklist,
  Section == "Visual Displays" &
    Item %in% c("QC / facet dashboard", "Strict marginal visuals", "Precision / information curves"),
  c("Item", "Available", "DraftReady", "NextAction")
)

Residual PCA and Reporting

pca <- analyze_residual_pca(diag_pca, mode = "both")
plot_residual_pca(pca, mode = "overall", plot_type = "scree")

data("mfrmr_example_bias", package = "mfrmr")
bias_df <- mfrmr_example_bias
fit_bias <- fit_mfrm(
  bias_df,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score",
  method = "MML",
  model = "RSM",
  quad_points = 7
)
diag_bias <- diagnose_mfrm(fit_bias, residual_pca = "none")
bias <- estimate_bias(fit_bias, diag_bias, facet_a = "Rater", facet_b = "Criterion")
fixed <- build_fixed_reports(bias)
apa <- build_apa_outputs(fit_bias, diag_bias, bias_results = bias)

mfrm_threshold_profiles()
vis <- build_visual_summaries(fit_bias, diag_bias, threshold_profile = "standard")
vis$warning_map$residual_pca_overall

The same example_bias dataset also carries a Group variable so DIF-oriented examples can show a non-null pattern instead of a fully clean result. It can be loaded either with load_mfrmr_data("example_bias") or data("mfrmr_example_bias", package = "mfrmr").

Human-Readable Reporting API

spec <- specifications_report(fit, title = "Study run")
data_qc <- data_quality_report(
  fit,
  data = ej2021_study1,
  person = "Person",
  facets = c("Rater", "Criterion"),
  score = "Score"
)
iter <- estimation_iteration_report(fit, max_iter = 8)
subset_rep <- subset_connectivity_report(fit, diagnostics = diag)
facet_stats <- facet_statistics_report(fit, diagnostics = diag)
cat_structure <- category_structure_report(fit, diagnostics = diag)
cat_curves <- category_curves_report(fit, theta_points = 101)
bias_rep <- bias_interaction_report(bias, top_n = 20)
plot_bias_interaction(bias_rep, plot = "scatter")

Design Simulation and Prediction

The package also supports a separate simulation/prediction layer. The key distinction is:

evaluate_mfrm_recovery() checks whether known generating parameters are recovered under a stated simulation design. It is the first simulation check to run when you are validating a model specification or a planned design.
evaluate_mfrm_design() and predict_mfrm_population() are design-level helpers that summarize expected operating characteristics under an explicit simulation specification.
mfrm_generalizability() and mfrm_d_study() summarize observed univariate G-study components and analytic D-study projections. Read IdentificationStatus, GStatus, and PhiStatus before reporting projected coefficients; boundary or singular mixed-model fits are design-identification warnings rather than high-stakes-ready reliability evidence.
predict_mfrm_units() and sample_mfrm_plausible_values() score future or partially observed persons under a fixed MML calibration.

if (requireNamespace("lme4", quietly = TRUE)) {
  gt <- mfrm_generalizability(fit)
  gt$coefficients[, c("G", "Phi", "GStatus", "PhiStatus",
                      "IdentificationStatus")]

  ds <- mfrm_d_study(
    gt,
    data.frame(Rater = c(2, 3, 4), Criterion = 4),
    residual_scaling = "sensitivity"
  )
  ds[, c("n_Rater", "n_Criterion", "ResidualScaling",
         "G", "Phi", "GStatus", "PhiStatus", "IdentificationStatus")]
}

sim_spec <- build_mfrm_sim_spec(
  n_person = 30,
  n_rater = 4,
  n_criterion = 4,
  raters_per_person = 2,
  assignment = "rotating"
)

recovery <- suppressWarnings(
  evaluate_mfrm_recovery(
    sim_spec = sim_spec,
    reps = 2,
    maxit = 30,
    include_diagnostics = TRUE,
    diagnostic_fit_df_method = "both",
    seed = 2
  )
)

summary(recovery)$recovery_summary[, c("ParameterType", "Facet", "RMSE", "Bias")]
plot(recovery, type = "summary", metric = "rmse", draw = FALSE)$data$plot_table

recovery_review <- assess_mfrm_recovery(
  recovery,
  min_reps = 2,
  min_se_available = NULL,
  max_mcse_rmse_ratio = NULL,
  max_rmse = c(facet = 1, step = 1, default = 1.5),
  max_abs_bias = c(default = 0.75)
)

summary(recovery_review)$checklist[, c("Section", "Item", "Status")]
summary(recovery_review)$reading_order
summary(recovery_review)$condition_reporting_notes
summary(recovery_review)$condition_review
summary(recovery_review)$diagnostic_reporting_notes
summary(recovery_review)$diagnostic_review

status_plot <- plot(recovery_review, type = "status", draw = FALSE)
status_plot$data$section_status
status_plot$data$reading_order

metric_plot <- plot(recovery_review, type = "metrics", metric = "rmse", draw = FALSE)
metric_plot$data$plot_table
metric_plot$data$guidance

recovery_bundle <- build_summary_table_bundle(
  recovery_review,
  appendix_preset = "recommended"
)
recovery_bundle$table_index[, c("Table", "Rows", "Role")]

# For a release-scale check, use the optional validation protocol:
# source(system.file("validation", "recovery-validation.R", package = "mfrmr"))
# validation <- mfrmr_run_recovery_validation(tier = "core", output_dir = "recovery-validation")
# summary(validation)
# validation_summary <- summary(validation)
# validation_summary$reading_order
# validation_summary$topline_release_decision
# validation_summary$release_decision_table
# validation_summary$condition_reporting_notes
# validation_summary$condition_summary
# validation_summary$diagnostic_reporting_notes
# validation_summary$domain_decision_table
# validation_bundle <- build_summary_table_bundle(validation_summary)
# validation_bundle$tables$reading_order
# validation_bundle$tables$topline_release_decision
# validation_bundle$tables$condition_reporting_notes
# validation_bundle$tables$condition_summary
# validation_bundle$tables$diagnostic_reporting_notes
# The top-line table is the release-recovery conclusion. It uses recovery
# metrics, convergence, and Monte Carlo precision from core cases as the
# primary evidence, while ExtendedSensitivityStatus reports non-core stress
# cases separately. The condition notes and condition table keep generator
# stress and score support separate from recovery metrics, so OverallStatus = "review" is not
# read as a recovery-metric failure by itself. The diagnostic reporting notes
# turn fit/separation operating characteristics into report caveats rather
# than release gates.

pred_pop <- predict_mfrm_population(
  sim_spec = sim_spec,
  reps = 2,
  maxit = 30,
  seed = 1
)

summary(pred_pop)$forecast[, c("Facet", "MeanSeparation", "McseSeparation")]

keep_people <- unique(toy$Person)[1:18]
toy_mml <- suppressWarnings(
  fit_mfrm(
    toy[toy$Person %in% keep_people, , drop = FALSE],
    person = "Person",
    facets = c("Rater", "Criterion"),
    score = "Score",
    method = "MML",
    quad_points = 5,
    maxit = 30
  )
)

new_units <- data.frame(
  Person = c("NEW01", "NEW01"),
  Rater = unique(toy$Rater)[1],
  Criterion = unique(toy$Criterion)[1:2],
  Score = c(2, 3)
)

pred_units <- predict_mfrm_units(toy_mml, new_units, n_draws = 0)
pv_units <- sample_mfrm_plausible_values(toy_mml, new_units, n_draws = 2, seed = 1)

summary(pred_units)$estimates[, c("Person", "Estimate", "Lower", "Upper")]
summary(pv_units)$draw_summary[, c("Person", "Draws", "MeanValue")]

For a report or appendix handoff, pass the recovery objects through the same summary-table export route used by the rest of the package:

export_summary_appendix(
  list(recovery = recovery, recovery_review = recovery_review),
  output_dir = tempdir(),
  prefix = "mfrmr_recovery_appendix",
  preset = "recommended",
  include_html = FALSE,
  overwrite = TRUE
)

For a quick smoke test, reps = 2 or another very small value is useful only to check that the data-generating setup and refit path run. For a study report, increase reps, keep the ADEMP-style metadata in the exported tables, and set substantive RMSE/Bias thresholds so that assess_mfrm_recovery() can mark those rows as ok, review, or concern rather than not_assessed.