NEWS


mfrmr 0.2.0 (2026-05-16)

This is the first CRAN-facing release after 0.1.5. Version 0.1.6 was used as an unpublished development line, so most users will experience 0.2.0 as a direct upgrade from CRAN 0.1.5 to CRAN 0.2.0. Read this section as the public change log relative to 0.1.5; the later "mfrmr 0.1.6 (development-only)" section is retained only as detailed development history.

0.2.0 is a substantial analysis and reporting release. It expands the bounded-GPCM route, adds shrinkage and design-audit workflows, improves APA/reporting handoff, adds residual dimensionality and classical DIF screening helpers, broadens the visualization surface, and corrects several statistical documentation and output contracts.

Upgrade notes from CRAN 0.1.5

Review these points before rerunning an existing 0.1.5 analysis script:

For package maintainers and users who installed GitHub snapshots during development: the 0.1.6 section below records what happened during that unpublished development line, but 0.2.0 is the public release boundary.

Citation and attribution corrections

Documentation refinements

Release overview

The subsections below give the detailed 0.2.0 release notes. They cover both the unpublished 0.1.6 development work and the final 0.2.0 changes because CRAN users are upgrading directly from 0.1.5.

Default changes

Relative to CRAN 0.1.5, three user-visible defaults change:

For reporting consistency, plot_person_fit(), plot_bubble(), and plot_facet_quality_dashboard() now inherit the active MnSq screening band from mfrm_misfit_thresholds() when no manual band is supplied. Pass explicit plot thresholds (for example lower = 0.5, upper = 1.5, fit_range = c(0.5, 1.5), or misfit_warn = 1.5) to freeze a manual review band.

New features

Residual dimensionality checks

check_residual_dimensionality() adds a parallel-analysis layer for residual PCA. It compares observed residual eigenvalues with null eigenvalues from independent residual matrices, column-wise residual permutations, or fitted-model parametric simulations. The companion plot_residual_dimensionality() returns the same mfrm_plot_data payload style as other visual helpers, and as.data.frame() exposes comparison, observed, and null-distribution tables for CSV export or custom plotting. The help page explicitly distinguishes this exploratory residual-structure diagnostic from FACETS ZSTD, TAM itemfit ZSTD, and mirt's S-X2 statistic.

Continuous integration

New GitHub Actions workflows added alongside the existing pkgdown.yaml: R-CMD-check.yaml runs the matrix on Ubuntu (release / devel / oldrel-1) plus macos-latest and windows-latest (release), and test-coverage.yaml runs covr with artifact upload (no external service contacted).

Differential-functioning display controls

plot_dif_heatmap() gains display controls for cell labels (show_values, value_digits), absolute flag thresholds (flag_threshold, flag_color), and shared symmetric color limits (scale_limit) so several heatmaps can be drawn on a comparable scale.

plot_dif_summary() gains optional normal-approximation confidence intervals, effect-threshold guide lines, method-aware axis labels, and an interpretation-guide payload that downstream code can render alongside the figure.

Plot payload printing

print.mfrm_plot_data() is now defined, so the headline draw = FALSE return value renders as a compact summary (name, title, payload shapes, legend / reference-line counts) instead of a raw list dump.

Classical DIF and classic curve front doors

analyze_dif_classical() adds a limited classical screening route for long-format many-facet data. It supports generalized Mantel-Haenszel / Cochran-Mantel-Haenszel screening over ordered score categories and binary logistic DIF screening when the dichotomization is explicit through logistic_threshold. It does not implement SIBTEST, does not estimate subgroup MFRM parameters, and does not claim ETS A/B/C labels.

Four classic plot entry points are now exported: plot_expected_score_curve(), plot_test_characteristic_curve(), plot_cumulative_category_curve(), and plot_kidmap(). They reuse the package's existing category-curve, design-weighted expectation, and person-fit payloads while giving mirt/TAM/FACETS users familiar names.

Bounded GPCM fair-average and bias unblock (slope-aware)

fair_average_table() and estimate_bias() no longer hard-stop on GPCM fits. Both helpers now use the slope-aware element-conditional GPCM construction:

Both helpers gain method = "GPCM-slope-aware" and a caveat field that names the slope convention and reminds the user that the SE columns are not delta-method standard errors of the fair-average / bias values. A delta-method SE for both is planned for a future release; it requires a vcov() method on the joint covariance of (theta, a, delta), which is not yet exposed. See ?fair_average_table, ?estimate_bias, and gpcm_capability_matrix() for the full support contract.

build_apa_outputs(), build_mfrm_manifest(), build_mfrm_replay_script(), and export_mfrm_bundle() now route bounded GPCM fits through package-native outputs with explicit caveats. facets_parity_report() and facets_output_file_bundle(include = "score") remain blocked under GPCM in 0.2.0 because those FACETS-compatibility outputs are Rasch-family score-side contracts.

Bounded GPCM visual summaries and QC pipeline

diagnose_mfrm() now attaches the slope-aware GPCM fair-average table. build_visual_summaries() and run_qc_pipeline() now accept bounded GPCM fits and return support_status = "supported_with_caveat". Their caveat states that fair-average and bias checks are GPCM-specific exploratory screens, not Rasch-family invariance evidence.

Bug fixes

Documentation

Build hygiene

.Rbuildignore tightened the inst/references/ source-package boundary. The two runtime / user-facing files in that directory -- facets_column_contract.csv (read at runtime by facets_parity_report()) and FACETS_manual_mapping.md (the FACETS Table to mfrmr helper mapping cited in the README) -- are preserved.

Test and check coverage

The 0.2.0 release line adds regression coverage for the public upgrade surface from 0.1.5: GPCM scope, empirical fit plots, adjusted fit p-value tables, classical DIF screens, classic curve front doors, JML lz_star, residual dimensionality, arbitrary-facet simulation, bias/signal detection simulation, APA/reporting output, namespace contracts, and exception datasets with missing values or sparse score categories. The release candidate was validated with the full local test suite, Rd parsing, R CMD build, and R CMD check.

Performance note

The cpp11 MML backend (src/mml_backend.cpp, RSM and PCM only) is opt-in via options(mfrmr.use_cpp11_backend = TRUE) for this release. It is validated against the pure-R reference at tolerance = 1e-12 on a fixed regression fixture. The default flip to ON is planned for a follow-up release after a cycle of community testing.

Deferred to a follow-up release

Scoped during 0.2.0 prep but not shipped in 0.2.0; carried over to a later release:

These are scheduled for a follow-up release.

mfrmr 0.1.6

This unpublished development line added empirical-Bayes shrinkage for small-N facets, a hierarchical-structure and sample-adequacy audit layer, integrated missing-code pre-processing, APA output adapters for Word / HTML, model-estimated two-way non-person facet interactions, confidence-interval propagation through the plot surface and the ICC reporting family, and expanded reproducibility manifests. Its public changes ship as part of 0.2.0, not as a separate CRAN release. Some development-only entries below were later corrected before 0.2.0; the top 0.2.0 section is authoritative for current behavior.

Development default flips included in 0.2.0

Three default values were flipped during this development line and ship publicly in 0.2.0. Scripts that explicitly pass the old value are unaffected; scripts that rely on the 0.1.5 defaults should be reviewed.

New features

Model-estimated facet interactions

fit_mfrm() gains facet_interactions for confirmatory two-way interactions between non-person facets in RSM and PCM fits, for example facet_interactions = "Rater:Criterion". These terms are estimated simultaneously with the main MFRM parameters as fixed effects under zero marginal-sum constraints, contributing (A - 1) * (B - 1) free parameters for an A x B interaction block.

New supporting pieces:

The feature is intentionally narrow in its 0.2.0 public form: person-involving interactions, higher-order interactions, GPCM interactions, and random-effect facet interactions are deferred. Residual bias screening via estimate_bias() and estimate_all_bias() remains separate from these model-estimated fixed effects.

Empirical-Bayes facet shrinkage

fit_mfrm(..., facet_shrinkage = "empirical_bayes") applies James-Stein / empirical-Bayes shrinkage to each non-person facet's fixed-effect estimates. fit$facets$others gains ShrunkEstimate, ShrunkSE, and ShrinkageFactor columns, and fit$shrinkage_report summarises the per-facet prior variance, mean shrinkage, and effective degrees of freedom.

The estimator is the classical method-of-moments form (Efron & Morris, 1973):

Two post-hoc helpers make shrinkage available to existing fits:

The "laplace" alias currently routes to the empirical-Bayes path and is reserved for a future penalised-MML implementation.

Integration: summary(fit) exposes FacetShrinkage and FacetShrinkageTau2Mean; build_apa_outputs() adds a Method-section sentence naming the mode, mean tau_hat^2, and mean shrinkage with a Efron & Morris (1973) citation; build_mfrm_manifest() gains a shrinkage_audit table; reporting_checklist() gains an "Empirical-Bayes shrinkage" item.

Hierarchical structure and sample-adequacy audit

Five new exported functions describe the observed design, flag small-N facet levels, and quantify ICC / design effect. Estimation remains fixed-effects MFRM; these helpers are purely descriptive and do not alter the fit.

Fit- and reporting-stack integration:

Optional dependencies igraph and lme4 move to Suggests; when either is absent the relevant report is omitted with a clear message().

Missing-code pre-processing in the fit call

fit_mfrm() now accepts missing_codes = NULL | TRUE | "default" | <character vector>, forwarded to prepare_mfrm_data(), audit_mfrm_anchors(), and describe_mfrm_data(). When active, the standard FACETS / SPSS / SAS sentinels ("99", "999", "-1", "N", "NA", "n/a", ".", "" by default, or any caller- supplied set) are converted to NA on the person, facets, and score columns before any downstream processing. Replacement counts are recorded in fit$prep$missing_recoding and surfaced through build_mfrm_manifest()$missing_recoding. The default (missing_codes = NULL) is strictly backward-compatible.

A standalone recode_missing_codes() helper is also exported for users who prefer to recode before calling fit_mfrm().

APA output adapters

kableExtra and flextable join Suggests.

Shrinkage and audit visualisations

All three methods follow the existing preset = c("standard", "publication", "compact") convention and use base-R graphics.

Confidence intervals across the plot surface

Additional visualisations

Fourteen additions across the plot surface, all base-R / additive (default behaviours unchanged):

igraph is already in Suggests; the equating-graph view falls back to the bar chart when igraph is not installed.

Expanded test coverage

Direct regression tests for these development additions:

Internal architecture

row_max_fast() and the three category_prob_* polytomous-response kernels are now in R/core-category-probabilities.R instead of inline in R/mfrm_core.R. Pure file-level reorganization; no behaviour change. The remaining structural split of mfrm_core.R (likelihood / optimizer / EM / gradients / prep / report tables) is scheduled for a future release.

Package-level MnSq misfit threshold

mfrm_misfit_thresholds() returns the lower / upper active MnSq screening band that mfrmr screens use when flagging element-level Infit / Outfit MnSq misfit. Defaults are c(lower = 0.5, upper = 1.5) and can be overridden globally via R options:

Helpers that consume the band include summary(diagnose_mfrm(...)) (misfit_flagged block + key_warnings auto-flag), build_misfit_casebook() (the new element_fit source family), the bias / misfit narrative inside build_apa_outputs(), and facet_quality_dashboard() when misfit_warn = NULL. Setting the options once at the top of an analysis script therefore changes every downstream screen at once.

Additional secondary plots

Four new public helpers extend the diagnostic plot family:

plot_bubble() gains a view = c("measure", "infit_outfit") argument. The default "measure" keeps the historical Measure (logit) x MnSq bubble layout; view = "infit_outfit" switches to the Winsteps Table 30 layout (Infit MnSq on x, Outfit MnSq on y, bubble size defaults to N). Both views return the same mfrm_plot_data contract.

plot_dif_heatmap(draw = FALSE) now returns an mfrm_plot_data payload whose data$matrix is the metric matrix (was previously the bare matrix only).

plot_information(..., draw = FALSE) payloads now include a series field listing which curves the legend describes ("Information", "SE", or both for type = "both"), so downstream ggplot2 re-renderers can map the right column without inspecting type manually.

Reporting surface enrichments

Internal architecture: file split

To improve navigability of the core estimation engine, four self-contained sections moved out of R/mfrm_core.R into focused files. All functions remain internal and the public API is unchanged.

R/api-simulation.R similarly grew an R/api-simulation-future-branch.R companion file holding the future-branch design-schema layer. Public simulation entry points (simulate_mfrm_data, evaluate_mfrm_design, evaluate_mfrm_diagnostic_screening, evaluate_mfrm_signal_detection) remain in R/api-simulation.R.

R/api-plotting-extras2.R was renamed to R/api-plotting-screening.R to drop the numerical suffix in favour of a functional name; tests follow the same rename.

A new tests/testthat/helper-fixtures.R exposes make_toy_fit() / make_toy_diagnostics() / local_toy_fit() helpers so future tests can reuse the standard example_core fit without retyping the load_mfrmr_data() + fit_mfrm() + diagnose_mfrm() chain.

Replay-script overhaul

export_mfrm_bundle() and build_mfrm_replay_script() now write a self-contained replay package:

Performance: diagnose_mfrm() on large designs

calc_interrater_agreement() (the inter-rater agreement helper that diagnose_mfrm() calls when Person is part of facet_cols) previously used a list() for the per-context probability lookup and c(exp_vals, ...) accumulation inside a per-row loop. This gave near-quadratic scaling: 6,400 observations took ~2 s, but 72,000 observations took ~141 s. The lookup is now an environment (hash-backed for character keys) and exp_vals is preallocated and filled by index, so the helper now scales linearly in the number of observations. On the 72,000-observation benchmark in the audit, diagnose_mfrm() drops from ~141 s to ~15 s.

The make_union_find() helper used by the connectivity audit was also rewritten with an iterative find_root (with path compression) instead of the previous recursive form. Designs whose union chain depth exceeded options(expressions) (default 5,000) no longer error out with "evaluation is too deeply nested".

Input validation: degenerate inputs surface earlier

prepare_mfrm_data() now:

fit_mfrm() now treats NaN / Inf for maxit, reltol, and quad_points as invalid input with a localised English error, instead of falling through to R's locale-dependent "missing value where TRUE/FALSE needed" message.

Pre-rendered cheatsheet PDF

The two-page landscape cheatsheet now ships in pre-rendered form at system.file("cheatsheet", "mfrmr-cheatsheet.pdf", package = "mfrmr") alongside the existing .Rmd source. Users without a working LaTeX toolchain can open the PDF directly; users who want to customize it can still knit the .Rmd with rmarkdown::render(). The README and ?mfrmr package help now point at both files.

Help-page examples: "what to look for" annotations

The most-visited help pages now embed concrete interpretation comments inside their @examples blocks. Each shipped example shows what value ranges or patterns indicate "good", what threshold or rule of thumb applies, and what follow-up to run if the value is off. Coverage in this development line includes:

Help-page examples: lighter-weight \donttest{}

Several main entry points now expose a small fast-path block (a JML fit on example_core plus a single diagnostic / plot call) before the heavier \donttest{} block. The fast path is below R CMD check's example-time budget and provides a regression net that runs every check, while the full \donttest{} block continues to showcase the larger MML / publication-route examples. Affected pages: ?fit_mfrm, ?diagnose_mfrm, ?plot_qc_dashboard, ?reporting_checklist, ?build_apa_outputs.

Documentation

Yen Q3 local-dependence statistic

q3_statistic(fit, diagnostics) returns the Yen (1984) Q3 index between every facet-level pair, with three published reporting thresholds (Yen 0.20, Marais 0.30, Christensen et al. relative 0.20) and a textual Interpretation column that names which flag(s) each pair triggered. The helper reuses the standardized- residual pivot that plot_local_dependence_heatmap() already draws, so the table and the heatmap stay numerically consistent.

Extended person-fit indices

compute_person_fit_indices(diagnostics, fit) was introduced during development as an extension to the Infit / Outfit / ZSTD columns that diagnose_mfrm() already exposes. Its final 0.2.0 public contract is:

The development-only ECI4 column was removed before the 0.2.0 public release because it duplicated the standardized chi-square / Outfit-ZSTD approximation rather than implementing Tatsuoka and Tatsuoka's extended-caution index. Use OutfitZSTD for the equivalent screen.

Generalizability-theory adapter

mfrm_generalizability(fit) re-fits the rating data as a crossed random-effects model Score ~ 1 + (1 | Person) + (1 | Facet1) + ... via lme4::lmer and returns the canonical G / Phi coefficients plus per-source variance components. Useful when a reviewer asks for a generalizability-theory complement to the Rasch-style separation / reliability statistics that diagnose_mfrm() already emits.

Import adapters: mirt / TAM / eRm

Three thin importers expose external fit objects via the same mfrm_fit interface that the mfrmr plot and table helpers consume:

The imported objects carry the mfrm_imported_fit class and populate measurement-side slots (facets$person, facets$others, steps, summary) only. Bias / DIF / anchor / replay slots are explicitly not populated; full bundle import is planned for a future release.

Parallel parametric-bootstrap ICC

compute_facet_icc(boot = "boot") gains ci_boot_parallel ("no" / "multicore" / "snow") and ci_boot_ncpus arguments that are forwarded to lme4::bootMer(). The per-replicate cli progress bar is suppressed under parallel execution because worker processes hold their own copy of the progress state.

Parallel evaluate_mfrm_design (scaffold)

evaluate_mfrm_design() accepts a parallel = c("no", "future") argument. When "future" is requested and the future.apply Suggests package is installed, the rep loop within each design row honours whatever future::plan() is currently active; cross-design-row parallelism is planned for a future release. Without future.apply the call falls back to serial execution with an explicit message.

Resumable MML EM fits

fit_mfrm() accepts a checkpoint = list(file = ..., every_iter = ...) argument. When supplied to a mml_engine = "em" (or hybrid) fit, the EM scaffolding writes its state to file every every_iter outer iterations using saveRDS(). If the file exists when a subsequent call starts, the engine resumes from the recorded iteration. The direct optim() engine ignores the checkpoint; non-EM fits run unaffected.

GPCM verification tests

A new tests/testthat/test-gpcm-verification.R exercises every "supported" and "supported_with_caveat" row of gpcm_capability_matrix() on a toy dataset and asserts the documented helper returns the expected shape. "blocked" and "deferred" rows have negative tests that confirm the helper either refuses to run or returns an explicit caveat. These tests make the GPCM scope a contract that future commits cannot silently shrink.

Optional FACETS Table 7 style fit output on fit$facets$others

fit_mfrm(attach_diagnostics = TRUE) runs diagnose_mfrm() once after the fit and merges the per-level SE, Infit, Outfit, and PtMeaCorr columns onto fit$facets$others. This makes the facet table look like a FACETS Table 7 summary without a separate call. The default FALSE preserves the minimal Facet / Level / Estimate layout from 0.1.5.

Reproducibility

build_mfrm_manifest() gains several new tables so replay bundles carry everything a deterministic re-run needs:

digest is added to Suggests.

Bug fixes

Messaging improvements

Documentation and citations

Reference citations corrected:

Plot polish

Other additions

Test suite

At this development checkpoint, 6,380+ tests passed (up from 6,343 in 0.1.5), with 0 failures and 0 errors. Additional 0.2.0 release tests were added later for GPCM, empirical fit, classical DIF, residual dimensionality, arbitrary-facet simulation, and reporting routes. New test files at this checkpoint:

Pre-existing test-harness errors unrelated to 0.1.5 behaviour have also been cleaned up (S3 dispatch, GPCM scope wording, internal-helper prefixing with mfrmr:::).

mfrmr 0.1.5 (2026-04-12)

Maintenance release

First-use workflow

Estimation and scoring

Diagnostics, reporting, and visualization

External-software scope

mfrmr 0.1.4 (2026-03-30)

CRAN resubmission

mfrmr 0.1.3

CRAN resubmission

mfrmr 0.1.2

CRAN resubmission

mfrmr 0.1.1

CRAN resubmission

mfrmr 0.1.0

Initial release

Package operations and publication readiness