
The MAIHDA package provides a comprehensive toolkit for conducting Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy (MAIHDA). This approach is particularly valuable for examining intersectional inequalities in health and social outcomes by considering the joint effects of multiple social categories (e.g., gender, race, socioeconomic status).
maihda() fits the
null and adjusted models, summarises the VPC/ICC and the PCV
decomposition (additive vs. intersectional), and (optionally) compares
across a higher-level group in a single callrun_maihda_app()) for no-code exploratory data analysis
and model fittinglme4
(frequentist), brms (Bayesian), WeMix
(design-weighted pseudo-likelihood), and ordinal
(ordinal::clmm, for cumulative/ordered-factor
outcomes)sampling_weights fit a population-weighted model using
WeMix pseudo-maximum likelihood via engine = "wemix" or a
brms pseudo-posterior via engine = "brms".compare_maihda_groups() contrasts the between-stratum
share of variance (VPC/ICC) and the between-/within-stratum
variance components across levels of a higher-level variable such as
country or region.context = "school" crosses the intersectional strata with a
place/institution level (school, hospital, region)You can install the development version from GitHub:
# install.packages("devtools")
devtools::install_github("hdbt/MAIHDA")or the current stable version from CRAN:
install.packages("MAIHDA")library(MAIHDA)
data("maihda_health_data")
# Everything in one call: null + adjusted fit, VPC/ICC summary, and PCV decomposition.
analysis <- maihda(BMI ~ Age + Gender + Race + Education + (1 | Gender:Race:Education),
data = maihda_health_data)
analysis
summary(analysis)
plot(analysis, type = "vpc")
plot(analysis, type = "effect_decomp")
plot(analysis, type = "predicted", highlight_interactions = TRUE)
plot(analysis) # plots all plot typesmaihda()A wrapper that runs the standard two-model MAIHDA workflow: it fits
the null model (covariates only) and the
adjusted model (plus the dimensions’ additive main
effects, summarises the null-model VPC/ICC, and reports the
PCV (the additive share of the intersectional
inequality). When a group is supplied it also runs this
decomposition within each group. Alternatively,
decomposition = "crossed-dimensions" reads the
additive/interaction split off a single model that enters each
dimension’s main effect as a random intercept. Returns one
maihda_analysis object with print(),
summary(), and plot() methods
maihda_table()Assembles the two standard MAIHDA write-up deliverables from a fitted
maihda() analysis (or a single fit_maihda()
model) in one call: (a) a model-results table
contrasting the null (Model 1) and adjusted (Model 2) fits — intercept,
between-stratum variance and SD, VPC/ICC, the PCV, and (for a binary
outcome) the AUC and Median Odds Ratio — and (b) a ranked-strata
table ordering every stratum by its predicted outcome, with
conditional intervals and the stratum random effect. The
$models data frame is numeric and export-ready
(write.csv() / knitr::kable());
print() renders the familiar “estimate [low, high]” layout
plus the top/bottom strata. It adapts to every fit type
(crossed-dimensions shares, contextual Context share (VPC))
and engine (lme4/brms/WeMix/ordinal); for an ordinal model the intercept
row is NA and the category thresholds are reported by
summary(), not the table. For a longitudinal (growth-curve)
fit each stratum is a trajectory rather than a single ranked value, so
$strata is NULL (with a
$strata_note pointing to
predict(type = "strata") /
plot(type = "trajectories")); the $models
table is still produced.
analysis <- maihda(BMI ~ Age + Gender + Race + (1 | Gender:Race), data = maihda_health_data)
tab <- maihda_table(analysis)
tab # printed: Model 1 vs Model 2 table + highest/lowest strata
tab$models # numeric, export-ready results table
tab$strata # all strata ranked by predicted BMImake_strata()Creates intersectional strata from multiple categorical variables with optional minimum count filtering.
fit_maihda()Fits multilevel models using the lme4 (default), brms, WeMix
(design-weighted, via sampling_weights), or ordinal
(ordinal::clmm, selected automatically for ordered-factor
outcomes) engine. Supports various families including gaussian,
binomial, poisson, negbinomial (overdispersed counts; theta estimated
via lme4::glmer.nb() or brms’s shape
parameter, with the VPC using the latent-scale level-1 variance of
Nakagawa, Johnson & Schielzeth 2017), and ordinal/cumulative
(proportional-odds models for ordered outcomes; latent-scale VPC with
pi^2/3 logit / 1 probit level-1 variance, response-scale predictions as
expected category scores).
context =)The cross-classified design crosses individuals’
intersectional strata with a higher-level context –
hospitals (patient survival), schools (student achievement),
neighbourhoods. Pass context = "school" to
fit_maihda() or maihda() to fit
outcome ~ covars + (1 | stratum) + (1 | school) in one
model; the summary then splits the unexplained variance into
between-stratum (intersectional, net of the context),
between-context (the general contextual effect), and
residual, and plot(type = "context_vpc")
visualises the partition.
data(maihda_country_data)
# Strata (gender x SES) crossed with country in ONE model -- contrast with
# group = "country", which instead fits an independent model per country.
a <- maihda(math ~ 1 + (1 | gender:ses), data = maihda_country_data,
context = "country")
a$summary$context$vpc_context_total # the country share of unexplained variance
plot(a, type = "context_vpc")A context with few levels (like these 6 countries) weakly identifies
its variance; prefer many-level contexts or engine = "brms"
for serious use.
summary()Provides comprehensive model summaries including:
predict_maihda()Makes predictions at two levels:
plot()Creates various visualizations:
plot_prediction_deviation_panels()Creates a two-panel dashboard for visualizing predicted values and highlighting the most notable cases or strata. What counts as notable depends on the model type — the largest deviation from the mean prediction (Gaussian/Poisson), the largest absolute deviance residual (binomial), or the most surprising observation (ordinal).
compare_maihda()Compares VPC/ICC across multiple models with optional bootstrap
confidence intervals, and (by default, ic = TRUE) appends
relative-fit information criteria — AIC/BIC for the likelihood engines,
WAIC/LOOIC for brms — for comparing model structures.
maihda_ic()Reports the relative-fit information criteria for one or more models
(or a maihda() analysis, expanded into its null and
adjusted models) to help choose between model structures — a
question the VPC/PCV do not address. AIC/BIC for the likelihood engines
(lme4, ordinal) and the Bayesian WAIC/LOOIC for brms, with a
delta column from the best model. REML lmer
fits are refitted with ML so AIC/BIC are comparable across models with
different fixed effects (the null-vs-adjusted case).
compare_maihda_groups()Compares the between-stratum share of variance (VPC/ICC) and the between-/within-stratum variance components across the levels of a higher-level grouping variable such as country, region, or survey wave, fitting a stratified MAIHDA model per group.
calculate_pvc()Calculates the proportional change in between-stratum variance (PCV) between two models.
lme4,
brms, WeMix, ordinal)stepwise_pcv()Evaluates multiple sequential models by iteratively adding covariates
step-by-step. Each step’s PCV is the change in between-stratum variance
contributed by a predictor given the variables already in the model. For
a binary outcome it also reports the discriminatory-accuracy trajectory
(AUC, the step/total change in AUC, and MOR)
alongside the PCV, so the strata’s discriminatory accuracy can be
tracked as covariates are added.
Supply id and time to
fit_maihda()/maihda() for a 3-level
growth-curve MAIHDA (occasions within individuals within strata, with
random intercept and slope on time at both levels), the
life-course extension of Bell, Evans, Holman & Leckie (2024). The
between-stratum VPC is then time-varying, and
maihda(decomposition = "longitudinal") reports the PCV
separately for the baseline (intercept) and the slope variance — the
additive-vs-multiplicative split of the intersectional
trajectory inequality.
data(maihda_long_data) # 600 persons x 5 waves; gender x ethnicity x education strata
# Time-varying VPC from a 3-level growth model
m <- fit_maihda(wellbeing ~ wave + (1 | gender:ethnicity:education),
data = maihda_long_data, id = "id", time = "wave")
summary(m) # baseline VPC + the VPC trajectory over waves
plot(m, type = "vpc_trajectory") # VPC(t) curve
plot(m, type = "trajectories") # predicted per-stratum mean trajectories
# Additive-vs-multiplicative PCV (null vs adjusted growth model)
a <- maihda(wellbeing ~ wave + (1 | gender:ethnicity:education),
data = maihda_long_data, id = "id", time = "wave",
decomposition = "longitudinal")
a$pcv # PCV_intercept (baseline) and PCV_slope (trajectory)
plot(a, type = "pcv_trajectory")run_maihda_app()Launches a locally-hosted, interactive Shiny Dashboard that exposes the core functionalities for data modeling, visualization, and summarization visually.
# Requires brms package
model_brms <- fit_maihda(
BMI ~ Age + (1 | Gender:Race:Education),
data = maihda_health_data,
engine = "brms",
chains = 4,
iter = 2000
)
summary_brms <- summary(model_brms)For complex-survey data (NHANES, PISA, …), pass the sampling-weight
column via sampling_weights. Survey weights are
not lme4 weights= (those are precision
weights), so the fit routes through WeMix::mix() – weighted
pseudo-maximum-likelihood (Rabe-Hesketh & Skrondal 2006) – and
engine = "lme4" with sampling weights is an error rather
than a silent misfit.
# One call: design-weighted null + adjusted models and PCV. The engine switches
# to "wemix" automatically (install WeMix from CRAN).
analysis <- maihda(
outcome ~ age + (1 | gender:race:education),
data = survey_data,
sampling_weights = "person_weight"
)
analysis # design-weighted VPC/ICC and PCV
summary(analysis) # design-consistent (sandwich) SEs for the fixed effects
# Works across the toolkit: stepwise PCV, group comparison, prediction, plots,
# and the design-weighted AUC for binary outcomes.
stepwise_pcv(strata_data, "outcome", c("gender", "race", "education"),
sampling_weights = "person_weight")
# Bayesian alternative: weights enter as likelihood weights (pseudo-posterior --
# point estimates target the population-weighted estimand, not full design-based
# inference; credible intervals are not design-based).
fit_maihda(outcome ~ age + (1 | gender:race:education), data = survey_data,
engine = "brms", sampling_weights = "person_weight")# Incrementally track the explained variables interactively
stepwise_results <- stepwise_pcv(
data = data,
outcome = "health_outcome",
vars = c("gender", "race", "age")
)
print(stepwise_results)broom)tidy() and glance() methods turn a fitted
model or analysis into tidy data for tables (gt,
flextable) and ggplot2, reading the quantities
summary() already computes (no new statistics, no
refit):
analysis <- maihda(BMI ~ Age + Gender + Race + Education + (1 | Gender:Race:Education),
data = maihda_health_data)
# One-row headline: VPC/ICC, PCV, AUC/MOR (binary), additive/interaction shares, n.
# This row has no equivalent in broom.mixed/easystats -- PCV needs the null+adjusted pair.
glance(analysis)
# Per-estimate tidy frames (broom's term/estimate/std.error/conf.low/conf.high shape):
tidy(analysis) # stratum (intersection) effects, with labels
tidy(analysis, component = "variance") # variance components
tidy(analysis, component = "fixed", which = "adjusted")
# A caterpillar plot in two lines:
library(ggplot2)
tidy(analysis) |>
ggplot(aes(reorder(label, estimate), estimate, ymin = conf.low, ymax = conf.high)) +
geom_pointrange() + coord_flip()You can access a live, cloud-hosted version of the MAIHDA interactive dashboard directly in your browser without installing R: https://hdbt.shinyapps.io/shiny/
Alternatively, you can run all analyses described above in the browser locally by typing:
library(MAIHDA)
run_maihda_app()For detailed documentation and examples, see the package vignette:
vignette("introduction", package = "MAIHDA")Required:
Optional:
sampling_weightsrun_maihda_app())Bootstrap confidence intervals use a parametric bootstrap via
lme4::simulate() / lme4::refit(); no external
bootstrap package is required.
Evans, C. R., Williams, D. R., Onnela, J. P., & Subramanian, S. V. (2018). A multilevel approach to modeling health inequalities at the intersection of multiple social identities. Social Science & Medicine, 203, 64-73.
Merlo, J. (2018). Multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA) within an intersectional framework. Social Science & Medicine, 203, 74-80.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this package in your research, please cite:
Bulut (2025). *MAIHDA: Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy.* R package version 0.2.0, https://github.com/hdbt/MAIHDA. doi: 10.32614/CRAN.package.MAIHDA
A BibTeX entry for LaTeX users is:
@Manual{Bulut2025MAIHDA,
title = {MAIHDA: Multilevel Analysis of Individual Heterogeneity and Discriminatory Accuracy},
author = {Hamid Bulut},
year = {2025},
note = {R package version 0.2.0},
url = {https://github.com/hdbt/MAIHDA},
doi = {10.32614/CRAN.package.MAIHDA}
}