def.gof() - the Directed Ebrahim-Farrington (DEF)
goodness-of-fit test. Projects grouped standardized residuals onto a
smooth calibration-shape basis ("poly2",
"poly3", "stukel") and calibrates the
statistic as a weighted sum of chi-square_1 variables (Satterthwaite by
default; Imhof via the suggested CompQuadForm).
basis = "ensemble" is a shortcut to
def.ensemble.gof().def.ensemble.gof() - combines the three DEF bases
(optionally the omnibus EF, or extra p-values) into one decision via the
Cauchy combination test (default), with minp and
fisher offered for comparison.ef.gof(), def.gof(), and
def.ensemble.gof() now accept either a
fitted glm or
(y, predicted_probs) as input. For def.gof,
supplying the design matrix X (with the
y/predicted_probs form) gives the exact
calibration; without it a conservative chi-square reference is used and
a warning is issued.ef.gof() now defaults to the
chi-square reference (method = "chisq"):
the grouped statistic is referred to a chi-square_{G-2} distribution.
Use method = "normal" to reproduce the previous
(standardized-normal) p-value.
run.all.gof() - a one-shot runner that returns a
tidy data.frame, one row per test. Pass a fitted
glm for the whole battery, or
(y, predicted_probs) for the prediction-only tests. One
failing test never aborts the run. This build bundles Pearson, Deviance,
Osius-Rojek, Copas-RSS, Hosmer-Lemeshow (deciles and equal-width),
Pigeon-Heyse, EF, the three DEF bases, Stukel, the covariate-space tests
Tsiatis, Xie, and Pulkstenis-Robinson, and the two Cauchy-combination
ensemble rows. Osius-Rojek, Copas-RSS, Pigeon-Heyse, Tsiatis, and
Pulkstenis-Robinson were verified to match their original
implementations to ~1e-15 (Xie’s statistic also matches).
All run.all.gof() tests were verified to reproduce
the implementations used in the original thesis simulation. In
particular Osius-Rojek and Stukel now follow
LogisticDx::gof.glm (Stukel via
statmod::glm.scoretest; statmod added to
Suggests), matching it numerically; Copas-RSS matches
rms’s gof residual; HL matches
ResourceSelection::hoslem.test; and
HL-equalwidth, Pigeon-Heyse, Tsiatis, Xie, and
Pulkstenis-Robinson match their source scripts.
A second EF row, EF-normal, reports the omnibus EF
test with the normal reference used in the thesis simulation (the
EF row uses the chi-square default).
More opt-in slow (include_slow = TRUE) tests: the
GAM-based HL-GAM, PR-GAM, and
Xie-GAM (Xie et al. 2021; need mgcv; HL-GAM
and PR-GAM match the source gam_gof_tests exactly, Xie-GAM
uses a fixed clustering seed), and Stute-Zhu
(cumulative-residual parametric-bootstrap test; sequential, set reps via
control = list("Stute-Zhu" = list(B = ...)); statistic
matches the source exactly).
Lai-Liu-HL (Lai & Liu 2018, standardized-power
procedure for the Hosmer-Lemeshow test). It has no p-value: it resamples
to a target size, fits the model, estimates the HL rejection rate
(“standardized power”), and returns a randomized accept/reject decision.
The standardized power is reported as the statistic and the decision in
the Note (set n0/k via
control). Verified to match the source
lai_liu_test exactly.
Two further opt-in slow tests: eHL (the e-value
Hosmer-Lemeshow test of Henzi et al. 2024; base-R reimplementation, with
attribution, of the marius-cp/eHL code, matching it to ~1e-11; reported
as p = min(1, 1/e)), and BAGofT (the
binary-adaptive GOF test, wrapping the BAGofT package; set
nsim via
control = list(BAGofT = list(nsim = ...))).
An opt-in slow test, le-Cessie (le Cessie-van
Houwelingen 1995, general multivariate smoothed-residual test), runs
when include_slow = TRUE. It is O(n2)-O(n3).
Adapted with attribution from the USGS smwrStats package
(public domain); verified to match it exactly.
The Xie test uses the corrected degrees of freedom
G - k/2 - 1 with k the number of predictors.
(Earlier thesis runs used df = G - 0.5, an artifact of
coef() returning NULL on a
predicted-probability list; the statistic is the same, only the p-value
differs.)
Added the Information-Matrix test (White 1982 / Orme
1988), the closed-form IM test; verified to match the thesis
IMtest_fast exactly.
include_slow = TRUE tests in a later build:
the GAM-based tests (HL-GAM, PR-GAM, Xie-GAM; need mgcv),
the bootstrap tests (Hosmer bootstrap, Stute-Zhu), the e-value HL
(eHL; needs isotone), and BAGofT.
(McCullagh is not added: it appears only in the unused
goflogit macro, not in the thesis simulation.)This is the first release of the ebrahim.gof package, implementing the Ebrahim-Farrington goodness-of-fit test for logistic regression models.
ef.gof() - Performs the
Ebrahim-Farrington goodness-of-fit testEbrahim Khaled Ebrahim (Alexandria University) Email: ebrahimkhaled@alexu.edu.eg