This vignette demonstrates the main workflow of the
OnlineSurr package:
fit.surr().summary(), visualize with
plot().time_homo_test().The package returns a fitted object of class
fitted_onlinesurr that stores point estimates and bootstrap
draws for treatment-effect trajectories and PTE-based summaries.
fit.surr() expects data in long format
with one row per subject-time measurement. Key requirements enforced by
the code:
id identifies subjects; there must be at most
one observation per subject-time combination.treat indicates treatment assignment; it is coerced to
a factor and is intended to represent two treatment
levels.time must be numeric and equally
spaced across observed time points. If time is
omitted, the function creates a within-subject index Time
assuming the data are already ordered and equally spaced.fit.surr() fits:
plot.fitted_onlinesurr() plots:
time_homo_test() tests the hypothesis that the PTE is
constant over time (implemented via a max-type statistic and Monte Carlo
approximation of the null).This section builds a minimal simulated dataset consistent with the
checks inside fit.surr().
set.seed(1)
# Dimensions
N <- 100 # subjects
T <- 6 # time points
times <- seq_len(T) # numeric, equally spaced
# Subject IDs and treatment assignment (two levels)
id <- rep(seq_len(N), each = T)
trt <- rep(rbinom(N, 1, 0.5), each = T) # 0/1
time <- rep(times, times = N)
# Simulate surrogate(s) and outcome
# Surrogate is affected by treatment and time
s <-
0.2 * time + # Trend
(0.2 + 0.4 * time) * trt + # Treatment effect
rnorm(N * T, sd = 0.1) # Noise
# Outcome depends on treatment effect, time trend, surrogate, and noise
y <-
0.2 + # intercept
0.1 * time + # Trend
(0.2 + 0.1 * time) * trt + # (direct) treatment effect
0.6 * s + # Surrogate effect
rnorm(N * T, sd = 0.1) # Noise
dat <- data.frame(
id = id,
trt = trt,
time = time,
s = s,
y = y
)
# Ensure the data are ordered by (id, time) (recommended)
dat <- dat[order(dat$id, dat$time), ]
head(dat)
#> id trt time s y
#> 1 1 0 1 0.2398106 0.4499785
#> 2 1 0 2 0.3387974 0.5209293
#> 3 1 0 3 0.6341120 1.0634402
#> 4 1 0 4 0.6870637 0.8692466
#> 5 1 0 5 1.1433024 1.4113951
#> 6 1 0 6 1.3980400 1.3448466fit.surr()fit.surr() requires:
formula: outcome mean model. The function will
internally add treatment-by-time fixed effects.id: subject identifier (unquoted).treat: treatment variable (unquoted).surrogate: surrogate structure (as a formula or a
string).time: numeric time variable (unquoted).library(OnlineSurr)
fit <- fit.surr(
formula = y ~ 1, # baseline fixed effects; trt*time terms added internally
id = id,
surrogate = ~s, # surrogate structure
treat = trt,
data = dat,
time = time,
N.boots = 2000, # bootstrap draws stored in the fitted object
verbose = 0 # hide progress
)The formulas for the fixed effects and the surrogate structures
accept any temporal structure available in the kDGLM
package (see its vignette for details). Functions that transform the
data are also supported.
In particular, we provide the lagged function, which
computes lagged values of its arguments and can be included in a model
formula to account for delayed or lingering effects of a predictor over
time. We also provide the s function, which generates a
spline basis for a numeric variable and can be used to model smooth,
potentially non-linear effects without having to specify the basis
expansion manually.
library(OnlineSurr)
fit <- fit.surr(
formula = y ~ 1, # baseline fixed effects; trt*time terms added internally
id = id,
surrogate = ~ s(s) + s(lagged(s, 1)) + s(lagged(s, 2)), # surrogate structure
treat = trt,
data = dat,
time = time,
verbose = 0 # hide progress
)fit.surr() storesThe returned object has class fitted_onlinesurr and is a
list with (at least):
fit$T: number of time pointsfit$N: number of subjectsfit$n.fixed: number of fixed-effect coefficients per
subject design (reference size)fit$Marginal$point: point estimates (vector) from the
marginal modelfit$Marginal$smp: bootstrap draws (matrix) from the
marginal modelfit$Conditional$point: point estimates (vector) from
the conditional modelfit$Conditional$smp: bootstrap draws (matrix) from the
conditional modelThe first T (in practice, the first
n.fixed) elements used by plotting/testing methods
correspond to the time-indexed treatment-effect parameters.
The package provides an S3 summary method
summary.fitted_onlinesurr().
t selects the time index.cumulative=TRUE reports cumulative effects up to time
t (when implemented by the method).cumulative=FALSE reports time-specific quantities at
time t only.summary(fit, t = 6, cumulative = TRUE)
#> Fitted Online Surrogate
#>
#> Cummulated effects at time 6:
#> Estimate Std. Error t value Pr(>|t|)
#> Delta 9.01931 0.05600 161.06739 0.0000e+00 ***
#> Delta.R 2.47893 0.41384 5.99008 2.0973e-09 ***
#> CPTE 0.72515 0.04604 - -
#>
#> Time homogeneity test:
#>
#> Test stat. Crit. value p-value
#> 1.12582 2.45246 0.61942
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1plot() dispatches to
plot.fitted_onlinesurr().
Interpretation notes:
time_homo_test() provides a max-type test, using a Monte
Carlo approximation of the null distribution.
test <- time_homo_test(fit, signif.level = 0.05, N.boots = 50000)
test
#> $T
#> [1] 1.125824
#>
#> $T.crit
#> 95%
#> 2.449603
#>
#> $p.value
#> [1] 0.61756Returned components:
T: observed test statisticT.crit: critical value at the requested significance
levelp.value: Monte Carlo p-valueTime index must be numeric and equally
spaced.
If you have missing measurements, include the missing time points with
NA outcomes rather than dropping those times, so spacing
remains consistent.
One row per subject-time.
If you have duplicates, aggregate first (e.g., average within a time
window) or decide which measurement to keep.
Bootstrap size tradeoff.
fit.surr(N.boots=...) controls stored bootstrap draws used
for confidence intervals associated with the treatment effect, LPTE and
CPTE; time_homo_test(N.boots=...) controls Monte Carlo
draws for the null distribution of the time homogeneity test.
See Santos Jr. and Parast (2026) for details about the theoretical aspects of the package.
sessionInfo()
#> R version 4.5.1 (2025-06-13 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#> LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: America/Chicago
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] OnlineSurr_0.0.3
#>
#> loaded via a namespace (and not attached):
#> [1] gtable_0.3.6 jsonlite_2.0.0 dplyr_1.1.4
#> [4] compiler_4.5.1 Rcpp_1.1.0 tidyselect_1.2.1
#> [7] tidyr_1.3.2 jquerylib_0.1.4 scales_1.4.0
#> [10] yaml_2.3.12 fastmap_1.2.0 ggplot2_4.0.2
#> [13] R6_2.6.1 labeling_0.4.3 generics_0.1.4
#> [16] knitr_1.51 rbibutils_2.4.1 tibble_3.3.1
#> [19] zigg_0.0.2 bslib_0.10.0 pillar_1.11.1
#> [22] RColorBrewer_1.1-3 rlang_1.1.7 cachem_1.1.0
#> [25] xfun_0.52 sass_0.4.10 S7_0.2.1
#> [28] RcppParallel_5.1.10 otel_0.2.0 kDGLM_1.2.14
#> [31] cli_3.6.5 withr_3.0.2 extraDistr_1.10.0
#> [34] magrittr_2.0.3 Rdpack_2.6.5 digest_0.6.37
#> [37] grid_4.5.1 rstudioapi_0.18.0 latex2exp_0.9.8
#> [40] lifecycle_1.0.5 vctrs_0.6.5 Rfast_2.1.5.1
#> [43] evaluate_1.0.5 glue_1.8.0 farver_2.1.2
#> [46] purrr_1.0.4 rmarkdown_2.31 tools_4.5.1
#> [49] pkgconfig_2.0.3 htmltools_0.5.8.1