pvarife implements the Panel VAR estimator with Interactive Fixed Effects (IFE) of Tuğan (2021). The model is
\[ y_{i,t} = \sum_{\ell=1}^{L} \Theta_\ell \, y_{i,t-\ell} + F_t \lambda_i + e_{i,t}, \quad i = 1,\ldots,I,\; t = 1,\ldots,T, \]
where \(y_{i,t}\) is a \(K\times 1\) vector of endogenous variables, \(F_t\) is an \(r\times 1\) vector of unobservable common factors, \(\lambda_i\) is a unit-specific loading vector, and \(e_{i,t}\) is an idiosyncratic error.
The estimator jointly identifies \(\beta = [\Theta_1,\ldots,\Theta_L]\), \(F = \{F_t\}\), and \(\Lambda = \{\lambda_i\}\) via an inner/outer EM algorithm (Bai 2009). Impulse responses under recursive (Cholesky) identification and asymptotic confidence bands (Theorem 2.3 of Tuğan 2021) are provided.
Reference: Tuğan, M. (2021). Panel VAR models with interactive fixed effects. Econometrics Journal, 24, 225–246. https://doi.org/10.1093/ectj/utaa021
The main input is a three-dimensional array y[i, t, k]:
| Dimension | Index | Meaning |
|---|---|---|
| 1 | i |
cross-sectional unit |
| 2 | t |
time period |
| 3 | k |
VAR variable |
NA values are allowed for unbalanced panels; the algorithm imputes missing entries via the EM step.
library(pvarife)
set.seed(1)
sim <- sim_pvarife(
n_units = 50, # I
n_time = 30, # T
n_vars = 2, # K
n_lags = 1, # lag order
n_factors = 1, # number of interactive fixed effects
seed = 42
)
dim(sim$y) # I x T x K
#> [1] 50 30 2
sim$beta_true # true coefficient vector
#> [1] 1.00 1.00 0.65 0.30 0.20 0.60
sim$sigma_true # true reduced-form covariance
#> [,1] [,2]
#> [1,] 1.0 0.5
#> [2,] 0.5 1.0The function generates data from the DGP of Tuğan (2021), Section S10:
fit <- pvarife(
y = sim$y,
n_lags = 1,
n_factors = 1,
n_out = 20, # outer GLS iterations (paper default: 50)
n_in = 5 # inner PCA/EM iterations (paper default: 10)
)
print(fit)
#> Panel VAR with Interactive Fixed Effects
#> Units (I): 50 | Time periods (T): 30 | Variables (K): 2
#> Lag order: 1 | Factors: 1
#>
#> Coefficients:
#> Intercepts:
#> intercept[1] intercept[2]
#> 1.111169 1.099592
#>
#> Theta_1 (VAR lag 1):
#> y[1,t-1] y[2,t-1]
#> y[1] 0.620220 0.317411
#> y[2] 0.224987 0.549718
#>
#> Reduced-form covariance (Sigma):
#> [,1] [,2]
#> [1,] 0.916247 0.474269
#> [2,] 0.474269 0.937656The n_out and n_in parameters control the convergence of the EM algorithm. For publication-quality results use the defaults (n_out = 50, n_in = 10); smaller values speed up experimentation.
coef(fit)
#> intercept[1] intercept[2] theta1[1,1] theta1[1,2] theta1[2,1] theta1[2,2]
#> 1.1111692 1.0995918 0.6202204 0.3174106 0.2249868 0.5497178The first \(K\) elements are equation-specific intercepts; the remaining \(K^2 L\) elements are the VAR lag matrices \(\Theta_1, \ldots, \Theta_L\) (stacked row-major).
ir <- compute_irf(
fit,
n_periods = 8,
shock = 1L # shock to first variable (Cholesky ordering)
)
dim(ir) # K x n_periods
#> [1] 2 8
plot(ir, var_names = c("Variable 1", "Variable 2"))The irf_bands() function draws from the joint asymptotic distribution of \(\hat\beta\) and the common component (Theorem 2.3 of Tuğan 2021).
For robustness checks that do not rely on asymptotic approximations:
bsb <- bootstrap_irf_bands(
fit,
n_periods = 8,
n_boot = 100, # computationally expensive
level = 0.95,
seed = 2
)
plot(bsb, var_names = c("Variable 1", "Variable 2"))By default compute_irf() uses short-run (Cholesky) identification: the impact matrix \(A_0 = \mathrm{chol}(\hat\Sigma)^\top\) is lower-triangular, so shock 1 has no contemporaneous effect on variable 2.
For long-run identification, the cumulative impact matrix \(C(1) = (I - \Theta_1)^{-1} A_0\) is constrained to be lower-triangular: shock 1 has no long-run (permanent) effect on variable 2. The impact matrix becomes
\[A_0 = (I - \Theta_1)\,\mathrm{chol}(D)^\top, \quad D = (I-\Theta_1)^{-1}\hat\Sigma(I-\Theta_1)^{-\top}.\]
This matches the MATLAB Monte Carlo code IRs_to_Shocks_LR_Identification.m.
fit_lr <- pvarife(sim_lr$y, n_lags = 1, n_factors = 1, n_out = 20, n_in = 5)
# Point estimate at estimated beta (uncorrected)
ir_lr <- compute_irf(
fit_lr,
n_periods = 8,
identification = "long_run",
diff_vars = sim_lr$diff_vars_suggested # cumulate variable 1
)
plot(ir_lr, var_names = c("Variable 1", "Variable 2"))
# Bias-corrected point estimate
ir_lr_bc <- compute_irf(
fit_lr,
n_periods = 8,
identification = "long_run",
diff_vars = 1L,
bias_correct = TRUE # apply Theorem 2.3 bias correction
)
# Full confidence bands (median is already bias-corrected)
bands_lr <- irf_bands(
fit_lr,
n_periods = 8,
identification = "long_run",
diff_vars = 1L,
n_draw = 200,
seed = 1
)
plot(bands_lr, var_names = c("Variable 1", "Variable 2"))The pvarife() estimator has a bias of order \(O(1/\sqrt{TC})\) documented in Theorem 2.3 of Tuğan (2021). Setting bias_correct = TRUE in compute_irf() subtracts the analytical bias from \(\hat\beta\) before computing the MA representation.
| Call | Description |
|---|---|
compute_irf(fit) |
Fast uncorrected point estimate |
compute_irf(fit, bias_correct = TRUE) |
Bias-corrected point estimate |
irf_bands(fit) |
Full inference; median is implicitly bias-corrected |
ir_unc <- compute_irf(fit, n_periods = 8)
ir_bc <- compute_irf(fit, n_periods = 8, bias_correct = TRUE)
# ir_bc is close to the median returned by irf_bands()avar <- asymptotic_var(fit)
se <- sqrt(diag(avar$variance))
# Bias-corrected 95% confidence intervals for each element of beta
beta_hat <- as.numeric(fit$beta)
beta_biasC <- beta_hat - avar$bias
ci_lower <- beta_biasC - 1.96 * se
ci_upper <- beta_biasC + 1.96 * se
data.frame(
parameter = names(coef(fit)),
estimate = round(beta_hat, 4),
std_err = round(se, 4),
ci_lower = round(ci_lower, 4),
ci_upper = round(ci_upper, 4)
)Simply set missing cells to NA before calling pvarife():
y_unbal <- sim$y
# Randomly set 15% of unit-time observations to NA
set.seed(10)
mask <- array(runif(prod(dim(y_unbal))) < 0.15, dim = dim(y_unbal))
y_unbal[mask] <- NA
fit_unbal <- pvarife(y_unbal, n_lags = 1, n_factors = 1, n_out = 10, n_in = 5)
print(fit_unbal)The EM step in the inner loop imputes missing observations using the current factor estimates, consistent with the approach of Bai (2009).
| Parameter | Recommended | Notes |
|---|---|---|
n_out |
50 | Increase if coefficients have not converged |
n_in |
10 | Increase for more precise factor extraction |
n_factors |
Use IC criteria | See paper Section 3 for selection |
n_draw |
500 | For irf_bands(); reduce for testing |
n_boot |
200 | For bootstrap_irf_bands(); very slow |
To check convergence, compare results with n_out = 50 and n_out = 100.
The paper’s empirical application uses panel data on government debt and capital formation from Liaqat (2019). The dataset is not bundled with this package due to redistribution restrictions, but the replication code is available in the MATLAB package at documentforrepl/replication package/Economic Application/.
The R workflow would be:
n_lags = 1, n_factors = 2 (for Panel B of Figure 1, rmax = 2)pvarife(), compute_irf(), and irf_bands() with n_draw = 500ir[1, 1] (shock to first variable at horizon 1)Tuğan, M. (2021). Panel VAR models with interactive fixed effects. Econometrics Journal, 24, 225–246. https://doi.org/10.1093/ectj/utaa021
Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229–1279.
Liaqat, Z. (2019). Does public debt cause crowding out? A cross-country empirical evidence. Economics Letters, 181, 1–4.