The package hgwrr is used to calibrate Hierarchical and Geographically Weighted Regression (HGWR) model on spatial data. It requires the spatial hierarchical structure in the data; i.e., samples are grouped by their locations. All the variables are either in the group level or sample level. For the group-level variables, they can have fixed effects (globally constant) or spatially weighted effects (varying with the location). For the sample-level variables, they can have fixed effects or random effects (varying among groups). We note the fixed effects as \(\beta\), the group-level spatially weighted (GLSW) effects as \(\gamma\), and sample-level random (SLR) effects as \(\mu\). The HGWR model consists of these three kinds of effects and estimates the three kinds of effects considering the spatial heterogeneity.
library(hgwrr)
#> Loading required package: sf
#> Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
#> Loading required package: MASSTo calibrate a HGWR model, use the function hgwr().
hgwr(
  formula, data, ..., bw = "CV",
  kernel = c("gaussian", "bisquared"),
  alpha = 0.01, eps_iter = 1e-6, eps_gradient = 1e-6,
  max_iters = 1e6, max_retries = 1e6,
  ml_type = c("D_Only", "D_Beta"), verbose = 0
)The following is explanation of some important parameters.
formulaThis parameter specifies the model form. Recall that the three kinds of effects are GLSW, fixed, and SLR effects. They are specified in different parts of the formula.
In the formula, L() is used to mark some effects as GLSW
effects, and ( | group) is used to set the SLR effects and
grouping indicator. Only group-level variables can have GLSW
effects.
datasf objects
From version 0.3-1, this parameter supports sf objects.
In this case, no further arguments in ... are required.
Here is an example.
data(wuhan.hp)
m_sf <- hgwr(
  formula = Price ~ L(d.Water + d.Commercial) + BuildingArea + (Floor.High | group),
  data = wuhan.hp,
  bw = 299
)data.frame objects
If the data is a normal data.frame object, an extra
argument coords is required to specify the coordinates of
each group. Note that the row order of coords needs to
match that of the group variable. Here is an example.
bw and kernelArgument bw is the bandwidth used to estimate GLSW
effects. It can be either of the following options:
"CV" letting the algorithm select one.Argument kernel is the kernel function used to estimate
GLSW effects. Currently, there are only two choices:
"gaussian" and "bisquared".
The output of returned object of hgwr() shows the
estimates of the effects.
m_df
#> Hierarchical and geographically weighted regression model
#> =========================================================
#> Formula: y ~ L(g1 + g2) + x1 + (z1 | group)
#>  Method: Back-fitting and Maximum likelihood
#>    Data: mulsam.test$data
#> 
#> Fixed Effects
#> -------------
#>  Intercept        x1 
#>   4.056759  1.967648 
#> 
#> Group-level Spatially Weighted Effects
#> --------------------------------------
#> Bandwidth: 9.35816 (nearest neighbours)
#> 
#> Coefficient estimates:
#>  Coefficient        Min  1st Quartile     Median  3rd Quartile        Max 
#>    Intercept  -2.769060     -2.708289  -2.356463     -2.225995  -2.022646 
#>           g1   0.876505      1.253144   1.702822      1.939969   2.336628 
#>           g2   1.082775      1.279601   1.424307      1.607909   1.722892 
#> 
#> Sample-level Random Effects
#> ---------------------------
#>    Groups       Name  Std.Dev.      Corr 
#>     group  Intercept  1.032962           
#>                   z1  1.032962  0.000000 
#>  Residual             1.032962           
#> 
#> Other Information
#> -----------------
#> Number of Obs: 873
#>        Groups: group , 25And the summary() method shows some diagnostic
information.
summary(m_df)
#> Hierarchical and geographically weighted regression model
#> =========================================================
#> Formula: y ~ L(g1 + g2) + x1 + (z1 | group)
#>  Method: Back-fitting and Maximum likelihood
#>    Data: mulsam.test$data
#> 
#> Parameter Estimates
#> -------------------
#> Fixed effects:
#>             Estimated   Sd. Err      t.val  Pr(>|t|)      
#>  Intercept   4.056759  0.203079  19.976270  0.000000  *** 
#>         x1   1.967648  0.033827  58.168658  0.000000  *** 
#> 
#> Bandwidth: 9.35816 (nearest neighbours)
#> 
#> GLSW effects:
#>             Mean Est.  Mean Sd.     ***    **     *     . 
#>  Intercept  -2.421973  0.251700  100.0%  0.0%  0.0%  0.0% 
#>         g1   1.641343  1.823056    0.0%  0.0%  0.0%  0.0% 
#>         g2   1.435709  1.506236    0.0%  0.0%  0.0%  0.0% 
#> 
#> SLR effects:
#>    Groups       Name      Mean  Std.Dev.      Corr 
#>     group  Intercept  0.000000  1.032962           
#>                   z1  1.869552  1.032962  0.000000 
#>  Residual             0.088510  1.032962           
#> 
#> 
#> Diagnostics
#> -----------
#>  rsquared  0.905066 
#>    logLik       NaN 
#>       AIC       NaN 
#> 
#> Scaled Residuals
#> ----------------
#>        Min         1Q    Median        3Q       Max 
#>  -3.408088  -0.576387  0.100854  0.734105  3.036324 
#> 
#> Other Information
#> -----------------
#> Number of Obs: 873
#>        Groups: group , 25The significance level of spatial heterogeneity in GLSW effects can be tested with the following codes.
summary(m_df, test_hetero = T)
#> Hierarchical and geographically weighted regression model
#> =========================================================
#> Formula: y ~ L(g1 + g2) + x1 + (z1 | group)
#>  Method: Back-fitting and Maximum likelihood
#>    Data: mulsam.test$data
#> 
#> Parameter Estimates
#> -------------------
#> Fixed effects:
#>             Estimated   Sd. Err      t.val  Pr(>|t|)      
#>  Intercept   4.056759  0.203079  19.976270  0.000000  *** 
#>         x1   1.967648  0.033827  58.168658  0.000000  *** 
#> 
#> Bandwidth: 9.35816 (nearest neighbours)
#> 
#> GLSW effects:
#>             Mean Est.  Mean Sd.     ***    **     *     . 
#>  Intercept  -2.421973  0.251700  100.0%  0.0%  0.0%  0.0% 
#>         g1   1.641343  1.823056    0.0%  0.0%  0.0%  0.0% 
#>         g2   1.435709  1.506236    0.0%  0.0%  0.0%  0.0% 
#> 
#> SLR effects:
#>    Groups       Name      Mean  Std.Dev.      Corr 
#>     group  Intercept  0.000000  1.032962           
#>                   z1  1.869552  1.032962  0.000000 
#>  Residual             0.088510  1.032962           
#> 
#> 
#> Diagnostics
#> -----------
#>  rsquared  0.905066 
#>    logLik       NaN 
#>       AIC       NaN 
#> 
#> Scaled Residuals
#> ----------------
#>        Min         1Q    Median        3Q       Max 
#>  -3.408088  -0.576387  0.100854  0.734105  3.036324 
#> 
#> Other Information
#> -----------------
#> Number of Obs: 873
#>        Groups: group , 25Some other methods are provided.
head(coef(m_df))
#>   Intercept       g1       g2       x1       z1
#> 1 0.9143066 2.336628 1.633698 1.967648 1.817139
#> 2 1.1269566 1.932128 1.626517 1.967648 2.305685
#> 3 1.8867179 2.027690 1.659433 1.967648 2.251592
#> 4 1.1245250 2.265663 1.536906 1.967648 1.591036
#> 5 1.7726751 2.219179 1.607909 1.967648 1.698600
#> 6 0.7008420 2.082628 1.421329 1.967648 1.855599
head(fitted(m_df))
#> [1]  3.659871  4.317510  6.929765  1.768491  0.511762 -2.023591
head(residuals(m_df))
#> [1] -0.5654830 -0.7380541  0.9197850  0.5707894 -0.3850239 -0.1648946The following papers shows more details about the mathematical basis about the HGWR model.