Title: | Distributed Online Covariance Matrix Tests |
Date: | 2025-09-02 |
Version: | 0.3 |
Description: | Distributed Online Covariance Matrix Tests 'Docovt' is a powerful tool designed to efficiently process and analyze distributed datasets. It enables users to perform covariance matrix tests in an online, distributed manner, making it highly suitable for large-scale data analysis. By leveraging advanced computational techniques, 'Docovt' ensures robust and scalable solutions for statistical analysis, particularly in scenarios where data is dispersed across multiple nodes or sources. This package is ideal for researchers and practitioners working with high-dimensional data, providing a flexible and efficient framework for covariance matrix estimation and hypothesis testing. The philosophy of 'Docovt' is described in Guo G.(2025) <doi:10.1016/j.physa.2024.130308>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Imports: | stats |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-09-02 08:21:57 UTC; lenovo |
Author: | Guangbao Guo |
Maintainer: | Guangbao Guo <ggb11111111@163.com> |
Depends: | R (≥ 3.5.0) |
Repository: | CRAN |
Date/Publication: | 2025-09-03 07:50:03 UTC |
Two-Sample Covariance Test by Cai, Liu and Xia (2013)
Description
Given two sets of data matrices X and Y, where X is an n1 rows and p cols matrix and Y is an n2 rows and p cols matrix, we conduct hypothesis testing of the covariance matrix between two samples. The null hypothesis is:
H_0 : \Sigma_1 = \Sigma_2
\Sigma_1
and \Sigma_2
are the sample covariance matrices of X and Y respectively. This test method is based on the test method proposed by Cai, Liu and Xia (2013). When the pval value is less than the significance coefficient (generally 0.05), the null hypothesis is rejected.
Usage
CLX(X,Y)
Arguments
X |
A matrix of n1 by p |
Y |
A matrix of n2 by p |
Value
stat |
a test statistic value. |
pval |
a test p_value. |
References
Cai, T. T., Liu, W., and Xia, Y. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. Journal of the American Statistical Association, 108(501):265-277.
Examples
## generate X and Y.
p= 500; n1 = 100; n2 = 150
X=matrix(rnorm(n1*p), ncol=p)
Y=matrix(rnorm(n2*p), ncol=p)
## run test
CLX(X,Y)
COVID19
Description
A COVID19 data set from NCBI with ID GSE152641. The data set profiled peripheral blood from 24 healthy controls and 62 prospectively enrolled patients with community-acquired lower respiratory tract infection by SARS-COV-2 within the first 24 hours of hospital admission using RNA sequencing.
Usage
data(COVID19)
Format
'COVID19'
A data frame with 86 observations on the following 2 groups.
- healthy group1
row 2 to row 19, and row 82 to 87, in total 24 healthy controls
- patients group2
row 20 to 81, in total 62 prospectively enrolled patients
Examples
data(COVID19)
dim(COVID19)
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
dim(group1)
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
dim(group2)
Two-Sample Covariance Test by Li and Chen (2012)
Description
Given two sets of data matrices X and Y, where X is an n1 rows and p cols matrix and Y is an n2 rows and p cols matrix, we conduct hypothesis testing of the covariance matrix between two samples. The null hypothesis is:
H_0 : \Sigma_1 = \Sigma_2
\Sigma_1
and \Sigma_2
are the sample covariance matrices of X and Y respectively. This test method is based on the test method proposed by Li and Chen (2012). When the pval value is less than the significance coefficient (generally 0.05), the null hypothesis is rejected.
Usage
LC(X,Y)
Arguments
X |
A matrix of n1 by p |
Y |
A matrix of n2 by p |
Value
stat |
a test statistic value. |
pval |
a test p_value. |
References
Li, J. and Chen, S. X. (2012). Two sample tests for high-dimensional covariance matrices. The Annals of Statistics, 40(2):908-940.
Examples
## generate X and Y.
p= 500; n1 = 100; n2 = 150
X=matrix(rnorm(n1*p), ncol=p)
Y=matrix(rnorm(n2*p), ncol=p)
## run test
LC(X,Y)
Two-Sample Covariance Test by Yu, Li and Xue (2022)
Description
Given two sets of data matrices X and Y, where X is an n1 rows and p cols matrix and Y is an n2 rows and p cols matrix,, we conduct hypothesis testing of the covariance matrix between two samples. The null hypothesis is:
H_0 : \Sigma_1 = \Sigma_2
\Sigma_1
and \Sigma_2
are the sample covariance matrices of X and Y respectively. This test method is based on the test method proposed by Yu, Li and Xue (2022). When the pval value is less than the significance coefficient (generally 0.05), the null hypothesis is rejected.
Usage
PEC(X,Y)
Arguments
X |
A matrix of n1 by p |
Y |
A matrix of n2 by p |
Value
stat |
a test statistic value. |
pval |
a test p_value. |
References
Yu, X., Li, D., and Xue, L. (2022). Fisher's combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1-14.
Examples
## generate X and Y.
p= 500; n1 = 100; n2 = 150
X=matrix(rnorm(n1*p), ncol=p)
Y=matrix(rnorm(n2*p), ncol=p)
## run test
PEC(X,Y)
Two-Sample Covariance Test by Yu, Li, Xue and Li(2022)
Description
Given two sets of data matrices X and Y, where X is an n1 rows and p cols matrix and Y is an n2 rows and p cols matrix, we conduct hypothesis testing of the covariance matrix between two samples. The null hypothesis is:
H_0 : \Sigma_1 = \Sigma_2
\Sigma_1
and \Sigma_2
are the sample covariance matrices of X and Y respectively. This test method is based on the test method proposed by Yu, Li, Xue and Li (2022). When the pval value is less than the significance coefficient (generally 0.05), the null hypothesis is rejected.
Usage
PECO(X,Y,delta = NULL)
Arguments
X |
A matrix of n1 by p |
Y |
A matrix of n2 by p |
delta |
A scalar used as the threshold for building PE components, usually the default value. |
Value
stat |
a test statistic value. |
pval |
a test p_value. |
References
Yu, X., Li, D., Xue, L., and Li, R. (2022). Power-enhanced simultaneous test of high-dimensional mean vectors and covariance matrices with application to gene-set testing. Journal of the American Statistical Association, (in press):1-14.
Examples
## generate X and Y.
p= 500; n1 = 100; n2 = 150
X=matrix(rnorm(n1*p), ncol=p)
Y=matrix(rnorm(n2*p), ncol=p)
## run test
PECO(X,Y)
Two-Sample Covariance Test by Yu, Li and Xue (2022)
Description
Given two sets of data matrices X and Y, where X is an n1 rows and p cols matrix and Y is an n2 rows and p cols matrix,, we conduct hypothesis testing of the covariance matrix between two samples. The null hypothesis is:
H_0 : \Sigma_1 = \Sigma_2
\Sigma_1
and \Sigma_2
are the sample covariance matrices of X and Y respectively. This test method is based on the test method proposed by Yu, Li and Xue (2022). When the pval value is less than the significance coefficient (generally 0.05), the null hypothesis is rejected.
Usage
PEF(X,Y)
Arguments
X |
A matrix of n1 by p |
Y |
A matrix of n2 by p |
Value
stat |
a test statistic value. |
pval |
a test p_value. |
References
Yu, X., Li, D., and Xue, L. (2022). Fisher's combined probability test for high-dimensional covariance matrices. Journal of the American Statistical Association, (in press):1-14.
Examples
## generate X and Y.
p= 500; n1 = 100; n2 = 150
X=matrix(rnorm(n1*p), ncol=p)
Y=matrix(rnorm(n2*p), ncol=p)
## run test
PEF(X,Y)
One-Sample Covariance Test by Cai and Ma (2013)
Description
Given data, it performs 1-sample test for Covariance where the null hypothesis is
H_0 : \Sigma_n = \Sigma_0
where \Sigma_n
is the covariance of data model and \Sigma_0
is a
hypothesized covariance based on a procedure proposed by Cai and Ma (2013).
Usage
cm13(X,Sigma0, alpha)
Arguments
X |
an |
Sigma0 |
a |
alpha |
level of significance. |
Value
a named list containing:
- statistic
a test statistic value.
- threshold
rejection criterion to be compared against test statistic.
- reject
a logical;
TRUE
to reject null hypothesis,FALSE
otherwise.
Examples
## generate data from multivariate normal with trivial covariance.
p = 5;n=10
X=data = matrix(rnorm(n*p), ncol=p)
alpha=0.05
Sigma0=diag(ncol(X))
cm13(X,Sigma0, alpha)
Two-Sample Covariance Test by Cai and Ma (2013)
Description
Given two sets of data, it performs 2-sample test for equality of covariance matrices where the null hypothesis is
H_0 : \Sigma_1 = \Sigma_2
where \Sigma_1
and \Sigma_2
represent true (unknown) covariance
for each dataset based on a procedure proposed by Cai and Ma (2013).
If statistic
>
threshold
, it rejects null hypothesis.
Usage
cmtwo(X, Y, alpha)
Arguments
X |
an |
Y |
an |
alpha |
level of significance. |
Value
a named list containing
- statistic
a test statistic value.
- threshold
rejection criterion to be compared against test statistic.
- reject
a logical;
TRUE
to reject null hypothesis,FALSE
otherwise.
Examples
## generate 2 datasets from multivariate normal with identical covariance.
p= 5; n1 = 100; n2 = 150; alpha=0.05
X=data1 = matrix(rnorm(n1*p), ncol=p)
Y=data2 = matrix(rnorm(n2*p), ncol=p)
# run test
cmtwo(X, Y, alpha)
corneal
Description
This dataset was acquired during a keratoconus study, a collaborative project involving Ms.Nancy Tripoli and Dr.Kenneth L.Cohen of Department of Ophthalmology at the University of North Carolina, Chapel Hill. The fitted feature vectors for the complete corneal surface dataset collectively into a feature matrix with dimensions of 150 × 2000.
Usage
data(corneal)
Format
'corneal'
A data frame with 150 observations on the following 4 groups.
- normal group1
row 1 to row 43 in total 43 rows of the feature matrix correspond to observations from the normal group
- unilateral suspect group2
row 44 to row 57 in total 14 rows of the feature matrix correspond to observations from the unilateral suspect group
- suspect map group3
row 58 to row 78 in total 21 of the feature matrix correspond to observations from the suspect map group
- clinical keratoconus group4
row 79 to row 150 in total 72 of the feature matrix correspond to observations from the clinical keratoconus group
Examples
data(corneal)
dim(corneal)
group1 <- as.matrix(corneal[1:43, ]) ## normal group
dim(group1)
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
dim(group2)
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
dim(group3)
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
dim(group4)
miRNA
Description
A three factor level variable corresponding to cancer type
Usage
data(miRNA)
Format
Dataframe with 21 samples and 537 variables
- columns
variables
- rows
samples
Examples
data(miRNA)
One-Sample Covariance Test by Srivastava, Yanagihara, and Kubokawa (2014)
Description
Given data, it performs 1-sample test for Covariance where the null hypothesis is
H_0 : \Sigma_n = \Sigma_0
where \Sigma_n
is the covariance of data model and \Sigma_0
is a
hypothesized covariance based on a procedure proposed by Srivastava, Yanagihara, and Kubokawa (2014).
Usage
syk(data, Sigma0, alpha)
Arguments
data |
an |
Sigma0 |
a |
alpha |
level of significance. |
Value
a named list containing
- statistic
a test statistic value.
- threshold
rejection criterion to be compared against test statistic.
- reject
a logical;
TRUE
to reject null hypothesis,FALSE
otherwise.
Examples
## generate data from multivariate normal with trivial covariance.
p = 5;n=10
data = matrix(rnorm(n*p), ncol=p)
alpha=0.05
Sigma0=diag(ncol(data))
## run the test
syk(data, Sigma0, alpha)