snpgdsPCACorr {SNPRelate} | R Documentation |
To calculate the SNP correlations between eigenvactors and SNP genotypes
snpgdsPCACorr(pcaobj, gdsobj, snp.id=NULL, eig.which=NULL, num.thread=1L, with.id=TRUE, outgds=NULL, verbose=TRUE)
pcaobj |
a |
gdsobj |
an object of class |
snp.id |
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used |
eig.which |
a vector of integers, to specify which eigenvectors to be used |
num.thread |
the number of (CPU) cores used; if |
with.id |
if |
outgds |
|
verbose |
if TRUE, show information |
If an output file name is specified via outgds
, "sample.id",
"snp.id" and "correlation" will be stored in the GDS file. The GDS node
"correlation" is a matrix of correlation coefficients, and it is stored with
the format of packed real number ("packedreal16" preserving 4 digits, 0.0001
is the smallest number greater zero, see add.gdsn).
Return a list if outgds=NULL
,
sample.id |
the sample ids used in the analysis |
snp.id |
the SNP ids used in the analysis |
snpcorr |
a matrix of correlation coefficients, "# of eigenvectors" x "# of SNPs" |
Xiuwen Zheng
Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genetics 2:e190.
snpgdsPCA
, snpgdsPCASampLoading
,
snpgdsPCASNPLoading
# open an example dataset (HapMap) genofile <- snpgdsOpen(snpgdsExampleFileName()) # get chromosome index chr <- read.gdsn(index.gdsn(genofile, "snp.chromosome")) pca <- snpgdsPCA(genofile) cr <- snpgdsPCACorr(pca, genofile, eig.which=1:4) plot(abs(cr$snpcorr[3,]), xlab="SNP Index", ylab="PC 3", col=chr) # output to a gds file if limited memory snpgdsPCACorr(pca, genofile, eig.which=1:4, outgds="test.gds") (f <- openfn.gds("test.gds")) m <- read.gdsn(index.gdsn(f, "correlation")) closefn.gds(f) # check summary(c(m - cr$snpcorr)) # should < 1e-4 # close the file snpgdsClose(genofile) # delete the temporary file unlink("test.gds", force=TRUE)