snpgdsGRM {SNPRelate} | R Documentation |
Calculate Genetic Relationship Matrix (GRM) using SNP genotype data.
snpgdsGRM(gdsobj, sample.id=NULL, snp.id=NULL, autosome.only=TRUE, remove.monosnp=TRUE, maf=NaN, missing.rate=NaN, method=c("GCTA", "Eigenstrat", "EIGMIX", "Weighted", "Corr", "IndivBeta"), num.thread=1L, with.id=TRUE, verbose=TRUE)
gdsobj |
an object of class |
sample.id |
a vector of sample id specifying selected samples; if NULL, all samples are used |
snp.id |
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used |
autosome.only |
if |
remove.monosnp |
if TRUE, remove monomorphic SNPs |
maf |
to use the SNPs with ">= maf" only; if NaN, no MAF threshold |
missing.rate |
to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold |
method |
"GCTA" – genetic relationship matrix defined in CGTA; "Eigenstrat" – genetic covariance matrix in EIGENSTRAT; "EIGMIX" – two times coancestry matrix defined in Zheng & Weir (2015), "Weighted" – weighted GCTA, as the same as "EIGMIX", "Corr" – Scaled GCTA GRM (dividing each i,j element by the product of the square root of the i,i and j,j elements), "IndivBeta" – two times individual beta estimate; see details |
num.thread |
the number of (CPU) cores used; if |
with.id |
if |
verbose |
if |
"GCTA": the genetic relationship matrix in GCTA is defined as G_ij = avg_l [(g_il - 2*p_l)*(g_jl - 2*p_l) / 2*p_l*(1 - p_l)] for individuals i,j and locus l;
"Eigenstrat": the genetic covariance matrix in EIGENSTRAT G_ij = avg_l [(g_il - 2*p_l)*(g_jl - 2*p_l) / 2*p_l*(1 - p_l)] for individuals i,j and locus l; the missing genotype is imputed by the dosage mean of that locus.
"EIGMIX" / "Weighted": it is the same as '2 * snpgdsEIGMIX(, ibdmat=TRUE, diagadj=FALSE)$ibd': G_ij = [sum_l (g_il - 2*p_l)*(g_jl - 2*p_l)] / [sum_l 2*p_l*(1 - p_l)] for individuals i,j and locus l;
"IndivBeta": it is the same as '2 * snpgdsIndivBeta(, with.id=FALSE)'.
Return a list if with.id = TRUE
:
sample.id |
the sample ids used in the analysis |
snp.id |
the SNP ids used in the analysis |
grm |
the genetic relationship matrix; different methods might have different meanings and interpretation for estimates |
If with.id = FALSE
, this function returns the genetic relationship
matrix (GRM) without sample and SNP IDs.
Xiuwen Zheng
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. American journal of human genetics 88, 76-82 (2011).
Zheng X, Weir BS. Eigenanalysis on SNP Data with an Interpretation of Identity by Descent. Theoretical Population Biology. 2016 Feb;107:65-76. doi: 10.1016/j.tpb.2015.09.004
Weir BS, Zheng X. SNPs and SNVs in Forensic Science. Forensic Science International: Genetics Supplement Series. 2015. doi:10.1016/j.fsigss.2015.09.106
snpgdsPCA
, snpgdsEIGMIX
,
snpgdsIndivBeta
,
snpgdsIndInb
, snpgdsFst
# open an example dataset (HapMap) genofile <- snpgdsOpen(snpgdsExampleFileName()) rv <- snpgdsGRM(genofile, method="GCTA") eig <- eigen(rv$grm) # Eigen-decomposition pop <- factor(read.gdsn(index.gdsn(genofile, "sample.annot/pop.group"))) plot(eig$vectors[,1], eig$vectors[,2], col=pop) legend("topleft", legend=levels(pop), pch=19, col=1:4) # close the file snpgdsClose(genofile)