snpgdsDiss {SNPRelate} | R Documentation |
Calculate the individual dissimilarities for each pair of individuals.
snpgdsDiss(gdsobj, sample.id=NULL, snp.id=NULL, autosome.only=TRUE, remove.monosnp=TRUE, maf=NaN, missing.rate=NaN, num.thread=1, verbose=TRUE)
gdsobj |
an object of class |
sample.id |
a vector of sample id specifying selected samples; if NULL, all samples are used |
snp.id |
a vector of snp id specifying selected SNPs; if NULL, all SNPs are used |
autosome.only |
if |
remove.monosnp |
if TRUE, remove monomorphic SNPs |
maf |
to use the SNPs with ">= maf" only; if NaN, no MAF threshold |
missing.rate |
to use the SNPs with "<= missing.rate" only; if NaN, no missing threshold |
num.thread |
the number of (CPU) cores used; if |
verbose |
if TRUE, show information |
The minor allele frequency and missing rate for each SNP passed in
snp.id
are calculated over all the samples in sample.id
.
The details will be described in future.
Return a class "snpgdsDissClass":
sample.id |
the sample ids used in the analysis |
snp.id |
the SNP ids used in the analysis |
diss |
a matrix of individual dissimilarity |
Xiuwen Zheng
Zheng, Xiuwen. 2013. Statistical Prediction of HLA Alleles and Relatedness Analysis in Genome-Wide Association Studies. PhD dissertation, the department of Biostatistics, University of Washington.
Weir BS, Zheng X. SNPs and SNVs in Forensic Science. 2015. Forensic Science International: Genetics Supplement Series.
# open an example dataset (HapMap) genofile <- snpgdsOpen(snpgdsExampleFileName()) pop.group <- as.factor(read.gdsn(index.gdsn( genofile, "sample.annot/pop.group"))) pop.level <- levels(pop.group) diss <- snpgdsDiss(genofile) hc <- snpgdsHCluster(diss) # close the genotype file snpgdsClose(genofile) # split set.seed(100) rv <- snpgdsCutTree(hc, label.H=TRUE, label.Z=TRUE) # draw dendrogram snpgdsDrawTree(rv, main="HapMap Phase II", edgePar=list(col=rgb(0.5,0.5,0.5, 0.75), t.col="black"))