PLINK: Key Functions for Data Analysis
- PMID: 30040203
- DOI: 10.1002/cphg.59
PLINK: Key Functions for Data Analysis
Abstract
Genetic data analysis of large numbers of single nucleotide variants (SNVs), including genome-wide association studies (GWAS), exome chips, and whole exome (WES) or whole-genome (WGS) sequencing data, requires well defined processing steps. As a result, several freely available analytic toolkits have been developed to streamline these processes. Among these, PLINK is the most comprehensive in terms of its quality control and analytic modules, although its focus remains on SNVs. PLINK fulfills two analytic needs-aiding the process of performing quality control (QC) on large data sets and providing basic statistical tools to analyze the variants in genetic models. The current version of PLINK (v1.90b) has incorporated several sophisticated statistical modeling features, such as those that were introduced by GCTA (genome-wide complex trait analysis), including mixed-model association analysis and cluster-based algorithms. Although PLINK is diverse in its applicability to data management and analysis, in some instances, other available tools offer more optimal options. Here we provide a practical overview of major PLINK features with respect to QC, data management, and association mapping, along with learned shortcuts and limitations to be considered. In cases where PLINK features are limited, we provide alternative approaches using additional freely available pipelines. © 2018 by John Wiley & Sons, Inc.
Keywords: GWAS; NGS; QC; SNV; association; software.
Copyright © 2018 John Wiley & Sons, Inc.
References
Literature Cited
References
-
- Aulchenko, Y. S., Struchalin, M. V., & van Duijn, C. M. (2010). ProbABEL package for genome-wide association analysis of imputed data. BMC Bioinformatics, 11, 134. doi: 10.1186/1471-2105-11-134
-
- Barrett, J. C., Fry, B., Maller, J., & Daly, M. J. (2005). Haploview: Analysis and visualization of LD and haplotype maps. Bioinformatics, 21, 263-265. doi: 10.1093/bioinformatics/bth457.
-
- Chen, M.-H., & Yang, Q. (2010). GWAF: An R package for genome-wide association analyses with family data. Bioinformatics, 26(4), 580-181. doi: 10.1093/bioinformatics/btp710.
-
- Cole, B. S., Hall, M. A., Urbanowicz, R. J., Gilbert-Diamond, D., & Moore, J. H. (2017). Analysis of gene-gene interactions. Current Protocols in Human Genetics, 95, 1.14.1-1.14.10. doi: 10.1002/cphg.45.
-
- Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., ... 1000 Genomes Project Analysis Group (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156-2158. doi: 10.1093/bioinformatics/btr330.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials