Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Nov;37(7):726-42.
doi: 10.1002/gepi.21757.

Functional linear models for association analysis of quantitative traits

Affiliations

Functional linear models for association analysis of quantitative traits

Ruzong Fan et al. Genet Epidemiol. 2013 Nov.

Abstract

Functional linear models are developed in this paper for testing associations between quantitative traits and genetic variants, which can be rare variants or common variants or the combination of the two. By treating multiple genetic variants of an individual in a human population as a realization of a stochastic process, the genome of an individual in a chromosome region is a continuum of sequence data rather than discrete observations. The genome of an individual is viewed as a stochastic function that contains both linkage and linkage disequilibrium (LD) information of the genetic markers. By using techniques of functional data analysis, both fixed and mixed effect functional linear models are built to test the association between quantitative traits and genetic variants adjusting for covariates. After extensive simulation analysis, it is shown that the F-distributed tests of the proposed fixed effect functional linear models have higher power than that of sequence kernel association test (SKAT) and its optimal unified test (SKAT-O) for three scenarios in most cases: (1) the causal variants are all rare, (2) the causal variants are both rare and common, and (3) the causal variants are common. The superior performance of the fixed effect functional linear models is most likely due to its optimal utilization of both genetic linkage and LD information of multiple genetic variants in a genome and similarity among different individuals, while SKAT and SKAT-O only model the similarities and pairwise LD but do not model linkage and higher order LD information sufficiently. In addition, the proposed fixed effect models generate accurate type I error rates in simulation studies. We also show that the functional kernel score tests of the proposed mixed effect functional linear models are preferable in candidate gene analysis and small sample problems. The methods are applied to analyze three biochemical traits in data from the Trinity Students Study.

Keywords: association mapping; common variants; complex traits; functional data analysis; quantitative trait loci; rare variants.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using both rare and common variants in analysis, when causal variants were both rare and common, and all causal variants had positive effects. The simulations were based on COSI sequence data.
Figure 2
Figure 2
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using both rare and common variants in analysis, when causal variants were both rare and common, and 20%/80% causal variants had negative/positive effects. The simulations were based on COSI sequence data.
Figure 3
Figure 3
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using both rare and common variants in analysis, when causal variants were both rare and common, and 50%/50% causal variants had negative/positive effects. The simulations were based on COSI sequence data.
Figure 4
Figure 4
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using rare variants in analysis, when causal variants were only rare, and all causal variants had positive effects. The simulations were based on COSI sequence data.
Figure 5
Figure 5
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using rare variants in analysis, when causal variants were only rare, and 20%/80% causal variants had negative/positive effects. The simulations were based on COSI sequence data.
Figure 6
Figure 6
The empirical power of the F-test statistics of the fixed effect models (3), (4), and (6), and SKAT and SKAT-O using rare variants in analysis, when causal variants were only rare, and 50%/50% causal variants had negative/positive effects. The simulations were based on COSI sequence data.

Similar articles

Cited by

References

    1. Bansal V, Harismendy O, Tewhey R, Murray SS, Schork NJ, Topol EJ, Frazer KA. Accurate detection and genotyping of SNPs utilizing population sequencing data. Genome Res. 2010a;20:537–545. - PMC - PubMed
    1. Bansal V, Libiger O, Torkamani A, Schork NJ. Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet. 2010b;11:773–785. - PMC - PubMed
    1. Barnett IJ, Lee S, Lin X. Detecting rare variant effects using extreme phenotype sampling in sequencing association studies. Genet Epidemiol. 2013;37:142–151. - PMC - PubMed
    1. Clarke J, Wu HC, Jayasinghe L, Patel A, Reid S, Bayley H. Continuous base identification for single-molecule nanopore DNA sequencing. Nat Nanotechnol. 2009;4:265–270. - PubMed
    1. Davies R. The distribution of a linear combination of chi-square random variables. J R Stat Soc Ser C Appl Stat. 1980;29:323–333.

Publication types

Substances

LinkOut - more resources