Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;71(1):67-82.
doi: 10.1159/000324839. Epub 2011 Apr 8.

A rapid generalized least squares model for a genome-wide quantitative trait association analysis in families

Affiliations

A rapid generalized least squares model for a genome-wide quantitative trait association analysis in families

Xiang Li et al. Hum Hered. 2011.

Abstract

Genome-wide association studies (GWAS) using family data involve association analyses between hundreds of thousands of markers and a trait for a large number of related individuals. The correlations among relatives bring statistical and computational challenges when performing these large-scale association analyses. Recently, several rapid methods accounting for both within- and between-family variation have been proposed. However, these techniques mostly model the phenotypic similarities in terms of genetic relatedness. The familial resemblances in many family-based studies such as twin studies are not only due to the genetic relatedness, but also derive from shared environmental effects and assortative mating. In this paper, we propose 2 generalized least squares (GLS) models for rapid association analysis of family-based GWAS, which accommodate both genetic and environmental contributions to familial resemblance. In our first model, we estimated the joint genetic and environmental variations. In our second model, we estimated the genetic and environmental components separately. Through simulation studies, we demonstrated that our proposed approaches are more powerful and computationally efficient than a number of existing methods are. We show that estimating the residual variance-covariance matrix in the GLS models without SNP effects does not lead to an appreciable bias in the p values as long as the SNP effect is small (i.e. accounting for no more than 1% of trait variance).

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
−log10(p values) comparing the ‘gold standard’ GLS method and the other methods in HomoG, HomoGE, and HetGE scenarios. There are 10,000 points on each panel. The first descriptions label the 6 methods, which are RFGLS-UN, FGLS, RFGLS-VC, GWAF, OLS, and GEE (from top to bottom). The second descriptions label the 3 simulation scenarios, which are HomoG, HomoGE, and HetGE (from left to right)
Fig. 2
Fig. 2
−log10(p values) comparing the rapid FGLS method (RFGLS-UN) and its corresponding full FGLS method. There are 10,000 points on each panel. a HomoG, b HomoGE, c HetGE.
Fig. 3
Fig. 3
Comparison between the RFGLS-UN and the FGLS method by increasing the proportion of the total variance explained by each SNP from 0.1 to 5% (Simulation II). There are 10 SNPs being compared at each vertical line. Solid circles represent the FGLS method, and empty circles represent the RFGLS-UN method.
Fig. 4
Fig. 4
Power of the multiple causal SNPs in a boxplot for each method (Simulation III). y-axis is power. x-axis is method. a HomoG, b HomoGE, c HetGE. Each of the 10 causal SNP explains 0.6% of the total variance. α = 5 × 10−8.
Fig. 5
Fig. 5
Manhattan plot of the genome scans for the MCTFR height phenotype using the 5 methods. The x-axes represent the chromosome number. The y-axes represent the −log10 of p values.
Fig. 6
Fig. 6
Q-Q plots and log Q-log Q plots. a, b RFGLS-UN. c, d RFGLS-VC.

Similar articles

Cited by

References

    1. Cupples LA, Arruda H, Benjamin E, D'Agostino R, et al. The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports. BMC Med Genet. 2007;8(suppl 1):S1. - PMC - PubMed
    1. Baum AE, Akula N, Cabanero M, Cardona I, Corona W, et al. A genome-wide association study implicates diacylglycerol kinase eta (DGKH) and several other genes in the etiology of bipolar disorder. Mol Psychiatry. 2008;13:197–207. - PMC - PubMed
    1. Voight BF, Scott LJ, Steinthorsdottir V, Morris AP, et al. Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis. Nat Genet. 2010;42:579–589. - PMC - PubMed
    1. Benyamin B, Visscher PM, McRae AF. Family-based genome-wide association studies. Pharmacogenomics. 2009;10:181–190. - PubMed
    1. Duerr RH, Taylor KD, Brant SR, Rioux JD, Silverberg MS, Daly MJ, Steinhart AH, Abraham C, Regueiro M, Griffiths A, Dassopoulos T, Bitton A, Yang H, Targan S, Datta LW, Kistner EO, Schumm LP, Lee AT, Gregersen PK, Barmada MM, Rotter JI, Nicolae DL, Cho JH. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science. 2006;314:1461–1463. - PMC - PubMed

Publication types