. 2008 Jun 28:4:225-35.

doi: 10.4137/ebo.s756.

Estimation of genetic effects and genotype-phenotype maps

Arnaud Le Rouzic¹, José M Alvarez-Castro

Affiliations

PMID: 19204820
PMCID: PMC2614198
DOI: 10.4137/ebo.s756

Estimation of genetic effects and genotype-phenotype maps

Arnaud Le Rouzic et al. Evol Bioinform Online. 2008.

. 2008 Jun 28:4:225-35.

doi: 10.4137/ebo.s756.

Authors

Arnaud Le Rouzic¹, José M Alvarez-Castro

Affiliation

¹ Center for Ecological and Evolutionary Synthesis, University of Oslo, Oslo, Norway. a.p.s.lerouzic@bio.uio.no

PMID: 19204820
PMCID: PMC2614198
DOI: 10.4137/ebo.s756

Abstract

Determining the genetic architecture of complex traits is a necessary step to understand phenotypic changes in natural, experimental and domestic populations. However, this is still a major challenge for modern genetics, since the estimation of genetic effects tends to be complicated by genetic interactions, which lead to changes in the effect of allelic substitutions depending on the genetic background. Recent progress in statistical tools aiming to describe and quantify genetic effects meaningfully improves the efficiency and the availability of genotype-to-phenotype mapping methods. In this contribution, we facilitate the practical use of the recently published 'NOIA' quantitative framework by providing an implementation of linear and multilinear regressions, change of reference operation and genotype-to-phenotype mapping in a package ('noia') for the software R, and we discuss theoretical and practical benefits evolutionary and quantitative geneticists may find in using proper modeling strategies to quantify the effects of genes.

PubMed Disclaimer

Figures

**Figure 1. Illustration of data formatting.**
Part a provides an example of a data set in which the genotypes of individals are fully known (or, alternatively, totally unknown and considered as missing data); 1 and 3 stand for the homozygotes (e.g. ‘AA’ and ‘aa’) and 2 for the hererozygote. Part b illustrates a second kind of data set in which the genotypes are defined by their probabilites. In this example, part b is the exact equivalent of part a (and then, the frequency of the ‘known’ genotypes is always 1), but in practice, especially when the data result from a Haley-Knott regression, the probabilities, computed from the genotypes at flanking markers, may be intermediate. Missing values (‘NA’) are allowed in type a data sets, and are replaced by genotypic probabilities equal to genotypic frequencies in the rest of the population (here, close to 0.25, 0.5, and 0.25 since the population is an F₂). The Z matrix used for the regression (equation 5) is computed from a ‘type b’ data set, meaning that if ‘type a’ data is provided, it is turned into ‘type b’ before the genetic regression.

**Figure 2. Accuracy of GP map predictions.**
The estimate of genotypic values, as well as their 95% confidence intervals, are shown for two different tow-locus Genotype-Phenotype maps (a: no epistasis, b: multilinear epistasis). Results are derived from simulated F₂ populations of size N = 200 (the script is provided in the Appendix). Predictions are satisfactory, except if the model cannot handle the complexity of the map (marginal effect model on an epistatic map). Confidence intervals are smaller when the genotypic value is estimated from a frequent genotype in the population (the most frequent genotype in an F₂ being 22), and when the model has less degrees of freedom (such as in one-locus models). 95% confidence intervals are estimated from the standard error (SE) by CI = 1.96 × SE.

**Figure 3. Impact of the quality of the data set on the results.**
The effect of the population size and the proportion of missing data on the quality of the results is illustrated by the standard deviation of the 2-locus GP map estimates. The amplitude of uncertainties changes with the genotype considered, since the more frequent in the F₂ population, the better the estimate of the genotypic value. The results for the ‘best’ genotype (i.e. the fully heterozygous (‘htz’) genotype 22) and one of the the ‘worse’ ones (fully homozygous (‘hmz’) 11) are displayed. a: improvement in the precision of the GP map when the size of the population under study is increased. b: effects of substituting (randomly) genotypic information (2 loci, N = 500) by missing data. In this example (*Var*(e) = 1, additive GP map), fairly good estimates of the genotypic values in a 2-locus GP map requires N > 400, and these estimates appear to be quite robust to missing data information. The corresponding script is available in the Appendix.

**Figure 4. Computational resource requirements.**
The complexity of the models increases with the number of loci. a) presents the time necessary for the linear regression, with full and marginal-effect models. The test has been performed on a single AMD Athlon 4000 + processor, with the standard R software for Linux (32 bits) and its profiling module (Rprof). Multilinear regression (not shown) is always slower than the corresponding linear regression since this linear regression is first performed to estimate the starting values. b) Increase of the S matrix size with the number of loci. S matrix is the largest element in the model, and its size is proportional to the memory necessary to run the program. With a modern desktop PC, it is possible to run regressions up to 10 loci, which is probably beyond the number of genes that can be located in a regular experimental procedure.

**Figure 5. Illustration of the consequences of reducing the complexity of GP maps.**
An F₂ population (size N = 500, *Var* (e) = 1) has been simulated from an arbitrary 2-locus, 2-allele (a and A at the first locus, b and B at the other one) GP map (panel a). The inferrence of the GP map from this population with different regression options is displayed in panels b to f (see the Appendix for the corresponding R script). b: Full model (9 parameters), explains 77.7% of the total phenotypic variance; c) multilinear model (6 parameters, 74.3%); d) no dominance (i.e. only additive and additive-by-additive interactions) (4 parameters, 55.9%); e) no epistasis (5 parameters, 70.8%); f ) additive effects only (3 parameters, 54.9%). The full model always performs better (results identical to the actual GP map except sampling effect). The relative performance of the other models obviously depends on the shape of the actual GP map. If the decomposition is orthogonal, a model selection procedure can be performed to make a rational choice among all possible models.

See this image and copyright information in PMC

Cited by

Genetic architecture of tameness in a rat model of animal domestication.
Albert FW, Carlborg O, Plyusnina I, Besnier F, Hedwig D, Lautenschläger S, Lorenz D, McIntosh J, Neumann C, Richter H, Zeising C, Kozhemyakina R, Shchepina O, Kratzsch J, Trut L, Teupser D, Thiery J, Schöneberg T, Andersson L, Pääbo S. Albert FW, et al. Genetics. 2009 Jun;182(2):541-54. doi: 10.1534/genetics.109.102186. Epub 2009 Apr 10. Genetics. 2009. PMID: 19363126 Free PMC article.
Dissection of the genetic architecture of body weight in chicken reveals the impact of epistasis on domestication traits.
Le Rouzic A, Alvarez-Castro JM, Carlborg O. Le Rouzic A, et al. Genetics. 2008 Jul;179(3):1591-9. doi: 10.1534/genetics.108.089300. Epub 2008 Jul 13. Genetics. 2008. PMID: 18622035 Free PMC article.
Monotonicity is a key feature of genotype-phenotype maps.
Gjuvsland AB, Wang Y, Plahte E, Omholt SW. Gjuvsland AB, et al. Front Genet. 2013 Nov 7;4:216. doi: 10.3389/fgene.2013.00216. eCollection 2013. Front Genet. 2013. PMID: 24223579 Free PMC article.
Mapping QTLs with main and epistatic effects underlying grain yield and heading time in soft winter wheat.
Reif JC, Maurer HP, Korzun V, Ebmeyer E, Miedaner T, Würschum T. Reif JC, et al. Theor Appl Genet. 2011 Jul;123(2):283-92. doi: 10.1007/s00122-011-1583-y. Epub 2011 Apr 8. Theor Appl Genet. 2011. PMID: 21476040
Directionality of epistasis in a murine intercross population.
Pavlicev M, Le Rouzic A, Cheverud JM, Wagner GP, Hansen TF. Pavlicev M, et al. Genetics. 2010 Aug;185(4):1489-505. doi: 10.1534/genetics.110.118356. Epub 2010 Jun 1. Genetics. 2010. PMID: 20516493 Free PMC article.

See all "Cited by" articles

References

1. Álvarez-Castro JM, Carlborg Ö. A unified model for functional and statistical epistasis and its application in quantitative trait loci analysis. Genetics. 2007;176:1151–67. - PMC - PubMed
1. Álvarez-Castro JM, Le Rouzic A, Carlborg Ö. How to perform meaningful estimates of genetic effects. PLoS Genetics. 2008;4(5):e1000062. - PMC - PubMed
1. Carter AJ, Hermisson J, Hansen TF. The role of epistatic gene interactions in the response to selection and the evolution of evolvability. Theor. Popul. Biol. 2005;68:179–96. - PubMed
1. Cheverud J, Routman E. Epistasis and its contribution to genetic variance-components. Genetics. 1995;139:1455–61. - PMC - PubMed
1. Cockerham CC. An extension of the concept of partitionning hereditary variance for analysis of covariances among relatives when epistasis is present. Genetics. 1954;39:859–82. - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimation of genetic effects and genotype-phenotype maps

Affiliation

Estimation of genetic effects and genotype-phenotype maps

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources