Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Mar 20;20(1):148.
doi: 10.1186/s12859-019-2703-y.

FitTetra 2.0 - improved genotype calling for tetraploids with multiple population and parental data support

Affiliations

FitTetra 2.0 - improved genotype calling for tetraploids with multiple population and parental data support

Konrad Zych et al. BMC Bioinformatics. .

Abstract

Background: Genetic studies in tetraploids are lagging behind in comparison with studies of diploids as the complex genetics of tetraploids require much more elaborated computational methodologies. Recent advancements in development of molecular techniques and computational tools facilitate new methods for automated, high-throughput genotype calling in tetraploid species. We report on the upgrade of the widely-used fitTetra software aiming to improve its accuracy, which to date is hampered by technical artefacts in the data.

Results: Our upgrade of the fitTetra package is designed for a more accurate modelling of complex collections of samples. The package fits a mixture model where some parameters of the model are estimated separately for each sub-collection. When a full-sib family is analyzed, we use parental genotypes to predict the expected segregation in terms of allele dosages in the offspring. More accurate modelling and use of parental data increases the accuracy of dosage calling. We tested the package on data obtained with an Affymetrix Axiom 60 k array and compared its performance with the original version and the recently published ClusterCall tool, showing that at least 20% more SNPs could be called with our updated.

Conclusion: Our updated software package shows clearly improved performance in genotype calling accuracy. Estimation of mixing proportions of the underlying dosage distributions is separated for full-sib families (where mixture proportions can be estimated from the parental dosages and inheritance model) and unstructured populations (where they are based on the assumption of Hardy-Weinberg equilibrium). Additionally, as the distributions of signal ratios of the dosage classes can be assumed to be the same for all populations, including parental data for some subpopulations helps to improve fitting other populations as well. The R package fitTetra 2.0 is freely available under the GNU Public License as Additional file with this article.

Keywords: Autotetraploids; Genomics; Genotype calling; Genotyping; Polyploids; fitPoly.

PubMed Disclaimer

Conflict of interest statement

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Histogram of genotype calling for probe_1028 (from fitTetra 2.0 test dataset), an example where the results of fitTetra 2.0 are different from fitTetra 1.0. Two populations are present: an association panel labeled “PANEL” and a FS family with two parental genotypes (both replicated). The association panel is assumed to be in Hardy-Weinberg equilibrium. Right: Calling with fitTetra 2.0 (using parental genotype data). The parental genotypes are set to be duplex (2) and quadruplex (4), leading to the segregation pattern 0:0:1:4:1, which is shown in the upper panel. Left: Calling with fitTetra 1.0. Without information on population structure, parental samples are genotyped as simplex (1) and triplex (3). Such a combination should result in a 0:1:2:1:0 pattern in the FS family. However, if we just consider samples from the FS family, the results of the calling suggest a 0:1:4:1:0 pattern which does not match any parental combination

Similar articles

Cited by

References

    1. Gidskehaug L, Kent M, Hayes BJ, Lien S. Genotype calling and mapping of multisite variants using an Atlantic salmon iSelect SNP array. Bioinforma Oxf Engl. 2011;27:303–310. doi: 10.1093/bioinformatics/btq673. - DOI - PubMed
    1. Voorrips RE, Gort G, Vosman B. Genotype calling in tetraploid species from bi-allelic marker data using mixture models. BMC Bioinformatics. 2011;12:172. doi: 10.1186/1471-2105-12-172. - DOI - PMC - PubMed
    1. Serang O, Mollinari M, Garcia AAF. Efficient exact maximum a posteriori computation for bayesian SNP genotyping in polyploids. PLoS One. 2012;7:e30906. doi: 10.1371/journal.pone.0030906. - DOI - PMC - PubMed
    1. Carley CAS, Coombs JJ, Douches DS, Bethke PC, Palta JP, Novy RG, et al. Automated tetraploid genotype calling by hierarchical clustering. Theor Appl Genet. 2017;130:717–726. doi: 10.1007/s00122-016-2845-5. - DOI - PubMed
    1. Vos PG, Uitdewilligen JGAML, Voorrips RE, Visser RGF, van Eck HJ. Development and analysis of a 20K SNP array for potato (Solanum tuberosum): an insight into the breeding history. TAG Theor Appl Genet Theor Angew Genet 2015. - PMC - PubMed