Resolving discrepancy between nucleotides and amino acids in deep-level arthropod phylogenomics: differentiating serine codons in 21-amino-acid models
- PMID: 23185239
- PMCID: PMC3502419
- DOI: 10.1371/journal.pone.0047450
Resolving discrepancy between nucleotides and amino acids in deep-level arthropod phylogenomics: differentiating serine codons in 21-amino-acid models
Abstract
Background: In a previous study of higher-level arthropod phylogeny, analyses of nucleotide sequences from 62 protein-coding nuclear genes for 80 panarthopod species yielded significantly higher bootstrap support for selected nodes than did amino acids. This study investigates the cause of that discrepancy.
Methodology/principal findings: The hypothesis is tested that failure to distinguish the serine residues encoded by two disjunct clusters of codons (TCN, AGY) in amino acid analyses leads to this discrepancy. In one test, the two clusters of serine codons (Ser1, Ser2) are conceptually translated as separate amino acids. Analysis of the resulting 21-amino-acid data matrix shows striking increases in bootstrap support, in some cases matching that in nucleotide analyses. In a second approach, nucleotide and 20-amino-acid data sets are artificially altered through targeted deletions, modifications, and replacements, revealing the pivotal contributions of distinct Ser1 and Ser2 codons. We confirm that previous methods of coding nonsynonymous nucleotide change are robust and computationally efficient by introducing two new degeneracy coding methods. We demonstrate for degeneracy coding that neither compositional heterogeneity at the level of nucleotides nor codon usage bias between Ser1 and Ser2 clusters of codons (or their separately coded amino acids) is a major source of non-phylogenetic signal.
Conclusions: The incongruity in support between amino-acid and nucleotide analyses of the forementioned arthropod data set is resolved by showing that "standard" 20-amino-acid analyses yield lower node support specifically when serine provides crucial signal. Separate coding of Ser1 and Ser2 residues yields support commensurate with that found by degenerated nucleotides, without introducing phylogenetic artifacts. While exclusion of all serine data leads to reduced support for serine-sensitive nodes, these nodes are still recovered in the ML topology, indicating that the enhanced signal from Ser1 and Ser2 is not qualitatively different from that of the other amino acids.
Conflict of interest statement
Figures





Similar articles
-
Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study.Syst Biol. 2013 Jan 1;62(1):121-33. doi: 10.1093/sysbio/sys077. Epub 2012 Sep 6. Syst Biol. 2013. PMID: 22962005
-
Mitochondrial phylogenomics of early land plants: mitigating the effects of saturation, compositional heterogeneity, and codon-usage bias.Syst Biol. 2014 Nov;63(6):862-78. doi: 10.1093/sysbio/syu049. Epub 2014 Jul 28. Syst Biol. 2014. PMID: 25070972
-
Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence.Syst Biol. 2008 Dec;57(6):920-38. doi: 10.1080/10635150802570791. Syst Biol. 2008. PMID: 19085333
-
p-Adic hierarchical properties of the genetic code.Biosystems. 2019 Nov;185:104017. doi: 10.1016/j.biosystems.2019.104017. Epub 2019 Aug 18. Biosystems. 2019. PMID: 31433999 Review.
-
Reflections on the Origin and Early Evolution of the Genetic Code.Chembiochem. 2023 May 16;24(10):e202300048. doi: 10.1002/cbic.202300048. Epub 2023 Apr 13. Chembiochem. 2023. PMID: 37052530 Review.
Cited by
-
A large-scale, higher-level, molecular phylogenetic study of the insect order Lepidoptera (moths and butterflies).PLoS One. 2013;8(3):e58568. doi: 10.1371/journal.pone.0058568. Epub 2013 Mar 12. PLoS One. 2013. PMID: 23554903 Free PMC article.
-
Major Revisions in Arthropod Phylogeny Through Improved Supermatrix, With Support for Two Possible Waves of Land Invasion by Chelicerates.Evol Bioinform Online. 2020 Feb 5;16:1176934320903735. doi: 10.1177/1176934320903735. eCollection 2020. Evol Bioinform Online. 2020. PMID: 32076367 Free PMC article.
-
LMAP_S: Lightweight Multigene Alignment and Phylogeny eStimation.BMC Bioinformatics. 2019 Dec 30;20(1):739. doi: 10.1186/s12859-019-3292-5. BMC Bioinformatics. 2019. PMID: 31888452 Free PMC article.
-
Relationships within Cladobranchia (Gastropoda: Nudibranchia) based on RNA-Seq data: an initial investigation.R Soc Open Sci. 2015 Sep 23;2(9):150196. doi: 10.1098/rsos.150196. eCollection 2015 Sep. R Soc Open Sci. 2015. PMID: 26473045 Free PMC article.
-
The Mitochondrial Genomes of Phytophagous Scarab Beetles and Systematic Implications.J Insect Sci. 2018 Nov 1;18(6):11. doi: 10.1093/jisesa/iey076. J Insect Sci. 2018. PMID: 30508200 Free PMC article.
References
-
- Lockhart PJ, Howe CJ, Bryant DA, Beanland TJ, Larkum AWD (1992) Substitutional bias confounds inference of cyanelle origins from sequence data. J Mol Evol 34: 153–162. - PubMed
-
- Gruber KF, Voss RS, Jansa SA (2007) Base-compositional heterogeneity in the RAG1 locus among didelphid marsupials: Implications for phylogenetic inference and the evolution of GC content. Syst Biol 56: 83–96. - PubMed
-
- Song H, Sheffield NC, Cameron SL, Miller KB, Whiting MF (2010) When phylogenetic assumptions are violated: Base compositional heterogeneity and among-site rate variation in beetle mitochondrial phylogenomics. Syst Ent 39: 429–448.
-
- Regier JC, Shultz JW, Zwick A, Hussey A, Ball B, et al. (2010) Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature 463: 1079–1083. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources