Beginner's Guide on the Use of PAML to Detect Positive Selection
- PMID: 37096789
- PMCID: PMC10127084
- DOI: 10.1093/molbev/msad041
Beginner's Guide on the Use of PAML to Detect Positive Selection
Abstract
The CODEML program in the PAML package has been widely used to analyze protein-coding gene sequences to estimate the synonymous and nonsynonymous rates (dS and dN) and to detect positive Darwinian selection driving protein evolution. For users not familiar with molecular evolutionary analysis, the program is known to have a steep learning curve. Here, we provide a step-by-step protocol to illustrate the commonly used tests available in the program, including the branch models, the site models, and the branch-site models, which can be used to detect positive selection driving adaptive protein evolution affecting particular lineages of the species phylogeny, affecting a subset of amino acid residues in the protein, and affecting a subset of sites along prespecified lineages, respectively. A data set of the myxovirus (Mx) genes from ten mammal and two bird species is used as an example. We discuss a new feature in CODEML that allows users to perform positive selection tests for multiple genes for the same set of taxa, as is common in modern genome-sequencing projects. The PAML package is distributed at https://github.com/abacus-gene/paml under the GNU license, with support provided at its discussion site (https://groups.google.com/g/pamlsoftware). Data files used in this protocol are available at https://github.com/abacus-gene/paml-tutorial.
Keywords: d N/dS; PAML; adaptive evolution; nonsynonymous substitutions; positive selection; synonymous substitutions.
© The Author(s) 2023. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.
Figures








Similar articles
-
PoSE: visualization of patterns of sequence evolution using PAML and MATLAB.BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):364. doi: 10.1186/s12859-018-2335-7. BMC Bioinformatics. 2018. PMID: 30343671 Free PMC article.
-
A beginners guide to estimating the non-synonymous to synonymous rate ratio of all protein-coding genes in a genome.Methods Mol Biol. 2015;1201:65-90. doi: 10.1007/978-1-4939-1438-8_4. Methods Mol Biol. 2015. PMID: 25388108
-
LMAP: Lightweight Multigene Analyses in PAML.BMC Bioinformatics. 2016 Sep 6;17(1):354. doi: 10.1186/s12859-016-1204-5. BMC Bioinformatics. 2016. PMID: 27597435 Free PMC article.
-
paPAML: An Improved Computational Tool to Explore Selection Pressure on Protein-Coding Sequences.Genes (Basel). 2022 Jun 18;13(6):1090. doi: 10.3390/genes13061090. Genes (Basel). 2022. PMID: 35741852 Free PMC article.
-
Analysis of selection in protein-coding sequences accounting for common biases.Brief Bioinform. 2021 Sep 2;22(5):bbaa431. doi: 10.1093/bib/bbaa431. Brief Bioinform. 2021. PMID: 33479739 Review.
Cited by
-
Positive selection and relaxed purifying selection contribute to rapid evolution of male-biased genes in a dioecious flowering plant.Elife. 2024 Feb 14;12:RP89941. doi: 10.7554/eLife.89941. Elife. 2024. PMID: 38353667 Free PMC article.
-
Phylogenetics of Lepidonotopodini (Macellicephalinae, Polynoidae, Annelida) and Comparative Mitogenomics of Shallow-Water vs. Deep-Sea Scaleworms (Aphroditiformia).Biology (Basel). 2024 Nov 27;13(12):979. doi: 10.3390/biology13120979. Biology (Basel). 2024. PMID: 39765646 Free PMC article.
-
Comparative analysis of the mitochondrial genomes of the soft-shelled turtles Palea steindachneri and Pelodiscus axenaria and phylogenetic implications for Trionychia.Sci Rep. 2025 Feb 28;15(1):7138. doi: 10.1038/s41598-025-90985-2. Sci Rep. 2025. PMID: 40021811 Free PMC article.
-
Comparative phylogenetic analysis of the mediator complex subunit in asparagus bean (Vigna unguiculata ssp. sesquipedialis) and its expression profile under cold stress.BMC Genomics. 2024 Feb 6;25(1):149. doi: 10.1186/s12864-024-10060-4. BMC Genomics. 2024. PMID: 38321384 Free PMC article.
-
An evolutionary timeline of the oxytocin signaling pathway.Commun Biol. 2024 Apr 17;7(1):471. doi: 10.1038/s42003-024-06094-9. Commun Biol. 2024. PMID: 38632466 Free PMC article.
References
-
- Anisimova M, Kosiol C. 2009. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol. 26:255–271. - PubMed
-
- Anisimova M, Yang Z. 2007. Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites. Mol Biol Evol. 24:1219–1228. - PubMed
-
- Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 57:289–300.
-
- Benjamini Y, Hochberg Y. 2000. On the adaptive control of the false discovery rate in multiple testing with independent statistics. J Educat Behav Stat. 25:83.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources