Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation for the F81 model
- PMID: 38269075
- PMCID: PMC10805598
- DOI: 10.1098/rsos.231088
Position-specific evolution in transcription factor binding sites, and a fast likelihood calculation for the F81 model
Abstract
Transcription factor binding sites (TFBS), like other DNA sequence, evolve via mutation and selection relating to their function. Models of nucleotide evolution describe DNA evolution via single-nucleotide mutation. A stationary vector of such a model is the long-term distribution of nucleotides, unchanging under the model. Neutrally evolving sites may have uniform stationary vectors, but one expects that sites within a TFBS instead have stationary vectors reflective of the fitness of various nucleotides at those positions. We introduce 'position-specific stationary vectors' (PSSVs), the collection of stationary vectors at each site in a TFBS locus, analogous to the position weight matrix (PWM) commonly used to describe TFBS. We infer PSSVs for human TFs using two evolutionary models (Felsenstein 1981 and Hasegawa-Kishino-Yano 1985). We find that PSSVs reflect the nucleotide distribution from PWMs, but with reduced specificity. We infer ancestral nucleotide distributions at individual positions and calculate 'conditional PSSVs' conditioned on specific choices of majority ancestral nucleotide. We find that certain ancestral nucleotides exert a strong evolutionary pressure on neighbouring sequence while others have a negligible effect. Finally, we present a fast likelihood calculation for the F81 model on moderate-sized trees that makes this approach feasible for large-scale studies along these lines.
Keywords: compensatory mutation; evolution; transcription factor binding site.
© 2024 The Authors.
Conflict of interest statement
We declare we have no competing interests.
Figures
References
-
- Kulakovskiy IV, Levitsky VG, Oschepkov DG, Vorontsov IE, Makeev VJ. 2013. Learning advanced TFBS models from chip-seq data-diChIPMunk: effective construction of dinucleotide positional weight matrices. In Int. Conf. on Bioinformatics Models, Methods and Algorithms, vol. 2, pp. 146–150. Setúbal, Portugal: SciTePress, Science and Technology Publications.
LinkOut - more resources
Full Text Sources