Nullomers and High Order Nullomers in Genomic Sequences
- PMID: 27906971
- PMCID: PMC5132333
- DOI: 10.1371/journal.pone.0164540
Nullomers and High Order Nullomers in Genomic Sequences
Abstract
A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures





Similar articles
-
The topography of nullomer-emerging mutations and their relevance to human disease.Comput Struct Biotechnol J. 2024 Dec 25;30:1-11. doi: 10.1016/j.csbj.2024.12.026. eCollection 2025. Comput Struct Biotechnol J. 2024. PMID: 39839549 Free PMC article.
-
Nullomers: really a matter of natural selection?PLoS One. 2007 Oct 10;2(10):e1022. doi: 10.1371/journal.pone.0001022. PLoS One. 2007. PMID: 17925870 Free PMC article.
-
Absent from DNA and protein: genomic characterization of nullomers and nullpeptides across functional categories and evolution.Genome Biol. 2021 Aug 25;22(1):245. doi: 10.1186/s13059-021-02459-z. Genome Biol. 2021. PMID: 34433494 Free PMC article.
-
Visualization of sequence and structural features of genomes and chromosome fragments. Application to CpG islands, Alu sequences and whole genomes.Gene. 2011 Mar 1;473(2):76-81. doi: 10.1016/j.gene.2010.11.008. Epub 2010 Dec 16. Gene. 2011. PMID: 21167919 Review.
-
Dinucleotide relative abundance extremes: a genomic signature.Trends Genet. 1995 Jul;11(7):283-90. doi: 10.1016/s0168-9525(00)89076-9. Trends Genet. 1995. PMID: 7482779 Review.
Cited by
-
The topography of nullomer-emerging mutations and their relevance to human disease.Comput Struct Biotechnol J. 2024 Dec 25;30:1-11. doi: 10.1016/j.csbj.2024.12.026. eCollection 2025. Comput Struct Biotechnol J. 2024. PMID: 39839549 Free PMC article.
-
Specificity Analysis of Genome Based on Statistically Identical K-Words With Same Base Combination.IEEE Open J Eng Med Biol. 2020 Jul 14;1:214-219. doi: 10.1109/OJEMB.2020.3009055. eCollection 2020. IEEE Open J Eng Med Biol. 2020. PMID: 35402963 Free PMC article.
-
The farther the better: Investigating how distance from human self affects the propensity of a peptide to be presented on cell surface by MHC class I molecules, the case of Trypanosoma cruzi.PLoS One. 2020 Dec 7;15(12):e0243285. doi: 10.1371/journal.pone.0243285. eCollection 2020. PLoS One. 2020. PMID: 33284846 Free PMC article.
-
The determinants of the rarity of nucleic and peptide short sequences in nature.NAR Genom Bioinform. 2024 Apr 4;6(2):lqae029. doi: 10.1093/nargab/lqae029. eCollection 2024 Jun. NAR Genom Bioinform. 2024. PMID: 38584871 Free PMC article.
-
Cellular Activity of CQWW Nullomer-Derived Peptides.ACS Omega. 2025 Feb 11;10(7):6794-6800. doi: 10.1021/acsomega.4c08860. eCollection 2025 Feb 25. ACS Omega. 2025. PMID: 40028100 Free PMC article.
References
-
- Hampikian G, Andersen T. Absent sequences: nullomers and primes. Pacific Symposium on Biocomputing. 2007;12:355–366. - PubMed
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources