Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 12;4(1):e00290-18.
doi: 10.1128/mSystems.00290-18. eCollection 2019 Jan-Feb.

Orphan Genes Shared by Pathogenic Genomes Are More Associated with Bacterial Pathogenicity

Affiliations

Orphan Genes Shared by Pathogenic Genomes Are More Associated with Bacterial Pathogenicity

Sarah Entwistle et al. mSystems. .

Abstract

Orphan genes (also known as ORFans [i.e., orphan open reading frames]) are new genes that enable an organism to adapt to its specific living environment. Our focus in this study is to compare ORFans between pathogens (P) and nonpathogens (NP) of the same genus. Using the pangenome idea, we have identified 130,169 ORFans in nine bacterial genera (505 genomes) and classified these ORFans into four groups: (i) SS-ORFans (P), which are only found in a single pathogenic genome; (ii) SS-ORFans (NP), which are only found in a single nonpathogenic genome; (iii) PS-ORFans (P), which are found in multiple pathogenic genomes; and (iv) NS-ORFans (NP), which are found in multiple nonpathogenic genomes. Within the same genus, pathogens do not always have more genes, more ORFans, or more pathogenicity-related genes (PRGs)-including prophages, pathogenicity islands (PAIs), virulence factors (VFs), and horizontal gene transfers (HGTs)-than nonpathogens. Interestingly, in pathogens of the nine genera, the percentages of PS-ORFans are consistently higher than those of SS-ORFans, which is not true in nonpathogens. Similarly, in pathogens of the nine genera, the percentages of PS-ORFans matching the four types of PRGs are also always higher than those of SS-ORFans, but this is not true in nonpathogens. All of these findings suggest the greater importance of PS-ORFans for bacterial pathogenicity. IMPORTANCE Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.

Keywords: ORFan; horizontal gene transfer; orphan gene; pathogenic island; pathogenicity; prophage; virulence factor.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Pangenome idea to define different groups of ORFan genes and non-ORFan genes.
FIG 2
FIG 2
The percentages of different groups of ORFans. The violin boxplots are shown with genomes represented as dots of different colors corresponding to four groups of ORFans. For each genome, the percentages of different ORFan groups are calculated as follows: % SS-ORFans = no. of SS-ORFans/total no. of proteins in the genome. Four pairs of Wilcoxon tests were performed: (i) SS-ORFans (P) versus SS-ORFans (NP), (ii) PS-ORFans (P) versus NS-ORFans (NP), (iii) PS-ORFans (P) versus SS-ORFans (P), and (iv) NS-ORFans (NP) versus SS-ORFans (NP). Only the statistically significant differences are indicated with vertical lines and asterisks (*). Red asterisks indicate P value of <0.05, supporting higher SS-ORFans (P) in test pair i, higher PS-ORFans (P) in test pair ii, higher PS-ORFans (P) in test pair iii, and higher NS-ORFans (P) in test pair iv. Blue asterisks indicate the opposite.
FIG 3
FIG 3
More conserved PS-ORFans (but not NS-ORFans) are more likely to be found in prophages and PAIs. The x axis is the number of genera in which an ORFan has blastp hits. (The number is 1 for an ORFan restricted to its own genus.) The y axis is the percentage of ORFans (e.g., the number of ORFans located in prophages divided by the number of ORFans). The detailed numbers are available in Table S4.

Similar articles

Cited by

References

    1. Fischer D, Eisenberg D. 1999. Finding families for genomic ORFans. Bioinformatics 15:759–762. doi:10.1093/bioinformatics/15.9.759. - DOI - PubMed
    1. Ekstrom A, Yin Y. 2016. ORFanFinder: automated identification of taxonomically restricted orphan genes. Bioinformatics 32:2053–2055. doi:10.1093/bioinformatics/btw122. - DOI - PMC - PubMed
    1. Lobb B, Kurtz DA, Moreno-Hagelsieb G, Doxey AC. 2015. Remote homology and the functions of metagenomic dark matter. Front Genet 6:234. doi:10.3389/fgene.2015.00234. - DOI - PMC - PubMed
    1. Yin Y, Fischer D. 2006. On the origin of microbial ORFans: quantifying the strength of the evidence for viral lateral transfer. BMC Evol Biol 6:63. doi:10.1186/1471-2148-6-63. - DOI - PMC - PubMed
    1. Khalturin K, Hemmrich G, Fraune S, Augustin R, Bosch TCG. 2009. More than just orphans: are taxonomically-restricted genes important in evolution? Trends Genet 25:404–413. doi:10.1016/j.tig.2009.07.006. - DOI - PubMed

LinkOut - more resources