Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 28;17(7):1020.
doi: 10.3390/ijms17071020.

Computational Identification of the Paralogs and Orthologs of Human Cytochrome P450 Superfamily and the Implication in Drug Discovery

Affiliations

Computational Identification of the Paralogs and Orthologs of Human Cytochrome P450 Superfamily and the Implication in Drug Discovery

Shu-Ting Pan et al. Int J Mol Sci. .

Abstract

The human cytochrome P450 (CYP) superfamily consisting of 57 functional genes is the most important group of Phase I drug metabolizing enzymes that oxidize a large number of xenobiotics and endogenous compounds, including therapeutic drugs and environmental toxicants. The CYP superfamily has been shown to expand itself through gene duplication, and some of them become pseudogenes due to gene mutations. Orthologs and paralogs are homologous genes resulting from speciation or duplication, respectively. To explore the evolutionary and functional relationships of human CYPs, we conducted this bioinformatic study to identify their corresponding paralogs, homologs, and orthologs. The functional implications and implications in drug discovery and evolutionary biology were then discussed. GeneCards and Ensembl were used to identify the paralogs of human CYPs. We have used a panel of online databases to identify the orthologs of human CYP genes: NCBI, Ensembl Compara, GeneCards, OMA ("Orthologous MAtrix") Browser, PATHER, TreeFam, EggNOG, and Roundup. The results show that each human CYP has various numbers of paralogs and orthologs using GeneCards and Ensembl. For example, the paralogs of CYP2A6 include CYP2A7, 2A13, 2B6, 2C8, 2C9, 2C18, 2C19, 2D6, 2E1, 2F1, 2J2, 2R1, 2S1, 2U1, and 2W1; CYP11A1 has 6 paralogs including CYP11B1, 11B2, 24A1, 27A1, 27B1, and 27C1; CYP51A1 has only three paralogs: CYP26A1, 26B1, and 26C1; while CYP20A1 has no paralog. The majority of human CYPs are well conserved from plants, amphibians, fishes, or mammals to humans due to their important functions in physiology and xenobiotic disposition. The data from different approaches are also cross-validated and validated when experimental data are available. These findings facilitate our understanding of the evolutionary relationships and functional implications of the human CYP superfamily in drug discovery.

Keywords: bioinformatics; comparative genomics; drug metabolism; homolog; human CYP; ortholog; paralog.

PubMed Disclaimer

Figures

Figure 1
Figure 1
(A) Alignment of 57 human CYP proteins which are retrieved from Swiss-Prot. Multiple sequence alignment of human CYPs is carried out using Clustal W v2.0; (B) The phylogenic tree of human CYPs which can infer the evolutionary relationships among human CYPs; (C) MEME (Multiple EM for Motif Elicitation) version 4.10.1 is employed to identify important conserved motifs present in human CYP proteins.
Figure 1
Figure 1
(A) Alignment of 57 human CYP proteins which are retrieved from Swiss-Prot. Multiple sequence alignment of human CYPs is carried out using Clustal W v2.0; (B) The phylogenic tree of human CYPs which can infer the evolutionary relationships among human CYPs; (C) MEME (Multiple EM for Motif Elicitation) version 4.10.1 is employed to identify important conserved motifs present in human CYP proteins.
Figure 2
Figure 2
Gene tree for human CYP1A1, 1A2, 1B1, 17A1, and 21A2 built using Ensembl 84. These five genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 537 genes from various species. The total number of speciation nodes is 370, and the number of duplication is 143. The number of ambiguous nodes is 21, and the number of gene split events is 2.
Figure 2
Figure 2
Gene tree for human CYP1A1, 1A2, 1B1, 17A1, and 21A2 built using Ensembl 84. These five genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 537 genes from various species. The total number of speciation nodes is 370, and the number of duplication is 143. The number of ambiguous nodes is 21, and the number of gene split events is 2.
Figure 3
Figure 3
Gene tree for human CYP2A6, 2A7, 2A13, 2B6, 2C8, 2C9, 2C19, 2D6, 2D7, 2E1, 2F1, 2J2, 2R1, 2S1, 2U1, and 2W1 built using Ensembl 84. These CYP2 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 1254 genes from various species. The total number of speciation nodes is 741, and the number of duplication is 483. The number of ambiguous nodes is 29, and there is no gene split event.
Figure 3
Figure 3
Gene tree for human CYP2A6, 2A7, 2A13, 2B6, 2C8, 2C9, 2C19, 2D6, 2D7, 2E1, 2F1, 2J2, 2R1, 2S1, 2U1, and 2W1 built using Ensembl 84. These CYP2 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 1254 genes from various species. The total number of speciation nodes is 741, and the number of duplication is 483. The number of ambiguous nodes is 29, and there is no gene split event.
Figure 4
Figure 4
Gene tree for human CYP3A4, 3A5, 3A7, 3A43, 4A11, 4A22, 4B1, 4F2, 4F3, 4F8, 4F11, 4F12, 4F22, 4V2, 4X1, 4Z1, 5A1/TBXAS1, and 46A1 built using Ensembl 84. These CYP3, 4, 5 and 46 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 1008 genes from various species. The total number of speciation nodes is 558, and the number of duplication is 384. The number of ambiguous nodes is 31, and there are 4 gene split events.
Figure 4
Figure 4
Gene tree for human CYP3A4, 3A5, 3A7, 3A43, 4A11, 4A22, 4B1, 4F2, 4F3, 4F8, 4F11, 4F12, 4F22, 4V2, 4X1, 4Z1, 5A1/TBXAS1, and 46A1 built using Ensembl 84. These CYP3, 4, 5 and 46 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 1008 genes from various species. The total number of speciation nodes is 558, and the number of duplication is 384. The number of ambiguous nodes is 31, and there are 4 gene split events.
Figure 5
Figure 5
Gene tree for human CYP7A1, 7B1, 8A1/PTGIS, 8B1, and 39A1 built using Ensembl 84. These CYP7, 8, and 39 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 340 genes from various species. The total number of speciation nodes is 287, and the number of duplication is 41. The number of ambiguous nodes is 10, and there is only 1 gene split event.
Figure 5
Figure 5
Gene tree for human CYP7A1, 7B1, 8A1/PTGIS, 8B1, and 39A1 built using Ensembl 84. These CYP7, 8, and 39 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 340 genes from various species. The total number of speciation nodes is 287, and the number of duplication is 41. The number of ambiguous nodes is 10, and there is only 1 gene split event.
Figure 6
Figure 6
Gene tree for human CYP11A1, 11B1, 11C1, 24A1, 27A1, 27B1, and 27C1 built using Ensembl 84. These CYP11, 24, and 27 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 410 genes from various species. The total number of speciation nodes is 344, and the number of duplication is 52. The number of ambiguous nodes is 13, and there is no gene split event.
Figure 6
Figure 6
Gene tree for human CYP11A1, 11B1, 11C1, 24A1, 27A1, 27B1, and 27C1 built using Ensembl 84. These CYP11, 24, and 27 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 410 genes from various species. The total number of speciation nodes is 344, and the number of duplication is 52. The number of ambiguous nodes is 13, and there is no gene split event.
Figure 7
Figure 7
Gene tree for human CYP26A1, 26B1, 26C1, and 51A1 built using Ensembl 84. These CYP26 and CYP51 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 260 genes from various species. The total number of speciation nodes is 232, and the number of duplication is 12. The number of ambiguous nodes is 15, and there is no gene split event.
Figure 7
Figure 7
Gene tree for human CYP26A1, 26B1, 26C1, and 51A1 built using Ensembl 84. These CYP26 and CYP51 family genes are paralogs to each other derived from the same ancestral gene via duplication events. The gene tree includes a total of 260 genes from various species. The total number of speciation nodes is 232, and the number of duplication is 12. The number of ambiguous nodes is 15, and there is no gene split event.

Similar articles

Cited by

References

    1. Podust L.M., Sherman D.H. Diversity of P450 enzymes in the biosynthesis of natural products. Nat. Prod. Rep. 2012;29:1251–1266. doi: 10.1039/c2np20020a. - DOI - PMC - PubMed
    1. Kelly S.L., Kelly D.E. Microbial cytochromes P450: Biodiversity and biotechnology. Where do cytochromes P450 come from, what do they do and what can they do for us? Philos. Trans. R. Soc. Lond. B. 2013;368:20120476. doi: 10.1098/rstb.2012.0476. - DOI - PMC - PubMed
    1. Gillam E.M., Hayes M.A. The evolution of cytochrome P450 enzymes as biocatalysts in drug discovery and development. Curr. Top. Med. Chem. 2013;13:2254–2280. doi: 10.2174/15680266113136660158. - DOI - PubMed
    1. Van Rantwijk F., Sheldon R.A. Selective oxygen transfer catalysed by heme peroxidases: Synthetic and mechanistic aspects. Curr. Opin. Biotechnol. 2000;11:554–564. doi: 10.1016/S0958-1669(00)00143-9. - DOI - PubMed
    1. Omura T. Heme-thiolate proteins. Biochem. Biophys. Res. Commun. 2005;338:404–409. doi: 10.1016/j.bbrc.2005.08.267. - DOI - PubMed

MeSH terms

Substances

LinkOut - more resources