Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 14:10:604.
doi: 10.1186/1471-2164-10-604.

The other side of comparative genomics: genes with no orthologs between the cow and other mammalian species

Affiliations

The other side of comparative genomics: genes with no orthologs between the cow and other mammalian species

Raffaele Mazza et al. BMC Genomics. .

Abstract

Background: With the rapid growth in the availability of genome sequence data, the automated identification of orthologous genes between species (orthologs) is of fundamental importance to facilitate functional annotation and studies on comparative and evolutionary genomics. Genes with no apparent orthologs between the bovine and human genome may be responsible for major differences between the species, however, such genes are often neglected in functional genomics studies.

Results: A BLAST-based method was exploited to explore the current annotation and orthology predictions in Ensembl. Genes with no orthologs between the two genomes were classified into groups based on alignments, ontology, manual curation and publicly available information. Starting from a high quality and specific set of orthology predictions, as provided by Ensembl, hidden relationship between genes and genomes of different mammalian species were unveiled using a highly sensitive approach, based on sequence similarity and genomic comparison.

Conclusions: The analysis identified 3,801 bovine genes with no orthologs in human and 1010 human genes with no orthologs in cow, among which 411 and 43 genes, respectively, had no match at all in the other species. Most of the apparently non-orthologous genes may potentially have orthologs which were missed in the annotation process, despite having a high percentage of identity, because of differences in gene length and structure. The comparative analysis reported here identified gene variants, new genes and species-specific features and gave an overview of the other side of orthology which may help to improve the annotation of the bovine genome and the knowledge of structural differences between species.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Venn diagram representation of the results obtained from the queries in Ensembl release 50. Each colored circle represents a gene set for a specie. a) query result returning 3,801 cow genes with no orthologs in human, mouse and dog. b) query results returning 1,010 human genes representing core mammalian orthologs having no orthologs in cow but with orthologs in mouse and dog.
Figure 2
Figure 2
Schematic representation of the four categories for the aligned genes. Red boxes indicate the aligned transcript, blue boxes indicate exons of annotated genes on the genome. a) potential ortholog, b) new gene, c) gene variant, d) intronic.
Figure 3
Figure 3
Statistical box plot representing the distribution of identity percentage and E-value of alignments in "cow vs. human" (a) and "human vs. cow" (b) comparisons.
Figure 4
Figure 4
Tool to display tree-like representation of the GO graphs. Main root for molecular function category is shown for human (a) and bovine (b) libraries. The main navigation features of the web site are indicated by grey boxes.
Figure 5
Figure 5
Web interface to display the results of alignments of the bovine and human libraries. Records can be retrieved by Ensembl ID, percentage of identity of the alignment, classification and gene name, through the search panel. For each record, representing a sequence, the Ensembl ID, the sequence name, the chromosome of the alignment, the percentage of identity, the E-value of the alignment and the classification flag were available. Moreover a link takes the user to the corresponding region of the alignment in the Ensembl contig view and the "detail" button expands a detail panel. In this panel the selected alignment is represented graphically in order to show the coverage and percentage of identity of the sequence but also the position of the alignment to the genome scaffold, alongside with genscan predictions and known gene sequences, retrieved directly from Ensembl database.

Similar articles

Cited by

References

    1. Koonin EV. Orthologs, paralogs, and evolutionary genomics. Annu Rev Genet. 2005;39:309–338. doi: 10.1146/annurev.genet.39.073003.114725. - DOI - PubMed
    1. Studer RA, Robinson-Rechavi M. How confident can we be that orthologs are similar, but paralogs differ? Trends Genet. 2009;25:210–216. doi: 10.1016/j.tig.2009.03.004. - DOI - PubMed
    1. Zmasek CM, Eddy SR. RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics. 2002;3:14. doi: 10.1186/1471-2105-3-14. - DOI - PMC - PubMed
    1. Storm CEV, Sonnhammer ELL. Comprehensive analysis of orthologous protein domains using the HOPS database. Genome Res. 2003;13:2353–2362. doi: 10.1101/gr1305203. - DOI - PMC - PubMed
    1. Storm CEV, Sonnhammer ELL. Automated ortholog inference from phylogenetic trees and calculation of orthology reliability. Bioinformatics. 2002;18:92–99. doi: 10.1093/bioinformatics/18.1.92. - DOI - PubMed

Publication types