Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;16 Suppl 14(Suppl 14):S3.
doi: 10.1186/1471-2105-16-S14-S3. Epub 2015 Oct 2.

Reconstruction of ancestral gene orders using intermediate genomes

Reconstruction of ancestral gene orders using intermediate genomes

Pedro Feijão. BMC Bioinformatics. 2015.

Abstract

Background: The problem of reconstructing ancestral genomes in a given phylogenetic tree arises in many different comparative genomics fields. Here, we focus on reconstructing the gene order of ancestral genomes, a problem that has been largely studied in the past 20 years, especially with the increasing availability of whole genome DNA sequences. There are two main approaches to this problem: event-based methods, that try to find the ancestral genomes that minimize the number of rearrangement events in the tree; and homology-based, that look for conserved structures, such as adjacent genes in the extant genomes, to build the ancestral genomes.

Results: We propose algorithms that use the concept of intermediate genomes, arising in optimal pairwise rearrangement scenarios. We show that intermediate genomes have combinatorial properties that make them easy to reconstruct, and develop fast algorithms with better reconstructed ancestral genomes than current event-based methods. The proposed framework is also designed to accept extra information, such as results from homology-based approaches, giving rise to combined algorithms with better results than the original methods.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Genome G = {(○ 1 −2 3 ○), (○ 4 5 ○)} with adjacency set A = {○1t, 1h 2h, 2t 3t, 3h ○, ○4t, 4h 5t, 5h○}.
Figure 2
Figure 2
Breakpoint graph BP(A, B) of A = {○1t, 1h 2t, 2h 3t, 3h 4t, 4h ○, ○5t, 5h6t, 6h 7t, 7h ○} and B = {1h 2h, 2t 3h, 3t 4t, 4h 1t, ○6t, 6h 5t, 5h 7h, 7t○}. A-edges are drawn in green, and B-edges in blue.
Figure 3
Figure 3
Circular BP graph BP(A, B) of the genomes in Figure 2, with telomeric adjacencies. A-edges are drawn in green, and B-edges in blue.
Figure 4
Figure 4
Possible ways of applying a DCJ operation on adjacencies pq and rs. (a) The initial BP(A, B); (b) Applying DCJ {pq, rs} → {ps, qr} on genome A; (c) Applying {pq, rs} → {pr, qs} on genome A; The DCJ in (b) is optimal, increasing the number of cycles by one. The DCJ in (c) is not, and introduces crossing edges.
Figure 5
Figure 5
Scenario  S= {A = M0 , M1 , M2 , M3 = B} transforming A (green edges) into B (blue edges) with DCJ operations, shown in BP(A, B). Intermediate genomes Mi are drawn at each step with orange edges. At each step, a DCJ operation is applied on the edges labeled with ∗. Orange edges never cross in the intermediate genomes.
Figure 6
Figure 6
(a) A genome M , in orange, with exactly two crossing edges in BP(A, B), with A in green and B in blue. (b) A DCJ operation {pr, sq} → {pq, rs} on M that uncross the edges breaks one AM-cycle in two, keeping the other BM-cycle, meaning that M is not an intermediate genome, as shown in Lemma 2.
Figure 7
Figure 7
Number of AM- and BM-cycles in an AB-cycle of size 2n with n non-crossing M-edges. A-, B- and M -edges are drawn in green, blue and orange, respectively. (a) An AB-cycle of size 2 has one AM- and one BM-cycle. An AB-cycle of size n, shown in (b), has one more AM-cycle than an AB-cycle of size n − 2, with the same number of BM-cycles, shown in (c).
Figure 8
Figure 8
Solving the Guided Intermediate problem for BP(A, B) with two AB-cycles, and guide G = {pq, rs, uv}. At each step, an adjacency from G is taken, and if accepted its vertices and incident edges are removed, and two new edges are added to complete the resulting two smaller AB-cycles. The problem is then recursively solved on the smaller cycles. (a) Initial BP(A, B), where adjacency pq is accepted. (b) Second adjacency rs is also accepted. (c) The adjacency uv covers two components, because it crosses with a previously added adjacency, and it is rejected. (d) In the end of the algorithm, three extra adjacencies (shown as dotted orange edges) are added for each AB-cycle of size 2, since this is the only choice for a non-crossing edge belonging to an intermediate genome M. The final output is then the adjacency set {pq, rs} plus the three adjacencies corresponding to the dotted orange edges.
Figure 9
Figure 9
Percentage of correct adjacencies found by the Homology algorithms, for each tree diameter.
Figure 10
Figure 10
Percentage of correct adjacencies found by the Distance algorithms, for each tree diameter.
Figure 11
Figure 11
Total DCJ tree length for the distance algorithms.
Figure 12
Figure 12
Total DCJ distance between the simulated and reconstructed ancestors for the distance algorithms.

Similar articles

Cited by

References

    1. Sankoff D, Blanchette M. Multiple genome rearrangement and breakpoint phylogeny. Journal of Computational Biology. 1998;5(3):555–570. doi: 10.1089/cmb.1998.5.555. - DOI - PubMed
    1. Yancopoulos S, Attie O, Friedberg R. Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics. 2005;21(16):3340–6. doi: 10.1093/bioinformatics/bti535. - DOI - PubMed
    1. Bergeron A, Mixtacki J, Stoye J. A unifying view of genome rearrangements. Lecture Notes in Computer Science. 2006;4175:163–173. doi: 10.1007/11851561_16. - DOI
    1. Alekseyev MA, Pevzner PA. Breakpoint graphs and ancestral genome reconstructions. Genome Research. 2009;19(5):943–57. doi: 10.1101/gr.082784.108. - DOI - PMC - PubMed
    1. Zheng C, Sankoff D. On the PATHGROUPS approach to rapid small phylogeny. BMC bioinformatics. 2011;12(Suppl 1):4. doi: 10.1186/1471-2105-12-S1-S4. - DOI - PMC - PubMed

Publication types

LinkOut - more resources