Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;16 Suppl 5(Suppl 5):S6.
doi: 10.1186/1471-2164-16-S5-S6. Epub 2015 May 26.

ProCARs: Progressive Reconstruction of Ancestral Gene Orders

ProCARs: Progressive Reconstruction of Ancestral Gene Orders

Amandine Perrin et al. BMC Genomics. 2015.

Abstract

Background: In the context of ancestral gene order reconstruction from extant genomes, there exist two main computational approaches: rearrangement-based, and homology-based methods. The rearrangement-based methods consist in minimizing a total rearrangement distance on the branches of a species tree. The homology-based methods consist in the detection of a set of potential ancestral contiguity features, followed by the assembling of these features into Contiguous Ancestral Regions (CARs).

Results: In this paper, we present a new homology-based method that uses a progressive approach for both the detection and the assembling of ancestral contiguity features into CARs. The method is based on detecting a set of potential ancestral adjacencies iteratively using the current set of CARs at each step, and constructing CARs progressively using a 2-phase assembling method.

Conclusion: We show the usefulness of the method through a reconstruction of the boreoeutherian ancestral gene order, and a comparison with three other homology-based methods: AnGeS, InferCARs and GapAdj. The program, written in Python, and the dataset used in this paper are available at http://bioinfo.lifl.fr/procars/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of a species tree. A species tree on five genomes A, B, C, D, and E. The black-colored ancestral node defines two ingroup sets each composed of a single genome, I1 = {D} and I2 = {E}, and an outgroup set O = {A, B, C}. The conserved adjacencies at the ancestral black-colored node are given.
Figure 2
Figure 2
Diagram of the method steps. Overall description of the ProCARs method steps
Figure 3
Figure 3
Organization of the sets of adjacencies considered in Step a). A tree whose nodes represent sets of conserved adjacencies found at the current step of the method, and edges represent the inclusion relations between the sets: the root of the tree is the set S of all conserved adjacencies. Abbreviations: FS (Fully-conserved adjacencies), PS (Partly-conserved adjacencies), NC (Non-Conflicting), C (Conflicting), R (Retained), and D (Discarded). Non-conflicting sets are represented with square nodes. The sets of non-conflicting adjacencies added at the current step are represented with black-colored nodes. The sets of conflicting adjacencies saved for the next step b) are represented with gray-colored nodes.
Figure 4
Figure 4
Examples of minimum mutation cost labelings of the nodes of a species tree. Right and left trees show two minimum mutation cost labelings of the nodes of the species tree for the adjacencies (g h) (left labeling) and (g −h) (right labeling) conserved at the ancestral node in the species tree depicted in Figure 1. For each labeling, the edges of tree with a change of state are in dashed line. The cost of the left labeling is 3, and the cost of the right labeling is 2.
Figure 5
Figure 5
Breakpoint distances between the sets of CARs. The label on each edge gives the breakpoint distance between the two methods in the nodes.
Figure 6
Figure 6
Distribution of the number of blocks involved in each CAR. For each of the four methods, the number of CARs for which the number of blocks is in a given range is plotted.
Figure 7
Figure 7
Number of adjacencies shared or exclusive for each of the four methods compared. AnGeS contains no specific adjacency. For example, 635 adjacencies are shared by all the methods, and 16 are shared between AnGeS, InferCARs and ProCARs. For ProCARs and GapAdj (in italic), we also give the number of adjacencies and the step in which they have been added. For example, there are 15 adjacencies exclusive to GapAdj, of which 9 have been added at step 1, 5 at step 2 and 1 at step 3.

References

    1. Bourque G, Pevzner PA. Genome-scale evolution: reconstructing gene orders in the ancestral species. Genome Research. 2002;12(1):26–36. - PMC - PubMed
    1. Sankoff D, Blanchette M. Multiple genome rearrangement and breakpoint phylogeny. Journal of Computational Biology. 1998;5:555–570. doi: 10.1089/cmb.1998.5.555. - DOI - PubMed
    1. Zheng C, Sankoff D. On the pathgroups approach to rapid small phylogeny. BMC Bioinformatics. 2011;12:4. doi: 10.1186/1471-2105-12-4. - DOI - PMC - PubMed
    1. Bergeron A, Blanchette M, Chateau A, Chauve C. Reconstructing ancestral gene orders using conserved intervals. Lecture Notes in Computer Science. 2004;3240:14–25. doi: 10.1007/978-3-540-30219-3_2. - DOI
    1. Ma J, Zhang L, Suh BB, Rany BJ, Burhans RC, Kent WJ, Blanchette M, Haussler D, Miller W. Reconstructing contiguous regions of an ancestral genome. Genome Research. 2006;16:1557–1565. doi: 10.1101/gr.5383506. - DOI - PMC - PubMed

Publication types

LinkOut - more resources