Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Apr 1:10:13.
doi: 10.1186/s13015-015-0041-9. eCollection 2015.

On the family-free DCJ distance and similarity

Affiliations

On the family-free DCJ distance and similarity

Fábio V Martinez et al. Algorithms Mol Biol. .

Abstract

Structural variation in genomes can be revealed by many (dis)similarity measures. Rearrangement operations, such as the so called double-cut-and-join (DCJ), are large-scale mutations that can create complex changes and produce such variations in genomes. A basic task in comparative genomics is to find the rearrangement distance between two given genomes, i.e., the minimum number of rearragement operations that transform one given genome into another one. In a family-based setting, genes are grouped into gene families and efficient algorithms have already been presented to compute the DCJ distance between two given genomes. In this work we propose the problem of computing the DCJ distance of two given genomes without prior gene family assignment, directly using the pairwise similarities between genes. We prove that this new family-free DCJ distance problem is APX-hard and provide an integer linear program to its solution. We also study a family-free DCJ similarity and prove that its computation is NP-hard.

Keywords: DCJ; Family-free genome comparison; Genome rearrangement.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The adjacency graph for the two unichromosomal and linear genomes A={(∘ −1 3 4 2 ∘)} and B={(∘ −2 1 4 3 ∘)} .
Figure 2
Figure 2
A possible gene similarity graph for the two unichromosomal linear genomes A={(∘ 1 2 3 4 5 ∘)} and B={(∘ 6 −7 −8 −9 10 11 ∘)} .
Figure 3
Figure 3
Reduced genomes and their weighted adjacency graph. Considering the genomes A={(∘ 1 2 3 4 5 ∘)} and B={(∘ 6 −7 −8 −9 10 11 ∘)} as in Figure 2, let M 1 (dotted edges) and M 2 (dashed edges) be two distinct matchings in G S σ(A,B), shown in the upper part. The two resulting weighted adjacency graphs AGσ(AM1,BM1), that has two odd paths and three cycles, and AGσ(AM2,BM2), that has two odd paths and two cycles, are shown in the lower part.
Figure 4
Figure 4
Gene similarity graph GS σ (A F ,B F ) constructed from the input genomes A={(∘ a c −b d ∘)} and B={(∘ −c d a c b −b ∘)} of (1,2) - EXDCJ-DISTANCE , where all edge weights are 1. Highlighted edges represent a maximal matching M in G S σ(A F,B F).
Figure 5
Figure 5
Gene similarity graph GS σ (A,B) for k=3 .

References

    1. Sankoff D. Proc. of CPM 1992. LNCS, vol. 644. Heidelberg: Springer Verlag; 1992. Edit distance for genome comparison based on non-local operations.
    1. Bergeron A, Mixtacki J, Stoye J. Proc. of WABI 2006. LNBI, vol. 4175. Heidelberg: Springer Verlag; 2006. A unifying view of genome rearrangements.
    1. Bafna V, Pevzner P. Genome rearrangements and sorting by reversals. In: Proc. of FOCS 1993: 1993. p. 148–57.
    1. Hannenhalli S, Pevzner P. Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proc. of FOCS 1995: 1995. p. 581–92.
    1. Yancopoulos S, Attie O, Friedberg R. Efficient sorting of genomic permutations by translocation, inversion and block interchanges. Bioinformatics. 2005;21(16):3340–6. doi: 10.1093/bioinformatics/bti535. - DOI - PubMed

LinkOut - more resources