On the family-free DCJ distance and similarity

Fábio V Martinez¹, Pedro Feijão², Marília Dv Braga³, Jens Stoye²

Affiliations

¹ Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Avenida Costa e Silva, s-n, Campo Grande, 79070-900 MS Brazil ; Technische Fakultät and CeBiTec, Universität Bielefeld, Universitätsstr. 25, Bielefeld, 33615 Germany.
² Technische Fakultät and CeBiTec, Universität Bielefeld, Universitätsstr. 25, Bielefeld, 33615 Germany.
³ Inmetro - Instituto Nacional de Metrologia, Qualidade e Tecnologia, Av. Nossa Senhora das Graças, 50, Duque de Caxias, 25250-020 RJ Brazil.

PMID: 25859276
PMCID: PMC4391664
DOI: 10.1186/s13015-015-0041-9

On the family-free DCJ distance and similarity

Fábio V Martinez et al. Algorithms Mol Biol. 2015.

. 2015 Apr 1:10:13.

doi: 10.1186/s13015-015-0041-9. eCollection 2015.

Authors

Fábio V Martinez¹, Pedro Feijão², Marília Dv Braga³, Jens Stoye²

Affiliations

¹ Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Avenida Costa e Silva, s-n, Campo Grande, 79070-900 MS Brazil ; Technische Fakultät and CeBiTec, Universität Bielefeld, Universitätsstr. 25, Bielefeld, 33615 Germany.
² Technische Fakultät and CeBiTec, Universität Bielefeld, Universitätsstr. 25, Bielefeld, 33615 Germany.
³ Inmetro - Instituto Nacional de Metrologia, Qualidade e Tecnologia, Av. Nossa Senhora das Graças, 50, Duque de Caxias, 25250-020 RJ Brazil.

PMID: 25859276
PMCID: PMC4391664
DOI: 10.1186/s13015-015-0041-9

Abstract

Structural variation in genomes can be revealed by many (dis)similarity measures. Rearrangement operations, such as the so called double-cut-and-join (DCJ), are large-scale mutations that can create complex changes and produce such variations in genomes. A basic task in comparative genomics is to find the rearrangement distance between two given genomes, i.e., the minimum number of rearragement operations that transform one given genome into another one. In a family-based setting, genes are grouped into gene families and efficient algorithms have already been presented to compute the DCJ distance between two given genomes. In this work we propose the problem of computing the DCJ distance of two given genomes without prior gene family assignment, directly using the pairwise similarities between genes. We prove that this new family-free DCJ distance problem is APX-hard and provide an integer linear program to its solution. We also study a family-free DCJ similarity and prove that its computation is NP-hard.

Keywords: DCJ; Family-free genome comparison; Genome rearrangement.

PubMed Disclaimer

Figures

**Figure 1**
**The adjacency graph for the two unichromosomal and linear genomes** ***A={(∘ −1 3 4 2 ∘)}*** **and** ***B={(∘ −2 1 4 3 ∘)}*** .

**Figure 2**
**A possible gene similarity graph for the two unichromosomal linear genomes** ***A={(∘ 1 2 3 4 5 ∘)}*** **and** ***B={(∘ 6 −7 −8 −9 10 11 ∘)}*** .

**Figure 3**
**Reduced genomes and their weighted adjacency graph.** Considering the genomes A={(∘ 1 2 3 4 5 ∘)} and B={(∘ 6 −7 −8 −9 10 11 ∘)} as in Figure 2, let M ₁ (dotted edges) and M ₂ (dashed edges) be two distinct matchings in G S _σ(A,B), shown in the upper part. The two resulting weighted adjacency graphs ${AG}_{σ} (A^{M_{1}}, B^{M_{1}})$ , that has two odd paths and three cycles, and ${AG}_{σ} (A^{M_{2}}, B^{M_{2}})$ , that has two odd paths and two cycles, are shown in the lower part.

**Figure 4**
**Gene similarity graph** GS _σ (A _F ,B _F ) **constructed from the input genomes** ***A={(∘ a c −b d ∘)}*** **and** ***B={(∘ −c d a c b −b ∘)}*** of ***(1,2)*** - EXDCJ-DISTANCE **, where all edge weights are 1.** Highlighted edges represent a maximal matching M in G S _σ(A _F,B _F).

**Figure 5**
**Gene similarity graph** GS _σ ***(A,B)*** **for** ***k=3*** .

See this image and copyright information in PMC

References

1. Sankoff D. Proc. of CPM 1992. LNCS, vol. 644. Heidelberg: Springer Verlag; 1992. Edit distance for genome comparison based on non-local operations.
1. Bergeron A, Mixtacki J, Stoye J. Proc. of WABI 2006. LNBI, vol. 4175. Heidelberg: Springer Verlag; 2006. A unifying view of genome rearrangements.
1. Bafna V, Pevzner P. Genome rearrangements and sorting by reversals. In: Proc. of FOCS 1993: 1993. p. 148–57.
1. Hannenhalli S, Pevzner P. Transforming men into mice (polynomial algorithm for genomic distance problem). In: Proc. of FOCS 1995: 1995. p. 581–92.
1. Yancopoulos S, Attie O, Friedberg R. Efficient sorting of genomic permutations by translocation, inversion and block interchanges. Bioinformatics. 2005;21(16):3340–6. doi: 10.1093/bioinformatics/bti535. - DOI - PubMed

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

On the family-free DCJ distance and similarity

Affiliations

On the family-free DCJ distance and similarity

Authors

Affiliations

Abstract

Figures

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous