Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Jan 28;16(1):19.
doi: 10.1186/s12859-014-0442-7.

MDAT- Aligning multiple domain arrangements

Affiliations

MDAT- Aligning multiple domain arrangements

Carsten Kemena et al. BMC Bioinformatics. .

Abstract

Background: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments.

Results: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method.

Conclusion: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat .

PubMed Disclaimer

Figures

Figure 1
Figure 1
Domain similarity score distribution. The scores were calculated by HHsearch, for all pairwise alignment scores of Pfam-A domains (version 27). The values have been divided into two groups depending on whether the two domains belonging to the same clan or not (different or no clan). Values of self alignments are not included.
Figure 2
Figure 2
Example MDA: The IPR021012 family, consisting of 90 sequences is shown. All segments contain the Dscam domain. The pfam_scan script has been used to perform the domain annotation. The first column depicts the reference arrangement ID, and the second the number of times this arrangement has been encountered.
Figure 3
Figure 3
Example of an MDA based sequence alignment: An MSA of seven sequences from the BAliBASE3 benchmark with annotated domains is shown. The upper alignment has been generated using MDAT, the bottom one using MAFFT. Due to the incorporation of domain information MDAT is able to align all 5 occurring Fer2 domains correctly together, MAFFT only aligns 4 of the domains and stretches them very widely.
Figure 4
Figure 4
Workflow of the MDA2MSA algorithm. Step 1: Sequences with identical domain arrangements (a) are split according to domain boundaries (b). Then each part is separately aligned (c) and finally all parts are merged back together (d) into a single alignment. Step 2: The MDA (a) is used as a guide. The sequences are split into parts according to the MDA (b). Cuts are performed at the borders of aligned domains resulting in 5 parts. Each pair of sequence segments can now be aligned separately. In case unaligned domains occur in the MDA (part 3), the dynamic programming algorithm is changed such that it maintains the order of domains (c). The striped area represents the area that is not calculated because the MDA forbids the alignment of the two domains.

References

    1. Moore AD, Björklund AK, Ekman D, Bornberg-Bauer E, Elofsson A. Arrangements in the modular evolution of proteins. Trends Biochem Sci. 2008;33(9):444–51. doi: 10.1016/j.tibs.2008.05.008. - DOI - PubMed
    1. Marsh JA, Teichmann SA. How do proteins gain new domains? Genome Biol. 2010;11(7):126. doi: 10.1186/gb-2010-11-7-126. - DOI - PMC - PubMed
    1. Forslund K, Sonnhammer ELL. Evolution of protein domain architectures. Methods Mol Biol. 2012;856:187–216. doi: 10.1007/978-1-61779-585-5_8. - DOI - PubMed
    1. Bornberg-Bauer E, Albà M M Dynamics and adaptive benefits of modular protein evolution. Curr Opin Struct Biol. 2013;23(3):459–66. doi: 10.1016/j.sbi.2013.02.012. - DOI - PubMed
    1. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate JG, Boursnell C, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40(Database-Issue):290–301. doi: 10.1093/nar/gkr1065. - DOI - PMC - PubMed

Publication types

LinkOut - more resources