Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jan 12:5:27.
doi: 10.1186/s40064-015-1609-z. eCollection 2016.

Structural homology guided alignment of cysteine rich proteins

Affiliations

Structural homology guided alignment of cysteine rich proteins

Thomas M A Shafee et al. Springerplus. .

Abstract

Background: Cysteine rich protein families are notoriously difficult to align due to low sequence identity and frequent insertions and deletions.

Results: Here we present an alignment method that ensures homologous cysteines align by assigning a unique 10 amino acid barcode to those identified as structurally homologous by the DALI webserver. The free inter-cysteine regions of the barcoded sequences can then be aligned using any standard algorithm. Finally the barcodes are replaced with the original columns to yield an alignment which requires the minimum of manual refinement.

Conclusions: Using structural homology information to constrain sequence alignments allows the alignment of highly divergent, repetitive sequences that are poorly dealt with by existing algorithms. Tools are provided to perform this method online using the CysBar web-tool (http://CysBar.science.latrobe.edu.au) and offline (python script available from http://github.com/ts404/CysBar).

Keywords: Alignment; Barcode; Cysteine-rich proteins; Defensin.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Overview of barcode alignment method. Lack of sequence conservation and abundance of cysteines prevents automatic alignment by standard methods. Homologous cysteines identified from structural alignment are replaced with 10aa barcodes to pin them in place. Standard algorithms are used to realign free loops between the barcoded columns. Barcodes are exchanged for the original columns for the final alignment and phylogeny calculation. Sequences are coloured with cysteines in yellow, any other residue in grey, gaps in light grey, and barcode sequences in blue
Fig. 2
Fig. 2
Identifying homologous cysteines by structural alignment. a The starting query structure (1MR4). b Overlay of aligned structures identified by DALI (Holm and Rosenström 2010). c The cysteine pairs indicated by DALI to be homologous in the structures. d Alignment of sequence based on structure by DALI. PDB accession numbers: 1MR4, 1N4N, 1UGL, 2LR5, 1I2V, 1FJN, 1SN1, 2PTA
Fig. 3
Fig. 3
Final alignment. a Alignment of the sequences after barcodes have been replaced with original sequence columns by CysBar-r. Sequences coloured with cysteines in yellow, any other residue in grey, gaps in light grey. b Distribution of lengths of inter-cysteine loops. ce Distribution of sequence length, hydrophobicity and net charge. Data from oop_statistics.csv processed by loopproperties.xlsx

References

    1. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol. 1990;215:403–410. doi: 10.1016/S0022-2836(05)80360-2. - DOI - PubMed
    1. Bystrykh LV. Generalized DNA barcode design based on Hamming codes. PLoS One. 2012 - PMC - PubMed
    1. Ceroni A, Passerini A, Vullo A, Frasconi P. DISULFIND: a disulfide bonding state and cysteine connectivity prediction server. Nucleic Acids Res. 2006;34:W177–W181. doi: 10.1093/nar/gkl266. - DOI - PMC - PubMed
    1. Chakrabarti S, Lanczycki CJ, Panchenko AR, et al. Refining multiple sequence alignments with conserved core regions. Nucleic Acids Res. 2006;34:2598–2606. doi: 10.1093/nar/gkl274. - DOI - PMC - PubMed
    1. Colgrave ML, Craik DJ. Thermal, chemical, and enzymatic stability of the cyclotide kalata B1: the importance of the cyclic cystine knot. Biochemistry. 2004;43:5965–5975. doi: 10.1021/bi049711q. - DOI - PubMed

LinkOut - more resources