Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Aug;33(8):831-8.
doi: 10.1038/nbt.3300. Epub 2015 Jul 27.

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

Affiliations

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

Babak Alipanahi et al. Nat Biotechnol. 2015 Aug.

Abstract

Knowing the sequence specificities of DNA- and RNA-binding proteins is essential for developing models of the regulatory processes in biological systems and for identifying causal disease variants. Here we show that sequence specificities can be ascertained from experimental data with 'deep learning' techniques, which offer a scalable, flexible and unified computational approach for pattern discovery. Using a diverse array of experimental data and evaluation metrics, we find that deep learning outperforms other state-of-the-art methods, even when training on in vitro data and testing on in vivo data. We call this approach DeepBind and have built a stand-alone software tool that is fully automatic and handles millions of sequences per experiment. Specificities determined by DeepBind are readily visualized as a weighted ensemble of position weight matrices or as a 'mutation map' that indicates how variations affect binding within a specific sequence.

PubMed Disclaimer

Comment in

References

    1. Nat Biotechnol. 2008 Dec;26(12):1351-9 - PubMed
    1. Bioinformatics. 2000 Jan;16(1):16-23 - PubMed
    1. Science. 2014 Feb 14;343(6172):764-8 - PubMed
    1. Nat Biotechnol. 2006 Nov;24(11):1429-35 - PubMed
    1. Bioinformatics. 2007 Jul 1;23(13):i72-9 - PubMed

Publication types

MeSH terms