A new exhaustive method and strategy for finding motifs in ChIP-enriched regions
- PMID: 24475069
- PMCID: PMC3901781
- DOI: 10.1371/journal.pone.0086044
A new exhaustive method and strategy for finding motifs in ChIP-enriched regions
Abstract
ChIP-seq, which combines chromatin immunoprecipitation (ChIP) with next-generation parallel sequencing, allows for the genome-wide identification of protein-DNA interactions. This technology poses new challenges for the development of novel motif-finding algorithms and methods for determining exact protein-DNA binding sites from ChIP-enriched sequencing data. State-of-the-art heuristic, exhaustive search algorithms have limited application for the identification of short (l, d) motifs (l ≤ 10, d ≤ 2) contained in ChIP-enriched regions. In this work we have developed a more powerful exhaustive method (FMotif) for finding long (l, d) motifs in DNA sequences. In conjunction with our method, we have adopted a simple ChIP-enriched sampling strategy for finding these motifs in large-scale ChIP-enriched regions. Empirical studies on synthetic samples and applications using several ChIP data sets including 16 TF (transcription factor) ChIP-seq data sets and five TF ChIP-exo data sets have demonstrated that our proposed method is capable of finding these motifs with high efficiency and accuracy. The source code for FMotif is available at http://211.71.76.45/FMotif/.
Conflict of interest statement
Figures
-value threshold of
, ‘FMotif’ and ‘CisFinder’ indicate the closest matching motif logos found by these programs (all motif logos were generated using the web-based tool Weblogo [45]), ‘Literature’ indicates the corresponding motif logos published in literature, ‘
’ indicates the number of binding sites found by either FMotif or CisFinder, and ‘Rank’ after ‘
’ is the ranking number of a reported motif found by either FMotif or CisFinder.
)), FoxA1 (hepatocyte nuclear factor 3
[42]), NRSF (neuron-restrictive silencer factor [2]), and STAT1 (signal transducer and activator of transcription protein [1]). Results from CisFinder and published motifs in literature are shown for comparison. Column definitions are the same as those in Figure 1.
,
) motif.
References
-
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4: 651–657. - PubMed
-
- Johnson DS, Mortazavi A, Myers RM, Wold B (2007) Genome-wide mapping of in vivo protein- DNA interactions. Science 316: 1497–1502. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Molecular Biology Databases
Miscellaneous
