Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May 10;6(5):e19633.
doi: 10.1371/journal.pone.0019633.

Using unsupervised patterns to extract gene regulation relationships for network construction

Affiliations

Using unsupervised patterns to extract gene regulation relationships for network construction

Yi-Tsung Tang et al. PLoS One. .

Abstract

Background: The gene expression is usually described in the literature as a transcription factor X that regulates the target gene Y. Previously, some studies discovered gene regulations by using information from the biomedical literature and most of them require effort of human annotators to build the training dataset. Moreover, the large amount of textual knowledge recorded in the biomedical literature grows very rapidly, and the creation of manual patterns from literatures becomes more difficult. There is an increasing need to automate the process of establishing patterns.

Methodology/principal findings: In this article, we describe an unsupervised pattern generation method called AutoPat. It is a gene expression mining system that can generate unsupervised patterns automatically from a given set of seed patterns. The high scalability and low maintenance cost of the unsupervised patterns could help our system to extract gene expression from PubMed abstracts more precisely and effectively.

Conclusions/significance: Experiments on several regulators show reasonable precision and recall rates which validate AutoPat's practical applicability. The conducted regulation networks could also be built precisely and effectively. The system in this study is available at http://ikmbio.csie.ncku.edu.tw/AutoPat/.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. The overall architecture of AutoPat.
Figure 2
Figure 2. Example of combined weight of extracted sentences.
Figure 3
Figure 3. The frequency distribution of unsupervised patterns.
Figure 4
Figure 4. The extraction precision of different thresholds.
Figure 5
Figure 5. The precision of the Top N ranking results for unsupervised patterns.
Figure 6
Figure 6. Examples of the homonym and abbreviation issues.
Figure 7
Figure 7. The global network of HIF-1 TF.
Figure 8
Figure 8. The example of an indirect relationship between HIF-1 and p53.

Similar articles

Cited by

References

    1. Blaschke C, Valencia A. The Potential Use of SUISEKI as a Protein Interaction Discovery Tool. Genome Informatics. 2001;12:123–134. - PubMed
    1. Kim S, Shin SY, Lee IH, Kim SJ, Sriram R, et al. PIE: an online prediction system for protein-protein interactions from text. Nucleic Acids Res. 2008;36:W411–415. - PMC - PubMed
    1. Huang M, Zhu X, Hao Y, Payan DG, Qu K, et al. Discovering patterns to extract protein-protein interactions from full texts. Bioinformatics. 2004;20:3604–3612. - PubMed
    1. Ozgur A, Vu T, Erkan G, Radev DR. Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics. 2008;24:i277–285. - PMC - PubMed
    1. Hoffmann R, Valencia A. A gene network for navigating the literature. Nat Genet. 2004;36:664. - PubMed

Publication types

MeSH terms

Substances