Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review

Text Mining Gene Selection to Understand Pathological Phenotype Using Biological Big Data

In: Bioinformatics [Internet]. Brisbane (AU): Exon Publications; 2021 Mar 20. Chapter 1.
Affiliations
Free Books & Documents
Review

Text Mining Gene Selection to Understand Pathological Phenotype Using Biological Big Data

Christophe Desterke et al.
Free Books & Documents

Excerpt

Whole transcriptome omics experiments allow for the study of gene regulation at the cellular level. During analysis and interpretation of omics data, false discovery can occur. To minimize false discovery and identify true significant cases, multi-test correction has been introduced to bioinformatics algorithms. The scientific literature offers a huge collection of information that can be parsed using a web Application Programming Interface. Gene selection by text mining can rank information according to its importance while taking into account the most recent updates in scientific literature. The integration of text mining selection in biological big data, such as transcriptome experiments including single cell transcriptome, can achieve an important dimensional reduction of the data without any statistical hypothesis. This avoids false discoveries regarding the molecules of interest. Hydatidiform moles and focal segmental glomerulosclerosis (FSGS) nephropathy are the two examples presented in this chapter, which demonstrate the considerable value of these analytical methods to prove the concept. The best FSGS markers expressed can be displayed by building an interactive online web interface as a web resource based on the glomerular cell transcriptome. This chapter shows the value of integrating text mining with omics data analysis to discover specific molecules and determine their locations and functions associated with complex diseases.

PubMed Disclaimer

References

    1. Campillos M, Kuhn M, Gavin A-C, Jensen LJ, Bork P. Drug target identification using side-effect similarity. Science. 2008;321(5886):263–6. https://doi.org/10.1126/science.1158140 . - DOI - PubMed
    1. Krallinger M, Valencia A, Hirschman L. Linking genes to literature: text mining, information extraction, and retrieval applications for biology. Genome Biol. 2008;9 Suppl 2:S8. https://doi.org/10.1186/gb-2008-9-s2-s8 . - DOI - PMC - PubMed
    1. Ashburn TT, Thor KB. Drug repositioning: identifying and developing new uses for existing drugs. Nat Rev Drug Discov. 2004;3(8):673–83. https://doi.org/10.1038/nrd1468 . - DOI - PubMed
    1. Swanson DR. Medical literature as a potential source of new knowledge. Bull Med Libr Assoc. 1990;78(1):29–37. - PMC - PubMed
    1. Rani J, Shah ABR, Ramachandran S. pubmed.mineR: an R package with text-mining algorithms to analyse PubMed abstracts. J Biosci. 2015 oct;40(4):671–82. https://doi.org/10.1007/s12038-015-9552-2 . - DOI - PubMed

LinkOut - more resources