A model for indexing medical documents combining statistical and symbolic knowledge
- PMID: 18693792
- PMCID: PMC2655916
A model for indexing medical documents combining statistical and symbolic knowledge
Abstract
Objectives: To develop and evaluate an information processing method based on terminologies, in order to index medical documents in any given documentary context.
Methods: We designed a model using both symbolic general knowledge extracted from the Unified Medical Language System (UMLS) and statistical knowledge extracted from a domain of application. Using statistical knowledge allowed us to contextualize the general knowledge for every particular situation. For each document studied, the extracted terms are ranked to highlight the most significant ones. The model was tested on a set of 17,079 French standardized discharge summaries (SDSs).
Results: The most important ICD-10 term of each SDS was ranked 1st or 2nd by the method in nearly 90% of the cases.
Conclusions: The use of several terminologies leads to more precise indexing. The improvement achieved in the models implementation performances as a result of using semantic relationships is encouraging.
Figures




Similar articles
-
Multi-terminology indexing for the assignment of MeSH descriptors to medical abstracts in French.AMIA Annu Symp Proc. 2009 Nov 14;2009:521-5. AMIA Annu Symp Proc. 2009. PMID: 20351910 Free PMC article.
-
Improving the quality of the coding of primary diagnosis in standardized discharge summaries.Health Care Manag Sci. 2008 Jun;11(2):147-51. doi: 10.1007/s10729-008-9060-0. Health Care Manag Sci. 2008. PMID: 18581821
-
Synonym, topic model and predicate-based query expansion for retrieving clinical documents.AMIA Annu Symp Proc. 2012;2012:1050-9. Epub 2012 Nov 3. AMIA Annu Symp Proc. 2012. PMID: 23304381 Free PMC article.
-
Predication-based semantic indexing: permutations as a means to encode predications in semantic space.AMIA Annu Symp Proc. 2009 Nov 14;2009:114-8. AMIA Annu Symp Proc. 2009. PMID: 20351833 Free PMC article.
-
Mining knowledge from corpora: an application to retrieval and indexing.Stud Health Technol Inform. 2008;136:467-72. Stud Health Technol Inform. 2008. PMID: 18487775
References
-
- Salton G. Automatic text analysis. Science. 1970 Apr 17;168(929):335–43. - PubMed
-
- Sparck Jones K, Walker S, Robertson SE.A probabilistic model of information retrieval: development and comparative experiments Information Processing and Management 200036Part 1779–808.Part 2 9–40.
-
- Aronson AR, Mork JG, Gay CW, Humphrey SM, Rogers WJ.The NLM indexing initiative's medical text indexer Medinfo 200411(Pt 1)268–72. - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources