Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Nov-Dec;9(6):637-52.
doi: 10.1197/jamia.m1075.

Methods for semi-automated indexing for high precision information retrieval

Affiliations

Methods for semi-automated indexing for high precision information retrieval

Daniel C Berrios et al. J Am Med Inform Assoc. 2002 Nov-Dec.

Abstract

Objective: To evaluate a new system, ISAID (Internet-based Semi-automated Indexing of Documents), and to generate textbook indexes that are more detailed and more useful to readers.

Design: Pilot evaluation: simple, nonrandomized trial comparing ISAID with manual indexing methods. Methods evaluation: randomized, cross-over trial comparing three versions of ISAID and usability survey.

Participants: Pilot evaluation: two physicians. Methods evaluation: twelve physicians, each of whom used three different versions of the system for a total of 36 indexing sessions.

Measurements: Total index term tuples generated per document per minute (TPM), with and without adjustment for concordance with other subjects; inter-indexer consistency; ratings of the usability of the ISAID indexing system.

Results: Compared with manual methods, ISAID decreased indexing times greatly. Using three versions of ISAID, inter-indexer consistency ranged from 15% to 65% with a mean of 41%, 31%, and 40% for each of three documents. Subjects using the full version of ISAID were faster (average TPM: 5.6) and had higher rates of concordant index generation. There were substantial learning effects, despite our use of a training/run-in phase. Subjects using the full version of ISAID were much faster by the third indexing session (average TPM: 9.1). There was a statistically significant increase in three-subject concordant indexing rate using the full version of ISAID during the second indexing session (p < 0.05).

Summary: Users of the ISAID indexing system create complex, precise, and accurate indexing for full-text documents much faster than users of manual methods. Furthermore, the natural language processing methods that ISAID uses to suggest indexes contributes substantially to increased indexing speed and accuracy.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The ISAID indexing system’s role in ELBook, a high precision information retrieval system. ISAID requires syntactic, semantic, and other domain-specific knowledge from query and concept models. Typically, a domain expert will first author a query model using the QueryEditor (upper right). The domain expert then exports the query model in XML and HTML formats for use in ISAID (upper left) and a search interface (bottom), respectively. Next, a system administrator will prepare one or more documents for indexing using the ISAID Text Analyzer. Then, one or more users of the ISAID indexing interface (Figure 2▶) can generate indexes for these documents, which are stored in a networked Index Database. Once such indexes have been generated, an end-user can search them using a dynamically generated Query Interface. Arrows indicate data flow: heavy arrows represent data input to and output from ISAID.
Figure 2
Figure 2
The ISAID indexing interface. The interface consists of a main window with four frames. At the top of the window is the query model frame and at the bottom is the document view frame. The two remaining frames, the markup view frame and the query template frame, have tabs (labeled “markup” and “template”) and can be “pushed off” the display. The query template frame (magnified in B) displays which generic query and which concept values ISAID suggests for indexing the document element shown in B.
Figure 3
Figure 3
A–F, Interaction with the ISAID indexing interface. The steps shown are those a user would take to create a new instance of an index for a sentence from the Special Operations Forces Handbook of Medicine (U.S. Armed Forces, used with permission).
Figure 3
Figure 3
A–F, Interaction with the ISAID indexing interface. The steps shown are those a user would take to create a new instance of an index for a sentence from the Special Operations Forces Handbook of Medicine (U.S. Armed Forces, used with permission).
Figure 4
Figure 4
The term-vector-space model adapted for information indexing. A given document is compared with several queries (only two queries are shown) using each component of a term vector (only two components are shown).
Figure 5
Figure 5
The average rate (in indexed term tuples per minute) that subjects generated index tuples using three versions of the indexing interface: one that proposes concept values, one that proposes concept values and selects query templates, and one that neither proposes concept values nor selects query templates, by concordance index and by indexing session. See Methods for calculation of concordance index. *Statistically significant in ANOVA analysis, p < 0.05.

Similar articles

Cited by

References

    1. Forsythe DE, Buchanan BG, Osheroff JA, Miller RA. Expanding the concept of medical information: An observational study of physicians’ information needs. Comput Biomed Res. 1992;25(2):181–200. - PubMed
    1. Covell DG, Uman GC, Manning PR. Information needs in office practice: are they being met? Ann Intern Med. 1985; 103:596–9. - PubMed
    1. Giuse NB, Huber JT, Giuse DA, et al. Information needs of health care professionals in an AIDS outpatient clinic as determined by chart review [see comments]. J Am Med Inform Assoc. 1994;1(5):395–403. - PMC - PubMed
    1. Chambliss ML, Conley J. Answering clinical questions. J Fam Pract. 1996;43(2):140–4. - PubMed
    1. Hersh WR, Hickam DH. An evaluation of interactive Boolean and natural language searching with an online medical textbook. J Am Soc Info Sci. 1995;46(7):478–9.

Publication types