Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 26;5(5):e22461.
doi: 10.2196/22461.

Neural Machine Translation-Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development and Validation Study

Affiliations

Neural Machine Translation-Based Automated Current Procedural Terminology Classification System Using Procedure Text: Development and Validation Study

Hyeon Joo et al. JMIR Form Res. .

Abstract

Background: Administrative costs for billing and insurance-related activities in the United States are substantial. One critical cause of the high overhead of administrative costs is medical billing errors. With advanced deep learning techniques, developing advanced models to predict hospital and professional billing codes has become feasible. These models can be used for administrative cost reduction and billing process improvements.

Objective: In this study, we aim to develop an automated anesthesiology current procedural terminology (CPT) prediction system that translates manually entered surgical procedure text into standard forms using neural machine translation (NMT) techniques. The standard forms are calculated using similarity scores to predict the most appropriate CPT codes. Although this system aims to enhance medical billing coding accuracy to reduce administrative costs, we compare its performance with that of previously developed machine learning algorithms.

Methods: We collected and analyzed all operative procedures performed at Michigan Medicine between January 2017 and June 2019 (2.5 years). The first 2 years of data were used to train and validate the existing models and compare the results from the NMT-based model. Data from 2019 (6-month follow-up period) were then used to measure the accuracy of the CPT code prediction. Three experimental settings were designed with different data types to evaluate the models. Experiment 1 used the surgical procedure text entered manually in the electronic health record. Experiment 2 used preprocessing of the procedure text. Experiment 3 used preprocessing of the combined procedure text and preoperative diagnoses. The NMT-based model was compared with the support vector machine (SVM) and long short-term memory (LSTM) models.

Results: The NMT model yielded the highest top-1 accuracy in experiments 1 and 2 at 81.64% and 81.71% compared with the SVM model (81.19% and 81.27%, respectively) and the LSTM model (80.96% and 81.07%, respectively). The SVM model yielded the highest top-1 accuracy of 84.30% in experiment 3, followed by the LSTM model (83.70%) and the NMT model (82.80%). In experiment 3, the addition of preoperative diagnoses showed 3.7%, 3.2%, and 1.3% increases in the SVM, LSTM, and NMT models in top-1 accuracy over those in experiment 2, respectively. For top-3 accuracy, the SVM, LSTM, and NMT models achieved 95.64%, 95.72%, and 95.60% for experiment 1, 95.75%, 95.67%, and 95.69% for experiment 2, and 95.88%, 95.93%, and 95.06% for experiment 3, respectively.

Conclusions: This study demonstrates the feasibility of creating an automated anesthesiology CPT classification system based on NMT techniques using surgical procedure text and preoperative diagnosis. Our results show that the performance of the NMT-based CPT prediction system is equivalent to that of the SVM and LSTM prediction models. Importantly, we found that including preoperative diagnoses improved the accuracy of using the procedure text alone.

Keywords: CPT classification; machine learning; natural language processing; neural machine translation.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest: None declared.

Figures

Figure 1
Figure 1
The architecture of an automated current procedural terminology coding system based on the Transformer model. CPT: current procedural terminology.
Figure 2
Figure 2
The flowchart of data selection and rules to split the training, validation, and holdout sets. CPT: current procedural terminology.
Figure 3
Figure 3
The distribution of current procedural terminology codes in the training, testing, and holdout sets, sorted by most to least frequent codes.
Figure 4
Figure 4
The top-1 and top-3 accuracy comparison based on the training sample size. LSTM: long short-term memory; NMT: neural machine translation; SVM: support vector machine.
Figure 5
Figure 5
Bilingual Evaluation Understudy scores of imbalanced labels for translating manually entered procedure text into preferred terms in step 1 of the neural machine translation–based model. BLEU: Bilingual Evaluation Understudy.

References

    1. Himmelstein DU, Campbell T, Woolhandler S. Health care administrative costs in the United States and Canada, 2017. Ann Intern Med. 2020 Jan 21;172(2):134–42. doi: 10.7326/M19-2818. http://paperpile.com/b/rjjJZS/lM2f - DOI - PubMed
    1. Himmelstein DU, Jun M, Busse R, Chevreul K, Geissler A, Jeurissen P, Thomson S, Vinet M, Woolhandler S. A comparison of hospital administrative costs in eight nations: US costs exceed all others by far. Health Aff (Millwood) 2014 Sep;33(9):1586–94. doi: 10.1377/hlthaff.2013.1327. http://paperpile.com/b/rjjJZS/MKqD - DOI - PubMed
    1. Papanicolas I, Woskie LR, Jha AK. Health care spending in the United States and other high-income countries. J Am Med Assoc. 2018 Mar 13;319(10):1024–39. doi: 10.1001/jama.2018.1150. http://paperpile.com/b/rjjJZS/LXW4 - DOI - PubMed
    1. Emanuel EJ. The real cost of the US health care system. J Am Med Assoc. 2018 Mar 13;319(10):983–5. doi: 10.1001/jama.2018.1151. http://paperpile.com/b/rjjJZS/VScV - DOI - PubMed
    1. Kahn JG, Kronick R, Kreger M, Gans DN. The cost of health insurance administration in California: estimates for insurers, physicians, and hospitals. Health Aff (Millwood) 2005 Nov;24(6):1629–39. doi: 10.1377/hlthaff.24.6.1629. http://paperpile.com/b/rjjJZS/Ss1O - DOI - PubMed