Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier

Yihan Deng¹, Kerstin Denecke¹

Affiliations

PMID: 36406473
PMCID: PMC9672500
DOI: 10.3389/frai.2022.1000283

Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier

Yihan Deng et al. Front Artif Intell. 2022.

. 2022 Nov 4:5:1000283.

doi: 10.3389/frai.2022.1000283. eCollection 2022.

Authors

Yihan Deng¹, Kerstin Denecke¹

Affiliation

¹ Department of Technology and Computer Science, Institute for Medical Informatics, Bern University of Applied Sciences, Biel/Bienne, Switzerland.

PMID: 36406473
PMCID: PMC9672500
DOI: 10.3389/frai.2022.1000283

Abstract

The Swiss classification of surgical interventions (CHOP) has to be used in daily practice by physicians to classify clinical procedures. Its purpose is to encode the delivered healthcare services for the sake of quality assurance and billing. For encoding a procedure, a code of a maximal of 6-digits has to be selected from the classification system, which is currently realized by a rule-based system composed of encoding experts and a manual search in the CHOP catalog. In this paper, we will investigate the possibility of automatic CHOP code generation based on a short query to enable automatic support of manual classification. The wide and deep hierarchy of CHOP and the differences between text used in queries and catalog descriptions are two apparent obstacles for training and deploying a learning-based algorithm. Because of these challenges, there is a need for an appropriate classification approach. We evaluate different strategies (multi-class non-terminal and per-node classifications) with different configurations so that a flexible modular solution with high accuracy and efficiency can be provided. The results clearly show that the per-node binary classification outperforms the non-terminal multi-class classification with an F1-micro measure between 92.6 and 94%. The hierarchical prediction based on per-node binary classifiers achieved a high exact match by the single code assignment on the 5-fold cross-validation. In conclusion, the hierarchical context from the CHOP encoding can be employed by both classifier training and representation learning. The hierarchical features have all shown improvement in the classification performances under different configurations, respectively: the stacked autoencoder and training examples aggregation using true path rules as well as the unified vocabulary space have largely increased the utility of hierarchical features. Additionally, the threshold adaption through Bayesian aggregation has largely increased the vertical reachability of the per node classification. All the trainable nodes can be triggered after the threshold adaption, while the F1 measures at code levels 3-6 have been increased from 6 to 89% after the threshold adaption.

Keywords: CHOP; ensemble classifier; feature selection; hierarchical classification; medical procedure.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

**Figure 1**
Working example for query log “Naht geburtsbedingter Riss” and relevant and irrelevant codes within the CHOP. The model with log input generates a list of relevant codes as output. The green codes are relevant (positive examples) to the query, while the brown codes are irrelevant categories (negative examples). The obtained classification model should be able to determine the corresponding category and generate the relevant CHOP code for a query.

**Figure 2**
The architecture of the proposed code assigning pipeline.

**Figure 3**
Unsupervised nested denoising autoencoders. The layer in circle is the representation vector (125 dimensions), we used for the downstream classification task.

**Figure 4**
Self attentive autoencoder: The input can be all queries or grouped queries on all true paths.

**Figure 5**
Hierarchical search based on non-terminal nodes and leaf nodes. The target query representation is learned with all query examples on its true path, so that the representation can be adapted according to all its relevant representations.

**Figure 6**
Algorithm of the hierarchical prediction of CHOP codes with base classifier. Adaptions refers to bottom-up average and Bayesian aggregation. More details about these two adaption methods are introduced in Section 6.5.

**Figure 7**
The bottom-up average for the threshold adaption.

**Figure 8**
Bayesian aggregation for the threshold adaption. The blue circles represent the binary values of a CHOP node. The green circles represent the observed classifier outputs.

**Figure 9**
Micro F1 measure of multi-class non-terminal (NT) classifiers and per node (PN) ensembled binary classifier using random forest, Adabooster, and feedforward neural networks (DNN).

**Figure 10**
Comparison of performance average hierarchical F1 between bottom-up average threshold adaption vs. the threshold adaption with Bayesian aggregation based on 20 selected CHOP code from out of sample test sets.

See this image and copyright information in PMC

References

1. Atutxa A., Pxérez A., Casillas A. (2018). Machine learning approaches on diagnostic term encoding with the icd for clinical documentation. IEEE J. Biomed. Health Inform. 22, 1323–1329. 10.1109/JBHI.2017.2743824 - DOI - PubMed
1. BfArM (2022). Operationen- und Prozedurenschlüssel. Available online at: https://www.bfarm.de/DE/Kodiersysteme/Klassifikationen/OPS-ICHI/OPS/_nod... (accessed September 30, 2022).
1. BFS S. (2022). Schweizerische Operationsklassifikation (CHOP). Available online at: https://www.bfs.admin.ch/asset/de/659-2200 (accessed September 30, 2022).
1. Boytcheva S. (2011). “Automatic matching of icd-10 codes to diagnoses in discharge letters,” in Proceedings of the Second Workshop on Biomedical Natural Language Processing (Hissar: Association for Computational Linguistics; ), 11–18.
1. Cao P., Chen Y., Liu K., Zhao J., Liu S., Chong W. (2020). “HyperCore: hyperbolic and co-graph representation for automatic ICD coding,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (Association for Computational Linguistics), 3105–3114.

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier

Affiliation

Classification of user queries according to a hierarchical medical procedure encoding system using an ensemble classifier

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Research Materials