Comparative Study

. 2025 May;86(3):145-160.

doi: 10.1007/s00056-023-00491-1. Epub 2023 Aug 29.

Assessment of the quality of different commercial providers using artificial intelligence for automated cephalometric analysis compared to human orthodontic experts

Felix Kunz¹, Angelika Stellzig-Eisenhauer², Lisa Marie Widmaier², Florian Zeman³, Julian Boldt⁴

Affiliations

¹ Department of Orthodontics, University Hospital of Würzburg, Pleicherwall 2, 97070, Würzburg, Germany. kunz_f@ukw.de.
² Department of Orthodontics, University Hospital of Würzburg, Pleicherwall 2, 97070, Würzburg, Germany.
³ Centre for Clinical Studies, University Hospital of Regensburg, Regensburg, Germany.
⁴ Department of Prosthetic Dentistry, University Hospital of Würzburg, Würzburg, Germany.

PMID: 37642657
PMCID: PMC12043786
DOI: 10.1007/s00056-023-00491-1

Comparative Study

Assessment of the quality of different commercial providers using artificial intelligence for automated cephalometric analysis compared to human orthodontic experts

Felix Kunz et al. J Orofac Orthop. 2025 May.

. 2025 May;86(3):145-160.

doi: 10.1007/s00056-023-00491-1. Epub 2023 Aug 29.

Authors

Felix Kunz¹, Angelika Stellzig-Eisenhauer², Lisa Marie Widmaier², Florian Zeman³, Julian Boldt⁴

Affiliations

¹ Department of Orthodontics, University Hospital of Würzburg, Pleicherwall 2, 97070, Würzburg, Germany. kunz_f@ukw.de.
² Department of Orthodontics, University Hospital of Würzburg, Pleicherwall 2, 97070, Würzburg, Germany.
³ Centre for Clinical Studies, University Hospital of Regensburg, Regensburg, Germany.
⁴ Department of Prosthetic Dentistry, University Hospital of Würzburg, Würzburg, Germany.

PMID: 37642657
PMCID: PMC12043786
DOI: 10.1007/s00056-023-00491-1

Abstract
in English, German

Purpose: The aim of this investigation was to evaluate the accuracy of various skeletal and dental cephalometric parameters as produced by different commercial providers that make use of artificial intelligence (AI)-assisted automated cephalometric analysis and to compare their quality to a gold standard established by orthodontic experts.

Methods: Twelve experienced orthodontic examiners pinpointed 15 radiographic landmarks on a total of 50 cephalometric X‑rays. The landmarks were used to generate 9 parameters for orthodontic treatment planning. The "humans' gold standard" was defined by calculating the median value of all 12 human assessments for each parameter, which in turn served as reference values for comparisons with results given by four different commercial providers of automated cephalometric analyses (DentaliQ.ortho [CellmatiQ GmbH, Hamburg, Germany], WebCeph [AssembleCircle Corp, Seongnam-si, Korea], AudaxCeph [Audax d.o.o., Ljubljana, Slovenia], CephX [Orca Dental AI, Herzliya, Israel]). Repeated measures analysis of variances (ANOVAs) were calculated and Bland-Altman plots were generated for comparisons.

Results: The results of the repeated measures ANOVAs indicated significant differences between the commercial providers' predictions and the humans' gold standard for all nine investigated parameters. However, the pairwise comparisons also demonstrate that there were major differences among the four commercial providers. While there were no significant mean differences between the values of DentaliQ.ortho and the humans' gold standard, the predictions of AudaxCeph showed significant deviations in seven out of nine parameters. Also, the Bland-Altman plots demonstrate that a reduced precision of AI predictions must be expected especially for values attributed to the inclination of the incisors.

Conclusion: Fully automated cephalometric analyses are promising in terms of timesaving and avoidance of individual human errors. At present, however, they should only be used under supervision of experienced clinicians.

Zusammenfassung: ZIEL: Ziel dieser Untersuchung war es, die Analysequalität verschiedener kommerzieller Anbieter für KI(künstliche Intelligenz)-basierte Fernröntgenseitenanalysen (FRS-Analysen) zu untersuchen und deren Auswertungen mit einem durch Experten festgelegten Goldstandard zu vergleichen.

Methoden: Auf 50 FRS wurden durch 12 erfahrene Untersucher 15 Landmarken identifiziert, auf deren Basis 9 relevante Parameter für die kieferorthopädische Behandlungsplanung vermessen wurden. Der „menschliche Goldstandard“ wurde definiert, indem der Medianwert aller 12 menschlichen Bewertungen für jeden Parameter berechnet wurde. Dieser diente als Referenzwert für den Vergleich mit den Ergebnissen von 4 verschiedenen kommerziellen Anbietern automatisierter FRS-Analysen (DentaliQ.ortho [CellmatiQ GmbH, Hamburg, Deutschland], WebCeph [AssembleCircle Corp, Seongnam-si, Korea], AudaxCeph [Audax d.o.o., Ljubljana, Slowenien], CephX [Orca Dental AI, Herzliya, Israel]). Die statistische Auswertung erfolgte mittels ANOVAs mit Messwiederholungen sowie mittels Bland-Altman-Plots.

Ergebnisse: Die Ergebnisse der ANOVAs mit Messwiederholung zeigten signifikante Unterschiede zwischen den Vorhersagen der kommerziellen Anbieter und dem menschlichen Goldstandard für alle 9 untersuchten Parameter, wobei sich im Rahmen der anschließenden paarweisen Vergleiche große Unterschiede zwischen den 4 kommerziellen Anbietern ergaben. Während keine signifikanten Unterschiede zwischen den Werten von DentaliQ.ortho und dem Goldstandard festgestellt wurden, wichen die Vorhersagen von AudaxCeph bei 7 von 9 Parametern signifikant ab. Außerdem zeigten die Bland-Altman-Plots, dass grundsätzlich eine geringere Präzision der KI-Vorhersagen bei den Parametern für die Inklination der Frontzähne zu erwarten ist.

Schlussfolgerung: Vollständig automatisierte FRS-Analysen sind vielversprechend in Bezug auf ihre Zeitersparnis und die Vermeidung individueller menschlicher Fehler. Derzeit sollten sie jedoch nur unter Aufsicht erfahrener Kliniker eingesetzt werden.

Keywords: Cephalometric landmarks; Deep Learning; Human gold standard; Machine Learning; Orthodontic parameters.

PubMed Disclaimer

Conflict of interest statement

Declarations. Conflict of interest: F. Kunz, A. Stellzig-Eisenhauer, L. M. Widmaier, F. Zeman and J. Boldt declare that they have no competing interests. Ethical standards: All procedures performed in studies involving human participants or on human tissue were in accordance with the ethical standards of the institutional and/or national research committee and with the 1975 Helsinki declaration and its later amendments or comparable ethical standards.

Figures

**Fig. 1**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: SNA Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: SNA

**Fig. 2**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: SNB Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: SNB

**Fig. 3**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: ANB Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: ANB

**Fig. 4**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: SN-PP Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: SN-PP

**Fig. 5**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: SN-MeGo Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: SN-MeGo

**Fig. 6**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: PP-MeGo Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: PP-MeGo

**Fig. 7**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: Facial height Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: Gesichtshöhenverhältnis

**Fig. 8**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: U1-SN Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: U1-SN

**Fig. 9**
Bland–Altman plots comparing the analyses of the commercial providers to the humans’ gold standard: L1-MeGo Bland-Altman-Plots zum Vergleich der Analysen der kommerziellen Anbieter mit dem menschlichen Goldstandard: L1-MeGo

See this image and copyright information in PMC

References

1. Arik SO, Ibragimov B, Xing L (2017) Fully automated quantitative cephalometry using convolutional neural networks. J Med Imaging 4(1):14501. 10.1117/1.JMI.4.1.014501 - PMC - PubMed
1. Baumrind S, Frantz RC (1971) The reliability of head film measurements. 2. Conventional angular and linear measures. Am J Orthod 60(5):505–17. 10.1016/0002-9416(71)90116-3 - PubMed
1. Broadbent B (1931) A new X‑ray technique and its application to orthodontia. Angle Orthod 1(2):45–66
1. Chan CK, Tng TH, Hägg U, Cooke MS (1994) Effects of cephalometric landmark validity on incisor angulation. Am J Orthod Dentofacial Orthop 106(5):487–495. 10.1016/s0889-5406(94)70071-0 - PubMed
1. Chen R, Ma Y, Chen N, Lee D, Wang W (2019) Cephalometric landmark detection by attentivefeature pyramid fusion and regression-voting. MICCAI. 10.48550/arXiv.1908.08841

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
- PubMed Central
- Springer
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Assessment of the quality of different commercial providers using artificial intelligence for automated cephalometric analysis compared to human orthodontic experts

Affiliations

Assessment of the quality of different commercial providers using artificial intelligence for automated cephalometric analysis compared to human orthodontic experts

Authors

Affiliations

Abstract
in English, German

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract in English, German

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract
in English, German