Reading Akkadian cuneiform using natural language processing
- PMID: 33112872
- PMCID: PMC7592802
- DOI: 10.1371/journal.pone.0240511
Reading Akkadian cuneiform using natural language processing
Abstract
In this paper we present a new method for automatic transliteration and segmentation of Unicode cuneiform glyphs using Natural Language Processing (NLP) techniques. Cuneiform is one of the earliest known writing system in the world, which documents millennia of human civilizations in the ancient Near East. Hundreds of thousands of cuneiform texts were found in the nineteenth and twentieth centuries CE, most of which are written in Akkadian. However, there are still tens of thousands of texts to be published. We use models based on machine learning algorithms such as recurrent neural networks (RNN) with an accuracy reaching up to 97% for automatically transliterating and segmenting standard Unicode cuneiform glyphs into words. Therefore, our method and results form a major step towards creating a human-machine interface for creating digitized editions. Our code, Akkademia, is made publicly available for use via a web application, a python package, and a github repository.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures




Similar articles
-
Translating Akkadian to English with neural machine translation.PNAS Nexus. 2023 May 2;2(5):pgad096. doi: 10.1093/pnasnexus/pgad096. eCollection 2023 May. PNAS Nexus. 2023. PMID: 37143863 Free PMC article.
-
Deep learning of cuneiform sign detection with weak supervision using transliteration alignment.PLoS One. 2020 Dec 16;15(12):e0243039. doi: 10.1371/journal.pone.0243039. eCollection 2020. PLoS One. 2020. PMID: 33326435 Free PMC article.
-
Restoration of fragmentary Babylonian texts using recurrent neural networks.Proc Natl Acad Sci U S A. 2020 Sep 15;117(37):22743-22751. doi: 10.1073/pnas.2003794117. Epub 2020 Sep 1. Proc Natl Acad Sci U S A. 2020. PMID: 32873650 Free PMC article.
-
Clinical Text Data in Machine Learning: Systematic Review.JMIR Med Inform. 2020 Mar 31;8(3):e17984. doi: 10.2196/17984. JMIR Med Inform. 2020. PMID: 32229465 Free PMC article. Review.
-
Essential Elements of Natural Language Processing: What the Radiologist Should Know.Acad Radiol. 2020 Jan;27(1):6-12. doi: 10.1016/j.acra.2019.08.010. Epub 2019 Sep 17. Acad Radiol. 2020. PMID: 31537505 Review.
Cited by
-
Translating Akkadian to English with neural machine translation.PNAS Nexus. 2023 May 2;2(5):pgad096. doi: 10.1093/pnasnexus/pgad096. eCollection 2023 May. PNAS Nexus. 2023. PMID: 37143863 Free PMC article.
References
-
- Geller MJ. The Last Wedge. Zeitschrift für Assyriologie und vorderasiatische Archäologie. 1997;87:43–95.
-
- Hunger H, de Jong T. Almanac W22340a from Uruk: The Latest Datable Cuneiform Tablet. Zeitschrift für Assyriologie und vorderasiatische Archäologie. 2014;104(2):182–194.
-
- Roaf M, Sinclair T, Kroll SE, Simpson SJ, Talbert DR, Gillies S, et al. Tell Abu Marya/[Apqu]: a Pleiades Place Resource. In: Pleiades: A Gazetteer of Past Places; 2015.Available from: https://pleiades.stoa.org/places/874723.
-
- Cohen J, Duncan D, Snyder D, Cooper J, Kumar S, Hahn D, et al. iClay: Digitizing Cuneiform. In: Chrysanthou Y, Cain K, Silberman N, Niccolucci F, editors. The 5th International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST 2004). Goslar: The Eurographics Association; 2004. p. 135–143.
-
- Charpin D. Ressources assyriologiques sur internet. Bibliotheca Orientalis. 2014;71:331–358.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources