HMPLMD: Handwritten Malayalam palm leaf manuscript dataset
- PMID: 36820128
- PMCID: PMC9938152
- DOI: 10.1016/j.dib.2023.108960
HMPLMD: Handwritten Malayalam palm leaf manuscript dataset
Abstract
The realization of high recognition rates of degraded documents such as palm leaf manuscripts primarily relies on document enhancement. Advancement of deep learning models in the process of document enhancement plays a major role among non-deep learning models or thresholding methods. Preparation of readily available ground truth data for creation of deep learning models is of paramount importance as it is highly time consuming task. The ground truth dataset preparation involves greater complexities as ancient documents are affected with degradations such as fungi, humidity, uneven illumination, discoloration, holes, cracks, and other damages. We propose a Handwritten Malayalam Palm Leaf Manuscript Dataset (HMPLMD) and its ground truth data aspiring for advancements in the field of palm leaf image analysis. We employ the palm leaf manuscripts of Kambaramayanam and Jathakas for the sake of experimentations. The proposed ground truth samples of degraded palm leaves plays a crucial role in creation of specialized deep/transfer learning models to handle challenges related to binarization.
Keywords: Binarization; Ground truth; Malayalam; Photoshop; Sauvola.
© 2023 The Author(s).
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.
Figures
References
-
- Fischer Andreas, Indermühle Emanuel, Bunke Horst, Viehhauser Gabriel, Stolz Michael. Proceedings of the 9th IAPR International Workshop on Document Analysis Systems(DAS '10) Association for Computing Machinery; New York, NY, USA: 2010. Ground truth creation for handwriting recognition in historical documents; pp. 3–10. - DOI
-
- Kesiman M.W.A., Burie J.-C., Wibawantara G.N.M.A., Sunarya I.M.G., Ogier J.-M. 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR) 2016. AMADI_LontarSet: the first handwritten balinese palm leaf manuscripts dataset; pp. 168–173. - DOI
-
- Shobha Rani N., Sajan Jain A., Kiran H.R. In: Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB) Pandian D., Fernando X., Baig Z., Shi F., editors. Vol. 30. Springer; Cham: 2019. A unified preprocessing technique for enhancement of degraded document images. ISMAC 2018. Lecture Notes in Computational Vision and Biomechanics. - DOI
-
- Sauvola J, Pietikainen M. Adaptive document image binarization. Pattern Recognit. 2000;33(2):225–236. doi: 10.1016/S0031-3203(99)00055-2. - DOI
-
- Niblack W. Prentice- Hall; Englewood Cliffs (NJ): 1986. An Introduction to Digital Image Processing.
LinkOut - more resources
Full Text Sources
