Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Oct 17;20(20):5884.
doi: 10.3390/s20205884.

Recognition of Pashto Handwritten Characters Based on Deep Learning

Affiliations

Recognition of Pashto Handwritten Characters Based on Deep Learning

Muhammad Sadiq Amin et al. Sensors (Basel). .

Abstract

Handwritten character recognition is increasingly important in a variety of automation fields, for example, authentication of bank signatures, identification of ZIP codes on letter addresses, and forensic evidence. Despite improved object recognition technologies, Pashto's hand-written character recognition (PHCR) remains largely unsolved due to the presence of many enigmatic hand-written characters, enormously cursive Pashto characters, and lack of research attention. We propose a convolutional neural network (CNN) model for recognition of Pashto hand-written characters for the first time in an unrestricted environment. Firstly, a novel Pashto handwritten character data set, "Poha", for 44 characters is constructed. For preprocessing, deep fusion image processing techniques and noise reduction for text optimization are applied. A CNN model optimized in the number of convolutional layers and their parameters outperformed common deep models in terms of accuracy. Moreover, a set of benchmark popular CNN models applied to Poha is evaluated and compared with the proposed model. The obtained experimental results show that the proposed model is superior to other models with test accuracy of 99.64 percent for PHCR. The results indicate that our model may be a strong candidate for handwritten character recognition and automated PHCR applications.

Keywords: Pashto handwritten character recognition; computer vision; convolutional neural networks; deep features fusion; deep learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure A1
Figure A1
A confusion matrix for 44 classes of Pashto handwritten characters.
Figure A2
Figure A2
Confusion matrix for the Urdu dataset that contains 40 classes of characters [42].
Figure A3
Figure A3
The confusion matrix for Devanagarithat contains 36 classes of characters [54].
Figure 1
Figure 1
Block diagram of our proposed methodology.
Figure 2
Figure 2
Form distributed to authors and used to compile the dataset: (a) The form completed by native Pashto speakers in Khyber Pukhtunkhwa, KPK Pakistan; (b) this form completed by non-native speakers in South Korea; (c) the individual letter “Khin” written by one user.
Figure 3
Figure 3
Image dataset denoising steps: (a) scanner-induced noise; (b) denoised image; (c) application of Gaussian blur filter.
Figure 4
Figure 4
Graph depicting the accuracy of the proposed model, ResNet18, and Resnet34 on the Poha dataset: (a) training and validation accuracy; (b) training and validation loss; (c) ResNet 18 training and validation accuracy; (d) ResNet 18 training and validation loss; (e) ResNet 34 training and validation accuracy; (f) ResNet 34 training and validation loss.
Figure 5
Figure 5
A sample of input images for the proposed model from the Urdu dataset [42].
Figure 6
Figure 6
Correct classification of Poha dataset characters by the proposed model.
Figure 7
Figure 7
Observed errors in recognition of the Poha dataset by the proposed model.

References

    1. Fujisawa H. Forty years of research in character and document recognition—An industrial perspective. Pattern Recognit. 2008;41:2435–2446. doi: 10.1016/j.patcog.2008.03.015. - DOI
    1. Steinherz T., Rivlin E., Intrator N. Offline cursive script word recognition—A survey. Int. J. Doc. Anal. Recognit. 1999;2:90–110. doi: 10.1007/s100320050040. - DOI
    1. Plamondon R., Srihari S.N. Online and off-line handwriting recognition: A comprehensive survey. IEEE Trans. Pattern Anal. Mach. Intell. 2000;22:63–84. doi: 10.1109/34.824821. - DOI
    1. Arica N., Yarman-Vural F.T. An overview of character recognition focused on off-line handwriting. IEEE Trans. Syst. Man Cybern. Part C. 2001;31:216–233. doi: 10.1109/5326.941845. - DOI
    1. Khan N.H., Adnan A., Basar S. An analysis of off-line and on-line approaches in Urdu character recognition; Proceedings of the 15th International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED 16); Venice, Italy. 29–31 January 2016.

LinkOut - more resources