Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 14;23(16):7156.
doi: 10.3390/s23167156.

Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model

Affiliations

Signer-Independent Arabic Sign Language Recognition System Using Deep Learning Model

Kanchon Kanti Podder et al. Sensors (Basel). .

Abstract

Every one of us has a unique manner of communicating to explore the world, and such communication helps to interpret life. Sign language is the popular language of communication for hearing and speech-disabled people. When a sign language user interacts with a non-sign language user, it becomes difficult for a signer to express themselves to another person. A sign language recognition system can help a signer to interpret the sign of a non-sign language user. This study presents a sign language recognition system that is capable of recognizing Arabic Sign Language from recorded RGB videos. To achieve this, two datasets were considered, such as (1) the raw dataset and (2) the face-hand region-based segmented dataset produced from the raw dataset. Moreover, operational layer-based multi-layer perceptron "SelfMLP" is proposed in this study to build CNN-LSTM-SelfMLP models for Arabic Sign Language recognition. MobileNetV2 and ResNet18-based CNN backbones and three SelfMLPs were used to construct six different models of CNN-LSTM-SelfMLP architecture for performance comparison of Arabic Sign Language recognition. This study examined the signer-independent mode to deal with real-time application circumstances. As a result, MobileNetV2-LSTM-SelfMLP on the segmented dataset achieved the best accuracy of 87.69% with 88.57% precision, 87.69% recall, 87.72% F1 score, and 99.75% specificity. Overall, face-hand region-based segmentation and SelfMLP-infused MobileNetV2-LSTM-SelfMLP surpassed the previous findings on Arabic Sign Language recognition by 10.970% accuracy.

Keywords: Arabic Sign Language; MediaPipe; deep learning; dynamic sign language; segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
System architecture for segmented Arabic sign recognition system.
Figure 2
Figure 2
A graphical description of number of frames distribution in inter and intra-class Arabic sign color video data in [16].
Figure 3
Figure 3
Detecting landmarks and connections of face, hand, and upper body pose on human subjects: (a) Raw frame; (b) Frame with landmarks and connections.
Figure 4
Figure 4
Segmented frame/image data creation using MediaPipe Holistic module.
Figure 5
Figure 5
Pre-processing of sign video data to upsampled to 20 frames by adding last frame.
Figure 6
Figure 6
CNN-LSTM-SelfMLP-based Arabic Sign Language recognition system architecture. Here (A) Spatial feature extractor, (B) Two Layer LSTM and (C) Two layer SelfMLP.
Figure 7
Figure 7
The illustration of q order ONN operation of weight, input features, and biases.
Figure 8
Figure 8
Architecture of self-MLP classifier.
Figure 9
Figure 9
Receiver operating characteristic (ROC) curves of six CNN-LSTM-SelfMLP models on (a) raw dataset and (b) segmented dataset.
Figure 10
Figure 10
Signer-independent accuracy comparison of twelve CNN-LSTM-SelfMLP models on raw and segmented datasets with counterpart literature [16].

References

    1. Galindo N.M., Sá G.G.d.M., Pereira J.d.C.N., Barbosa L.U., Barros L.M., Caetano J.Á. Information about COVID-19 for deaf people: An analysis of Youtube videos in Brazilian sign language. Rev. Bras. Enferm. 2021;74:e20200291. doi: 10.1590/0034-7167-2020-0291. - DOI - PubMed
    1. Makhashen G.M.B., Luqman H.A., El-Alfy E.S.M. Using Gabor filter bank with downsampling and SVM for visual sign language alphabet recognition; Proceedings of the 2nd Smart Cities Symposium (SCS 2019); Bahrain, Bahrain. 24–26 March 2019; pp. 1–6.
    1. Luqman H., Mahmoud S.A. Transform-based Arabic sign language recognition. Procedia Comput. Sci. 2017;117:2–9.
    1. Chowdhury M.E.H., Rahman T., Khandakar A., Ayari M.A., Khan A.U., Khan M.S., Al-Emadi N., Reaz M.B.I., Islam M.T., Ali S.H.M. Automatic and Reliable Leaf Disease Detection Using Deep Learning Techniques. AgriEngineering. 2021;3:294–312. doi: 10.3390/agriengineering3020020. - DOI
    1. Podder K.K., Chowdhury M.E.H., Tahir A.M., Mahbub Z.B., Khandakar A., Hossain M.S., Kadir M.A. Bangla Sign Language (BdSL) Alphabets and Numerals Classification Using a Deep Learning Model. Sensors. 2022;22:574. doi: 10.3390/s22020574. - DOI - PMC - PubMed