Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 15:52:109961.
doi: 10.1016/j.dib.2023.109961. eCollection 2024 Feb.

TLFS23 Tamil language fingerspelling dataset

Affiliations

TLFS23 Tamil language fingerspelling dataset

Bavesh Ram S et al. Data Brief. .

Abstract

Tamil is one of the oldest existing languages, spoken by around 65 million people across India, Sri Lanka and South-East Asia. Countries such as Fiji and South Africa also have a significant population with Tamil ancestry. Tamil is a complex language and has 247 characters. A labelled dataset for Tamil Fingerspelling named TLFS23 has been created for research related to vision-based Fingerspelling translators for the Speech and hearing Impaired. The dataset would open up avenues to develop automated systems as translators and interpreters for effective communication between fingerspelling language users and non- users, using computer vision and deep learning algorithms. One thousand images representing each unique finger flexion motion for every Tamil character was collected overall constituting a large dataset with 248 classes with a total of 2,55,155 images. The images were contributed by 120 individuals from different age groups. The dataset is made publicly available at: https://data.mendeley.com/datasets/39kzs5pxmk/2.

Keywords: Computer vision; Image dataset; Indian sign language; Tamil.

PubMed Disclaimer

Figures

Fig 1
Fig. 1
Folder layout for the TLFS23 Tamil language fingerspelling dataset.
Fig 2
Fig. 2
Setup facilitated for collecting images.
Fig 3
Fig. 3
Data collection examples: images logged in the dataset represented in (a), (c), (e) and (g). Corresponding reference images are shown in (b), (d), (f) and (h), respectively. Fig. 3 (a), (b) represent the Tamil character Ñi; (c) and (d) represent the character Ngā; (e) and (f) represent Sē; (g) and (h) represent Ni.
Fig 4
Fig. 4
Flowchart of the code used for data acquisition.

References

    1. Annamalai E., Asher R.E. Routledge; 2015. Colloquial Tamil: The Complete Course for Beginners. London and New York.
    1. Chirranjeavi M., Bavesh Ram S., Varatharajan G., Sundaresh A, Nair B.B., Harikumar M.E. TLFS23 - Tamil language finger spelling image dataset. Mendeley Data. 2023 doi: 10.17632/39kzs5pxmk.2. V2. - DOI
    1. Wazalwar S., Shrawankar U. Online healthcare consultation system for deaf & dumb during pandemic situation. Biosci. Biotechnol. Res. Commun. 2020;13:213–216.
    1. Sridhar A., Ganesan R.G., Kumar P., Khapra M. INCLUDE: a large scale dataset for Indian sign language recognition. Proceedings of the 28th ACM International Conference on Multimedia, Association for Computing Machinery; New York, NY, USA; 2020. pp. 1366–1375. - DOI
    1. Teja Mangamuri L.S., Jain L., Sharmay A. Two hand Indian sign language dataset for benchmarking classification models of machine learning. Proceedings of the IEEE International Conference on Issues Challenges in Intelligent Computing Techniques ICICT 2019; Ghaziabad; IEEE; 2019. pp. 1–5. - DOI

LinkOut - more resources