Ghadeer-speech-crowd-corpus: Speech dataset

Ghadeer Qasim Ali¹, Husam Ali Abdulmohsin¹

Affiliations

PMID: 39850367
PMCID: PMC11754659
DOI: 10.1016/j.dib.2024.111201

Ghadeer-speech-crowd-corpus: Speech dataset

Ghadeer Qasim Ali et al. Data Brief. 2024.

. 2024 Dec 9:58:111201.

doi: 10.1016/j.dib.2024.111201. eCollection 2025 Feb.

Authors

Ghadeer Qasim Ali¹, Husam Ali Abdulmohsin¹

Affiliation

¹ Computer Science Department, College of Science, University of Baghdad, Iraq.

PMID: 39850367
PMCID: PMC11754659
DOI: 10.1016/j.dib.2024.111201

Abstract

The availability of raw data is a considerable challenge across most branches of science. In the absence of data, neither experiments can be conducted nor development can be undertaken. Despite their importance, raw data are still lacking across many scientific fields. A literature survey conducted at the beginning of our study indicated a significant lack of Arabic speech datasets. Therefore, this study aims to address this problem by proposing a new Arabic and English dataset called Ghadeer-Speech-Crowd-Corpus. This dataset was designed to target more than one branch of speech-processing applications, such as crowd speaker identification, speech synthesis (text-to-speech), and speech recognition (speech-to-text). Speech samples were recorded over three months from 210 Iraqi Arab citizens living in different parts of Iraq and included more than one accent. The proposed dataset was fully balanced with respect to sex and recordings (same number of Arabic and English recordings). Additionally, it is a mono dataset and contains 15,626 audio samples recorded at a sampling rate of 44,100 Hz, 16-bit depth, and bit rate of 705.6 kb/s. The recordings were conducted at the Academy for Media Training of the College of Media, University of Baghdad.

Keywords: Arabic phrase; English phrase; Low-resource languages; Speech recognition.

PubMed Disclaimer

Figures

**Fig. 1**
The recording device used Mixer, Microphone.

**Fig. 3**
Coding system design. (a) solo speaker, (b) 2 crowd speakers, (c) 3 crowd speakers, (d) 4 crowd speakers, (e) 5 crowd speakers.

See this image and copyright information in PMC

References

1. Abdulmohsin H.A., Stephan J.J., Al-Khateeb B., Hasan S.S. In Proceedings of International Conference on Computing and Communication Networks: ICCCN 2021. Springer; 2022. Speech age estimation using a ranking convolutional neural network; pp. 123–130.
1. Abdulmohsin H.A., Al-Khateeb B., Hasan S.S. In Proceedings of International Conference on Computing and Communication Networks: ICCCN 2021. Springer; 2022. Speech gender recognition using a multilayer feature extraction method; pp. 113–122.
1. Panayotov V., Chen G., Povey D., Khudanpur S. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP. Vol. 2015. 2015. Librispeech: an ASR corpus based on public domain audio books; pp. 5206–5210. - DOI
1. W.M. Fisher, J.G. Fiscus, and D.S. Pallett, “Acoustic-Phonetic Continuous Speech Corpus,” no. 1992, 2015.
1. D.B. Paul and J.M. Baker, “The Design for the Wall Street Journal based CSR Corpus,” 1994.

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ghadeer-speech-crowd-corpus: Speech dataset

Affiliation

Ghadeer-speech-crowd-corpus: Speech dataset

Authors

Affiliation

Abstract

Figures

References

LinkOut - more resources

Full Text Sources