Ghadeer-speech-crowd-corpus: Speech dataset
- PMID: 39850367
- PMCID: PMC11754659
- DOI: 10.1016/j.dib.2024.111201
Ghadeer-speech-crowd-corpus: Speech dataset
Abstract
The availability of raw data is a considerable challenge across most branches of science. In the absence of data, neither experiments can be conducted nor development can be undertaken. Despite their importance, raw data are still lacking across many scientific fields. A literature survey conducted at the beginning of our study indicated a significant lack of Arabic speech datasets. Therefore, this study aims to address this problem by proposing a new Arabic and English dataset called Ghadeer-Speech-Crowd-Corpus. This dataset was designed to target more than one branch of speech-processing applications, such as crowd speaker identification, speech synthesis (text-to-speech), and speech recognition (speech-to-text). Speech samples were recorded over three months from 210 Iraqi Arab citizens living in different parts of Iraq and included more than one accent. The proposed dataset was fully balanced with respect to sex and recordings (same number of Arabic and English recordings). Additionally, it is a mono dataset and contains 15,626 audio samples recorded at a sampling rate of 44,100 Hz, 16-bit depth, and bit rate of 705.6 kb/s. The recordings were conducted at the Academy for Media Training of the College of Media, University of Baghdad.
Keywords: Arabic phrase; English phrase; Low-resource languages; Speech recognition.
© 2024 Published by Elsevier Inc.
Figures
References
-
- Abdulmohsin H.A., Stephan J.J., Al-Khateeb B., Hasan S.S. In Proceedings of International Conference on Computing and Communication Networks: ICCCN 2021. Springer; 2022. Speech age estimation using a ranking convolutional neural network; pp. 123–130.
-
- Abdulmohsin H.A., Al-Khateeb B., Hasan S.S. In Proceedings of International Conference on Computing and Communication Networks: ICCCN 2021. Springer; 2022. Speech gender recognition using a multilayer feature extraction method; pp. 113–122.
-
- Panayotov V., Chen G., Povey D., Khudanpur S. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing ICASSP. Vol. 2015. 2015. Librispeech: an ASR corpus based on public domain audio books; pp. 5206–5210. - DOI
-
- W.M. Fisher, J.G. Fiscus, and D.S. Pallett, “Acoustic-Phonetic Continuous Speech Corpus,” no. 1992, 2015.
-
- D.B. Paul and J.M. Baker, “The Design for the Wall Street Journal based CSR Corpus,” 1994.
LinkOut - more resources
Full Text Sources
