A dataset for voice-based human identity recognition
- PMID: 35356317
- PMCID: PMC8958529
- DOI: 10.1016/j.dib.2022.108070
A dataset for voice-based human identity recognition
Abstract
This paper introduces a new English speech dataset suitable for training and evaluating speaker recognition systems. Samples were obtained from non-native English speakers from the Arab region over the course of two months. The dataset was divided into two sub-datasets. Ten samples were collected from each speaker for each sub-dataset. The first sub-dataset contains samples of speakers repeating the phrase "Machine learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10". The second sub-dataset contains samples for the same speakers speaking randomly for five to ten seconds for each sample. The dataset consists of 150 speakers with a total of 3,000 data samples and about six hours of speech.
Keywords: Applied machine learning; Audio dataset; Different phrase; FLAC; Same phrase; Voice recognition.
© 2022 The Author(s). Published by Elsevier Inc.
Conflict of interest statement
The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.
Figures





References
-
- Panayotov V., Chen G., Povey D., Khudanpur S. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015. Librispeech: an ASR corpus based on public domain audio books; pp. 5206–5210. - DOI
-
- Y. Shafranovich, “Common format and MIME type for CSV files”, RFC 4180, doi:10.17487/RFC4180. - DOI
-
- I. Goncalves; S. Pfeiffer; C. Montgomery, “Ogg media types”, RFC 5334, doi:10.17487/RFC5334. - DOI
-
- 3RD generation partnership project 2 “3GPP2”. https://www.3gpp2.org/Public_html/Specs/C.S0050-B_v1.0_070521.pdf (accessed December 20 2021).
-
- Muin F., Gunawan T., Kartiwi M., Elsheikh E., Elsheikh M.A. AIP Conference Proceedings. Vol. 1883. AIP Publishing LLC; 2017. A review of lossless audio compression standards and algorithms.
LinkOut - more resources
Full Text Sources