Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Mar 18:42:108070.
doi: 10.1016/j.dib.2022.108070. eCollection 2022 Jun.

A dataset for voice-based human identity recognition

Affiliations

A dataset for voice-based human identity recognition

Baha' A Alsaify et al. Data Brief. .

Abstract

This paper introduces a new English speech dataset suitable for training and evaluating speaker recognition systems. Samples were obtained from non-native English speakers from the Arab region over the course of two months. The dataset was divided into two sub-datasets. Ten samples were collected from each speaker for each sub-dataset. The first sub-dataset contains samples of speakers repeating the phrase "Machine learning 1, 2, 3, 4, 5, 6, 7, 8, 9, 10". The second sub-dataset contains samples for the same speakers speaking randomly for five to ten seconds for each sample. The dataset consists of 150 speakers with a total of 3,000 data samples and about six hours of speech.

Keywords: Applied machine learning; Audio dataset; Different phrase; FLAC; Same phrase; Voice recognition.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships which have or could be perceived to have influenced the work reported in this article.

Figures

Fig 1:
Fig. 1
Architecture of the dataset.
Fig 2:
Fig. 2
Different speakers saying same phrase in the samePhrase sub-dataset.
Fig 3:
Fig. 3
Different speakers saying different phrase in the differentPhrase sub-dataset.
Fig 4:
Fig. 4
Same speaker saying same phrase in the samePhrase sub-dataset.
Fig 5:
Fig. 5
Same speaker saying different phrase in the differentPhrase sub-dataset.

References

    1. Panayotov V., Chen G., Povey D., Khudanpur S. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2015. Librispeech: an ASR corpus based on public domain audio books; pp. 5206–5210. - DOI
    1. Y. Shafranovich, “Common format and MIME type for CSV files”, RFC 4180, doi:10.17487/RFC4180. - DOI
    1. I. Goncalves; S. Pfeiffer; C. Montgomery, “Ogg media types”, RFC 5334, doi:10.17487/RFC5334. - DOI
    1. 3RD generation partnership project 2 “3GPP2”. https://www.3gpp2.org/Public_html/Specs/C.S0050-B_v1.0_070521.pdf (accessed December 20 2021).
    1. Muin F., Gunawan T., Kartiwi M., Elsheikh E., Elsheikh M.A. AIP Conference Proceedings. Vol. 1883. AIP Publishing LLC; 2017. A review of lossless audio compression standards and algorithms.

LinkOut - more resources