Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Apr 3;24(7):2281.
doi: 10.3390/s24072281.

Detection and Recognition of Voice Commands by a Distributed Acoustic Sensor Based on Phase-Sensitive OTDR in the Smart Home Concept

Affiliations

Detection and Recognition of Voice Commands by a Distributed Acoustic Sensor Based on Phase-Sensitive OTDR in the Smart Home Concept

Tatyana V Gritsenko et al. Sensors (Basel). .

Abstract

In recent years, attention to the realization of a distributed fiber-optic microphone for the detection and recognition of the human voice has increased, whereby the most popular schemes are based on φ-OTDR. Many issues related to the selection of optimal system parameters and the recognition of registered signals, however, are still unresolved. In this research, we conducted theoretical studies of these issues based on the φ-OTDR mathematical model and verified them with experiments. We designed an algorithm for fiber sensor signal processing, applied a testing kit, and designed a method for the quantitative evaluation of our obtained results. We also proposed a new setup model for lab tests of φ-OTDR single coordinate sensors, which allows for the quick variation of their parameters. As a result, it was possible to define requirements for the best quality of speech recognition; estimation using the percentage of recognized words yielded a value of 96.3%, and estimation with Levenshtein distance provided a value of 15.

Keywords: acoustic monitoring; distributed fiber-optic sensor; fiber-optic sensor; machine learning; phi-OTDR; speech recognition.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest. The funders had no role in the design of the study, in the collection, analysis, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Scheme of DAS interrogation for a smart city (left) and smart home (right).
Figure 2
Figure 2
(a) Scheme of a distributed fiber microphone based on a φ-OTDR; (b) waterfall of backscattered intensity (in a fake color scale) as a function of time and coordinate; and (c) backscattered intensity as a function of time for a specific coordinate.
Figure 3
Figure 3
(a) Interference signal before preprocessing; (b) signal after preprocessing; (c) spectrum of simulated interference signal before (blue trace) and after (red trace) filtering; (d) the spectrogram of the speech signal after preprocessing.
Figure 4
Figure 4
Block diagram of the algorithm used for processing the φ-OTDR setup signals.
Figure 5
Figure 5
Generalized experimental setup.
Figure 6
Figure 6
Experimental setup for the quality of speech recognition depending on the sampling frequency for hollow PZT cylinder with sensing fiber and coil sensing fiber configurations: (a) components and their interconnection in the experimental setup; (b) photo of the experimental setup.
Figure 7
Figure 7
An experimental setup for experimental studies of the quality of speech recognition depending on the sampling frequency for the sensing fiber placed simply on a table: (a) an experimental setup circuit, (b) the use of a metal plate to increase the sensitivity of the system.
Figure 8
Figure 8
An experimental setup for studies of the quality of speech recognition depending on the sampling frequency using a sensing fiber with a length of 2.5 m wound around an elastic horn-like core, influenced by speakers with a sound volume of 72 dB(C): (a) with the bottle bottom influenced by the speakers; (b) with the bottle sidepiece influenced by the speakers; (c) with the horn-like bottle without a bottom influenced by speakers from inside.
Figure 9
Figure 9
An experimental setup circuit for experimental studies of the quality of speech recognition depending on the sampling frequency for the sensing fiber with a pair of wFBGs.
Figure 10
Figure 10
Spectral characteristics of the original audio recording: (a) Spectrum; (b) Spectrogram.
Figure 11
Figure 11
Spectrograms of signals obtained with a sampling frequency of 40 kHz: (a) a PZT-actuated disturbance; (b) a coiled sensing fiber, with a volume of 92 dB(C); (c) a sensing fiber placed simply on the table, with a volume of 89 dB(C); (d) a sensing fiber section 0.8 m long glued to a metal plate, with a volume of 89 dB(C); (e) a bottle bottom influenced by the speakers, with a volume of 72 dB(C); (f) a bottle sidepiece influenced by the speakers, with a volume of 72 dB(C); (g) a horn-like bottle without a bottom influenced by the speakers from inside, with a volume of 72 dB(C); (h) a sensing fiber with wFBGs 1 m apart with a preamplifier, influenced by speakers, with a volume of 108 dB(C); (i) a sensing fiber with wFBGs 1 m part without a preamplifier, influenced by speakers, with a volume of 108 dB(C).
Figure 12
Figure 12
Dependence of speech recognition quality on ADC sampling frequency for different sensing fiber configurations: (a) the percentage of words recognized by Yandex SpeechKit; (b) the Levenshtein distance of the words recognized with Yandex SpeechKit; (c) the percentage of words recognized by Whisper NN; and (d) the Levenshtein distance of the words recognized with Whisper NN.

References

    1. Piechowiak M., Zwierzykowski P., Musznicki B. LoRaWAN Metering Infrastructure Planning in Smart Cities. Appl. Sci. 2023;13:8431. doi: 10.3390/app13148431. - DOI
    1. Bousmina A., Selmi M., Ben Rhaiem M.A., Farah I.R. A Hybrid Approach Based on GAN and CNN-LSTM for Aerial Activity Recognition. Remote Sens. 2023;15:3626. doi: 10.3390/rs15143626. - DOI
    1. Wang C., Wang L., Wei S., Sun Y., Liu B., Yan L. STN-GCN: Spatial and Temporal Normalization Graph Convolutional Neural Networks for Traffic Flow Forecasting. Electronics. 2023;12:3158. doi: 10.3390/electronics12143158. - DOI
    1. García L., Mota S., Titos M., Martínez C., Segura J.C., Benítez C. Fiber Optic Acoustic Sensing to Understand and Affect the Rhythm of the Cities: Proof-of-Concept to Create Data-Driven Urban Mobility Models. Remote Sens. 2023;15:3282. doi: 10.3390/rs15133282. - DOI
    1. Latif R.M.A., Jamil M., He J., Farhan M. A Novel Authentication and Communication Protocol for Urban Traffic Monitoring in VANETs Based on Cluster Management. Systems. 2023;11:322. doi: 10.3390/systems11070322. - DOI