Neural network algorithm for detection tonal, noise and pauses parts of continuous speech

Main Article Content

Ivan Yuriiovych Bondarenko
Olha Mykolaivna Ladoshko

Abstract

The problem of automatic detection of tones, noise and pauses parts of speech is considered. To solve this problem, we propose a neural network algorithm to classify sequences of frames into which the speech signal is separated. On the material of speech of corpuses TIMIT and NTIMIT experiments on evaluation of the quality, reliability and speed of the algorithm in speaker independent mode, including in non-stationary noise caused by the influence of the telephone channel were implemented.

Reference 11, figures 2, table 3.

Article Details

How to Cite
Bondarenko, I. Y., & Ladoshko, O. M. (2013). Neural network algorithm for detection tonal, noise and pauses parts of continuous speech. Electronics and Communications, 17(6), 19–25. https://doi.org/10.20535/2312-1807.2012.17.6.11392
Section
Theory of signals and systems

References

Atal B., Rabiner L. A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition // Acoustics, Speech and Signal Processing. – 1976. – Vol.24, Issue 3. – P.201-212.

Jamal Ghasemi, Amard Afzalian, M.R. Karami Mollaei. A Combined Voice Activity Detector Based On Singular Value Decomposition and Fourier Transform // Signal Processing. – 2010. – Vol.4, Issue 1. – P.54-61.

Jankowski C., Kalyanswamy A., Basson S., Spitz J. NTIMIT: A Phonetically Balanced, Continuous Speech, Telephone Bandwidth Speech Database//Proc. of ICASSP-90. – 1990. – P. 109-112.

LeCun Y., Bottou L., Orr G., Muller K. Efficient BackProp // Neural Networks: Tricks of the trade. – Springer Verlag, 1998. – P. 5-50.

Martin A., Charlet D., Mauuary L. Robust speech/non-speech detection using LDA applied to MFCC // Proc. of ICASSP'01. – 2001. – Vol.1. – P.237-240.

Wilson D.R., Martinez T.R. The general inefficiency of batch training for gradient descent learning // Neural Networks. – 2003. – Vol.16. Issue 10. – P.1429-1451.

Zue V., Seneff S., Glass J. Speech database development at MIT: TIMIT and beyond // Speech Communication. – 1990. – Vol. 9, № 4. – P.351-356.

Arkhipov I.A., Gitlin V.B, Luzin D.A An adaptive algorithm for deciding "TONE - not tone", synchronous with the main tone / / Speech technologies. – 2009. - № 1. - P.80-93. (Rus)

Gorban A.N. Generalized approximation theorem and the computational capabilities of neural networks // Siberian Journal of Numerical Mathematics. –- 1998. - V.1, № 1. - P. 12-24. (Rus)

Methods of processing speech signals in the time domain / L.R. Rabiner, R. Schafer Digital processing of speech signals. Per. from English. - M.: Radio and communication, 1981. – P.110-160. (Rus)

Unidirectional multilayered network of sigmoidal type / Osovsky C. Neural networks for information processing. Per. from Polish. - Moscow: Finance and Statistics, 2004. - P.46-88. (Rus)