THE IMPACT OF WINDOW FUNCTION ON IDENTIFICATION OF SPEAKER EMOTIONAL STATE

Paweł Powroźnik

pawel.powroznik@pollub.edu.pl
Politechnika Lubelska, Instytut Informatyki (Poland)

Dariusz Czerwiński


Politechnika Lubelska, Instytut Informatyki (Poland)

Abstract

The article presents the impact of window function used for preparing the spectrogram, on Polish emotional speech identification.. In conducted researches the following window functions were used: Hamming, Gauss, Dolph–Chebyshev, Blackman, Nuttall, Blackman-Harris. The spectrogram processing method by artificial neural network (ANN) was also described in this article. Obtained results allowed to assess the effectiveness of identification process with the use of ANN. The average efficiency ranged from 70 % to more than 87%.


Keywords:

window function, artificial neural networks, Polish emotional speech recognition

Berlin Database of Emotional Speech: http://www.expressive-speech.net/ (10.08.2014).
  Google Scholar

Bracewell R.: The Fourier Transform and its Application. Electric Engineering Series. McGraw-Hill International Editions. Singapore 2000.
  Google Scholar

Chena K.F., Lib Y.F.: Combining the Hanning windowed interpolated FFT in both directions. Computer Phisics Communication 178(12)/2008, 924–928.
  Google Scholar

Chmaj T., Lankosz M.: Akwizycja i przetwarzanie sygnałów cyfrowych. Politechnika Krakowska, Kraków 2011.
  Google Scholar

Database of Polish Emotional Speech: http://www.eletel.p.lodz.pl/bronakowski/med_cat­alog/ (10.08.2014).
  Google Scholar

Gałka J., Ziółko B.: Study of Performance Evaluation Methods for Non-Uniform Speech Segmentation, International of Circuits. Systems and Signal Processing. NAUN 2008.
  Google Scholar

Harris R, Fredric J.: On the use of Windows for Harmonic Analysis with the Discrete Fourier Transform. Proceedings of the IEEE 66(1)/1978, 51–83.
  Google Scholar

Heinzel, G., Rüdiger, A., Schilling R.: Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new flat-top windows (Technical report).Max Planck Institute (MPI) für Gravitationsphysik/Laser Interferometry & Gravitational Wave Astronomy.
  Google Scholar

Janicki A., Turkot M.: Rozpoznawanie stanu emocjonalnego mówcy z wykorzystaniem maszyny wektorów wspierających. KSTiT 2008, Bydgoszcz 2008.
  Google Scholar

Kamińska D., Pelikant A.: Zastosowanie multimedialnej klasyfikacji w rozpoznawaniu stanów emocjonalnych na podstawie mowy spontanicznej. IAPGOŚ 3/2012, 36–39.
  Google Scholar

Kim E.H., Hyu K.H., Kim S.H., Kwak Y.K.: Speech emotion recognition using eigen-FFT in clean and noisy environments. 16th IEEE International Conference on Robots and Human Interactive Communication, Jeju, Korea 2007.
  Google Scholar

Kłosiński R.: Materiały X Konferencji Naukowej SP 2014.
  Google Scholar

Konratowski E.: Czasowo-częstotliwościowa analiza drgań z wykorzystaniem metody overlapping. Logistyka 3/2014, 3104–3110.
  Google Scholar

Konratowski E.: Monitoring of the Multichannel Audio Signal, Computional collective intelligence. Technologies and Applications. Lecture Notes in Artifical Intelligence 6422, Springer Verlag, 298–306.
  Google Scholar

Krzyk P., Sułowicz M., Pragłowska–Ryłko N.: Zastosowanie IpDFT do diagnostyki silników asynchronicznych. Zeszyty Problemowe – Maszyny Elektryczne 3/2014, 293–300.
  Google Scholar

Lynch P.: The Dolph-Chebyshev window: A simple optimal filter. America Meteorological Society Journal of the Online 125/1997, 655–660.
  Google Scholar

Parsomphan S.: Use of Neural Network Classifier for Detecting Human Emotions via Speech Spectrogram. Procedings of the 3rd IIAE International Conference on Intelligence Systems and Image Processing. Japan 2015.
  Google Scholar

Pfitzinger H.R., Kaernbach C.: Amplitude and Amplitude Variation of Emotional Speech. Interspeech 2008, 1036–1039.
  Google Scholar

Powroźnik P., Czerwiński D: Effectiveness comparison on an artificial neural networks to identify Polish emotional speech. Przegląd Elektrotechniczny 07/2016, 45–48.
  Google Scholar

Powroźnik P.: Polish emotional speech recognition using artificial neural network. Advances is Science and Technology Research Journal 8(24)/2014, 24–27.
  Google Scholar

Ramakrishnan S.: Recognition of emotion from speech, A review. Speech Enhancement, Modeling and Recognition – Algorithms and Applications, March 2012.
  Google Scholar

Scherer K.: Vocal communication of emotions: A Review of Research Paradigms in Speech Communication 40/2003, 227–256.
  Google Scholar

Smith J. O.: Spectral Audio Signal Processing. W3K Publishing, 2011.
  Google Scholar

Thompson W. F., Balkwill L–L.: Decoding speech prosody in five languages. Semiotica 158/2006, 407–424.
  Google Scholar

Wicher A., Sęk A., Konieczny J.: Akustyczno-fonetyczne cechy mowy polskiej. Instytut Akustyki UAM Poznań, 2005.
  Google Scholar

Zieliński T. P., Cyfrowe przetwarzanie sygnałów. Od teorii do zastosowań. WKiŁ, Warszawa 2009.
  Google Scholar


Published
2017-12-21

Cited by

Powroźnik, P., & Czerwiński, D. (2017). THE IMPACT OF WINDOW FUNCTION ON IDENTIFICATION OF SPEAKER EMOTIONAL STATE. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 7(4), 96–100. https://doi.org/10.5604/01.3001.0010.7371

Authors

Paweł Powroźnik 
pawel.powroznik@pollub.edu.pl
Politechnika Lubelska, Instytut Informatyki Poland

Authors

Dariusz Czerwiński 

Politechnika Lubelska, Instytut Informatyki Poland

Statistics

Abstract views: 198
PDF downloads: 518


Most read articles by the same author(s)