Development of a method for recognizing emotions from a speech signal

D. A. Kravchuk

doi:10.21869/2223-1536-2024-14-2-72-80

Development of a method for recognizing emotions from a speech signal

D. A. Kravchuk

https://doi.org/10.21869/2223-1536-2024-14-2-72-80

Full Text:

PDF (Rus)

Generate QR code

Abstract

The purpose of research is automatic recognition of the speaker's emotions, based on the processing of sound recordings intended for use in alarm systems when working with operators of locomotive crews and dispatch services.
Methods. Human emotion recognition has been a rapidly developing area of research in recent years. Features of the vocal tract, such as sound power, formant frequencies, are used to detect certain emotions with good accuracy. A method was used to determine the signal energy by highlighting the dominant frequency. The work has developed a program code, on the basis of which an analysis of four emotions is given - anger, joy, fear and calm. The most important and difficult step is to determine the features most suitable for distinguishing emotions and the availability of databases. Collecting databases is a complex task requiring the manifestation of sincerity of emotions. Often, the collection of a database takes place in an artificial environment and the speech may sound staged; to eliminate such problems, it is necessary to use call center recordings.
Results. Recordings of basic emotional states, such as anger, joy, sadness, fear and surprise, which are the most common case of the study, were obtained and processed. The developed software code allows us to get closer to automatically determining emotions from a speech signal. To analyze speech recordings in samples, indicators of signal energy and identification of the dominant frequency were used.
Conclusion. The implemented method of monitoring the emotional state of a human operator using a speech signal is widely used in the prevention and improvement of indicators of the psychophysiological professional suitability of locomotive crew workers and the preservation of their professional health. Distinct differences are observed in the characteristics of all types of emotions.

Keywords

emotions, speech signal, signal spectrum, recognition method

About the Author

D. A. Kravchuk

Institute of Nanotechnology, Electronics and Instrumentation of the Southern Federal University
Russian Federation

Denis A. Kravchuk, Doctor of Sciences (Engineering)

2/E Shevchenko Str., Taganrog 347922, Russian Federation

References

1. Huang W., Wu Q., Dey N., Ashour A., Fong S.J., González-Crespo R. Adjectives grouping in a dimensionality affective clustering model for fuzzy perceptual evaluation. Int. J. Interact. Multimedia. Artif. Intell. 2020;6(2):10. https://doi.org/10.9781/ijimai.2020.05.002

2. Xusheng Wang, Xing Chen, Congjun Cao. Human emotion recognition by optimally fusing facial expression and speech feature, Signal Processing. Image Communication. 2020;84(10):115831. https://doi.org/10.1016/j.image.2020.115831

3. Akçay M. B., O˘guz K. Speech emotion recognition: emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech. Commun. 2020;(116):56–76. https://doi.org/10.1016/j.specom.2019.12.001

4. Wang J., Xue M., Culhane R., Diao E., Ding J., Tarokh V. Speech Emotion Recognition with Dual-Sequence LSTM Architecture. 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona, Spain; 2020. P. 6474–6478. https://doi.org/10.1109/ICASSP40776.2020.9054629

5. Yang N., Dey N., Sherratt R. S., Shi F. Recognize basic emotional states in speech by machine learning techniques using mel-frequency cepstral coefficient features. J. Intell. Fuzzy. Syst. 2020;(39):1925–1936. https://doi.org/10.3233/jifs-179963

6. Daneshfar F., Kabudian S.J., Neekabadi A. Speech emotion recognition using hybrid spectral-prosodic features of speech signal/glottal waveform, metaheuristic-based dimensionality reduction, and Gaussian elliptical basis function network classifier. Appl. Acoust. 2020;(166):107360. https://doi.org/10.1016/j.apacoust. 2020.107360

7. Palo H.K., Behera D., Rout B.C. Comparison of classifiers for speech emotion recognition (SER)with discriminative spectral features. Advances in Intelligent Computing and Communication: Proceedings of ICAC. Singapore: Springer; 2020. P 78–85. https://doi.org/10.1007/978-981-15-2774-6_10

8. Xie Y., Liang R., Liang Z., Huang C., Zou C., Schuller B. Speech emotion classification using attention-based lstm. IEEE/ACM Trans. Audio. Speech. Lang. Proc. 2019;27(11):1675–1685. https://doi.org/10.1109/TASLP. 2019.2925934

9. Hassouneh A., Mutawa A.M., Murugappan M. Development of a real-time emotion recognition system using facial expressions and EEG based on machine learning and deep neural network methods. InformMed. Unlock. 2020;(20):100372. https://doi.org/10.1016/j.imu.2020.100372

10. Kerkeni L., Serrestou Y., Raoof K., Mbarki M., Mahjoub M.A., Cleder C. Automatic speech emotion recognition using an optimal combination of features based on EMD-TKEO. Speech. Commun. 2019;(114):22–35. https://doi.org/10.1016/j.specom.2019.09.002

11. Uddin Md. Zia, Nilsson E.G., Emotion recognition using speech and neural structured learning to facilitate edge intelligence. Engineering Applications of Artificial Intelligence. 2020;94:103775. https://doi.org/10.1016/j.engappai.2020.103775

12. Kadiri S.R., Gangamohan P., Gangashetty S.V., et al. Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference. Circuits. Syst. Signal. Process. 2020;39:4459–4481. https://doi.org/10.1007/s00034-020-01377-y

13. Gorshkov Y.G. Visualization of Lung Sounds Based on Multilevel Wavelet Analysis. Scientific Visualization. 2022;14(2):18–26. https://doi.org/10.26583/sv.14.2.02

14. Kravchuk D.A. Ultrasonic system for monitoring the psychophysiological state of a train driver. Izvestiya Yugo-Zapadnogo gosudarstvennogo universiteta. Seriya: Upravlenie, vychislitel'naya tekhnika, informatika. Meditsinskoe priborostroenie = Proceedings of the Southwest State University. Series: Control, Computer Engineering, Information Science. Medical Instruments Engineering. 2020;10(1):134–142. (In Russ.)

15. Prokofieva L.P., Plastun I.L., Filippova N.V., Matveeva L.Yu., Plastun N.S. Recognition of emotions based on the characteristics of the speech signal (linguistic, clinical, information aspects). Sibirskii filologicheskii zhurnal = Siberian Journal of Philology. 2021;(2):325–336. (In Russ.)

16. Gorshkov Yu.G. Visualization of human emotional tension using a speech signal. Nauchnaya vizualizatsiya = Scientific Visualization. 2023;15(2):102–112 (In Russ.)

17. Gorshkov Y.G., Volkov A.K., Voinova N.A., et al. Acoustocardiography with Assessment of Emotional Tension from the Voice. Biomed. Eng. 2020;(53):383–387. https://doi.org/10.1007/s10527-020-09948-8

Review

For citations:

Kravchuk D.A. Development of a method for recognizing emotions from a speech signal. Proceedings of the Southwest State University. Series: IT Management, Computer Science, Computer Engineering. Medical Equipment Engineering. 2024;14(2):72-80. (In Russ.) https://doi.org/10.21869/2223-1536-2024-14-2-72-80

JATS XML

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2223-1536 (Print)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Proceedings of the Southwest State University. Series: IT Management, Computer Science, Computer Engineering. Medical Equipment Engineering

Development of a method for recognizing emotions from a speech signal

Full Text:

Abstract

Keywords

About the Author

References

Review

For citations:

Cookies policy