AI Learns to Analyze Voice, Facial Expressions, and Speech Simultaneously, Developed by HSE University and Sberbank Specialists

Unique algorithm proves 10% more accurate than existing counterparts

A new artificial intelligence system has been developed in Russia that recognizes human emotions better than its counterparts. The uniqueness of this technology lies in its simultaneous analysis of three sources of information: facial expressions, voice, and speech. This comprehensive assessment allows the system to be 10% more accurate than the best existing algorithms, which rely on only one source of data.

Andrey Savchenko, Scientific Director of the Sberbank Center for Practical Artificial Intelligence, said that the new technology is already demonstrating impressive results in tests. In the future, it can be adapted for use in virtual assistants, security systems, and telemedicine. One of the key advantages of the system is its flexibility: it can work even in conditions of data scarcity, for example, when the user's face is not visible or the voice is difficult to hear.

The development was carried out by Andrey Savchenko and his colleague Alexey Andreev from HSE University (Nizhny Novgorod). The structure of the system allows it to take into account changes in emotional state over time, which makes it more effective. Unlike other emotion recognition technologies, the new system can process multiple channels of information simultaneously, including facial expressions, voice characteristics, and speech structure.

According to the scientists, their development can be useful not only in marketing, but also in the field of security, where AI can help identify aggression or panic.

Read more on the topic:

Scientists at the Institute of Artificial Intelligence AIRI create a machine vision system for object recognition

AI-based medical assistant created in Russia

AI will help create superhard materials: Scientists from the Institute of Artificial Intelligence AIRI presented a new development