Russian scientists have developed AI that recognizes emotions in speech with high accuracy

The CA-SER algorithm was developed by a group of researchers from the Sberbank AI laboratory, the AIRI Institute, and MIPT

Russian scientists have developed a new artificial intelligence model capable of recognizing emotions in human speech with a high level of accuracy. The researchers' development significantly outperformed almost all existing analogues when working with the most complex forms of content.

«The source code of the model is publicly available, so other scientists can use the tool in their research to conduct additional experiments to test the model's performance with other languages and datasets, increase its versatility and applicability in real conditions. Thus, the model can be trained on Russian-language emotional corpora and then used in voice assistants and contact centers,» the message says.

The new algorithm, called CA-SER, was developed by a group of researchers from the Sberbank Artificial Intelligence Laboratory, the AIRI Institute, and MIPT. The artificial intelligence system they created is based on the self-supervised learning (SSL) paradigm and combines several approaches that are actively used today in the analysis of spoken language and for emotion recognition.

First, the system detects important characteristics of speech, and then adds data about the sounds of the voice to them, including their volume and tonality, taking into account which part of the audio spectrum is best perceived by humans. These two types of information are combined using a special mechanism, effectively connecting the general characteristics of speech with its detailed features, which helps to more accurately determine the emotions of the speaker.

The work of this AI system and nine other similar projects was tested by Russian scientists using samples from the IEMOCAP database. It includes an extensive set of audio recordings, video clips, text transcripts, and other multimedia data related to a large number of human emotions.

These tests showed that the development of Russian scientists significantly outperformed almost all other AI systems and was comparable to the more complex HuBERT transformer neural network.

Read also:

Recognizes jokes, profanity and even sarcasm: Yandex Cloud introduced a new empath neural network that distinguishes emotions in speech

Russian scientists have created a universal machine vision system for object recognition

Like the human eye: in Russia, a neural network was taught to recognize information in documents

Sources
TASS

Now on home