A tool that allows determining the actual performance of artificial intelligence systems when processing large texts in Russian and English has been developed by scientists from Russia and the UK. Let's tell you more.
The created benchmark assesses how the accuracy of the answer depends on the length of the text, as well as the quality of the neural network's responses. Excerpts from works of fiction with built-in tasks for understanding short tests were combined by scientists with the BABI dataset.
During the tests, it turned out that popular language models take into account about 20% of the context length, and the more complex the task and the larger the amount of data, the worse the quality. It is necessary to improve data processing, scientists believe. This is reported by the press service of the AIRI Institute.
The AIRI Institute, MIPT, the London Institute for Mathematical Sciences, and SberDevices participated in the study.
The head of the "Models with Memory" group at the AIRI Institute, Yuri Kuratov, is confident that the scientists' development will help language model developers understand where improvement is needed.
Read more on the topic:
A data bank for training AI has been developed in Moscow
A neural network for developing scenarios and films has been created in Russia