Yandex invents a new way to compress neural networks

This will help businesses reduce the cost of implementing AI systems

New methods for compressing language models have been released into free access by programmers at Yandex. Their implementation will allow businesses to reduce computing costs for using neural networks by eight times and thus save money. The Yandex Research solution will help companies that run neural networks on their own facilities.

Usually, a language model requires a large number of powerful graphics processors for its fast operation. Yandex specialists have created a solution that reduces the necessary computing power. It also includes a system for correcting errors that occur when compressing a large language model.

The company claims that the presented solution preserves 95% of the quality of the neural network's responses. The published code is available on GitHub.

Read more on the topic:

Yandex's neural network can retell videos in a foreign language

Neural network trained to recognize faces from awkward angles

It became known what Russians think about artificial intelligence