Researchers from the Higher School of Economics (HSE) and the Artificial Intelligence Institute AIRI have developed an innovative method for fine-tuning neural networks, which speeds up the process of adapting models to new tasks. The technology, called GSOFT, is based on grouping and optimally shuffling data, which reduces computational costs without sacrificing quality.
Traditional approaches to fine-tuning neural networks, such as LoRA or BOFT, require significant resources, especially when working with large models. Russian scientists have proposed an alternative — Group-and-Shuffle (GS) matrices, which divide data into groups, process them separately, and then combine them in an optimal way.
We figured out how to form orthogonal matrices using only two matrices of a special type, instead of five or six as in previous approaches. This saves resources and training time.
The GSOFT method was tested on various tasks, including fine-tuning the RoBERTa language model and image generation. Compared to its counterparts, it showed higher accuracy with lower memory and time costs. An additional option, Double GSOFT, allows you to adjust parameters from two sides, which increases the flexibility of the model.
We tested the method in various scenarios — from language and generative models to robust convolutional networks. In each of them, it worked reliably and with lower resource costs. This confirms that we can use the method for different purposes.
The researchers also tested their method on convolutional neural networks, which are commonly used for image and video analysis, for example, in face recognition systems. They developed GS matrices that can be used even in situations where the model needs to be resistant to noise and distortion.
The versatility of the approach allows it to be applied in various fields — from improving language models to creating robust image recognition systems. This opens up new opportunities for developers who need to quickly adapt AI solutions to changing tasks.
Read more on the topic:
Constructor for Adults: PAK-AI Changes the Approach to Business Digitalization in Russia
"Alice, subscribe to www1.ru": "Yandex" will complement its voice assistant with an AI agent