Sber researchers have developed GigaEmbeddings, a model that improves the handling of Russian-language texts. It is based on GigaChat-3B and uses a three-stage training process: preliminary preparation, fine-tuning, and multi-task learning. The architecture is optimized, which reduced the neural network parameters by 25% without compromising quality.
Until now, businesses have lacked effective tools for analyzing texts in Russian. Existing solutions either required significant computing power or struggled with search and classification. GigaEmbeddings solves these problems. The model is suitable for smart search in e-commerce, creating chatbots with advanced functions, analyzing customer requests, and generating recommendations.
Today, we are addressing a critical market need for high-quality NLP solutions for the Russian language. Our comprehensive platform allows businesses to radically optimize all text-related processes — from basic search and recommendation algorithms to advanced RAG systems in chatbots. [...] Companies are finally getting a unified solution — they no longer need to assemble functionality piecemeal from foreign products.
The model is available on GitVerse and HuggingFace. Developers expect it to become the standard for the financial sector, retail, and government services.