Теперь умеет по-русски: Kandinsky Image научился генерировать изображения с надписями на кириллице

The neural network has learned to write without errors and on any surface

Sber has released an update to the Kandinsky image generation model, which can now generate inscriptions in Cyrillic, taking into account the texture of surfaces.

The developers trained the neural network using over 10 million images with Russian text written in various ways. This allowed the model to learn to distinguish between printed and cursive letters.

Initially, Kandinsky was trained to generate Cyrillic text natively, without using additional modules, and then it was fine-tuned on an expert dataset, carefully selected and verified by designers and artists.

The new model still has difficulties in a number of request categories — long inscriptions, inscriptions with a mixture of Cyrillic and Latin, a detailed description of the entity or background may not work the first time. Short requests without specifying the background, scale, and angle are faster and of better quality, but here the model will fantasize on its own, which, however, is often only for the better.

Specifying texture and lighting helps create interesting variations of inscriptions — stones, water, ice, glass, marmalade, old wood, moss, patent leather, glossy table. Letters can be given relief or volume. For transparent textures, you can specify "backlight", "contre-jour" — the letters will become translucent. And for effect, you can add smoke or fog.

You can test the model yourself in the Kandinsky Telegram bot and in all GigaChat bots (Telegram, VKontakte, Odnoklassniki, Max), as well as in the web version.

We also tried it and liked the result.

Read more on the topic: