Tiny AI: Yandex Prepares Voice Model for Headphones and Smartwatches

The system will locally recognize speech without excessive battery drain

Yandex has developed a neural network model for voice control, approximately 200 KB in size. According to Dmitry Solodukha, head of voice activation, this is less than the size of a single photo on a smartphone.

The technology is designed for wearable devices — headphones, smartwatches, and other gadgets where low power consumption, fast response, and operation without constant processor load are crucial. In such devices, the system must constantly listen to ambient sound locally, but without draining the battery or causing delays.

To achieve this, Yandex engineers applied a two-stage scheme. First, a lightweight model determines if there is speech in the audio stream. Only then does the main neural network, responsible for command recognition, activate. This approach reduces the load on the device.

Additionally, developers reduced the number of model parameters by approximately 10 times through a new architecture. It is also planned to use chips with NPU — neural processors that accelerate AI computations and consume less energy than conventional CPUs — for such solutions.

The new model could become part of Yandex's future line of wearable AI devices. The first such gadgets are expected to be Yandex Drops headphones with "Alisa AI" and the "My Memory" function.

Read more on the topic: