AI service for creating audiobooks developed at NSU

Neural network voices texts from the university's electronic library

Researchers at Novosibirsk State University (NSU) have launched a service for automatically creating audio versions of books from the digitized collection of the university library. The project was created on the basis of developments at the Research Center for Artificial Intelligence (AI Center) of NSU, the press service of the educational institution said.

The text is extracted from the pdf file, processed and voiced by a neural network. The university plans to convert about 7 thousand publications from the electronic library into audio format.

  For mass use, it is important that the solution is not resource-intensive: it takes about half an hour of processor time per book, and we are talking about a 16-core processor, even without a video card.
Evgeny Pavlovsky, leading researcher at the NSU Artificial Intelligence Center

The AI service is built on the basis of the "Kappa" framework (developed by the NSU AI Center). It can be used to manage datasets and AI models. The framework checks the correctness of the models and reduces the risk of errors.

As part of the pilot mode, 100 books were voiced. The project team is waiting for feedback from the university library and listeners.

The developers believe that about 7 thousand books can be converted into audio format in a month. However, it will take at least a year to check the result. In the future, they plan to scale their project to other electronic libraries.

Read more materials on the topic: