A neural network has been developed in Russia that can "read" DNA and automatically build a gene map. This can speed up work with the genomes of organisms for which there was previously little biological data. The neural network was named GENATATOR.
The system looks at a DNA sequence and tries to understand where genes begin and end, what types they are, and how they are structured internally. AI copes with this complex task, as genes do not have clear "markers" by which their boundaries can be immediately determined.
Unlike older methods that operate on predefined rules, the model was trained on a large number of genomes. Therefore, it can find not only common protein-coding regions but also more complex genes, such as long non-coding RNAs.
The technology is especially useful for "non-model" organisms, for which there is almost no detailed data, and only raw genomic sequences are available.
During testing, the GENATATOR program, trained on humans and 38 mammalian species, successfully analyzed data from other living organisms, including Drosophila, plants, and yeasts. It also found rare "poisonous" exons that can destroy RNA.
To check the model's quality, an open leaderboard was created, where GENATATOR shows good results. The model itself is available on the Hugging Face platform.
Read more on the topic:
- Neural network diagnosed more accurately than a doctor: Russia discusses the boundaries of AI in medicine
- Russian Ministry of Health connects neural networks to medicine: AI will help doctors with documentation
- Breakthrough in Nanotechnology: Scientists from Moscow Pedagogical State University Assemble Hybrid Structures Using DNA Origami Technology for the First Time