Russian scientists from the AIRI Institute of Artificial Intelligence have developed a universal machine vision system for three-dimensional object recognition that performs equally well with different test sets. The development can be used in robotics, augmented reality and 3D scanning.
Previously, it was necessary to create separate models for each data set, which slowed down the development process. The new system, based on a pure transformer encoder, simplifies and speeds up this process.
The creation of three-dimensional machine vision systems is limited by small and heterogeneous data sets. The largest data set contains only about 7 thousand scenes, which is several times less than the millions of images for generative models.
To solve the problem, Russian scientists created a universal neural network based on a transformer encoder without optimizations for specific data sets and carried out a large-scale relabeling to reduce the number of object classes.
Experiments have shown that the new model effectively works with a large number of heterogeneous data sets and recognizes objects in different types of "point clouds" from laser radars and three-dimensional scanners. Scientists hope that the development will accelerate the creation of three-dimensional vision systems and improve the quality of their work.
Read also on the topic:
Russian engineers have developed "technical vision" for river and sea vessels (and water drones too)
Alikhanov was shown the first Russian robot tractor with AI from Cognitive Pilot
Russian scientists have developed a neuromorphic device for hydrocarbon exploration