A research study illustrates about RingMo, a Remote Sensing foundation model that can self-supervise and enhance the accuracy of remote sensing image interpretation.
The conventional remote sensing methods have limitations such as a domain gap between natural and remote sensing scenes and the poor generalization capacity of remote sensing models. Hence there was a need to develop a foundation model with general remote sensing feature representation. The self-supervised method is better than the fully supervised method in remote sensing to manage a large amount of unlabeled data available. According to the Aerospace Information Research Institute (AIR), Chinese Academy of Sciences (CAS), RingMo, a remote sensing foundation model framework has been developed to enhance the benefits of generative self-supervised learning for remote sensing images and improve the accuracy of remote sensing image interpretation.
Remote sensing images are essential in various fields such as classification and change detection and deep learning approaches have also been included to instantly develop remote sensing image interpretation. The most used training pattern is implementation of ImageNet pre-trained models to process remote sensing data for specified tasks. However, the above two methods have various performance limitations.
RingMo features large-scale datasets compiled using 2 million remote sensing images from satellite and aerial platforms, involving multiple scenes and objects around the world. This remote sensing foundation model training method is developed for dense and small objects in complex remote sensing scenes. RingMo is the first generative foundation model that is implemented with masked image modeling which can be used for cross-modal remote sensing data. The researchers envision implementing this model for 3D reconstruction, residential construction, transportation, and other fields.
Click here for the Published Research Paper