The quality of 3D reconstruction is inherently influenced by the input image qualities, specifically color and depth information. However, these two types of data contribute differently to the tracking and mapping processes. In this paper, we propose a predictive uncertainty framework to identify valuable pixels for these two processes. Additionally, we leverage this predictive uncertainty to guide a strategic bundle adjustment, thereby enhancing the overall accuracy and robustness of the reconstruction process.
Neural implicit fields have recently emerged as a powerful representation method for multi-view surface reconstruction due to their simplicity and state-of-the-art performance. However, reconstructing thin structures of indoor scenes while ensuring real-time performance remains a challenge for dense visual SLAM systems. Previous methods do not consider varying quality of input RGB-D data and employ fixed-frequency mapping process to reconstruct the scene, which could result in the loss of valuable information in some frames.
In this paper, we propose Uni-SLAM, a decoupled 3D spatial representation based on hash grids for indoor reconstruction. We introduce a novel defined predictive uncertainty to reweight the loss function, along with strategic local-to-global bundle adjustment. Experiments on synthetic and real-world datasets demonstrate that our system achieves state-of-the-art tracking and mapping accuracy while maintaining real-time performance. It significantly improves over current methods with a 25% reduction in depth L1 error and a 66.86% completion rate within 1 cm on the Replica dataset, reflecting a more accurate reconstruction of thin structures.
Uni-SLAM consists of two threads, tracking and mapping. While tracking is performed every frame for RGB-D stream, besides constant mapping is performed every n frame constantly with global BA, activated additional mapping process is executed to capture local scene information based on uncertainty and co-visibility check with local BA and local loop closure optimization (LLCO). Our proposed pixel-level uncertainty method adaptively filters outlier pixels and reweights the loss function, enabling more precise localization during tracking and the reconstruction of color and geometric information in mapping.
@misc{wang2024unislamuncertaintyawareneuralimplicit,
title={Uni-SLAM: Uncertainty-Aware Neural Implicit SLAM for Real-Time Dense Indoor Scene Reconstruction},
author={Shaoxiang Wang and Yaxu Xie and Chun-Peng Chang and Christen Millerdurai and Alain Pagani and Didier Stricker},
year={2024},
eprint={2412.00242},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.00242},
}