3D object spatial mapping from videos - Kejie Li
posted on 22 February, 2022


Abstract: Localising objects and estimating their geometry in 3D is an important step towards high-level 3D scene understanding, which has many applications in Augmented Reality and Robotics. In this talk, I will present our approaches to 3D object mapping with different focuses. Assuming a relatively simple and static environment, FroDO (CVPR 2020) is the first-of-its-kind framework to localise and reconstruct the detailed shape of multiple objects in a scene given a sequence of posed RGB video. We later developed MOLTR (RAL and ICRA 2021) to remove the assumption of a static environment. After realising data association is the bottleneck in the pipeline, we proposed ODAM (ICCV 2021) that focuses on associating object detections from different frames using an attentional graph neural network. The volume and location of objects are further refined using multi-view optimisation.