Foundations of Spatial Perception for Robotics: Hierarchical Representations and Real-Time Systems


The paper (missing reference) lays the foundations of hierarchical metric-semantic map representations for robotics. While hierarchical representations have been popular in robotics since its inception, the paper shows that these representations (i) lead to graphs with small treewidth (hence enabling efficient inference), (ii) allow combining traditional SLAM techniques with neuro-symbolic reasoning based on graph neural networks (GNN), and (iii) enable novel GNN architectures, namely Neural Trees, that are provably expressive and efficient. The paper also provides the first approach to build hierarchical representations, namely 3D scene graphs, of indoor environments in real-time. The approach combines 3D geometry, topology, and geometric deep learning, and reconstructs a hierarchical map of objects, places, rooms, buildings, and their relations.

The code has been released open-source at: