uHumans Dataset

We use a photo-realistic Unity-based simulator to test our spatial perception engine in a 65mx65m simulated office environment. The simulator also provides the 2D panoptic semantic segmentation for Kimera. Humans are simulated using standard graphics assets, and in particular the realistic 3D models provided by the SMPL project. A ROS service enables us to spawn objects and agents into the scene on-demand. The simulator provides ground-truth poses of humans and objects, which we use for benchmarking (Rosinol et al., 2020). Using this setup, we create several large visual-inertial datasets.

We release two dataset versions:

V1.0: is the dataset we used in our RSS2020 paper, and it is the one described below.
V2.0: is an extended version that can be found here: uHumans2

Dataset V1.0

This is the original dataset used for evaluation in our RSS2020 paper.

The datasets are:

where each one has 12, 24, and 60 humans, respectively.

Specifications

Stereo cameras
Depth camera
2D Semantic Segmentation
IMU
Odometry

types:       nav_msgs/Odometry      [cd5e73d190d741a2f92e81eda573aca7]
             sensor_msgs/CameraInfo [c9a58c1b0b154e0e6da7578cb991d214]
             sensor_msgs/Image      [060021388200f6f0f447d0fcd9c64743]
             sensor_msgs/Imu        [6a62c6daae103f4ff57a132d6f95cec2]
             tf2_msgs/TFMessage     [94810edda583a504dfda3829e70d7eec]
topics:      /tesse/depth/camera_info            1073 msgs    : sensor_msgs/CameraInfo
             /tesse/depth/image_raw              1073 msgs    : sensor_msgs/Image
             /tesse/imu                         40241 msgs    : sensor_msgs/Imu
             /tesse/left_cam/camera_info         1073 msgs    : sensor_msgs/CameraInfo
             /tesse/left_cam/image_raw           1073 msgs    : sensor_msgs/Image
             /tesse/odom                        40240 msgs    : nav_msgs/Odometry
             /tesse/right_cam/camera_info        1067 msgs    : sensor_msgs/CameraInfo
             /tesse/right_cam/image_raw          1067 msgs    : sensor_msgs/Image
             /tesse/segmentation/camera_info     1067 msgs    : sensor_msgs/CameraInfo
             /tesse/segmentation/image_raw       1067 msgs    : sensor_msgs/Image
             /tf                               105753 msgs    : tf2_msgs/TFMessage
             /tf_static                             1 msg     : tf2_msgs/TFMessage

References

Rosinol, A., Gupta, A., Abate, M., Shi, J., & Carlone, L. (2020). 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans. Robotics: Science and Systems (RSS). https://doi.org/10.15607/RSS.2020.XVI.079

@inproceedings{Rosinol20rss-dynamicSceneGraphs,
  title = {{3D} Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans},
  author = {Rosinol, A. and Gupta, A. and Abate, M. and Shi, J. and Carlone, L.},
  booktitle = {Robotics: Science and Systems (RSS)},
  year = {2020},
  pdf = {https://arxiv.org/pdf/2002.06289.pdf},
  url = {http://news.mit.edu/2020/robots-spatial-perception-0715},
  video = {https://www.youtube.com/watch?v=SWbofjhyPzI},
  doi = {10.15607/RSS.2020.XVI.079},
  note = {\linkToPdf{https://arxiv.org/pdf/2002.06289.pdf},
  	\linkToMedia{http://news.mit.edu/2020/robots-spatial-perception-0715},
  	\linkToVideo{https://www.youtube.com/watch?v=SWbofjhyPzI&feature=youtu.be}}
}