1Columbia University
2Toyota Research Institute
We introduce the task of dynamic scene completion: Given a monocular video as input, our model produces a 4D representation that captures the entire scene content along with all the static and dynamic objects within it over time. Secondly, in order to train and benchmark learning-based systems for object permanence, we contribute two large-scale synthetic datasets with rich annotations. Thirdly, we propose a framework that integrates transformer mechanisms within a spatiotemporal neural field for densely predicting scene features, and demonstrate that it exhibits occlusion reasoning capabilities. Our representation is conditioned on both local and global context via cross-attention, allowing it to generalize across different scenes.
Click here to view more long-term mesh visualizations for our model trained on the GREATER dataset.
Click here to view short-term visualizations that specifically focus on occlusion scenarios, primarily involving the yellow snitch.
Click here to view more long-term mesh visualizations for our model trained on the CARLA-4D dataset.
Click here to view short-term visualizations that specifically focus on occlusion scenarios involving pedestrians, cars, and motorcycles.
We contribute two multi-view RGB-D datasets:
Please see this Google Form link to request access to both datasets, or this repository to view the underlying generation code.
@inproceedings{vanhoorick2022revealing,
title={Revealing Occlusions with 4D Neural Fields},
author={Van Hoorick, Basile and Tendulkar, Purva and Sur\'is, D\'idac and Park, Dennis and Stent, Simon and Vondrick, Carl},
journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2022}
}
This research is based on work supported by Toyota Research Institute, the NSF CAREER Award #2046910, and the DARPA MCS program under Federal Agreement No. N660011924032. DS is supported by the Microsoft PhD Fellowship. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies, either expressed or implied, of the sponsors. The webpage template was inspired by this project page.