Our model takes a point cloud video clip with 12 frames as input, and subsequently conditions a spatiotemporal neural field in order to predict output point clouds of the complete dynamic scene at a chosen moment in time. Note that these results are non-cherry-picked as they comprise the first 20 scenes of the test set.
| Test Scene 1 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 2 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 3 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 4 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 5 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 6 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 7 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 8 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 9 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 10 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 11 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 12 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 13 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 14 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 15 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 16 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 17 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 18 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 19 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud | 
| Test Scene 20 | |||
| Input RGB Video | Input Point Cloud | Output Semantic Point Cloud | Target Semantic Point Cloud |