CFCS Youth Talks

Learning Dynamic 3D Objects from Images in the Wild

  • Shangzhe Wu, University of Oxford
  • Time: 2022-12-16 10:00
  • Host: Dr. Hao Dong
  • Venue: Online Talk


We all live in a physical 3D world, full of dynamic 3D objects. Being able to perceive the dynamic world in 3D from 2D visual observations is not only key to many AR and robotics applications, but also fundamental to the goal of visual understanding. However, existing learning-based image understanding models often treat images simply as compositions of 2D patterns, ignoring the fact that they arise from a 3D world. The major obstacle is the lack of large-scale 3D data and annotations, which are typically required to train 3D perception systems but are prohibitively expensive to collect. Natural intelligences, on the other hand, develop a comprehensive 3D understanding of the world primarily by observing 2D projections, without relying on extensive 3D supervision. This begs the question that I attempt to answer in my research: "can machines learn to perceive the dynamic 3D world without explicit 3D supervision?" In this talk, I will present some of our recent efforts in approaching this question. In particular, I will show how we can learn disentangled, explicit 3D representations of objects from just raw online photos and videos through inverse rendering, without explicit 3D labels for training.



Shangzhe Wu is a PhD student in the Visual Geometry Group (VGG) at the University of Oxford, advised by Andrea Vedaldi and Christian Rupprecht. His current research focuses on unsupervised 3D learning and inverse rendering. He also spent time interning at Google Research with Noah Snavely's team. His work on unsupervised learning of symmetric 3D objects received the Best Paper Award at CVPR 2020. Homepage:


  • Admission


VooV Meeting ID: 655 232 512