Beyond Labels. From self- to language-supervised learning, and the 3D World - Iro Laina (University of Oxford)
posted on 21 November, 2023


Abstract: Over the past few years, significant progress has been made to limit the amount of manual annotations required to train models for computer vision tasks. Self-supervised models and generalist (foundation) models have proven extremely powerful on multiple existing benchmarks and have paved the way towards new applications. In this talk, I will discuss our past and current efforts in the domains of self-supervised and language-supervised learning, focusing on understanding how visual concepts are represented in such models and how we can leverage these to extract meaningful information from images, for example segmenting objects. Finally, I will discuss how semantic information and priors in the 2D domain can be lifted to the 3D domain via neural rendering.