Self-supervised Learning from images, videos and augmentations - Yuki Asano (University of Amsterdam)
posted on 9 February, 2023


Abstract: In this talk I will talk about pushing the limits of what can be learnt without using any human annotations. After a first overview of what self-supervised learning is, we will first dive into how clustering can be combined with representation learning using optimal transport ([1] @ ICLR’20), a paradigm still relevant in current SoTA models like SwAV/DINO/MSN. Next, I will show how self-supervised clustering can be used for unsupervised segmentation in images ([2] @CVPR’22) and for videos (unpublished research). Finally, we analyse one of the key ingredients of self-supervised learning, the augmentations. Here, I will show that it is possible to extrapolate to semantic classes such as those of ImageNet or Kinetics using just a single datum as visual input when combined with strong augmentations and a pretrained teacher ([3] @ICLR’23).