Image Segmentation with Semantic Equivariance for Unsupervised Adaptation and Tracking - Nikita Araslanov
posted on 8 February, 2022


Abstract: The high accuracy of modern semantic segmentation models hinges on expensive high-quality dense annotation. Therefore, designing unsupervised objectives to learn semantic representations is of high practical relevance. This talk will focus on one principle towards this goal: semantic equivariance. The underlying idea is to exploit equivariance of the semantic maps to similarity transformations w.r.t. the input image. We will consider specific implementations and extensions of this technique in three problem domains. First, we will take a look at the unsupervised domain adaptation, where we adapt our model, trained on annotated synthetic data, to unlabelled real-world images. In the second example leveraging the equivariance, we will develop an approach to substantially improve model generalisation. In this setting, there is no target distribution available for model adaptation as before, but only a single datum from that distribution. A third example will present an unsupervised learning framework for extracting dense and semantically meaningful object-level correspondences from unlabelled videos. Here, we will exploit the equivariance to sidestep trivial solutions while learning dense semantic representations efficiently. We will highlight some of the limitations common to the discussed methods, and will conclude the presentation with an outlook on follow-up research directions.