Yasser Benigmin - Domain Adaptation in the Era of Foundation Models
Machine-readable: Markdown · JSON API · Site index
Описание видео
In this presentation, we address domain adaptation in semantic segmentation, where deep learning models rely heavily on large labeled datasets and struggle with domain shift, limiting real-world generalization. We show how Foundation Models (FMs) can be adapted to overcome these challenges under resource constraints through three key contributions. First, we present DATUM, a one-shot unsupervised domain adaptation approach that personalizes text-to-image diffusion models to generate diverse, style-consistent training data from a single target image. Next, we introduce CLOUDS, a collaborative framework in which multiple foundation models, such as CLIP, large language models, diffusion models, and Segment Anything Model, work together to generate synthetic data and automate the creation of high-quality pseudo-labels for self-training, enabling improved domain generalization.. Finally, we discuss FLOSS, a training-free strategy for open-vocabulary segmentation that enhances CLIP’s performance by automatically discovering class-specific “expert” text templates.
Yasser Benigmin is a recent PhD graduate in Computer Vision within the Multimedia team at Telecom Paris and the VISTA team at LIX (Laboratoire d'Informatique de l'X) at École Polytechnique, supervised by Stéphane Lathuilière, Vicky Kalogeiton, and Slim Essid. His research focuses on domain adaptation for semantic segmentation leveraging foundation models, with a particular emphasis on resource-constrained scenarios. Previously, he interned at INRIA Paris in the Astra-Vision team, working on open-vocabulary semantic segmentation under Raoul de Charette. Yasser holds an engineering degree from École des Mines de Saint-Étienne and completed an exchange year at EURECOM.
This session is brought to you by the Cohere Labs Open Science Community - a space where ML researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We'd like to extend a special thank you to Benedict Emoekabu and Mayank Bhaskar, Leads of our Computer Vision group for their dedication in organizing this event.
If you’re interested in sharing your work, we welcome you to join us! Simply fill out the form at https://forms.gle/ALND9i6KouEEpCnz6 to express your interest in becoming a speaker.
Join the Cohere Labs Open Science Community to see a full list of upcoming events (https://tinyurl.com/CohereLabsCommunityApp).