In this paper, we tackle the problem of Generalized Category Discovery (GCD). Given a dataset containing both labelled and unlabelled images, the objective is to categorize all images in the unlabelled subset, irrespective of whether they are from known or unknown classes.
In GCD, an inherent label bias exists between known and unknown classes due to the lack of ground-truth labels for the latter. State-of-the-art methods in GCD leverage parametric classifiers trained through self-distillation with soft labels, leaving the bias issue unattended. Besides, they treat all unlabelled samples uniformly, neglecting variations in certainty levels and resulting in suboptimal learning. Moreover, the explicit identification of semantic distribution shifts between known and unknown classes, a vital aspect for effective GCD, has been neglected.To address these challenges, we introduce DebGCD, a Debiased learning with distribution guidance framework for GCD. Initially, DebGCD co-trains an auxiliary debiased classifier in the same feature space as the GCD classifier, progressively enhancing the GCD features. Moreover, we introduce a semantic distribution detector in a separate feature space to implicitly boost the learning efficacy of GCD. Additionally, we employ a curriculum learning strategy based on semantic distribution certainty to steer the debiased learning at an optimized pace.
Thorough evaluations on GCD benchmarks demonstrate the consistent state-of-the-art performance of our framework, highlighting its superiority.
Overview of the DebGCD framework. In the upper branch, raw features are transformed using an MLP and then normalized. These normalized features are used for semantic distribution learning with a one-vs-all classifier. In the lower branch, a GCD classifier is trained on the normalized raw features. The predictions from both branches are combined to train the debiased classifier. As DebGCD aligns with prior work in representation learning, it's not explicitly depicted here.
We compare DebGCD with previous state-of-the-art GCD methods on the SSB benchmark. All methods are based on the DINO pre-trained backbone. We can see that our method consistently outperforms previous state-of-the-art methods.
The results on three coarse-grained datasets are shown below.
Visualization of the CIFAR100 features of the baseline and our method using t-SNE. Specifically, we randomly select a set of 20 classes, including 10 from the 'Old' categories and 10 from the 'New' categories. The clearly distinguishable clusters depicted indicate that the features obtained within our framework form notably cohesive groupings compared to those of the baseline. This effectively demonstrates the optimization impacts induced by our method on the clustering feature space.
Visualization of attention maps. Our method successfully directs its attention towards foreground objects, irrespective of whether they belong to the 'Old' or 'New' classes. The baseline denotes the pre-trained DINO.
@inproceedings{Liu2025DebGCD,
author = {Liu, Yuanpei and Han, Kai},
title = {DebGCD: Debiased Learning with Distribution Guidance for Generalized Category Discovery},
booktitle = {International Conference on Learning Representations (ICLR)},
year = {2025}
}