LooC: Effective Low-Dimentional Codebook for Compositional Vector Quantization

Abstract

Vector quantization (VQ) is a prevalent and fundamental technique that discretizes continuous feature vectors by approximating them using a codebook. As the diversity and complexity of data and models continue to increase, there is an urgent need for high-capacity, yet more compact VQ methods.

This paper aims to reconcile this conflict by presenting a new approach called LooC, which utilizes an effective Low-dimensional codebook for Compositional vector quantization. Firstly, LooC introduces a parameter-efficient codebook by reframing the relationship between codevectors and feature vectors, significantly expanding its solution space. Instead of individually matching codevectors with feature vectors, LooC treats them as lower-dimensional compositional units within feature vectors and combines them, resulting in a more compact codebook with improved performance. Secondly, LooC incorporates a parameter-free extrapolation-by-interpolation mechanism to enhance and smooth features during the VQ process, which allows for better preservation of details and fidelity in feature approximation. The design of LooC leads to full codebook usage, effectively utilizing the compact codebook while avoiding the problem of collapse. Thirdly, LooC can serve as a plug-and-play module for existing methods for different downstream tasks based on VQ. Finally, extensive evaluations on different tasks, datasets, and architectures demonstrate that LooC outperforms existing VQ methods, achieving state-of-the-art performance with a significantly smaller codebook.

Framework

Framework of Low-dimensional codebook for C ompositional vector quantization (LooC). The encoder transforms the input image into a continuous latent feature map z. z is then upsampled using bilinear interpolation with scale factor β. Simultaneously, each feature vector in z is divided into m units and quantized using a shared codebook C containing K codevectors of dimension d∗ = d/m. The quantized units are then reassembled and smoothed using average pooling to restore the shape as z. Finally, the decoder converts the feature map back to the image.

Performance

Image reconstruction results on low-resolution datasets of MNIST [27] and CIFAR10 [25]. LooC outperforms other SOTA methods with a significantly reduced codebook size of 32 × 4, which is 1024× smaller than 1024 × 128 used by most SOTAs..

mage reconstruction on high-resolution datasets of FFHQ [ 22 ] and ImageNet [4]. LooC has a compact code-book size of 256 × 4, which is 256× smaller than most SOTA methods’1024 × 256.

Visualization

Qualitative results. Reconstructed images using VQGAN [7], CVQ [53], and LooC. LooC significantly enhances reconstruction quality by preserving image details and restoring texture structures, as highlighted in the red boxes (best viewed in PDF with zoom).

Unconditional image generation on LSUN [48] and class-conditional image generation on Imagenet [4].

BibTeX

@inproceedings{li26wacv,
    author    = {Li, Jie and Wong, Kwan-Yee~K. and Han, Kai},
    title     = {LooC: Effective Low-Dimensional Codebook for Compositional Vector Quantization},
    booktitle = {Proc. Winter Conference on Applications of Computer Vision},
    volume    = {},
    pages     = {},
    address   = {Tucson, Arizona, USA},
    month     = {March},
    year      = {2026}
}
Copied!

This website is based on Nerfies.