Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models

Minghao Yin1, Yukang Cao2, Kai Han1
1The University of Hong Kong, 2Nanyang Technological University
Teaser Figure

Figure 1. High-fidelity textured 3D morphing of Wukong. Taking an image of Wukong (bottom left) as the source and an image of another character (bottom right) as the target, we demonstrate two types of textured 3D morphing by our method: (i) Purple arrows indicate texture-controlled morphing, where the geometric structure changes while preserving detailed textures from the source; (ii) Green arrows indicate textured 3D morphing guided by the target prompt.

Abstract

We present WUKONG, a novel training-free framework for high-fidelity textured 3D morphing that takes a pair of source and target prompts (image or text) as input. Unlike conventional methods—which rely on manual correspondence matching and deformation trajectory estimation (limiting generalization and requiring costly preprocessing)—WUKONG leverages the generative prior of flow-based transformers to produce high-fidelity 3D transitions with rich texture details. To ensure smooth shape transitions, we exploit the inherent continuity of flow-based generative processes and formulate morphing as an optimal transport barycenter problem. We further introduce a sequential initialization strategy to prevent abrupt geometric distortions and preserve identity coherence. For faithful texture preservation, we propose a similarity-guided semantic consistency mechanism that selectively retains high-frequency details and enables precise control over blending dynamics. This avoids common artifacts like oversmoothing while maintaining semantic fidelity. Extensive quantitative and qualitative evaluations demonstrate that WUKONG significantly outperforms state-of-the-art methods, achieving superior results across diverse geometry and texture variations.

Video Trailer

Method Framework

Method Framework

Figure 2. Architecture. Given a source and a target (image or text), we extract features using pretrained encoders and treat the condition tokens as empirical distributions. We compute their Wasserstein barycenter to obtain interpolated condition tokens. These are fed into a shared geometry flow model and texture flow model to generate 3D outputs at different α values, producing textured 3D morphs. The top-right shows our texture-controlled morphing branch, and the bottom-right illustrates the recursive initialization.

Interactive Results

Click on any image below to load the interactive 3D model.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 1. Morphing from Wukong to a dragon.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 2. Morphing from Wukong to a fox.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 3. Morphing from Wukong to dragonborn.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 4. Morphing from Wukong to a dinosaur

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 5. Morphing from a television to a dragon.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 6. Morphing from a magic book to a castle.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 7. Morphing from a mail box to a robot.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 8. Morphing from a cottage to an elephant.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 9. Morphing from a bulldozer to a stoneman.

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 10. Morphing from a printing machine to a robot

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 11. Morphing from a building to a dinosaur

Source
Source
Blend
Target
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Preserve
Source
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
View 3D
Target
Target

Case 12. Morphing from a castle to a crab.

Citation

@inproceedings{yinwukong,
  title={Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models},
  author={Yin, Minghao and Cao, Yukang and Han, Kai},
  booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems}
}