Stylized 3D Avatar Creation via Cascaded Domain Bridging

SIGGRAPH Asia, 2022

Shen Sang    Tiancheng Zhi    Guoxian Song    Minghao Liu    Chunpong Lai   

Jing Liu    Xiang Wen    James Davis    Linjie Luo   

Figure 1: (a) Given a front-facing user image as input, (b) our method progressively bridges the domain gap between real faces and 3D avatars through three stages: (b.1) The stylization stage performs an image space translation to generate a stylized portrait while normalizing expressions. (b.2) The parameterization stage uses a learned model to find avatar parameters which match the results of stylization. (b.3) The conversion stage searches for a valid avatar vector matching the parameterization that can be rendered by the graphics engine. (c) The output is a user editable 3D model which can be animated and applied to various applications, for example personalized emoji.


Stylized 3D avatars have become increasingly prominent in our modern life. Creating these avatars manually usually involves laborious selection and adjustment of continuous and discrete parameters and is time-consuming for average users. Self-supervised approaches to automatically create 3D avatars from user selfies promise high quality with little annotation cost but fall short in application to stylized avatars due to a large style domain gap. We propose a novel self-supervised learning framework to create high-quality stylized 3D avatars with a mix of continuous and discrete parameters. Our cascaded domain bridging framework first leverages a modified portrait stylization approach to translate input selfies into stylized avatar renderings as the targets for desired 3D avatars. Next, we find the best parameters of the avatars to match the stylized avatar renderings through a differentiable imitator we train to mimic the avatar graphics engine. To ensure we can effectively optimize the discrete parameters, we adopt a cascaded relaxation-and-search pipeline. We use a human preference study to evaluate how well our method preserves user identity compared to previous work as well as manual creation. Our results achieve much higher preference scores than previous work and close to those of manual creation. We also provide an ablation study to justify the design choices in our pipeline.


[paper]  [arXiv] [supp]  [video] 


    title = {AgileAvatar: Stylized 3D Avatar Creation via Cascaded Domain Bridging},
    author = {Sang, Shen and Zhi, Tiancheng and Song, Guoxian and Liu, Minghao and Lai, Chunpong and Liu, Jing and Wen, Xiang and Davis, James and Luo, Linjie},
    booktitle = {ACM SIGGRAPH Asia 2022 Conference Proceedings},
    numpages = {8},
    year = {2022},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    location = {Daegu, Republic of Korea},
    series = {SIGGRAPH Asia '22},
    url = {},
    doi = {10.1145/3550469.3555402}