Skip to content

[CVPR 2026] Offical implementation of the paper "HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images".

License

Notifications You must be signed in to change notification settings

Correr-Zhou/HiFi-Inpaint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

Yichen Liu1,*, Donghao Zhou2,*, Jie Wang3, Xin Gao3, Guisheng Liu3, Jiatong Li3,โ€ , Quanwei Zhang4,
Qiang Lyu1, Lanqing Guo5, Shilei Wen3,ยง, Weiqiang Wang1,ยง, Pheng-Ann Heng2,ยง

1University of Chinese Academy of Sciences, 2The Chinese University of Hong Kong, 3ByteDance,
4Zhejiang University, 5UT Austin

*Equal contribution, โ€ Project Lead, ยงCorresponding Author

โ€‚ โ€‚ โ€‚ โ€‚

๐Ÿ”ฅ Updates

  • 2026.02: Our paper is accepted by CVPR 2026!

๐Ÿ“‘ Open-Source Plan

We will release the code, dataset, and model after internal review. Please stay tuned!

  • HP-Image-40K Dataset
  • HiFi-Inpaint Inference Code
  • HiFi-Inpaint Model

๐ŸŒ Abstract

Human-product images, which showcase the integration of humans and products, play a vital role in advertising, e-commerce, and digital marketing. The essential challenge of generating such images lies in ensuring the high-fidelity preservation of product details. Among existing paradigms, reference-based inpainting offers a targeted solution by leveraging product reference images to guide the inpainting process. However, limitations remain in three key aspects: the lack of diverse large-scale training data, the struggle of current models to focus on product detail preservation, and the inability of coarse supervision for achieving precise guidance. To address these issues, we propose HiFi-Inpaint, a novel high-fidelity reference-based inpainting framework tailored for generating human-product images. HiFi-Inpaint introduces Shared Enhancement Attention (SEA) to refine fine-grained product features and Detail-Aware Loss (DAL) to enforce precise pixel-level supervision using high-frequency maps. Additionally, we construct a new dataset, HP-Image-40K, with samples curated from self-synthesis data and processed with automatic filtering. Experimental results show that HiFi-Inpaint achieves state-of-the-art performance, delivering detail-preserving human-product images.

teaser

We propose HiFi-Inpaint, a DiT-based framework that can seamlessly integrate product reference images into masked human images, generating high-quality human-product images with high-fidelity detail preservation.

๐Ÿ”— Citation

If you find HiFi-Inpaint useful for your research and applications, please cite:

@inproceedings{hifi_inpaint_2026,
  title={HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images},
  author={Liu, Yichen and Zhou, Donghao and Wang, Jie and Gao, Xin and Liu, Guisheng and Li, Jiatong and Zhang, Quanwei and Lyu, Qiang and Guo, Lanqing and Wen, Shilei and Wang, Weiqiang and Heng, Pheng-Ann},
  booktitle={CVPR},
  year={2026}
}

About

[CVPR 2026] Offical implementation of the paper "HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images".

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors