Diffusion Project

2024


Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

News: This paper is accepted by the WACV 2024 4th Workshop on Image/Video/Audio Quality in Computer Vision and Generative AI

For more details, visit the project page: GitHub - Diffusion Prism.

Introduction

Diffusion Prism is a training-free framework that efficiently transforms binary masks into realistic and diverse samples while preserving morphological features. We explored that a small amount of artificial noise will significantly assist the image-denoising process. To prove this novel mask-to-image concept, we use nano-dendritic patterns as an example to demonstrate the merit of our method compared to existing controllable diffusion models. We also extend the proposed framework to other biological patterns, highlighting its potential applications across various fields.

Sample Dataset

Sample of Dataset

Key Features

  • Training-Free Diffusion Framework: Generates images from binary skeletons without the need for model training or fine-tuning.
  • Diverse Backgrounds: Creates images with varied and realistic backgrounds, enhancing model generalizability.

Methodology

Sample of Dataset

Diffusion Process:

  • Combines masks with controllable noise, processed through a Variational Autoencoder (VAE) to generate latent variables.
  • The denoising U-Net refines these variables to produce realistic images guided by text prompts.

Experimental Results

Sample of Dataset

  • High-Quality: Lowest FID score compared to other methods, indicating better realistic styles.
  • Consistency: Morphology preserving, the skeleton shape is well-kept in synthesized images.

Sample of Dataset

For more details, visit the Project Page.

Citation

Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

@article{wang2025diffusion, title={Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion}, author={Wang, Hao and Chen, Xiwen and Bastola, Ashish and Qin, Jiayou and Razi, Abolfazl}, journal={arXiv preprint arXiv:2501.00944}, year={2025} }

FLAME Diffuser: Wildfire Image Synthesis using Mask Guided Diffusion





Contributing

This project is contributed by:

Hao Wang

Xiwen Chen

Ashish Bastola

Jiayou Qin

Abolfazl Razi