Diffusion Project

2024

Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

News: This paper is accepted by the WACV 2024 4th Workshop on Image/Video/Audio Quality in Computer Vision and Generative AI

For more details, visit the project page: GitHub - Diffusion Prism.

Introduction

Diffusion Prism is a training-free framework that efficiently transforms binary masks into realistic and diverse samples while preserving morphological features. We explored that a small amount of artificial noise will significantly assist the image-denoising process. To prove this novel mask-to-image concept, we use nano-dendritic patterns as an example to demonstrate the merit of our method compared to existing controllable diffusion models. We also extend the proposed framework to other biological patterns, highlighting its potential applications across various fields.

Sample Dataset

Sample of Dataset

Dataset: Download from Google Drive

Key Features

Training-Free Diffusion Framework: Generates images from binary skeletons without the need for model training or fine-tuning.
Diverse Backgrounds: Creates images with varied and realistic backgrounds, enhancing model generalizability.

Methodology

Sample of Dataset

Diffusion Process:

Combines masks with controllable noise, processed through a Variational Autoencoder (VAE) to generate latent variables.
The denoising U-Net refines these variables to produce realistic images guided by text prompts.

Experimental Results

Sample of Dataset

High-Quality: Lowest FID score compared to other methods, indicating better realistic styles.
Consistency: Morphology preserving, the skeleton shape is well-kept in synthesized images.

Sample of Dataset

For more details, visit the Project Page.

Citation

Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion

@article{wang2025diffusion, title={Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion}, author={Wang, Hao and Chen, Xiwen and Bastola, Ashish and Qin, Jiayou and Razi, Abolfazl}, journal={arXiv preprint arXiv:2501.00944}, year={2025} }

FLAME Diffuser: Wildfire Image Synthesis using Mask Guided Diffusion