Diffusion Model

What is a Diffusion Model?

A diffusion model is a type of generative AI that produces high-quality images (or other data) by learning to reverse a noise-adding process. It starts with pure random noise and step by step refines it into a coherent output -- much like a sculptor chipping away at a block of marble to reveal a statue inside.

Imagine spilling a drop of ink into a glass of water. Over time the ink diffuses until the water is uniformly cloudy. A diffusion model learns to run this process backwards: starting from the cloudy water and reconstructing the original ink drop. In practice, "ink" is a real image and "cloudy water" is random noise.

How Does It Work?

Forward process (adding noise) -- During training, the model takes real images and gradually adds Gaussian noise over many steps until the image becomes pure static.
Reverse process (removing noise) -- The model learns to predict and remove the noise at each step. Given a noisy image, it estimates what the slightly less noisy version should look like.
Generation -- To create a new image, the model starts from random noise and iteratively denoises it, guided by a text prompt or other conditioning signal.

Why Does It Matter?

Diffusion models power some of the most impressive image generation systems available today and have expanded into video, audio, and 3D content generation.

Key Examples

Stable Diffusion -- an open-source image generation model by Stability AI.
DALL-E (OpenAI) -- generates images from text descriptions.
Midjourney -- a popular AI art generation service.

What is a Diffusion Model?

How Does It Work?

Why Does It Matter?

Key Examples

Related terms