StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces

Kyeongmin Yeo*

KAIST

ICLR 2025

*Equal contribution

Assorted images generated by StochSync
Assorted mesh textures and panoramas generated using StochSync, including one in the background (environment map), which is a 360° panorama. StochSync extends the capabilities of image diffusion models trained in square spaces to produce images in arbitrary spaces such as cylinders, spheres, tori, and mesh surfaces.

✨Abstract

We propose StochSync\texttt{StochSync}, a method for generating images in arbitrary spaces—such as 360° panoramas or textures on 3D surfaces—using a pretrained image diffusion model. The main challenge is bridging the gap between the 2D images understood by the diffusion model (instance space X\mathcal{X}) and the target space for image generation (canonical space Z\mathcal{Z}). Unlike previous methods that struggle without strong conditioning or lack fine details, StochSync\texttt{StochSync} combines the strengths of Diffusion Synchronization and Score Distillation Sampling to perform effectively even with weak conditioning. Our experiments show that StochSync\texttt{StochSync} outperforms prior finetuning-based methods, especially in 360° panorama generation.


💡Overall Idea

Our approach combines the strengths of two existing methods:

Diffusion Synchronization (DS): Excels in producing detailed images but struggles with coherence across views when instance and canonical spaces are not pixel-aligned, often requiring strong guidance like depth maps.

Score Distillation Sampling (SDS): Achieves coherent results, but lacks fine details and high-quality textures.

We observed that the coherence in SDS comes from using maximum stochasticity in the denoising process—specifically, setting the noise level σt\sigma_t to its highest value during each reverse diffusion step in DDIM.

By incorporating maximum stochasticity into the reverse diffusion process of DS, along with other methods to further enhance quality, we achieve the coherence benefits of SDS while retaining the detailed quality of DS.

🌍360° Panorama Generation

Comparison of 360° panorama generation using StochSync and the baseline method.
Image 2
StochSync (Ours)
Image 1
SDS
Image 2
StochSync (Ours)
Image 1
SDI
Image 2
StochSync (Ours)
Image 1
ISM
Image 2
StochSync (Ours)
Image 1
MVDiffution
Image 2
StochSync (Ours)
Image 1
Panfusion
Image 2
StochSync (Ours)
Image 1
L-Magic

🎨3D Mesh Texturing


🌀Non-Euclidean Image Generation

⚽Sphere

🍩Torus