Abstract
Diffusion models have demonstrated strong generative capabilities across domains ranging from image synthesis to complex reasoning tasks. However, most inference-time scaling methods rely on fixed denoising schedules, limiting their ability to allocate computation based on instance difficulty or task-specific demands adaptively. We introduce the challenge of adaptive inference-time scaling—dynamically adjusting computational effort during inference—and propose Adaptive Bi-directional Cyclic Diffusion (ABCD), a flexible, search-based inference framework. ABCD refines outputs through bi-directional diffusion cycles while adaptively controlling exploration depth and termination. It comprises three components: Cyclic Diffusion Search, Automatic Exploration-Exploitation Balancing, and Adaptive Thinking Time. Experiments show that ABCD improves performance across diverse tasks while maintaining computational efficiency.
Background
Diffusion Models show strong generative power. However, most inference-time scaling methods rely on fixed denoising schedules, limiting their ability to adapt computation based on instance difficulty or task-specific demands. This rigidity restricts performance on challenging inputs and can lead to inefficient spending of computation.
Proposed Solution: We introduce the challenge of adaptive inference-time scaling—dynamically adjusting computational effort during inference—and propose the Adaptive Bi-directional Cyclic Diffusion (ABCD) framework.
Adaptive Bi-directional Cyclic Diffusion (ABCD)
ABCD reframes diffusion model inference as a flexible and efficient search process. The method is composed of three components, which operate in sequence during each iteration (or cycle) of the search:
1. Cyclic Diffusion Search (CDS)
Enables iterative refinement by cycling bi-directionally through the diffusion timeline. Each cycle consists of three stages:
- Denoising: $N$ particles are quickly denoised from $t=T$ to $t=0$ using accelerated DDIM, yielding initial candidates $x_0$
- Selection and Copy: Top-$K$ particles are selected using a reward function and replicated $J$ times
- Noising: $K \times J$ particles are sent back to go-back timestep $t'$ via forward process $q(x_{t'}|x_0)$
2. Automatic Exploration-Exploitation Balancing (AEEB)
Dynamically controls exploration depth using a "temperature pool" $\mathcal{T}=(t_1, t_2, ..., t_M)$ instead of fixed go-back timesteps. $K \times J$ replicas are distributed across all temperatures, allowing parallel exploration at multiple depths. Suboptimal temperatures are automatically discarded in subsequent Top-$K$ selection.
3. Adaptive Thinking Time (ATT)
Provides principled stopping criterion by monitoring solution quality evolution. Termination occurs when all Top-$K$ particles originate from the lowest temperature ($t=0$) for $\kappa$ consecutive cycles, indicating that global exploration is no longer needed and the search has converged to local refinement.
Experimental Results
ABCD's effectiveness is demonstrated across a diverse suite of challenging tasks, showcasing superior performance and computational efficiency compared to fixed-schedule baselines: Standard base Diffusion, Best-of-N (BoN), Sequential Monte Carlo (SMC), Beam Search (BS), and Search over Path (SoP).
The experiments cover a wide range of tasks, including the Mixture of Gaussian (MoG) proof-of-concept, Sudoku Puzzle Completion, Pixel Maze Path Generation, OGBench Point Maze Navigation, QM9 Molecular 3D Structure Prediction, and Text-to-Image Generation. Overall, ABCD consistently achieves significant gains in performance and efficiency across all domains.
Key Experimental Findings
Mixture of Gaussian (MoG)
The Mixture of Gaussian (MoG) toy task serves as a proof-of-concept to illustrate the necessity of multiple go-back noise levels and an adaptive terminal condition during iterative refinement. ABCD achieved a 100% success rate and minimal final distance on both Dataset 1 (locally clustered modes) and Dataset 2 (distant modes).
- Small go-back fails: Small GB (e.g., GB 20) cannot escape poor initial predictions on complex Dataset 2.
- Large go-back risks: Large GB can over-cycle and diverge on easier Dataset 1, losing useful information.
- Adaptive balance: ABCD adaptively balances exploration and exploitation, enabling effective scaling.
Overall Performance on Key Tasks
ABCD consistently delivered state-of-the-art results across various domains, showcasing superior time-accuracy trade-offs:
Sudoku Puzzle Completion
A logical reasoning benchmark requiring global exploration and local refinement. ABCD consistently outperforms all baselines, achieving 100% accuracy on harder test sets while baselines (e.g., SoP) reach only 95.5%. It maintains stable accuracy across all difficulty levels.
Pixel Maze Path Generation (OOD)
Tests generalization to unseen, larger maze structures. ABCD achieves near-perfect success rates significantly faster than all baselines. The performance gap widens as maze size increases (Size 11-15), highlighting superior adaptive exploration capabilities.
Molecular 3D Structure Prediction (QM9)
Requires generating physically and chemically valid 3D conformations. ABCD significantly outperforms all baselines, achieving peak molecular stability of ~0.99 (vs. SoP's 0.94), emphasizing robustness in complex generation tasks.
Text-to-Image Generation
Evaluated on high-dimensional outputs using compressibility and aesthetic scores. ABCD consistently outperforms BoN and SoP, efficiently identifying high-reward samples. ABCD achieves the same compression level that SoP reaches at 272s in less than a quarter of the time.
OGBench Point Maze Navigation
A long-horizon planning task (1000+ steps). ABCD consistently surpassed all baselines and was the only method to achieve perfect performance on both Large and Giant mazes.
Ablation Studies
Ablation studies confirm the effectiveness and necessity of ABCD's adaptive components. Key findings are summarized below.
(a)~(b) Task-Specific Optimal Exploration Depth
The optimal go-back noise level for iterative refinement varies substantially by task. For example, the best go-back step size was observed to be approximately 3/4 of total denoising steps for OGBench, 2/5 for Sudoku, and 4/5 for Pixel Maze. This variation motivates ABCD's Automatic Exploration–Exploitation Balancing (AEEB), which probes multiple exploration depths in parallel to find task-specific optimal depths.
(c) Adaptive Terminal Condition Necessity
An ablation on terminal criteria shows that using an adaptive stopping rule substantially improves overall time–success trade-offs by avoiding wasted computation on easy instances while allocating additional cycles to difficult ones. The adaptive terminal condition therefore plays a critical role in ABCD's efficiency.
(d) Dynamic Compute Allocation (Adaptive Thinking Time)
ABCD's Adaptive Thinking Time (ATT) criterion automatically scales computational effort to instance difficulty. Harder instances (smaller number provided case) receive more thinking cycles while simpler instances (more number provided case) terminate earlier, producing consistently superior time–success trade-offs compared to fixed-cycle baselines.
Adaptive Exploration and Refinement Mechanism
The image describes the mechanism by which the Adaptive Bi-directional Cyclic Diffusion (ABCD) model dynamically controls the balance between global exploration and local refinement across inference cycles. This mechanism is crucial for achieving high performance with optimal computational efficiency, especially in complex reasoning tasks like Sudoku.
- Behavior Varies by Difficulty: The search behavior varies significantly across Sudoku cases. Harder instances (fewer provided clues) generally tend to require a longer search (more cycles).
- Dynamic Exploration/Exploitation: Early Iterations focus on large modifications (corresponding to higher go-back temperatures/larger noise addition), emphasizing global exploration. Later Cycles concentrate on smaller refinements (corresponding to lower go-back temperatures/less noise addition), shifting toward local exploitation.
- Instance-Specific Graph: This dynamic adjustment means the model automatically expands and adjusts the diffusion generation process, resulting in a different generation graph for each final $x_0$ prediction.
Conclusion
This work addresses the critical limitation of fixed computational allocation in Diffusion Models by introducing the challenge of Adaptive Inference-Time Scaling. We proposed Adaptive Bi-directional Cyclic Diffusion (ABCD), a flexible, search-based inference framework that dynamically adjusts computational effort.
BibTeX
@misc{lee2025adaptiveinferencetimescalingcyclic,
title={Adaptive Inference-Time Scaling via Cyclic Diffusion Search},
author={Gyubin Lee and Truong Nhat Nguyen Bao and Jaesik Yoon and Dongwoo Lee and Minsu Kim and Yoshua Bengio and Sungjin Ahn},
year={2025},
eprint={2505.14036},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2505.14036},
}