Efficient Image Sampling in Diffusion Models via Rough Set Selection and Mamba State Spaces

Jianxing Yu, Zhiyong She, Yijin Shi

Abstract


The Denoising Diffusion Probabilistic Model (DDPM) faces significant challenges, including low sampling efficiency, high computational complexity, and substantial resource demands when processing long sequences. To address these issues, this study introduces an innovative framework that integrates Rough Set theory with a Mamba-based state-space model (SSM).In our approach, each timestep in the diffusion process is treated as an object within a Rough Set domain, where temporal intervals define equivalence relations. These relations yield equivalence classes, and for any subset of the domain, we compute its roughness based on these classes. The optimal sampling sub-sequence is then selected by identifying the subset with minimal roughness, ensuring a more stable and representative sampling path compared to random strategies. During the model's training, the data object at each timestep initializes the input and state matrices of the SSM. The state transition from the previous timestep and the current input are computed dynamically based on the temporal intervals. This process allows the SSM to perform selective state updates, significantly enhancing the training efficiency of the proposed Mamba-DDPM. We evaluated our method against DDPM, DDIM, VAE-DDPM, and ViT-DDPM on the ImageNet and FFHQ datasets. Experimental results demonstrate the superiority of our approach across multiple metrics, including FID, SSIM, PSNR, and image generation time at various resolutions. Specifically, compared to ViT-DDPM, our method achieved relative improvements of 0.30%~15.39% in FID, 2.44%~21.13% in SSIM, and 0.35%~3.34% in PSNR for 128x128 image generation. For 512x512 generation, the gains were 2.12%~13.06% in FID, 1.27%~14.29% in SSIM, and 0.62%~2.14% in PSNR. We conclude that the proposed method effectively mitigates the inherent limitations of DDPM and outperforms other leading diffusion models.


Full Text:

PDF


DOI: https://doi.org/10.31449/inf.v49i30.12067

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.