Diffusion Rewards Guided Adversarial Imitation Learning

Abstract

Inspired by the recent dominance of diffusion models in generative modeling, this work proposes diffusion rewards guided adversarial imitation learning (DRAIL), which integrates a diffusion model into GAIL, aiming to yield more precise and smoother rewards for policy learning. Specifically, we propose a diffusion discriminative classifier to construct an enhanced discriminator; then, we design diffusion rewards based on the classifier's output for policy learning. Our proposed method outperforms baselines or achieves competitive performance in various continuous control domains, including navigation, robot arm manipulation, and locomotion.

Publication
GenAI4DM Workshop at International Conference on Learning Representations (ICLR) 2024.

Related