AdaGAN Summary With the advent of Generative Adversarial Networks (GANs), various state-of-the art results have been produced on a variety of generative tasks (such as DeepDream), However, one of the main issues that troubles GANs is the stability of GANs along with its propensity of miss modes. The AdaGAN paper introduces a novel method that helps to alleviate this issue without adding too much overhead to the GAN itself. AdaGAN is an iterative procedure that where new components are added at each step and adding it to a mixture modeled that runs the GAN on a reweighted sample. This is actually inspired by current boosting algorithms (such as Adaboost, where this paper derived its name from, using a similar concept at the perceptron level, rather than at the GAN level). The paper doesn’t aim to improve the performance of the GANs themselves, but rather act as a supplement to allow for faster convergence which can be used alongside existing GAN implementations, thus the modification should not be too extensive code wise, called a metaalgorithm. The actual AdaGAN algorithm consists of several import steps. First, the GAN algorithm (or some other generative model) must be run in the usual way to initialize the generative model with the resulting generator G1. Following that at every t-th step, the following must be performed: o pick the mixture weight βt for the next component o update weights Wt of examples from the training set in such a way to bias the next component towards “hard” ones, not covered by the current mixture of generators Gt−1 o run the GAN algorithm, this time importance sampling mini-batches according to the updated weights Wt, resulting in a new generator Gc t , and finally o update our mixture of generators Gt = (1 − βt)Gt−1 + βtGc t (notation expressing the mixture of Gt−1 and Gc t with probabilities 1−βt and βt). Following the specifics of the algorithm, the paper then goes into a measure of error called f-divergence. It then goes into various proofs regarding the validity of the algorithm and the optimal βt value at each step, I will skip these sections as they aren’t the main focus of our class, which the implementation and application of neural networks. Following the proof of concept for the algorithm itself, the paper compares AdaGAN boosted GAN to a couple of more naïve implementations, such as Best-of-N and Ensemble. Best-of-N basically runs N independent GAN instances and takes the run that returns the best result on the validation set. Ensemble is a mixture of T GANs trained independently and combined with equal weights, along with a vanilla GAN. In their tests, it could be seen that the GAN boosted using AdaGAN performed significantly better than the vanilla GAN and produced best results and faster convergence rates compare the both Best-of-N and Ensemble GANs. The reason why this paper interested me was that AdaGAN can essentially be applied on the of ANY current GAN implementation and be able to reduce missed modes. Thus it pertains to not just our own individual group projects, but also every other project within the class. One such case where AdaGAN could seem use would be the facial generation GANs which often have convergence issue in which there would be times where only one face would be generated.
© Copyright 2026 Paperzz