Coding a GAN in Keras

Of course, the best way to learn is by doing, so let's jump in and start coding our first GAN. In this example, we will be building the basic DCGAN and then modifying it later for our purposes. Open up Chapter_3_2.py and follow these steps:

This code was originally pulled from https://github.com/eriklindernoren/Keras-GAN, which is the best representation of GANs in Keras anywhere, and is all thanks to Erik Linder-Norén. Great job, and thanks for the hard work, Erik.

An alternate listing a vanilla GAN has been added as Chapter_3_1.py for your learning pleasure.

We start by importing libraries:

from __future__ import print_function, division
from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import Adam
import matplotlib.pyplot as plt
import sys
import numpy as np

There are a few highlighted new types introduced in the preceding code: Reshape, BatchNormalization, ZeroPadding2D, LeakyReLU, Model, and Adam. We will explore each of these types in more detail next.
Most of our previous examples worked with basic scripts. We are now at a point where we want types (classes) of our own built for further use later. That means we now start by defining our class like so:

class DCGAN():

So, we create a new class (type) called DCGAN for our implementation of a deep convolutional GAN.
Next, we would normally define our init function by Python convention. However, for our purposes, let's first look at the generator function:

def build_generator(self):
  model = Sequential()
  model.add(Dense(128 * 7 * 7, activation="relu", input_dim=self.latent_dim))
  model.add(Reshape((7, 7, 128)))
  model.add(UpSampling2D())
  model.add(Conv2D(128, kernel_size=3, padding="same"))
  model.add(BatchNormalization(momentum=0.8))
  model.add(Activation("relu"))
  model.add(UpSampling2D())
  model.add(Conv2D(64, kernel_size=3, padding="same")) 
  model.add(BatchNormalization(momentum=0.8))
  model.add(Activation("relu"))
  model.add(Conv2D(self.channels, kernel_size=3, padding="same"))
  model.add(Activation("tanh"))
  model.summary()

  noise = Input(shape=(self.latent_dim,))
  img = model(noise)
  return Model(noise, img)

The build_generator function builds the art-forger model, which means it takes that sample set of noise and tries to convert it into an image the discriminator will believe is real. In this form, it uses the principle of convolution to make it more efficient, except, in this case, it generates a feature map of noise that it then turns into a real image. Essentially, the generator is doing the opposite of recognizing an image, but instead trying to generate an image based on feature maps.
In the preceding block of code, note how the input starts with 128, 7x7 feature maps of noise then uses a Reshape layer to turn it into the proper image layout we want to create. It then up-samples (the reverse of pooling or down-sampling) the feature map into 2x size (14 x 14), training another layer of convolution followed by more up-sampling (2x to 28 x 28) until the correct image size (28x28 for the MNIST) is generated. We also see the use of a new layer type called BatchNormalization, which we will cover in more detail shortly.
Next, we will build the build_discriminator function like so:

def build_discriminator(self):
  model = Sequential()
  model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=self.img_shape, padding="same"))
  model.add(LeakyReLU(alpha=0.2))
  model.add(Dropout(0.25))
  model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
  model.add(ZeroPadding2D(padding=((0,1),(0,1))))
  model.add(BatchNormalization(momentum=0.8))
  model.add(LeakyReLU(alpha=0.2))
  model.add(Dropout(0.25))
  model.add(Conv2D(128, kernel_size=3, strides=2, padding="same"))
  model.add(BatchNormalization(momentum=0.8))
  model.add(LeakyReLU(alpha=0.2))
  model.add(Dropout(0.25))
  model.add(Conv2D(256, kernel_size=3, strides=1, padding="same"))
  model.add(BatchNormalization(momentum=0.8))
  model.add(LeakyReLU(alpha=0.2))
  model.add(Dropout(0.25))
  model.add(Flatten())
  model.add(Dense(1, activation='sigmoid'))
  model.summary()

  img = Input(shape=self.img_shape)
  validity = model(img)
  return Model(img, validity)

This time, the discriminator is testing the image inputs and determining whether they are fake. It uses convolution to identify features, but in this example it uses ZeroPadding2D to place a buffer of zeros around the images in order to help identification. The opposite form of this layer would be Cropping2D, which crops an image. Note how this model does not use down-sampling or pooling with the convolution. We will explore the other new special layers LeakyReLU and BatchNormalization in the coming sections. Note how we have not used any pooling layers in our convolution. This is done to increase the spatial dimensionality through the fractionally strided convolutions. See how inside the convolution layers we are using an odd kernel and stride size.
We will now circle back and define the init function like so:

def __init__(self):
  self.img_rows = 28
  self.img_cols = 28
  self.channels = 1
  self.img_shape = (self.img_rows, self.img_cols, self.channels)
  self.latent_dim = 100
  optimizer = Adam(0.0002, 0.5)

  self.discriminator = self.build_discriminator()
  self.discriminator.compile(loss='binary_crossentropy',    
  optimizer=optimizer, metrics=['accuracy'])

  self.generator = self.build_generator() 
  z = Input(shape=(self.latent_dim,))
  img = self.generator(z)
  
  self.discriminator.trainable = False
  valid = self.discriminator(img)
  
  self.combined = Model(z, valid)
  self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)

This initialization code sets up the sizes for our input images (28 x 28 x 1, one channel for grayscale). It then sets up an Adam optimizer, something else we will review in another section on optimizers. After this, it builds the discriminator and then the generator. Then it combines the two models or sub networks (generator and discriminator) together. This allows the networks to work in tandem and optimize training across an entire network. Again, this is a concept we will look at more closely under optimizers.
Before we get too deep, take some time to run this example. This sample can take an extensive amount of time to run, so return to the book after it starts and keep it running.

As the sample runs, you will be able to see the generated output get placed into a folder called images within the same folder as your running Python file. Go ahead and watch as every 50 epochs a new image is saved, which is shown in the following diagram:

Example of output generated from a GAN

The preceding shows the results after 3,900 epochs or so. When you start training, it will take a while to get results this good.

That covers the basics of setting up the models, except all the work that is in the training, which we will cover in the next section.

Table of Contents for Coding a GAN in Keras

Create new playlist

Sign In

Sign Up

Table of Contents for
Coding a GAN in Keras