We can modify the ReLU function to have a small gradient when the input is negative—this can very quickly be accomplished, as follows:
func leaky_relu(x) {
if x >= 0 {
return x
} else {
return 0.01 * x
}
}
The chart for the preceding function will look like the following diagram:
Note that this chart has been altered for emphasis, so the slope of y with respect to x is actually 0.1 instead of 0.01, as is typical for what is considered a leaky ReLU.
As it will always produce a small gradient, this should help to prevent the neuron from dying on a more permanent basis while still giving us many of the benefits of ReLU.