Full Paper Content
1. Introduction
Deep neural networks contain multiple non-linear hidden layers and this makes them very expressive models that can learn very complicated relationships between their inputs and outputs.
2. Motivation
A motivation for dropout comes from a theory of the role of sex in evolution. Sexual reproduction involves taking half the genes of one parent and half of the other, adding a very small amount of random mutation.
3. Model Description
Dropout is used to prevent a neural network from overfitting. Standard backpropagation learning builds up brittle co-adaptations that work for the training data but do not generalize to unseen data.
4. Learning with Dropout
Dropout training is similar to standard stochastic gradient descent. The only difference is that for each training case in a mini-batch, we sample a thinned network by dropping out units.
5. Experiments on MNIST
We performed classification experiments on MNIST, a standard toy dataset of handwritten digits, to test dropout's ability to prevent overfitting.