The dropout layer helps to reduce overfitting in a neural network by randomly disabling some neurones by setting their outputs to 0 during the forward pass. These "disabled" neurones are chosen at random based on a probability function during every forward pass. A dropout neurone is only connected to one neurone from the previous layer.

Note

Dropout layers are only included in training and during testing / application the dropout layers always just pass on their inputs unchanged

Definition

Forward Pass

A single dropout layer can be defined as

Definitions
- is the output of the neurone in the layer
- is a randomly generated number (between 0 and 1)
- is the "dropout rate" or how many neurones to "disable" (between 0 and 1 where 0 is 0% and 1 is 100% are disabled)

This could be re-written using a Hadamard product of two matrices to describe a full layer of dropout neurones as follows

Definitions
- are the outputs of the neurones in the layer
- is an matrix of the same dimensions as populated with 0's and 1's randomly until (the "dropout rate") percent of the total input neurones are 0.

Backward Pass

Definitions
- is the outputs of the layer
- is the matrix populated with 0's and 1's randomly during the forward pass.
- is the cost function which is trying to be minimised