Eroxl's Notes
Learning Rate Decay

Learning rate decay is a method for slowly reducing the learning rate of an algorithm over time, this helps the algorithm to "settle" into a local minimum instead of jumping back and forth over it.

There are 2 main types of learning rate decay

Time Based Learning Rate Decay

Time based decay is a type of learning rate decay which decreases the learning rate relative to the last learning rate.

Time based learning rate decay can be defined by the following equation

  • Definitions
    • is the learning rate for the iteration
    • is the current iteration step
    • is the "decay parameter" and controls how quickly the learning rate will change.

Exponential Learning Rate Decay

Exponential decay is similar to time based decay except that it uses an exponent to model the decay.

Exponential learning rate decay can be defined by the following equation

  • Definitions
    • is the learning rate for the iteration
    • is the current iteration step
    • is the "decay parameter" and controls how quickly the learning rate will change.

Step Based Learning Rate Decay

Step based decay is a type of learning rate decay which decreases the learning rate every steps.

Step based learning rate decay can be defined by the following equation

  • Definitions
    • is the learning rate for the iteration
    • is the current iteration step
    • is the number of iterations between learning rate changes (ie. means that the learning rate will drop every iteration)
    • is the "decay parameter" and controls how quickly the learning rate will change.
    • denotes the floor operator which rounds the input down (ie. is rounded to )