A Guide to PyTorch Learning Rate Schedulers

In deep learning, choosing an appropriate learning rate is critical. While a fixed learning rate might suffice for basic tasks, a robust training pipeline usually requires a Learning Rate Scheduler (LR Scheduler) to optimize convergence and model performance.

An LR Scheduler works in tandem with an optimizer. It monitors the optimization process and, after each update, adjusts the learning rate based on predefined criteria.

Here are a few common LR Schedulers available in PyTorch:

Taking StepLR as an example, it decays the learning rate by a factor of gamma every step_size epochs (i.e., lr = lr * gamma). Below is a standard implementation pattern:

import torch.optim.lr_scheduler as lr_scheduler

optimizer = SGD(model, 0.1)
scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1)

for epoch in range(20):
    for input, target in dataset:
        optimizer.zero_grad()
        output = model(input)
        loss = loss_fn(output, target)
        loss.backward()
        optimizer.step()
    scheduler.step()