T-Rex Label

Dropout is a regularization technique that randomly “drops” a subset of neurons (sets their activations to zero) during each training iteration, preventing co-adaptation of feature detectors and reducing overfitting 维基百科. At each forward pass, neurons are retained with a fixed probability p, and during backpropagation only the retained neurons contribute gradients, effectively training an ensemble of thinned networks. At test time, all neurons are used but their outputs are scaled by p to match the expected activations during training, ensuring consistency between training and inference phases. Dropout’s stochastic nature makes models more robust to input variations, encourages redundancy in feature representations, and has been shown to improve generalization across vision and NLP tasks alike.