- When viewed as an ensemble training method, it samples a much richer set of architectures than existing methods such as dropout or stochastic depth.
- We propose a parameterization that reveals connections to exiting architectures and suggests a much richer set of architectures to be explored.
- Swapout samples from a rich set of architectures including dropout, stochastic depth and residual architectures as special cases.
- When viewed as a regularization method swapout not only inhibits co-adaptation of units in a layer, similar to dropout, but also across network layers.
- We conjecture that swapout achieves strong regularization by implicitly tying the parameters across layers.
Read the full article, click here.
@Deep_Hub: “Swapout:Learning ensemble of deep architectures (dropout/Res/Stoch. depth) #deeplearning #ML”
[1605.06465] Swapout: Learning an ensemble of deep architectures