- From Andrej Karpathy’s course cs231n:CNNs for Visual Recognition
All the plots were generated with one full forward pass across all the layers of the network with the same activation function
There are layers, each layer having units.
- Random data points of training examples are generated from a univariate “normal” (Gaussian) distribution of mean and variance .
- Weights for each layer were generated from the same distribution as that of but later on varied to obtain different plots.
DeepNets – How weight initialization affects forward and backward passes of a deep neural network
Continue reading “GitHub”