Normalization layer (or norm layer) is a type of layer in a neural network that normalizes the activations of the previous layer using a specified norm.
This can be useful for improving the stability and convergence of a neural network during training.
It is commonly used in convolutional neural networks (CNNs) to improve the network’s performance and reduce overfitting.
How to calculate
One common type of normalization layer is batch normalization, which normalizes the input data to have zero mean and unit variance across the batch dimension.
- The equation for batch normalization is:
$$\hat{x}_i = \frac{x_i - \mu_B}{\sqrt{\sigma_B^2 + \epsilon}}$$
where $x_i$ is the input to the layer, $\mu_B$ is the mean of the batch, $\sigma_B^2$ is the variance of the batch, and $\epsilon$ is a small constant (e.g. $10^{-5}$) to avoid division by zero.
The normalized output $\hat{x}_i$ is then scaled and shifted by learnable parameters $\gamma$ and $\beta$:
$$y_i = \gamma \hat{x}_i + \beta$$
where $y_i$ is the final output of the layer.
Batch normalization has been shown to be an effective technique for improving the training of deep neural networks, and is widely used in modern architectures.