Applies RMS(Root Mean Square) Normalization over a mini-batch of inputs as described in the paper Root Mean Square Layer Normalization
The mean is calculated over the last axis
. For example, if axis
= -2, the mean is computed over the last 2 dimensions of the input.
SkipRMSNorm is performed by the formula below:
axis to split the normalization dimension.
Whether apply SkipRMSNorm.
Input features.
Shape:
Transformation weight.
Shape:
Skip input.
Shape: same as X
Output features.
Shape: same as X
SkipOutput. If SkipIn
is not appear, SkipOut
will be a copy of X
Shape: same as X
If input is float16, data will convert to float32 before RMSNorm.