Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

AbhishekAshokDubey · 2023-07-13T10:27:18Z

Updating the forward function in Transformer block.

The change is simple, but still trying my best to explain below:

As per original paper: In 'Add & Norm' block of Transformer, Layer Norm is applied on top of => input/ residual and output of Self-attention. While in the current code, layer Norm is applied first & then added back to the input/ residual.

Updating the forward function in Transformer block. The code is simple to example the pull request, but still trying my best to explain below: As per paper: In 'Add & Norm' block of Transformer, Layer Norm is applied on top of input/ residual & output of Self-attention. While in the current code, first layer Norm is applied & then added back to the input/ residual.

reallyigor · 2023-09-26T15:42:27Z

See 1:35:33

AbhishekAshokDubey changed the title ~~Update gpt.py~~ Minor correction in 'Add & Norm' logic in Block Class in gpt.py Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

AbhishekAshokDubey commented Jul 13, 2023 •

edited

Loading

reallyigor commented Sep 26, 2023

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Are you sure you want to change the base?

Minor correction in 'Add & Norm' logic in Block Class in gpt.py #22

Conversation

AbhishekAshokDubey commented Jul 13, 2023 • edited Loading

reallyigor commented Sep 26, 2023

AbhishekAshokDubey commented Jul 13, 2023 •

edited

Loading