forked from rasbt/machine-learning-book
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
24 changed files
with
6,179 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
## Chapter 16: Transformers – Improving Natural Language Processing with Attention Mechanisms (Part 1/3) | ||
|
||
### Chapter Outline | ||
|
||
- Adding an attention mechanism to RNNs | ||
- Attention helps RNNs with accessing information | ||
- The original attention mechanism for RNNs | ||
- Processing the inputs using a bidirectional RNN | ||
- Generating outputs from context vectors | ||
- Computing the attention weights | ||
- Introducing the self-attention mechanism | ||
- Starting with a basic form of self-attention | ||
- Parameterizing the self-attention mechanism: scaled dot-product attention | ||
- Attention is all we need: introducing the original transformer architecture | ||
- Encoding context embeddings via multi-head attention | ||
- Learning a language model: decoder and masked multi-head attention | ||
- Implementation details: positional encodings and layer normalization | ||
- Building large-scale language models by leveraging unlabeled data | ||
- Pre-training and fine-tuning transformer models | ||
- Leveraging unlabeled data with GPT | ||
- Using GPT-2 to generate new text | ||
- Bidirectional pre-training with BERT | ||
- The best of both worlds: BART | ||
- Fine-tuning a BERT model in PyTorch | ||
- Loading the IMDb movie review dataset | ||
- Tokenizing the dataset | ||
- Loading and fine-tuning a pre-trained BERT model | ||
- Fine-tuning a transformer more conveniently using the Trainer API | ||
- Summary | ||
|
||
**Please refer to the [README.md](../ch01/README.md) file in [`../ch01`](../ch01) for more information about running the code examples.** | ||
|
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.