add ch16

ryanchangky · Jan 8, 2022 · 9478547 · 9478547
1 parent 2047ed2
commit 9478547
Show file tree

Hide file tree

Showing 24 changed files with 6,179 additions and 0 deletions.
diff --git a/ch16/README.md b/ch16/README.md
@@ -0,0 +1,32 @@
+##  Chapter 16: Transformers – Improving Natural Language Processing with Attention Mechanisms (Part 1/3)
+
+### Chapter Outline
+
+- Adding an attention mechanism to RNNs
+  - Attention helps RNNs with accessing information
+  - The original attention mechanism for RNNs
+  - Processing the inputs using a bidirectional RNN
+  - Generating outputs from context vectors
+  - Computing the attention weights
+- Introducing the self-attention mechanism
+  - Starting with a basic form of self-attention
+  - Parameterizing the self-attention mechanism: scaled dot-product attention
+- Attention is all we need: introducing the original transformer architecture
+  - Encoding context embeddings via multi-head attention
+  - Learning a language model: decoder and masked multi-head attention
+  - Implementation details: positional encodings and layer normalization
+- Building large-scale language models by leveraging unlabeled data
+  - Pre-training and fine-tuning transformer models
+  - Leveraging unlabeled data with GPT
+  - Using GPT-2 to generate new text
+  - Bidirectional pre-training with BERT
+  - The best of both worlds: BART
+- Fine-tuning a BERT model in PyTorch
+  - Loading the IMDb movie review dataset
+  - Tokenizing the dataset
+  - Loading and fine-tuning a pre-trained BERT model
+  - Fine-tuning a transformer more conveniently using the Trainer API
+- Summary
+
+**Please refer to the [README.md](../ch01/README.md) file in [`../ch01`](../ch01) for more information about running the code examples.**
+
diff --git a/ch16/ch16-part1-self-attention.ipynb b/ch16/ch16-part1-self-attention.ipynb