Skip to content

Commit 53bb4ef

Browse files
committed
2 parents 6f495d7 + d2f28c2 commit 53bb4ef

File tree

1 file changed

+57
-80
lines changed

1 file changed

+57
-80
lines changed

โ€Žbeginner_source/torchtext_translation_tutorial.py

Lines changed: 57 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,48 @@
11
"""
2-
Language Translation with TorchText
2+
TorchText๋กœ ์–ธ์–ด ๋ฒˆ์—ญํ•˜๊ธฐ
33
===================================
44
5-
This tutorial shows how to use several convenience classes of ``torchtext`` to preprocess
6-
data from a well-known dataset containing sentences in both English and German and use it to
7-
train a sequence-to-sequence model with attention that can translate German sentences
8-
into English.
5+
์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ``torchtext`` ์˜ ์œ ์šฉํ•œ ์—ฌ๋Ÿฌ ํด๋ž˜์Šค๋“ค๊ณผ ์‹œํ€€์Šค ํˆฌ ์‹œํ€€์Šค(sequence-to-sequence, seq2seq)๋ชจ๋ธ์„ ํ†ตํ•ด
6+
์˜์–ด์™€ ๋…์ผ์–ด ๋ฌธ์žฅ๋“ค์ด ํฌํ•จ๋œ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ ์…‹์„ ์ด์šฉํ•ด์„œ ๋…์ผ์–ด ๋ฌธ์žฅ์„ ์˜์–ด๋กœ ๋ฒˆ์—ญํ•ด ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
97
10-
It is based off of
11-
`this tutorial <https://github.com/bentrevett/pytorch-seq2seq/blob/master/3%20-%20Neural%20Machine%20Translation%20by%20Jointly%20Learning%20to%20Align%20and%20Translate.ipynb>`__
12-
from PyTorch community member `Ben Trevett <https://github.com/bentrevett>`__
13-
and was created by `Seth Weidman <https://github.com/SethHWeidman/>`__ with Ben's permission.
8+
์ด ํŠœํ† ๋ฆฌ์–ผ์€
9+
PyTorch ์ปค๋ฎค๋‹ˆํ‹ฐ ๋ฉค๋ฒ„์ธ `Ben Trevett <https://github.com/bentrevett>`__ ์ด ์ž‘์„ฑํ•œ
10+
`ํŠœํ† ๋ฆฌ์–ผ <https://github.com/bentrevett/pytorch-seq2seq/blob/master/3%20-%20Neural%20Machine%20Translation%20by%20Jointly%20Learning%20to%20Align%20and%20Translate.ipynb>`__ ์— ๊ธฐ์ดˆํ•˜๊ณ  ์žˆ์œผ๋ฉฐ
11+
`Seth Weidman <https://github.com/SethHWeidman/>`__ ์ด Ben์˜ ํ—ˆ๋ฝ์„ ๋ฐ›๊ณ  ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.
1412
15-
By the end of this tutorial, you will be able to:
13+
์ด ํŠœํ† ๋ฆฌ์–ผ์„ ํ†ตํ•ด ์—ฌ๋Ÿฌ๋ถ„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒƒ์„ ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค:
1614
17-
- Preprocess sentences into a commonly-used format for NLP modeling using the following ``torchtext`` convenience classes:
15+
- ``torchtext`` ์˜ ์•„๋ž˜์™€ ๊ฐ™์€ ์œ ์šฉํ•œ ํด๋ž˜์Šค๋“ค์„ ํ†ตํ•ด ๋ฌธ์žฅ๋“ค์„ NLP๋ชจ๋ธ๋ง์— ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š” ํ˜•ํƒœ๋กœ ์ „์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค:
1816
- `TranslationDataset <https://torchtext.readthedocs.io/en/latest/datasets.html#torchtext.datasets.TranslationDataset>`__
1917
- `Field <https://torchtext.readthedocs.io/en/latest/data.html#torchtext.data.Field>`__
2018
- `BucketIterator <https://torchtext.readthedocs.io/en/latest/data.html#torchtext.data.BucketIterator>`__
2119
"""
2220

2321
######################################################################
24-
# `Field` and `TranslationDataset`
22+
# `Field` ์™€ `TranslationDataset`
2523
# ----------------
26-
# ``torchtext`` has utilities for creating datasets that can be easily
27-
# iterated through for the purposes of creating a language translation
28-
# model. One key class is a
29-
# `Field <https://github.com/pytorch/text/blob/master/torchtext/data/field.py#L64>`__,
30-
# which specifies the way each sentence should be preprocessed, and another is the
31-
# `TranslationDataset` ; ``torchtext``
32-
# has several such datasets; in this tutorial we'll use the
33-
# `Multi30k dataset <https://github.com/multi30k/dataset>`__, which contains about
34-
# 30,000 sentences (averaging about 13 words in length) in both English and German.
24+
# ``torchtext`` ์—๋Š” ์–ธ์–ด ๋ณ€ํ™˜ ๋ชจ๋ธ์„ ๋งŒ๋“ค๋•Œ ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ๋งŒ๋“ค๊ธฐ ์ ํ•ฉํ•œ ๋‹ค์–‘ํ•œ ๋„๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
25+
# ๊ทธ ์ค‘์—์„œ๋„ ์ค‘์š”ํ•œ ํด๋ž˜์Šค ์ค‘ ํ•˜๋‚˜์ธ `Field <https://github.com/pytorch/text/blob/master/torchtext/data/field.py#L64>`__ ๋Š”
26+
# ๊ฐ ๋ฌธ์žฅ์ด ์–ด๋–ป๊ฒŒ ์ „์ฒ˜๋ฆฌ๋˜์–ด์•ผ ํ•˜๋Š”์ง€ ์ง€์ •ํ•˜๋ฉฐ, ๋˜ ๋‹ค๋ฅธ ์ค‘์š”ํ•œ ํด๋ž˜์Šค๋กœ๋Š” `TranslationDataset` ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
27+
# ``torchtext`` ์—๋Š” ์ด ์™ธ์—๋„ ๋น„์Šทํ•œ ๋ฐ์ดํ„ฐ์…‹๋“ค์ด ์žˆ๋Š”๋ฐ, ์ด๋ฒˆ ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” `Multi30k dataset <https://github.com/multi30k/dataset>`__ ์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
28+
# ์ด ๋ฐ์ดํ„ฐ ์…‹์€ ํ‰๊ท  ์•ฝ 13๊ฐœ์˜ ๋‹จ์–ด๋กœ ๊ตฌ์„ฑ๋œ ์•ฝ ์‚ผ๋งŒ ๊ฐœ์˜ ๋ฌธ์žฅ์„ ์˜์–ด์™€ ๋…์ผ์–ด ๋‘ ์–ธ์–ด๋กœ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
3529
#
36-
# Note: the tokenization in this tutorial requires `Spacy <https://spacy.io>`__
37-
# We use Spacy because it provides strong support for tokenization in languages
38-
# other than English. ``torchtext`` provides a ``basic_english`` tokenizer
39-
# and supports other tokenizers for English (e.g.
40-
# `Moses <https://bitbucket.org/luismsgomes/mosestokenizer/src/default/>`__)
41-
# but for language translation - where multiple languages are required -
42-
# Spacy is your best bet.
30+
# ์ฐธ๊ณ  : ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ์˜ ํ† ํฐํ™”(tokenization)์—๋Š” `Spacy <https://spacy.io>`__ ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
31+
# Spacy๋Š” ์˜์–ด ์ด ์™ธ์˜ ๋‹ค๋ฅธ ์–ธ์–ด์— ๋Œ€ํ•œ ๊ฐ•๋ ฅํ•œ ํ† ํฐํ™” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ``torchtext`` ๋Š”
32+
# `basic_english`` ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ œ๊ณตํ•  ๋ฟ ์•„๋‹ˆ๋ผ ์˜์–ด์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋“ค(์˜ˆ์ปจ๋ฐ
33+
# `Moses <https://bitbucket.org/luismsgomes/mosestokenizer/src/default/>`__ )์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค๋งŒ, ์–ธ์–ด ๋ฒˆ์—ญ์„ ์œ„ํ•ด์„œ๋Š” ๋‹ค์–‘ํ•œ ์–ธ์–ด๋ฅผ
34+
# ๋‹ค๋ฃจ์–ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— Spacy๊ฐ€ ๊ฐ€์žฅ ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.
4335
#
44-
# To run this tutorial, first install ``spacy`` using ``pip`` or ``conda``.
45-
# Next, download the raw data for the English and German Spacy tokenizers:
36+
# ์ด ํŠœํ† ๋ฆฌ์–ผ์„ ์‹คํ–‰ํ•˜๋ ค๋ฉด, ์šฐ์„  ``pip`` ๋‚˜ ``conda`` ๋กœ ``spacy`` ๋ฅผ ์„ค์น˜ํ•˜์„ธ์š”. ๊ทธ ๋‹ค์Œ,
37+
# Spacy ํ† ํฌ๋‚˜์ด์ €๊ฐ€ ์“ธ ์˜์–ด์™€ ๋…์ผ์–ด์— ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๋‹ค์šด๋กœ๋“œ ๋ฐ›์Šต๋‹ˆ๋‹ค.
4638
#
4739
# ::
4840
#
4941
# python -m spacy download en
5042
# python -m spacy download de
5143
#
52-
# With Spacy installed, the following code will tokenize each of the sentences
53-
# in the ``TranslationDataset`` based on the tokenizer defined in the ``Field``
54-
44+
# Spacy๊ฐ€ ์„ค์น˜๋˜์–ด ์žˆ๋‹ค๋ฉด, ๋‹ค์Œ ์ฝ”๋“œ๋Š” ``TranslationDataset`` ์— ์žˆ๋Š” ๊ฐ ๋ฌธ์žฅ์„ ``Field`` ์— ์ •์˜๋œ
45+
# ๋‚ด์šฉ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ† ํฐํ™”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
5546
from torchtext.datasets import Multi30k
5647
from torchtext.data import Field, BucketIterator
5748

@@ -71,30 +62,24 @@
7162
fields = (SRC, TRG))
7263

7364
######################################################################
74-
# Now that we've defined ``train_data``, we can see an extremely useful
75-
# feature of ``torchtext``'s ``Field``: the ``build_vocab`` method
76-
# now allows us to create the vocabulary associated with each language
65+
# ์ด์ œ ``train_data`` ๋ฅผ ์ •์˜ํ–ˆ์œผ๋‹ˆ, ``torchtext`` ์˜ ``Field`` ์— ์žˆ๋Š” ์—„์ฒญ๋‚˜๊ฒŒ ์œ ์šฉํ•œ ๊ธฐ๋Šฅ์„
66+
# ๋ณด๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค : ๋ฐ”๋กœ ``build_vovab`` ๋ฉ”์†Œ๋“œ(method)๋กœ ๊ฐ ์–ธ์–ด์™€ ์—ฐ๊ด€๋œ ์–ดํœ˜๋“ค์„ ๋งŒ๋“ค์–ด ๋‚ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
7767

7868
SRC.build_vocab(train_data, min_freq = 2)
7969
TRG.build_vocab(train_data, min_freq = 2)
8070

8171
######################################################################
82-
# Once these lines of code have been run, ``SRC.vocab.stoi`` will be a
83-
# dictionary with the tokens in the vocabulary as keys and their
84-
# corresponding indices as values; ``SRC.vocab.itos`` will be the same
85-
# dictionary with the keys and values swapped. We won't make extensive
86-
# use of this fact in this tutorial, but this will likely be useful in
87-
# other NLP tasks you'll encounter.
72+
# ์œ„ ์ฝ”๋“œ๊ฐ€ ์‹คํ–‰๋˜๋ฉด, ``SRC.vocab.stoi`` ๋Š” ์–ดํœ˜์— ํ•ด๋‹นํ•˜๋Š” ํ† ํฐ์„ ํ‚ค๋กœ, ๊ด€๋ จ๋œ ์ƒ‰์ธ์„ ๊ฐ’์œผ๋กœ ๊ฐ€์ง€๋Š”
73+
# ์‚ฌ์ „(dict)์ด ๋ฉ๋‹ˆ๋‹ค. ``SRC.vocab.itos`` ์—ญ์‹œ ์‚ฌ์ „(dict)์ด์ง€๋งŒ, ํ‚ค์™€ ๊ฐ’์ด ์„œ๋กœ ๋ฐ˜๋Œ€์ž…๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š”
74+
# ๊ทธ๋‹ค์ง€ ์ค‘์š”ํ•˜์ง€ ์•Š์€ ๋‚ด์šฉ์ด์ง€๋งŒ, ์ด๋Ÿฐ ํŠน์„ฑ์€ ๋‹ค๋ฅธ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋“ฑ์—์„œ ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
8875

8976
######################################################################
9077
# ``BucketIterator``
9178
# ----------------
92-
# The last ``torchtext`` specific feature we'll use is the ``BucketIterator``,
93-
# which is easy to use since it takes a ``TranslationDataset`` as its
94-
# first argument. Specifically, as the docs say:
95-
# Defines an iterator that batches examples of similar lengths together.
96-
# Minimizes amount of padding needed while producing freshly shuffled
97-
# batches for each new epoch. See pool for the bucketing procedure used.
79+
# ๋งˆ์ง€๋ง‰์œผ๋กœ ์‚ฌ์šฉํ•ด ๋ณผ ``torchtext`` ์— ํŠนํ™”๋œ ๊ธฐ๋Šฅ์€ ๋ฐ”๋กœ ``BucketIterator`` ์ž…๋‹ˆ๋‹ค.
80+
# ์ฒซ ๋ฒˆ์งธ ์ธ์ž๋กœ ``TranslationDataset`` ์„ ์ „๋‹ฌ๋ฐ›๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉํ•˜๊ธฐ๊ฐ€ ์‰ฝ์Šต๋‹ˆ๋‹ค. ๋ฌธ์„œ์—์„œ๋„ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ
81+
# ์ด ๊ธฐ๋Šฅ์€ ๋น„์Šทํ•œ ๊ธธ์ด์˜ ์˜ˆ์ œ๋“ค์„ ๋ฌถ์–ด์ฃผ๋Š” ๋ฐ˜๋ณต์ž(iterator)๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ฐ๊ฐ์˜ ์ƒˆ๋กœ์šด ์—ํฌํฌ(epoch)๋งˆ๋‹ค
82+
# ์ƒˆ๋กœ ์„ž์ธ ๊ฒฐ๊ณผ๋ฅผ ๋งŒ๋“œ๋Š”๋ฐ ํ•„์š”ํ•œ ํŒจ๋”ฉ์˜ ์ˆ˜๋ฅผ ์ตœ์†Œํ™” ํ•ฉ๋‹ˆ๋‹ค. ๋ฒ„์ผ€ํŒ… ๊ณผ์ •์—์„œ ์‚ฌ์šฉ๋˜๋Š” ์ €์žฅ ๊ณต๊ฐ„์„ ํ•œ๋ฒˆ ์‚ดํŽด๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
9883

9984
import torch
10085

@@ -108,40 +93,36 @@
10893
device = device)
10994

11095
######################################################################
111-
# These iterators can be called just like ``DataLoader``s; below, in
112-
# the ``train`` and ``evaluate`` functions, they are called simply with:
113-
#
96+
# ์ด ๋ฐ˜๋ณต์ž๋“ค์€ ``DataLoader`` ์™€ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋ž˜ ``train`` ๊ณผ
97+
# ``evaluation`` ํ•จ์ˆ˜์—์„œ ๋ณด๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฐ„๋‹จํžˆ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค :
11498
# ::
11599
#
116100
# for i, batch in enumerate(iterator):
117101
#
118-
# Each ``batch`` then has ``src`` and ``trg`` attributes:
102+
# ๊ฐ ``batch`` ๋Š” ``src`` ์™€ ``trg`` ์†์„ฑ์„ ๊ฐ€์ง€๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.
119103
#
120104
# ::
121105
#
122106
# src = batch.src
123107
# trg = batch.trg
124108

125109
######################################################################
126-
# Defining our ``nn.Module`` and ``Optimizer``
110+
# ``nn.Module`` ๊ณผ ``Optimizer`` ์ •์˜ํ•˜๊ธฐ
127111
# ----------------
128-
# That's mostly it from a ``torchtext`` perspecive: with the dataset built
129-
# and the iterator defined, the rest of this tutorial simply defines our
130-
# model as an ``nn.Module``, along with an ``Optimizer``, and then trains it.
112+
# ๋Œ€๋ถ€๋ถ„์€ ``torchtext`` ๊ฐ€ ์•Œ์•„์„œ ํ•ด์ค๋‹ˆ๋‹ค : ๋ฐ์ดํ„ฐ์…‹์ด ๋งŒ๋“ค์–ด์ง€๊ณ  ๋ฐ˜๋ณต์ž๊ฐ€ ์ •์˜๋˜๋ฉด, ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ
113+
# ์šฐ๋ฆฌ๊ฐ€ ํ•ด์•ผ ํ•  ์ผ์ด๋ผ๊ณ ๋Š” ๊ทธ์ € ``nn.Module`` ์™€ ``Optimizer`` ๋ฅผ ๋ชจ๋ธ๋กœ์„œ ์ •์˜ํ•˜๊ณ  ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๊ฒƒ์ด ์ „๋ถ€์ž…๋‹ˆ๋‹ค.
114+
#
131115
#
132-
# Our model specifically, follows the architecture described
133-
# `here <https://arxiv.org/abs/1409.0473>`__ (you can find a
134-
# significantly more commented version
135-
# `here <https://github.com/SethHWeidman/pytorch-seq2seq/blob/master/3%20-%20Neural%20Machine%20Translation%20by%20Jointly%20Learning%20to%20Align%20and%20Translate.ipynb>`__).
136-
#
137-
# Note: this model is just an example model that can be used for language
138-
# translation; we choose it because it is a standard model for the task,
139-
# not because it is the recommended model to use for translation. As you're
140-
# likely aware, state-of-the-art models are currently based on Transformers;
141-
# you can see PyTorch's capabilities for implementing Transformer layers
142-
# `here <https://pytorch.org/docs/stable/nn.html#transformer-layers>`__; and
143-
# in particular, the "attention" used in the model below is different from
144-
# the multi-headed self-attention present in a transformer model.
116+
# ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ ์‚ฌ์šฉํ•  ๋ชจ๋ธ์€ `์ด๊ณณ <https://arxiv.org/abs/1409.0473>`__ ์—์„œ ์„ค๋ช…ํ•˜๊ณ  ์žˆ๋Š” ๊ตฌ์กฐ๋ฅผ ๋”ฐ๋ฅด๊ณ  ์žˆ์œผ๋ฉฐ,
117+
# ๋” ์ž์„ธํ•œ ๋‚ด์šฉ์€ `์—ฌ๊ธฐ <https://github.com/SethHWeidman/pytorch-seq2seq/blob/master/3%20-%20Neural%20Machine%20Translation%20by%20Jointly%20Learning%20to%20Align%20and%20Translate.ipynb>`__
118+
# ๋ฅผ ์ฐธ๊ณ ํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.
119+
#
120+
# ์ฐธ๊ณ  : ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ชจ๋ธ์€ ์–ธ์–ด ๋ฒˆ์—ญ์„ ์œ„ํ•ด ์‚ฌ์šฉํ•  ์˜ˆ์‹œ ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€
121+
# ์ด ์ž‘์—…์— ์ ๋‹นํ•œ ํ‘œ์ค€ ๋ชจ๋ธ์ด๊ธฐ ๋•Œ๋ฌธ์ด์ง€, ๋ฒˆ์—ญ์— ์ ํ•ฉํ•œ ๋ชจ๋ธ์ด๊ธฐ ๋•Œ๋ฌธ์€ ์•„๋‹™๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ๋ถ„์ด ์ตœ์‹  ๊ธฐ์ˆ  ํŠธ๋ Œ๋“œ๋ฅผ
122+
# ์ž˜ ๋”ฐ๋ผ๊ฐ€๊ณ  ์žˆ๋‹ค๋ฉด ์ž˜ ์•„์‹œ๊ฒ ์ง€๋งŒ, ํ˜„์žฌ ๋ฒˆ์—ญ์—์„œ ๊ฐ€์žฅ ๋›ฐ์–ด๋‚œ ๋ชจ๋ธ์€ Transformers์ž…๋‹ˆ๋‹ค. PyTorch๊ฐ€
123+
# Transformer ๋ ˆ์ด์–ด๋ฅผ ๊ตฌํ˜„ํ•œ ๋‚ด์šฉ์€ `์—ฌ๊ธฐ <https://pytorch.org/docs/stable/nn.html#transformer-layers>`__
124+
# ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ด ํŠœํ† ๋ฆฌ์–ผ์˜ ๋ชจ๋ธ์ด ์‚ฌ์šฉํ•˜๋Š” "attention" ์€ Transformer ๋ชจ๋ธ์—์„œ ์ œ์•ˆํ•˜๋Š”
125+
# ๋ฉ€ํ‹ฐ ํ—ค๋“œ ์…€ํ”„ ์–ดํ…์…˜(multi-headed self-attention) ๊ณผ๋Š” ๋‹ค๋ฅด๋‹ค๋Š” ์ ์„ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค.
145126

146127

147128
import random
@@ -316,7 +297,7 @@ def forward(self,
316297

317298
encoder_outputs, hidden = self.encoder(src)
318299

319-
# first input to the decoder is the <sos> token
300+
# ๋””์ฝ”๋”๋กœ์˜ ์ฒซ ๋ฒˆ์งธ ์ž…๋ ฅ์€ <sos> ํ† ํฐ์ž…๋‹ˆ๋‹ค.
320301
output = trg[0,:]
321302

322303
for t in range(1, max_len):
@@ -376,16 +357,15 @@ def count_parameters(model: nn.Module):
376357
print(f'The model has {count_parameters(model):,} trainable parameters')
377358

378359
######################################################################
379-
# Note: when scoring the performance of a language translation model in
380-
# particular, we have to tell the ``nn.CrossEntropyLoss`` function to
381-
# ignore the indices where the target is simply padding.
360+
# ์ฐธ๊ณ  : ์–ธ์–ด ๋ฒˆ์—ญ์˜ ์„ฑ๋Šฅ ์ ์ˆ˜๋ฅผ ๊ธฐ๋กํ•˜๋ ค๋ฉด, ``nn.CrossEntropyLoss`` ํ•จ์ˆ˜๊ฐ€ ๋‹จ์ˆœํ•œ
361+
# ํŒจ๋”ฉ์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ถ€๋ถ„์„ ๋ฌด์‹œํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด๋‹น ์ƒ‰์ธ๋“ค์„ ์•Œ๋ ค์ค˜์•ผ ํ•ฉ๋‹ˆ๋‹ค.
382362

383363
PAD_IDX = TRG.vocab.stoi['<pad>']
384364

385365
criterion = nn.CrossEntropyLoss(ignore_index=PAD_IDX)
386366

387367
######################################################################
388-
# Finally, we can train and evaluate this model:
368+
# ๋งˆ์ง€๋ง‰์œผ๋กœ ์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ณ  ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค :
389369

390370
import math
391371
import time
@@ -486,11 +466,8 @@ def epoch_time(start_time: int,
486466
print(f'| Test Loss: {test_loss:.3f} | Test PPL: {math.exp(test_loss):7.3f} |')
487467

488468
######################################################################
489-
# Next steps
469+
# ๋‹ค์Œ ๋‹จ๊ณ„
490470
# --------------
491471
#
492-
# - Check out the rest of Ben Trevett's tutorials using ``torchtext``
493-
# `here <https://github.com/bentrevett/>`__
494-
# - Stay tuned for a tutorial using other ``torchtext`` features along
495-
# with ``nn.Transformer`` for language modeling via next word prediction!
496-
#
472+
# - ``torchtext`` ๋ฅผ ์‚ฌ์šฉํ•œ Ben Trevett์˜ ํŠœํ† ๋ฆฌ์–ผ์„ `์ด๊ณณ <https://github.com/bentrevett/>`__ ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
473+
# - ``nn.Transformer`` ์™€ ``torchtext`` ์˜ ๋‹ค๋ฅธ ๊ธฐ๋Šฅ๋“ค์„ ์ด์šฉํ•œ ๋‹ค์Œ ๋‹จ์–ด ์˜ˆ์ธก์„ ํ†ตํ•œ ์–ธ์–ด ๋ชจ๋ธ๋ง ํŠœํ† ๋ฆฌ์–ผ์„ ์‚ดํŽด๋ณด์„ธ์š”.

0 commit comments

Comments
ย (0)