Updating "Classifying Names with a Character-Level RNN" #2954

mgs28 · 2024-06-24T16:36:08Z

Description

Updating Sean's excellent RNN classification tutorial that is now 8 years old and missing some newer pytorch functionality.

Cannot use default Dataloader to select batch sizes because "stack expects each tensor to be of equal size" but each of the names are of different length. However, updated the code to use mini batches without Dataloader functionality.
Introducing pytorch's Datasets class, we show how to split the data into train and test datasets which changes the training explanation.
Rewrote pieces of the tutorial to use three classes to improve re-use (Data, DataSet and RNN).
Added a little more explanation to how RNNs score multi-character strings and their 2D matrix of tensors.
Changed evaluation from random training examples to an entire the test set.
removed some of the command line explanations since notebooks are used more often.
tried to preserve as much of the original text, functions and style as possible.

Checklist

The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
Only one issue is addressed in this pull request
Labels from the issue that this PR is fixing are added to this pull request
No unnecessary issues are included into this pull request.

cc @albanD

…ain and test sets as well as simplifying content

…block

pytorch-bot · 2024-06-24T16:36:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2954

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit e31a141 with merge base a91f631 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-06-24T16:36:14Z

Hi @mgs28!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

mgs28 · 2024-06-26T01:25:52Z

@svekars - it looks like you are active a lot in this repo, any chance you could help me with this? Thanks!

…optimizer

…to mgs28-char-rnn-update

mgs28 · 2024-06-28T01:10:34Z

Added functionality to process training data in mini batches to satisfy original story. However, had to use numpy + random to create batch indices from a given dataset.

Also, simplified training so it was a closer match to https://pytorch.org/tutorials/beginner/basics/optimization_tutorial.html

mgs28 · 2024-07-08T19:36:03Z

@svekars - would you please help me with this tutorial update pull request or point me to someone who could?

svekars · 2024-07-08T20:55:47Z

cc: @spro

…a better confusion matrix

mgs28 · 2024-07-10T16:35:35Z

Sorry about the spelling errors! I ran pyspelling and re-ran make html for the tutorial. This should pass those CI steps now.

I also added a story for me to come back and update the CONTRIBUTING.md to include some of these checks. (#2969)

Thanks @spro @svekars !

…rs, adding device config for CI steps, cleaning up documentatation

mgs28 · 2024-07-11T22:17:50Z

@spro and @svekars - I significantly cut the training time although it is faster on my CPU than GPU. It runs in 72 seconds on my local CPU. I also added some default device so it looks for your CUDA build machines to hopefully make it faster.

Thanks!

svekars · 2024-07-15T17:26:48Z

intermediate_source/char_rnn_classification_tutorial.py

@@ -3,6 +3,7 @@
 NLP From Scratch: Classifying Names with a Character-Level RNN
 **************************************************************
 **Author**: `Sean Robertson <https://github.com/spro>`_
+**Updated**: `Matthew Schultz <https://github.com/mgs28>`_ 


we typically don't add Update. Please remove this

@svekars - anything else I can do to make this better? thanks!

@svekars - hello, are there other items I should address here? I appreciate your help with this!

Co-authored-by: Svetlana Karslioglu <[email protected]>

mgs28 · 2024-10-31T01:29:08Z

@svekars - Sorry on two accounts:

Sorry for the delay... Some personal stuff come up and tonight was the earliest I could address.
I made review of your editing changes harder. I manually made some of your changes via a commit and accepted some here in the UI. I apologize for that. I was not thinking tonight.

I made one departure from your edits. You suggested removing a comma in "train text and validate our models" but when I re-read it, I actually meant "train, test and validate our models". I updated the text to read as such.

I apologize for my delay and muddying up this review but greatly appreciate your feedback on this. All your points were great. Thanks!

mgs28 · 2024-12-03T14:49:43Z

@svekars - Happy Tuesday. Is there anything else I need to do to help merge this tutorial in?

Thanks!

svekars · 2024-12-03T20:14:42Z

Looks good - would be great to also fix these user warnings:

mgs28 · 2024-12-03T22:01:14Z

@svekars - Fixed it. Sorry I missed that warning! Thanks

en-wordlist.txt

intermediate_source/char_rnn_classification_tutorial.py

svekars · 2024-12-06T21:57:00Z

@mgs28 I really want to merge this but I wonder what happened in the update that increased the time of the build for this tutorial from ~6 minutes to 22 minutes

mgs28 · 2024-12-06T22:32:36Z

@svekars - I also want you to merge it in. :)

History: The training is different (batches instead of random samples) and if you look at the comments around July 11th, we tinkered with the parameters to make the confusion matrix better.

My best suggestion is to set device = cpu at the beginning of the script. With 10% of my laptop CPU, it takes about 4 minutes. Can you share some of the specs of the deployment machine? That might give me a bit more to dig into. Thanks!

svekars · 2024-12-07T02:28:34Z

@svekars - I also want you to merge it in. :)

History: The training is different (batches instead of random samples) and if you look at the comments around July 11th, we tinkered with the parameters to make the confusion matrix better.

My best suggestion is to set device = cpu at the beginning of the script. With 10% of my laptop CPU, it takes about 4 minutes. Can you share some of the specs of the deployment machine? That might give me a bit more to dig into. Thanks!

It's a 1 GPU machine. Can you try with device = cpu?

mgs28 · 2024-12-09T16:00:40Z

@svekars - good idea. I removed the device options at the top of the script. pyspelling passed and the build html looks good. Please let me know if I missed something. Thanks!

svekars · 2024-12-09T19:12:20Z

well, now its43 minutes...

mgs28 · 2024-12-09T23:04:16Z

@svekars - well... clearly that's the wrong direction... I submitted a change to speed it back up

What's your goal?

In July and more recently, I compared the old code to this new addition.

I added an optimizer (SGD) instead of manually updating weights in the prior tutorial
I am using datasets (original reason for the update) instead of randomly selecting 100,000 examples
Batches require training over 9.8x data (55 iterations through 17,063 points = 938,465 examples). We're about 3x faster per example but training more examples.
The original article tests on the training data and we changed to withhold a separate dataset. This makes the performance look worse so we increased training time.

All of these changes make this tutorial more in line with best practices. To make the time goal, I can reduce training by a percentage.

Original training (estimate 22 minutes)
accuracy = 0.796745240688324
confusion matrix has a strong diagonal
Half training (estimate 11 minutes)
accuracy = 0.7974094748497009
confusion matrix has a less strong diagonal
Quarter training (estimate 5 minutes)
accuracy = 0.8003985285758972
confusion matrix has no diagonal

I checked in the 11 minute version for you which I think is a good compromise. Do you agree?

svekars · 2024-12-10T19:34:03Z

Agree on the 11 minutes compromise - thank you for adjusting! Please add a note that for the purpose of example the number of epochs is reduced and we recommend to set it to the a larger number for higher level of accuracy.

mgs28 · 2024-12-10T21:18:36Z

@svekars - good idea and done. I added a comment where training is specified. The first item in the end exercises also mentions hyperparameter tuning as a good place to start. Does that work? Thanks!

mgs28 added 2 commits June 23, 2024 21:30

Updating 8 year old tutorial to include DataLoader, splitting into tr…

52889cf

…ain and test sets as well as simplifying content

use label instead of category for class and remove old cmd line code-…

126f7ad

…block

Merge branch 'main' into mgs28-char-rnn-update

83612a5

facebook-github-bot added the cla signed label Jun 24, 2024

mgs28 closed this Jun 24, 2024

mgs28 reopened this Jun 24, 2024

mgs28 marked this pull request as ready for review June 24, 2024 19:08

mgs28 added 2 commits June 27, 2024 21:04

Simplify RNN class (e.g. one forward function), adding minibatches + …

b4db18d

…optimizer

Merge branch 'mgs28-char-rnn-update' of github.com:mgs28/tutorials in…

64169cf

…to mgs28-char-rnn-update

svekars and others added 3 commits July 8, 2024 13:56

Merge branch 'main' into mgs28-char-rnn-update

fc0b379

Merge branch 'main' into mgs28-char-rnn-update

b0202ec

fixing spelling errors, slight change to # of iterations to generate …

6d08a08

…a better confusion matrix

mgs28 added 2 commits July 11, 2024 18:11

decreasing training time by 97% (72s on CPU) by tuning hyper paramete…

80804ae

…rs, adding device config for CI steps, cleaning up documentatation

Merge branch 'main' into mgs28-char-rnn-update

ec6cb48

svekars reviewed Jul 15, 2024

View reviewed changes

svekars and others added 5 commits July 15, 2024 10:27

Merge branch 'main' into mgs28-char-rnn-update

07cdb7a

removing updated by

10cfcaa

Merge branch 'pytorch:main' into mgs28-char-rnn-update

ecb97ef

Merge branch 'main' into mgs28-char-rnn-update

8508a8b

Merge branch 'main' into mgs28-char-rnn-update

fb72a18

svekars added the core Tutorials of any level of difficulty related to the core pytorch functionality label Sep 4, 2024

Merge branch 'main' into mgs28-char-rnn-update

fd23ebe

Update intermediate_source/char_rnn_classification_tutorial.py

2d46054

Co-authored-by: Svetlana Karslioglu <[email protected]>

mgs28 added 2 commits October 30, 2024 21:29

Merge branch 'main' into mgs28-char-rnn-update

63cd021

Merge branch 'main' into mgs28-char-rnn-update

2ddc290

mgs28 added 2 commits December 3, 2024 16:39

Merge branch 'main' into mgs28-char-rnn-update

88dc735

removing set_ticklabels() errors

32c5357

svekars approved these changes Dec 4, 2024

View reviewed changes

svekars reviewed Dec 4, 2024

View reviewed changes

Apply suggestions from code review

12961d5

svekars reviewed Dec 4, 2024

View reviewed changes

intermediate_source/char_rnn_classification_tutorial.py Outdated Show resolved Hide resolved

intermediate_source/char_rnn_classification_tutorial.py Outdated Show resolved Hide resolved

intermediate_source/char_rnn_classification_tutorial.py Outdated Show resolved Hide resolved

svekars added 2 commits December 4, 2024 09:32

Fix linkcheck

7cb205c

Merge branch 'main' into mgs28-char-rnn-update

cac1c3e

Merge branch 'main' into mgs28-char-rnn-update

7fc2c9b

mgs28 added 2 commits December 9, 2024 10:43

removing device setup to speed up build

52374e0

adding device default to cpu since some commands involve a set device

7240041

reducing number of epochs by 50% to speed up build

a996498

Merge branch 'main' into mgs28-char-rnn-update

7cbcdb3

adding message the performance can be improved with more epochs

e31a141

svekars merged commit bdd15ed into pytorch:main Dec 11, 2024
20 checks passed

Updating "Classifying Names with a Character-Level RNN" #2954

Updating "Classifying Names with a Character-Level RNN" #2954

Uh oh!

Conversation

mgs28 commented Jun 24, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

pytorch-bot bot commented Jun 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2954

✅ No Failures

Uh oh!

facebook-github-bot commented Jun 24, 2024

Action Required

Process

Uh oh!

mgs28 commented Jun 26, 2024

Uh oh!

mgs28 commented Jun 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgs28 commented Jul 8, 2024

Uh oh!

svekars commented Jul 8, 2024

Uh oh!

mgs28 commented Jul 10, 2024

Uh oh!

mgs28 commented Jul 11, 2024

Uh oh!

svekars Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

mgs28 Jul 15, 2024

Choose a reason for hiding this comment

Uh oh!

mgs28 Jul 24, 2024

Choose a reason for hiding this comment

Uh oh!

mgs28 Sep 4, 2024

Choose a reason for hiding this comment

Uh oh!

mgs28 commented Oct 31, 2024

Uh oh!

mgs28 commented Dec 3, 2024

Uh oh!

svekars commented Dec 3, 2024

Uh oh!

mgs28 commented Dec 3, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

svekars commented Dec 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgs28 commented Dec 6, 2024

Uh oh!

svekars commented Dec 7, 2024

Uh oh!

mgs28 commented Dec 9, 2024

Uh oh!

svekars commented Dec 9, 2024

Uh oh!

mgs28 commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

svekars commented Dec 10, 2024

Uh oh!

mgs28 commented Dec 10, 2024

Uh oh!

Uh oh!

mgs28 commented Jun 24, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Jun 24, 2024 •

edited

Loading

mgs28 commented Jun 28, 2024 •

edited

Loading

svekars commented Dec 6, 2024 •

edited

Loading

mgs28 commented Dec 9, 2024 •

edited

Loading