Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Training Transformer models using Distributed Data Parallel and Pipeline Parallelism Tutorial broken #2916

Open
loganthomas opened this issue Jun 8, 2024 · 1 comment
Labels

Comments

@loganthomas
Copy link
Contributor

Add Link

@pritamdamania87 for awareness

Describe the bug

Related to #2895 and #2910

The training process uses Wikitext-2 dataset from torchtext which is no longer supported. In #2895, the proposed solution was to use hugging face's version of wikitext2.

/usr/local/lib/python3.10/dist-packages/torchtext/datasets/__init__.py:4: UserWarning: 
/!\ IMPORTANT WARNING ABOUT TORCHTEXT STATUS /!\ 
Torchtext is deprecated and the last released version will be 0.18 (this one). You can silence this warning by calling the following at the beginnign of your scripts: `import torchtext; torchtext.disable_torchtext_deprecation_warning()`
  warnings.warn(torchtext._TORCHTEXT_DEPRECATION_MSG)
/usr/local/lib/python3.10/dist-packages/torchtext/data/__init__.py:4: UserWarning: 
/!\ IMPORTANT WARNING ABOUT TORCHTEXT STATUS /!\ 
Torchtext is deprecated and the last released version will be 0.18 (this one). You can silence this warning by calling the following at the beginnign of your scripts: `import torchtext; torchtext.disable_torchtext_deprecation_warning()`
  warnings.warn(torchtext._TORCHTEXT_DEPRECATION_MSG)
/usr/local/lib/python3.10/dist-packages/torchtext/vocab/__init__.py:4: UserWarning: 
/!\ IMPORTANT WARNING ABOUT TORCHTEXT STATUS /!\ 
Torchtext is deprecated and the last released version will be 0.18 (this one). You can silence this warning by calling the following at the beginnign of your scripts: `import torchtext; torchtext.disable_torchtext_deprecation_warning()`
  warnings.warn(torchtext._TORCHTEXT_DEPRECATION_MSG)
/usr/local/lib/python3.10/dist-packages/torchtext/utils.py:4: UserWarning: 
/!\ IMPORTANT WARNING ABOUT TORCHTEXT STATUS /!\ 
Torchtext is deprecated and the last released version will be 0.18 (this one). You can silence this warning by calling the following at the beginnign of your scripts: `import torchtext; torchtext.disable_torchtext_deprecation_warning()`
  warnings.warn(torchtext._TORCHTEXT_DEPRECATION_MSG)
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-6-95aeeff963a5>](https://localhost:8080/#) in <cell line: 9>()
      7 from torchtext.vocab import build_vocab_from_iterator
      8 
----> 9 train_iter = WikiText2(split='train')
     10 tokenizer = get_tokenizer('basic_english')
     11 vocab = build_vocab_from_iterator(map(tokenizer, train_iter), specials=["<unk>"])

2 frames
[/usr/local/lib/python3.10/dist-packages/torchtext/datasets/wikitext2.py](https://localhost:8080/#) in WikiText2(root, split)
     67     """
     68     if not is_module_available("torchdata"):
---> 69         raise ModuleNotFoundError(
     70             "Package `torchdata` not found. Please install following instructions at https://github.com/pytorch/data"
     71         )

ModuleNotFoundError: Package `torchdata` not found. Please install following instructions at https://github.com/pytorch/data

---------------------------------------------------------------------------

Describe your environment

Collab tutorial notebook

@loganthomas
Copy link
Contributor Author

@pritamdamania87 how would you like to proceed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant