Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development #1

Open
wants to merge 1,839 commits into
base: master
Choose a base branch
from
Open

Development #1

wants to merge 1,839 commits into from

Conversation

albertoandreottiATgmail
Copy link

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

regisss and others added 30 commits June 6, 2022 22:41
* Fix circular import in onnx.utils

* Add comment for test fetcher

* Here too

* Style
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
* Support for deberta and deberta-v2

* Support for LXMert

* Support for Hubert

* Fix for pt1.11

* Trigger CI
* Quicktour Portuguese Translation

Translated quicktour.mdx until line 161

* Finished translating quicktour.mdx

Ready to upload and adjust eventual .mdx or translation mistakes.

* Add _toctree.yml and fix nits

* Fixed pt-br mdx syntax problem

Closed <frameworkcontent> instance

* Changed </frameworkcontent> line

* Copied missing block from english version of quicktour.mdx

* Reviwed the entire file once again. It should be working now.

Co-authored-by: Omar U. Espejel <[email protected]>
* added cbs to notebooks, made copy-paste error fix in generation_utils

* initial push for mctc model

* mctc feature extractor done

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.

* passing attention, now struggling to figure out how attention masks make sense here

* works when excluding attention masks. ask later how one would integrate attention maskshere

* bizarre configuration error (model prefix comes first in config dict json and messes up the order)

* all passing but bizzarre config dict ordering issue when to_dict

* passing all major tests

* feature extraction, processor, tokenizer added & tests passing

* style & consistency & other logistical fixes

* copy paste fix

* model after feature extraction working

* commiting final feature extraction results; need to fix normalization

* feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?

* delete print ; format code a bit

* fixing tests

* passing major tests

* fixing styles

* completed tokenization test with real example; not sure if these values are entirely correct.

* last test fixes from local

* reverting accidentally included custom setup configs

* remove load tf weights; fix config error

* testing couldnt import featureextractor

* fix docs

* fix docs

* resolving comments

* style fixes

* style fixes

* Update to MCTCConv1dSubSampler

Co-authored-by: Patrick von Platen <[email protected]>

* relposemb fixes

* conv1d name issue; expecting config fail with paraentheses

* fix config issue

* fix config issue

* fix config issue

* change everything to MCTCT

* fixing naming change errors

* archive list

* copyrights and docs

* copyrights and docs

* copyrights and docs

* merge resolution

* move tests, fix to changed optionaldependency structure

* test directories changed

* fixing tests

* how to avoid tf tests?

* how to avoid tf tests?

* tests passing locally

* allow mctctprocessor imported any env

* allow mctctprocessor imported any env

* fixed second round of feedback, need to fix docs

* doc changes not being applied

* all fixed

* style fix

* feedback fixes

* fix copies and feature extraction style fix

* Update tests/models/visual_bert/test_modeling_visual_bert.py

Co-authored-by: Sylvain Gugger <[email protected]>

* copy paste huggingface:main visual bert

* added eof newline to visual bert; all tests are passing otherwise

* fix slow tests by adding attention mask

* change model id to speechbrain

* make fix-copies

* fix readme unwanted deletes

* fixing readmes, make fix-copies

* consistent M-CTC-T naming

* Update src/transformers/models/mctct/__init__.py

Co-authored-by: Patrick von Platen <[email protected]>

* all fixed but variable naming

* adjust double quotes

* fixed variable names

* copyright and mr quilter

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* correct slow tests

* make fix-copies

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/models/mctct/configuration_mctct.py

Co-authored-by: Sylvain Gugger <[email protected]>

* m-ctc-t not mctct

Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: ydshieh <[email protected]>
* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <[email protected]>
* feat: initial implementation of data2vec segmentation model in TF.

* chore: minor corrections to make the segmenter work.

* chore: removed unncessary files.

* chore: add tests and other modifications.

* fix: loss computation for segmentation.

* chore: remove unused variable.

* chore: formatting.

* added a dummy adaptive pooling layer.

* removed unnecessary file.

* potentially add identifiers to layer names.

* fix: layer naming.

* chore: removed unnecessary print.

* Skipping unneeded test

* chore: add logging to debug tolerance.

* fix: segmentation tests for tfdata2vecvision

* chore: make style.

* fix: layer names, assertion to be resolved.

* Bumping test tolerance a bit

* chore: bump the tol in PT test.

Co-authored-by: matt <[email protected]>
* Update docker file

Co-authored-by: ydshieh <[email protected]>
…el Extension for PyTorch (huggingface#17138)

* init PR

* fix import ipex

* minor fix on bf16

* refine optimizer

* refine args notes

* refine code

* refine ipex optimize args

* refine half_precision_backend

* black format

* isort format

* isort format files

* flake8 format

* doc builder format

* refine codes

* remove jit and optim bits

* black preview format

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <[email protected]>

* refine code

* refine notes

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/trainer.py

Co-authored-by: Sylvain Gugger <[email protected]>

* code refine

* add ipex ut

* add performance cpu doc

* link to the cpu doc from main perf doc

* install ipex into CI's docker

* Update perf_train_cpu.mdx

* Update docs/source/en/perf_train_cpu.mdx

Co-authored-by: Stas Bekman <[email protected]>

* Update perf_train_cpu.mdx

* Update perf_train_cpu.mdx

Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
* Fix link for community notebooks

This fixes the link for community notebooks due to reorganization.

* Replace old link with fully link to the doc page

Co-authored-by: Sylvain Gugger <[email protected]>

Co-authored-by: Sylvain Gugger <[email protected]>
* adding template

* update model

* model update

* update conf for debug model

* update conversion

* update conversion script

* update conversion script

* fix missing keys check

* add tests to test the tokenizer in the local machine

* Change variable name

* add tests on xnli dataset

* add more description

* add descriptions + clearer code

* clearer code

* adding new tests + skipping few tests because of env problems

* change comment

* add dtype on the configuration

* add test embeddings

* add hardcoded test

* fix dtype issue

* adding torch.float16 to config

* adding more metrics (min, max, mean)

* add sum

* now the test passes with almost equal

* add files for conversion - test passes on cpu  gpu

* add final changes

* cleaning code

* add new args in the docstring

* fix one liner function

* remove macros

* remove forward attention

* clean up init funtion

* add comments on the issue

* rm scale mask softmax

* do make style

* fix dtype in init

* fixing for loop on att probs

* fix style with black

* fix style + doc error

* fix and debug CI errors (docs + style)

* some updates

- change new operations
- finally add scaled softmax
- added new args in the config

* make use cache working

* add changes

- save sharded models
- final changes on the modeling script

* add changes

- comment on alibi
- add TODO on seq length

* test commit

- added a text to test the commit

Co-authored-by: thomasw21 <[email protected]>

* final changes

- attention mask change
- generation works on BS176b

Co-authored-by: thomasw21 <[email protected]>

* changes - model + conversion

* move to correct dir

* put ,

* fex fixes

* fix tokenizer autodoc

* fix minor CI issues

* fix minor CI issues

* fix minor CI issues

* fix style issue

* fix minor import issues

* fix few issues

* remove def main on the test

* add require torch

* replace decorator with 'with'

* fix style

* change to bloom

* add quick fix tokenizer

* fix tokenizer file

* fix tokenizer

- merge tests
- small fixes

* fix import issue

* add bloom to readme

* fix consistency

* Update docs/source/en/model_doc/bloom.mdx

Co-authored-by: Sylvain Gugger <[email protected]>

* Apply suggestions from code review

fix comment issues on file headers

Co-authored-by: Sylvain Gugger <[email protected]>

* fix doc issue

* small fix - modeling test

* some changes

- refactor some code
- taking into account reviews
- more tests should pass
- removed pruning tests

* remove useless division

* more tests should pass

* more tests should pass

* more tests should pass

* let's try this one

-add alibi offset
- remove all permutes to make the grad operations work
- finger crossed

* refactor

- refactor code
- style changes
- add new threshold for test

* major changes

- change BLOOM to Bloom
- add quick doc on bloom.mdx
- move embeddings test on modeling test

* modify readme

* small fixes

* small fix

- better threshold for a test

* remove old test file from fetcher

* fix small typo

* major change

- change BloomLMHead to BloomForCausalLM

* remove onnx config

* major changes

- refactor the code
- remove asserts
- change tol for test

* make style

* small change

* adding a slow test + commenting old ones for now

* make style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* make style

* fix duplicates

* cleaning comments on config

* clean a bit conversion file

* refacor a bit modeling file

* refactor tokenizer file

* fix tokenization test issue

* fix tokenization issue #2

* fix tokenization issue second try

* fix test issue

* make style + add suggestions

* change test fetcher

* try this one

- slow tests should pass
- finger crossed

* possible final changes

* make style

* try fix padding side issue

* fix side

* fix padding issue

* fix ko-readme

* fix config auto

* cleaning modeling file

* keep bloom in caps in ko

* update config docs

* remove pretraining_pp

* remove model parallel

* update config

- add correct config files

* fix duplicates

* fix fetcher

* fix refactor issue

- remove divide function

* try to remove alibi

* small fixes

- fix alibi
- remove seq length
- refactor a bit the code

* put correct values

- fix bos and eos token ids

* fix attention mask loop

Co-authored-by: thomasw21 <[email protected]>

* small fixes:

- remove skip bias add

* small fixes

- fix typo in readme
- fix typos in config

* small changes

- remove a test
- add reconstruction test
- change config

* small changes

- change Scaled Softmax to BloomScaledSoftmax

* small fixes

- fix alibi dtype

* major changes

- removing explicit dtype when loading modules
- fixing test args (torch_dtype=auto)
- add dosctring

* fix readmes

* major changes

- now bloom supports alibi shifting
- refactor a bit the code
- better test tolerance now

* refactor a bit

* refactor a bit

* put correct name on test

* change docstring

* small changes

- fix docstring modeling
- fix test tolerance

* fix small nit

- take dtype from tensors in the conversion script

* minor fix

- fix mdx issue

* minor fix

- change config docstring

* forward contrib credits from PR14084

* Apply suggestions from code review

Co-authored-by: Stas Bekman <[email protected]>

* apply modifications

Co-authored-by: Stas Bekman <[email protected]>

* resolve softmax upcast

* Apply suggestions from code review

Co-authored-by: Stas Bekman <[email protected]>

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Niklas Muennighoff <[email protected]>

* final changes modeling

Co-authored-by: Stas Bekman <[email protected]>

* Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2'

* merge commit

* Apply suggestions from code review

Co-authored-by: Stas Bekman <[email protected]>

* apply suggestions

Apply suggestions from Stas comments
Co-authored-by: Stas Bekman <[email protected]>

* Fix gradient checkpointing

Co-authored-by: Stas Bekman <[email protected]>

* add slow but exact

* add accelerate compatibility

Co-authored-by: Nicolas Patry <[email protected]>

* forward contrib credits

Co-authored-by: thomasw21 <[email protected]>
Co-authored-by: sgugger <[email protected]>
Co-authored-by: patrickvonplaten <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: LysandreJik <[email protected]>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* fix torch device on tests

* make style

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* fix nits

Co-authored-by: patrickvonplaten<[email protected]>

* remove final nits

* fix doc

- add more details on the doc
- add links to checkpoints

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <[email protected]>

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: Sylvain Gugger <[email protected]>

* apply suggestions

Co-authored-by: sgugger <[email protected]>

* put test torchscript to false

* Update src/transformers/models/bloom/modeling_bloom.py

Co-authored-by: justheuristic <[email protected]>

* fix alibi

- create alibi only once

* add small doc

* make quality

* replace torch.nn

* remove token type emb

* fix fused op + output bias

* add fused op

- now can control fused operation from config

* remove fused op

* make quality

* small changes

- remove unsed args on config
- removed bias gelu file
- make the model torchscriptable
- add torchscript slow tests

* Update src/transformers/models/bloom/modeling_bloom.py

* fix slow

* make style

* add accelerate support

* add bloom to deepspeed tests

* minor changes

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <[email protected]>

* minor change

* slow tests pass

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <[email protected]>

* Update docs/source/en/model_doc/bloom.mdx

Co-authored-by: Sylvain Gugger <[email protected]>

* minor changes:

- change docstring
- add link to paper

Co-authored-by: Thomwolf <[email protected]>
Co-authored-by: Thomas Wolf <[email protected]>
Co-authored-by: thomasw21 <[email protected]>
Co-authored-by: Sylvain Gugger <[email protected]>
Co-authored-by: sIncerass <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
Co-authored-by: Niklas Muennighoff <[email protected]>
Co-authored-by: Nicolas Patry <[email protected]>
Co-authored-by: thomasw21 <[email protected]>
Co-authored-by: sgugger <[email protected]>
Co-authored-by: patrickvonplaten <[email protected]>
Co-authored-by: LysandreJik <[email protected]>
Co-authored-by: Patrick von Platen <[email protected]>
Co-authored-by: justheuristic <[email protected]>
Co-authored-by: Stas Bekman <[email protected]>
* Add ONNX support for ResNet

* Add ONNX test

* make fix-copies
* Use shape_list to safely get shapes

* Add relevant test

* Tidy and add metrics

* Resolve dynamic shaping issues and move test

* Tidy up and all samples in batch

* Formatting
…ce#17606)

* Adding `top_k` and `sort` arguments to `text-classification` pipeline.

- Deprecate `return_all_scores` as `top_k` is more uniform with other
  pipelines, and a superset of what `return_all_scores` can do.
  BC is maintained though.
  `return_all_scores=True` -> `top_k=None`
  `return_all_scores=False` -> `top_k=1`

- Using `top_k` will imply sorting the results, but using no argument
  will keep the results unsorted for backward compatibility.

* Remove `sort`.

* Fixing the test.

* Remove bad doc.
* Fix very long job failure text in Slack report

Co-authored-by: ydshieh <[email protected]>
When we're preparing the tensors for CPU for postprocessing, we need
to upgrade the `float16` to `float32` since CPUs don't have instructions
for `[b]float16`.
amyeroberts and others added 30 commits July 4, 2022 10:59
* Rought TF conversion outline

* Tidy up

* Fix padding differences between layers

* Add back embedder - whoops

* Match test file to main

* Match upstream test file

* Correctly pass and assign image_size parameter

Co-authored-by: Sayak Paul <[email protected]>

* Add in MainLayer

* Correctly name layer

* Tidy up AdaptivePooler

* Small tidy-up

More accurate type hints and remove whitespaces

* Change AdaptiveAvgPool

Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. https://github.com/huggingface/transformers/pull/17554/files/9e26607e22aa8d069c86b50196656012ff0ce62a#r900109509

Co-authored-by: From: matt <[email protected]>
Co-authored-by: Sayak Paul <[email protected]>

* Use updated AdaptiveAvgPool

Co-authored-by: matt <[email protected]>

* Make AdaptiveAvgPool compatible with CPU

* Remove image_size from configuration

* Fixup

* Tensorflow -> TensorFlow

* Fix pt references in tests

* Apply suggestions from code review - grammar and wording

Co-authored-by: NielsRogge <[email protected]>

Co-authored-by: NielsRogge <[email protected]>

* Add TFResNet to doc tests

* PR comments - GlobalAveragePooling and clearer comments

* Remove unused import

* Add in keepdims argument

* Add num_channels check

* grammar fix: by -> of

Co-authored-by: matt <[email protected]>

Co-authored-by: Matt <[email protected]>

* Remove transposes - keep NHWC throughout forward pass

* Fixup look sharp

* Add missing layer names

* Final tidy up - remove from_pt now weights on hub

Co-authored-by: Sayak Paul <[email protected]>
Co-authored-by: matt <[email protected]>
Co-authored-by: NielsRogge <[email protected]>
Co-authored-by: Matt <[email protected]>
…ace#17501)

* Refactor to inherit from nn.Module instead of nn.ModuleList

* Fix typo

* Empty to trigger CI re-run

Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)
* Return scalar losses instead of per-sample means

* Make loss shape (1,) instead of scalar

* Allow scalar losses in test_loss_computation

* Allow scalar losses in test_loss_computation

* Allow scalar losses in test_loss_computation

* Remove XLA loss function for RAG
…xample (huggingface#18002)

* Add ALL_LAYERNORM_LAYERS for LayerNorm

* fix bug of appending layer norm
* Link to the Datasets doc

* Remove unwanted file
* Add script to sort doc ToC

* Style and fixes

* Add check to quality job
…ngface#18008)

* Added command for windows VENV activation

* changed linux and macos  specification
…#17967)

* Drop columns after loading samples, rather than before, to avoid breaking transforms

* make fixup

* Add workaround so this PR can work with current datasets version
* Fix slow CI by pinning resampy

* Actually put it in the speech dependencies
* Fix type issue in using bucketing with Trainer

- Fix type issues in LengthGrouperSampler,
  DistributedLengthGroupedSampler

refs: huggingface#18003

* Change logging type in LengthGroupedSampler

- Change `logger.warning` to `logger.info`

Co-authored-by: Sylvain Gugger <[email protected]>

* Change logging type in DistributedLengthGroupedSampler

- Change `logger.warning` to `logger.info`

Co-authored-by: Sylvain Gugger <[email protected]>

* Remove adundant clause in LengthGroupedSampler

- Use `elif`

Co-authored-by: Sylvain Gugger <[email protected]>

* Remove adundant clause in DistributedLengthGroupedSampler

- Use `elif`

Co-authored-by: Sylvain Gugger <[email protected]>

* Apply black, isort to modified codes in the script

Co-authored-by: Sylvain Gugger <[email protected]>
huggingface#18078)

* Make Trainer.predict call on_evaluate (huggingface#17952)

* Add on_predict

* Small fix

* Small and different fix

* Add tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.