Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq #726

mattmazzola · 2023-06-01T22:15:57Z

We are a team at @microsoft Research that has a fork Metaseq repo with these additional features:

New pipeline task to perform Knowledge Distillation via Log Probabilities using a modified Cross Entropy implementation.
Improved inference script with added functionality such as ability to output logprobs/logits.
Improvements to Training Stop Conditions
Scripts to support Teacher data generation using Open AI Service
Documentation system using Sphinx
1. Documentation of Co-Teaching training process (https://arxiv.org/pdf/2305.02031.pdf)
Improved evaluation configuration to evaluate with different metrics depending on dataset
Miscellaneous Bug Fixes
1. jsonl_dataset.py#_build_index properly accounts for multi-byte characters.

Questions

Which of the features above would you be interested in us contributing back to Metaseq?
Would you be able to offer assistance with the merge process?
- For example, testing and verification of functionality for a feature PR.

We would be happy to answer any questions you have about the above components.

@tupini07

The text was updated successfully, but these errors were encountered:

suchenzang · 2023-06-08T17:49:51Z

@mattmazzola Sorry for delay - I've been on PTO; would be interested in all of the above contributions as they come online (deferring to you on what the best ordering here would be)!

mattmazzola · 2023-06-08T18:23:41Z

interested in all of the above contributions

Ok! I will talk with rest of team and see what we want to do.

We are trying to roll off our current work and transition to another project so it is not clear how much time we be able to spend these contributions. This creates a kind of trade-off / conflict between wanting the larger items for impact, but smaller items for less commitment.

deferring to you on what the best ordering here would be

These fixes and features from our fork has some non-trivial divergence from metaseq main so it's less easy to judge how much work until we see how many merge conflicts there are. It also makes testing difficult or not possible since our infrastructure was using different dependency set running Azure Machine Learning environment.

The list above was an ordered by estimate of how impactful the PR contributions would be to Metaseq; however, given the difficulties I was trying to create PRs with inverse order to increase likelihood they merge.
Beginning with the smallest / easiest since they were least likely to break something and wouldn't rely on as much help.

I think I may be able to at least submit PRs to share the ideas, but they may not be directly mergeable.
I think to be safest the PR or branch could be taken over by a core maintainer and verified.

suchenzang · 2023-06-08T19:40:31Z

These fixes and features from our fork has some non-trivial divergence from metaseq main... I think I may be able to at least submit PRs to share the ideas, but they may not be directly mergeable.

That makes a lot of sense - feel free to open up PRs in whatever state you have them; they will be a useful starting point for figuring out how to merge / test them and pull into main over time.

mattmazzola · 2023-06-13T16:05:37Z

I have created PRs for all of the items on the initial issue list (except for item 4) and referenced this issue.
Hopefully these can help improve Metaseq. Perhaps someone will continue exploration of the "soft" distillation technique in the future.

mattmazzola added the enhancement New feature or request label Jun 1, 2023

This was referenced Jun 7, 2023

fix: add support for wide characters when building index of dataset files #728

Merged

fix: ensure last checkpoint is always saved, refactor training stop conditions to be computed in single location #729

Open

suchenzang self-assigned this Jun 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq #726

Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq #726

mattmazzola commented Jun 1, 2023

suchenzang commented Jun 8, 2023

mattmazzola commented Jun 8, 2023

suchenzang commented Jun 8, 2023

mattmazzola commented Jun 13, 2023

Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq #726

Possible feature and bugfix contributions from Microsoft research team's fork of Metaseq #726

Comments

mattmazzola commented Jun 1, 2023

Questions

suchenzang commented Jun 8, 2023

mattmazzola commented Jun 8, 2023

suchenzang commented Jun 8, 2023

mattmazzola commented Jun 13, 2023