-
Notifications
You must be signed in to change notification settings - Fork 438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConcatDataset allows Packed sub datasets #1708
Comments
I see that this issue without "community welcome" label, but because it became little bit stale here are few thoughts. Probably something like:
While initializing ConcatDataset will solve it, but knowing that we can import PackedDataset to ConcatDataset due to circular import, probably the better idea is to do something from this list:
So final solution will consist of:
If you not mind, I might try to do a PR. |
There shouldn't be a circular import if ConcatDataset imports PackedDataset, as |
Ok, will recheck this, as I see torchtune does not have any custom exceptions, so I assume that raising some |
Closed by #1796 |
Currently we don’t allow data packing with concat dataset. We can discuss enabling that, but in the meantime there’s a big in recipes wheee a user can set packed =True to each dataset in the list . This will individually pack each one but not use the right collate function, also, all of the datasets in the list should probably be packed together. To fix this for now, we can have ConcatDataset check that there are no packed datasets in the list.
The text was updated successfully, but these errors were encountered: