-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stable hosting (+long-term archiving) of generated models (acoustic models etc) #228
Comments
An interesting example is set by a release of 1,008 machine translation models, covering 140 different languages. (From Hugging Face.) |
I think this is definitely the way to go. As part of the Elpis-ESPnet integration it'll be good to prepare a multilingual model that can be fine-tuned to target languages. Making such a model pip-installable, or easy to get by other means would be useful for the reasons you mentioned. |
Great. Laurent Besacier (@besacier) will look into Gitlab possibilities at 'his' place, and we will also look with @sguillaume at Gitlab possibilities at Huma-Num. |
@benfoley notes that ESPnet uses Zenodo: see https://colab.research.google.com/drive/1gnSuuFMNHvg1Tfli0bhhOMyfgQKkU3bu see a list of deposits here An example, fresh from this month (October 2020): |
This issue and #228 both stem from a related concern: stable storage (and long-term archiving) not just of primary data, but also of 'intermediate' states of the data sets (preprocessed data sets) and of 'computational outputs' such as acoustic models trained on a given data set.
Even if the tool & the data are available online, if training is a matter of days then wouldn't it make good sense to make the models available for download, too? (to the extent that the colleagues who produced them wish to make them available, of course)
Possible benefits:
The text was updated successfully, but these errors were encountered: