You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
first of all many thanks for making this repository accessible, great job!
I have two questions/comments:
it would be nice if there was an indication how much memory is roughly needed to load/prepare the datasets. I started with your example of SlimPajama-6B, and had some starting problems because I ran out of memory. Increasing the number of CPUs did the job, but it was a bit hard for me to find out how much is actually needed (and then you have to redo the download every time)
it would be also super helpful to have some benchmarks if you have them available: for example, for a given model and dataset, what is the best train/val loss you reached so far, and what is the optimizer setting that reached it. I don't know if you did many runs yourself, but if you have these information, it would be awesome to make an overview such that everyone can config a good baseline without having to do the tuning.
Thanks, and kind regards,
Fabian
The text was updated successfully, but these errors were encountered:
Hi,
first of all many thanks for making this repository accessible, great job!
I have two questions/comments:
Thanks, and kind regards,
Fabian
The text was updated successfully, but these errors were encountered: