We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where can I find the codeup_190k.json file ? I want to do the training with this data. Thanks.
The text was updated successfully, but these errors were encountered:
@manojitrtc1in Sorry for the late reply. I filter the existing data from the Hugging Face to gain codeup_190k.json. You can download the original 200k data from here (https://huggingface.co/datasets/rombodawg/Legacy_MegaCodeTraining200k).
Then, run the following command to obtain higher-quality instruction data, i.e., codeup_190k.json (I don't put it in GitHub due to its large size).
codeup_190k.json
cd data python preprocess.py
Sorry, something went wrong.
Could you please share the data since rombodawg/Legacy_MegaCodeTraining200k is missing...
No branches or pull requests
Where can I find the codeup_190k.json file ? I want to do the training with this data. Thanks.
The text was updated successfully, but these errors were encountered: