-
Notifications
You must be signed in to change notification settings - Fork 898
Issues: karpathy/minbpe
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
What is the difference about the bbpe vocab decode method in minbpe against huggingface transformers?
#15
opened Feb 19, 2024 by
lovekittynine
updated Feb 19, 2024
A probably faster way for training the tokenizer (pure Python)
#25
opened Feb 20, 2024 by
ReinforcedKnowledge
updated Feb 21, 2024
A thanks from self-learners community
#45
opened Feb 24, 2024 by
IamExperimenting
updated Feb 24, 2024
how to deal with special tokens for multiple files
#44
opened Feb 24, 2024 by
IamExperimenting
updated Feb 24, 2024
Alternative to „Vector Representation Pre-training“ possible?
#30
opened Feb 21, 2024 by
Heavy02011
updated Feb 29, 2024
Using minBPE token encoded sentence vectors need to be padded
#56
opened Mar 19, 2024 by
elevateclub
updated Mar 19, 2024
Implementation of LlamaTokenizer (without sentencepiece)
#60
opened Mar 26, 2024 by
MaveriQ
updated Mar 26, 2024
Would using prompts that contain concatenated words to reduce token count negatively affect results
#61
opened Mar 28, 2024 by
hatgit
updated Apr 3, 2024
decode() method in GPT4Tokenizer does not handle special tokens
#64
opened Apr 7, 2024 by
Vakarva
updated Apr 7, 2024
minbpe-rs
: A pure Rust implementation of minbpe
#66
opened Apr 21, 2024 by
shubham0204
updated Apr 22, 2024
Amplifying your courses with my digital notes
#70
opened Apr 30, 2024 by
AayushSameerShah
updated May 1, 2024
The regular expressions break all scripts with combining marks in the middle of the syllable
#73
opened May 12, 2024 by
ajaykg
updated May 22, 2024
Huggingface already has an efficient implementation of this?
#58
opened Mar 19, 2024 by
laurislopata
updated May 29, 2024
Instead of finding the one pair with the highest frequency and merging it at each step, do the highest N pairs
#69
opened Apr 23, 2024 by
hippietrail
updated Jun 7, 2024
Previous Next
ProTip!
Follow long discussions with comments:>50.