Skip to content

Commit

Permalink
Merge pull request #24 from prescient-design/smiles-tokenizer
Browse files Browse the repository at this point in the history
SMILES regex fast tokenizer
  • Loading branch information
karinazad authored Jan 29, 2025
2 parents 1e87be3 + 6825097 commit 5eb017c
Show file tree
Hide file tree
Showing 13 changed files with 1,610 additions and 33 deletions.
8 changes: 8 additions & 0 deletions src/lobster/assets/smiles_tokenizer/special_tokens_map.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"cls_token": "<cls>",
"eos_token": "<eos>",
"mask_token": "<mask>",
"pad_token": "<pad>",
"sep_token": "<sep>",
"unk_token": "<unk>"
}
Loading

0 comments on commit 5eb017c

Please sign in to comment.