-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update to huggingface/tokenizers v0.20.0 (#23)
* update tokenizers lib to v0.20.0 * benchmark new release * update bazel builds
- Loading branch information
Showing
6 changed files
with
103 additions
and
11 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
45 changes: 45 additions & 0 deletions
45
test/benchmark/1b502b65573ea00125eac62fa301c480402be19c.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
goos: darwin | ||
goarch: arm64 | ||
pkg: github.com/daulet/tokenizers | ||
BenchmarkEncodeNTimes-10 95174 12667 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 94437 12580 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 93362 12583 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 94240 13372 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 92844 12868 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 92984 12766 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 92055 12654 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 91874 13204 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 93130 12686 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 93288 12528 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.374 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.651 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 1.993 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.169 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.282 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.348 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.028 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.013 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.200 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 1.957 ns/op 0 B/op 0 allocs/op | ||
BenchmarkDecodeNTimes-10 250281 4474 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 268866 4501 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 260468 4422 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 264583 4455 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 262168 4552 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 262182 4455 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 262510 4511 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 263491 4524 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 265724 4396 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 259940 4430 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTokens-10 1804423 678.7 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1827415 654.8 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1850868 648.1 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1838286 650.1 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1853236 655.6 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1835120 657.1 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1838400 652.3 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1847911 659.2 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1808113 654.2 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1820958 666.3 ns/op 7 B/op 0 allocs/op | ||
PASS | ||
ok github.com/daulet/tokenizers 245.425s |
45 changes: 45 additions & 0 deletions
45
test/benchmark/7bb47dd52e68ae3349c0461d494921d6a07f7181.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
goos: darwin | ||
goarch: arm64 | ||
pkg: github.com/daulet/tokenizers | ||
BenchmarkEncodeNTimes-10 91389 12616 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 94416 12608 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95833 12702 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 93657 12692 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95575 12565 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95866 12700 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95568 12502 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95286 12625 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 95224 12739 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNTimes-10 93948 12949 ns/op 232 B/op 12 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.254 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 3.099 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.273 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.722 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 1.965 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.024 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 1.997 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 2.320 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 1.866 ns/op 0 B/op 0 allocs/op | ||
BenchmarkEncodeNChars-10 1000000000 4.136 ns/op 0 B/op 0 allocs/op | ||
BenchmarkDecodeNTimes-10 239275 4575 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 243561 4515 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 258657 4480 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 262723 4597 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 263178 4466 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 266382 4442 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 266616 4498 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 266132 4544 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 266750 4780 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTimes-10 266880 4454 ns/op 96 B/op 3 allocs/op | ||
BenchmarkDecodeNTokens-10 1808430 655.3 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1832203 649.4 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1851890 648.7 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1836775 649.1 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1839984 650.7 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1854864 643.8 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1854836 647.9 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1866586 643.4 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1794544 666.8 ns/op 7 B/op 0 allocs/op | ||
BenchmarkDecodeNTokens-10 1768803 666.9 ns/op 7 B/op 0 allocs/op | ||
PASS | ||
ok github.com/daulet/tokenizers 226.796s |