-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor async engine & turbomind IO #2968
Merged
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
7b8a841
refactor
lzhangzz 160ba9e
Merge remote-tracking branch 'origin/main' into refactor-1
lzhangzz 382f92b
async interface
lzhangzz 9c56be8
Merge remote-tracking branch 'origin/main' into refactor-3
lzhangzz 2cf49bd
update perf metrics & adaptive tokens per tick
lzhangzz aa5573d
wait-free
lzhangzz 6378aaa
refactor gateway
lzhangzz 9812538
optimize throughput
lzhangzz 8baa784
add cancel cb
lzhangzz 1bc68d1
simplify async engine
lzhangzz f220762
simplify async engine
lzhangzz 31c6223
fix end session
lzhangzz b3d15b1
faster synchronization
lzhangzz c6fd260
fix async engine
lzhangzz 8fa85dc
refactor async engine
lzhangzz 3f07733
fix semaphore
lzhangzz 2382d7e
refactor inference API
lzhangzz 747252c
remove turbomind sync interface
lzhangzz 54df9f1
Merge remote-tracking branch 'origin/main' into refactor-3
lzhangzz 5266f27
fix msvc build
lzhangzz 33ad2be
fix msvc build
lzhangzz 1c20608
fix msvc build
lzhangzz 6d1d209
Merge remote-tracking branch 'origin/main' into refactor-3
lzhangzz 43020b5
add extra outputs
lzhangzz 8412518
skip stop tokens
lzhangzz 3409742
exit gracefully
lzhangzz 21a7553
cancel all tasks atexit
lzhangzz 49701df
refactor profiler
lzhangzz f4b37af
fix id2step for api server
lzhangzz 2644fb7
save csv
lzhangzz 6029a2e
fix interactive
lzhangzz 50fdb68
fix lint
lzhangzz e2ed1a2
fix generate_token_len
lzhangzz 21432bf
fix async_end
lzhangzz ad0e07c
update pipeline ut
lzhangzz 4186da5
fix ignore eos
lzhangzz bee78b6
minor
lzhangzz 5f02cad
refactor profile pipeline api
lzhangzz 1965327
fix stop ids
lzhangzz 7b513cb
fix duplication
lzhangzz 2e3a17d
control output range of logits & last hidden states
lzhangzz 80108df
fix lint & typo
lzhangzz 6c2f901
fix blank response
lzhangzz 31b01f1
export batch & num prompts
lzhangzz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do_sample True?