-
Notifications
You must be signed in to change notification settings - Fork 64
Issues: deepjavalibrary/djl-serving
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Doubt] Inflight batching support in T5
enhancement
New feature or request
#2417
opened Oct 3, 2024 by
vguruju
Upgrade to support latest vLLM version (max_lora_rank)
enhancement
New feature or request
#2389
opened Sep 16, 2024 by
dreamiter
Support for newer Vision LMs through vllm 0.6.1
enhancement
New feature or request
#2387
opened Sep 14, 2024 by
rdzotz
docker 0.29.0-pytorch-inf2 with meta-llama/Meta-Llama-3.1-8B-Instructn failes
bug
Something isn't working
#2385
opened Sep 13, 2024 by
yaronr
NeuronX compiler: specify data type
enhancement
New feature or request
#2378
opened Sep 11, 2024 by
CoolFish88
Transformers NeuronX continuous batching support for Mistal 7b Instruct V3
enhancement
New feature or request
#2377
opened Sep 11, 2024 by
CoolFish88
Model conversion process failed. Unable to find bin files
bug
Something isn't working
#2365
opened Sep 5, 2024 by
joshight
Mistral7b custom inference with LMI not working: java.lang.IllegalStateException: Read chunk timeout.
bug
Something isn't working
#2362
opened Sep 5, 2024 by
jeremite
Strange generation with Llama-3.1-70B on ml.inf2.48xlarge
bug
Something isn't working
#2354
opened Sep 3, 2024 by
juliensimon
awscurl: Missing token metrics when -t option specified
bug
Something isn't working
#2340
opened Aug 25, 2024 by
CoolFish88
awscurl: WARN maxLength is not explicitly specified, use modelMaxLength: 512
bug
Something isn't working
#2339
opened Aug 25, 2024 by
CoolFish88
djl-inference:0.29.0-tensorrtllm0.11.0-cu124 regression: has no attribute 'to_word_list_format'
bug
Something isn't working
#2293
opened Aug 7, 2024 by
lxning
Llama 2 7b chat model output quality is low
bug
Something isn't working
#2093
opened Jun 21, 2024 by
ghost
Error running multimodel endpoints in sagemaker
bug
Something isn't working
#1911
opened May 15, 2024 by
Najib-Haq
document the /invocations endpoint
bug
Something isn't working
#1905
opened May 14, 2024 by
tenpura-shrimp
Better support prometheus metrics and/or allow custom prometheus metrics
enhancement
New feature or request
#1827
opened Apr 27, 2024 by
glennq
DJL-TensorRT-LLM Bug: TypeError: Got unsupported ScalarType BFloat16
bug
Something isn't working
#1816
opened Apr 25, 2024 by
rileyhun
DJL-TRTLLM: Error while detokenizing output response of teknium/OpenHermes-2.5-Mistral-7B on Sagemaker
bug
Something isn't working
#1792
opened Apr 20, 2024 by
omarelshehy
question to error model conversion process failed
bug
Something isn't working
#1785
opened Apr 17, 2024 by
geraldstanje
Token accuracy not as expected for starcoderbase-15b model with rolling batch type vllm
bug
Something isn't working
#1720
opened Apr 2, 2024 by
sreka
snap installer for djlbench doesn't work for arm64 platform
bug
Something isn't working
#1532
opened Feb 6, 2024 by
snadampal
Plan to use Attention Sinks?
enhancement
New feature or request
#1470
opened Jan 10, 2024 by
spring1915
add support for nacos of spring cloud alibaba
enhancement
New feature or request
#1436
opened Dec 31, 2023 by
litongjava
Streaming with rolling batch for starcoderbase model not working
bug
Something isn't working
#1352
opened Nov 30, 2023 by
prgawade
Previous Next
ProTip!
Follow long discussions with comments:>50.