Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: When use guided choice feature, vllm.engine.async_llm_engine.AsyncEngineDeadError #8100

Open
1 task done
TangJiakai opened this issue Sep 3, 2024 · 4 comments
Open
1 task done
Labels
bug Something isn't working unstale

Comments

@TangJiakai
Copy link

Your current environment

ERROR:asyncio:Exception in callback functools.partial(<function _raise_exception_on_finish at 0x7fd321802200>, error_ca
llback=<bound method AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7fd3164
50310>>)
handle: <Handle functools.partial(<function _raise_exception_on_finish at 0x7fd321802200>, error_callback=<bound method
 AsyncLLMEngine._error_callback of <vllm.engine.async_llm_engine.AsyncLLMEngine object at 0x7fd316450310>>)>
Traceback (most recent call last):
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 40, in _raise_ex
ception_on_finish
    task.result()
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 521, in run_engi
ne_loop
    has_requests_in_progress = await asyncio.wait_for(
                               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/asyncio/tasks.py", line 479, in wait_for
    return fut.result()
           ^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 495, in engine_s
tep
    request_outputs = await self.engine.step_async()
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 226, in step_asy
nc
    output = await self.model_executor.execute_model_async(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/executor/gpu_executor.py", line 117, in execute_mo
del_async
    output = await make_async(self.driver_worker.execute_model
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_cont
ext
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/worker/worker.py", line 272, in execute_model
    output = self.model_runner.execute_model(seq_group_metadata_list,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_cont
ext
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/worker/model_runner.py", line 731, in execute_mode
l
    logits = self.model.compute_logits(hidden_states, sampling_metadata)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/model_executor/models/llama.py", line 369, in comp
ute_logits
    logits = self.logits_processor(self.lm_head.weight, hidden_states,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_cal
l_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/lora/layers.py", line 1195, in forward
    return type(self.base_layer).forward(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/model_executor/layers/logits_processor.py", line 5
5, in forward
    logits *= self.scale
RuntimeError: CUDA error: an illegal memory access was encountered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
  File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/engine/async_llm_engine.py", line 47, in _raise_ex
ception_on_finish
    raise AsyncEngineDeadError(
vllm.engine.async_llm_engine.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an
 issue on Github. See stack trace above for the actual cause.
ESC[31mERRORESC[0m:    Exception in ASGI application
  + Exception Group Traceback (most recent call last):
  |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/_utils.py", line 77, in collapse_excgroup
s
  |     yield
  |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/base.py", line 186, in __call_
_
  |     async with anyio.create_task_group() as task_group:
  |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit_
_
  |     raise BaseExceptionGroup(
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py", line 4
01, in run_asgi
    |     result = await app(  # type: ignore[func-returns-value]
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in
 __call__
    |     return await self.app(scope, receive, send)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/fastapi/applications.py", line 1054, in __call__
    |     await super().__call__(scope, receive, send)
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/applications.py", line 113, in __call__
    |     await self.middleware_stack(scope, receive, send)
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/errors.py", line 187, in __c
all__
    |     raise exc
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/errors.py", line 165, in __c
all__
    |     await self.app(scope, receive, _send)
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __cal
l__
    |     with collapse_excgroups():
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/contextlib.py", line 155, in __exit__
    |     self.gen.throw(typ, value, traceback)
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/_utils.py", line 83, in collapse_excgro
ups
    |     raise exc
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __cal
l__
    |     response = await self.dispatch_func(request, call_next)
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 164,
 in authentication
    |     return await call_next(request)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_
next
    |     raise app_exc
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __cal
l__
    |     response = await self.dispatch_func(request, call_next)
    |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/data/tangjiakai/anaconda3/lib/python3.11/site-packages/vllm/entrypoints/openai/api_server.py", line 164,
 in authentication
    |     return await call_next(request)
    |            ^^^^^^^^^^^^^^^^^^^^^^^^

🐛 Describe the bug

I run the api server with the following cmd:

python -m vllm.entrypoints.openai.api_server \
        --model /data/pretrain_dir/Meta-Llama-3-8B-Instruct \
        --trust-remote-code \
        --port 8083 \
        --dtype auto \
        --pipeline-parallel-size 1 \
        --enforce-eager \
        --enable-prefix-caching \
        --enable-lora \
        --gpu-memory-utilization 0.6

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@TangJiakai TangJiakai added the bug Something isn't working label Sep 3, 2024
@flawaetz
Copy link

flawaetz commented Sep 4, 2024

This might be fixed already in 0.5.5 via #6594

@TangJiakai
Copy link
Author

No, it still happens in version 0.6.0

@TangJiakai TangJiakai changed the title [Bug]: 0.4.3 vllm vllm.engine.async_llm_engine.AsyncEngineDeadError [Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError Sep 9, 2024
@TangJiakai TangJiakai changed the title [Bug]: vllm.engine.async_llm_engine.AsyncEngineDeadError [Bug]: When use guided choice feature, vllm.engine.async_llm_engine.AsyncEngineDeadError Sep 9, 2024
Copy link

github-actions bot commented Dec 9, 2024

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

@github-actions github-actions bot added the stale label Dec 9, 2024
@gongjl123
Copy link

不,在 0.6.0 版本中仍然会出现这种情况

请问这个问题怎么解决的呀

@github-actions github-actions bot added unstale and removed stale labels Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working unstale
Projects
None yet
Development

No branches or pull requests

3 participants