Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimum Intel OpenVino fails with segmentation fault #3066

Closed
yifanmai opened this issue Oct 16, 2024 · 2 comments · Fixed by #3153
Closed

Optimum Intel OpenVino fails with segmentation fault #3066

yifanmai opened this issue Oct 16, 2024 · 2 comments · Fixed by #3153
Labels
bug Something isn't working models

Comments

@yifanmai
Copy link
Collaborator

yifanmai commented Oct 16, 2024

Hi @NoushNabi,

Recently, the Optimum Intel OpenVino tests have been failing intermittently because of what appears to be a race condition due to multiple concurrent calls to inference. This causes the run to exit with a segmentation fault. Could you take a look?

Example logs from this run:

Executor.execute {
      Parallelizing computation on 10 items over 4 threads {
        Created cache with config: SqliteCacheConfig(path='prod_env/cache/hf-internal-testing.sqlite')
        Created cache with config: SqliteCacheConfig(path='prod_env/cache/hf-internal-testing.sqlite')
        Loading hf-internal-testing/tiny-random-MistralForCausalLM (kwargs={'openvino': True}) for HELM model hf-internal-testing/tiny-random-MistralForCausalLM with Hugging Face Transformers {
          Hugging Face device set to "cpu" because CUDA is unavailable.
          Loading Hugging Face model hf-internal-testing/tiny-random-MistralForCausalLM {
            Created cache with config: SqliteCacheConfig(path='prod_env/cache/hf-internal-testing.sqlite')
            Created cache with config: SqliteCacheConfig(path='prod_env/cache/hf-internal-testing.sqlite')
            Created cache with config: SqliteCacheConfig(path='prod_env/cache/hf-internal-testing.sqlite')
We detected that you are passing `past_key_values` as a tuple of tuples. This is deprecated and will be removed in v4.47. Please convert your cache or use an appropriate `Cache` class (https://huggingface.co/docs/transformers/kv_cache#legacy-cache-format)
/opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/transformers/cache_utils.py:447: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  or len(self.key_cache[layer_idx]) == 0  # the layer has no cache
/opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/transformers/modeling_attn_mask_utils.py:281: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  elif sliding_window is None or key_value_length < sliding_window:
/opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/transformers/cache_utils.py:432: TracerWarning: Using len to get tensor shape might cause the trace to be incorrect. Recommended usage would be tensor.shape[0]. Passing a tensor of different shape might lead to errors or silently give incorrect results.
  elif len(self.key_cache[layer_idx]) == 0:  # fills previously skipped layers; checking for tensor causes errors
Starting from v4.46, the `logits` model output will have the same type as the model (except at train time, where it will always be FP32)
          } [7.965s]
        } [7.966s]
        HuggingFace error: Infer Request is busy
        Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)
        HuggingFace error: Infer Request is busy
        Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)
        HuggingFace error: Infer Request is busy
        Request failed. Retrying (attempt #2) in 10 seconds... (See above for error details)
/home/runner/work/_temp/3b3f1c68-38a5-4e0d-ba66-80ecc08f0[297](https://github.com/stanford-crfm/helm/actions/runs/11369018353/job/31625461750#step:7:298).sh: line 1:  2069 Segmentation fault      (core dumped) helm-run --run-entries boolq:model=hf-internal-testing/tiny-random-MistralForCausalLM --enable-huggingface-models hf-internal-testing/tiny-random-MistralForCausalLM --suite v1 --max-eval-instances 10 --openvino
@yifanmai yifanmai added bug Something isn't working models labels Oct 16, 2024
@yifanmai
Copy link
Collaborator Author

Hi @NoushNabi, have you had time to take a look at the segfault issue in the OpenVino codepath?

@yifanmai
Copy link
Collaborator Author

yifanmai commented Nov 6, 2024

Hi @NoushNabi, let me know if you have time to look at the segfault issue in the OpenVino codepath. If not, my plan is to remove the OpenVino codepath temporarily until the issue is resolved upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working models
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant