[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

BlohoJo · 2024-11-16T06:30:14Z

OpenVINO Version

2024.0.0 - Current

Operating System

Windows 10 Professional 2004 [Version 10.0.19041.1415]

Device used for inference

CPU (Intel Xeon E-2288G CPU [Coffee Lake / 9th Generation Core])

Framework

Any

Model used

Any

Issue description

This is related to my previous issue here:

#22678 (comment)

I'm an OpenVINO user of Topaz Video AI, and the current version uses only half of my total CPU. It uses OpenVINO 2024.3.0, but the problem I'm describing is specific to OpenVINO and started happening with 2024.0.0, and has history going back to at least 2023.0.1.

The system is a VM (VPS) running on a Hypervisor configured with two CPU sockets, each with 8 cores and 8 threads. The actual hardware and VM CPU is the Intel Xeon E-2288G CPU (Coffee Lake / 9th Generation Core).

The Intel Xeon E-2288G is listed as a supported processor for Windows 10 2004.

Intel Processor Diagnostic Tool v4.1.9.41 passes. TESTRESULTS.TXT contents: https://pastebin.com/zeWjbVur

CPU-Z Report: https://pastebin.com/uG071g8r

I used benchmark_app to track the history of when the problem started happening in OpenVINO. I beleive that the problem is that OpenVINO "thinks" that the total number of available threads on my system is only 8, when it is actually 16. I suspect it isn't currently able to interpret and handle more than 1 CPU socket.

Here is what I discovered:

In OpenVINO 2023.0.1, the system uses all available 16 threads (output shows INFERENCE_NUM_THREADS: 16) when the following command is used: benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml". It hangs and does nothing if -api async is used.

In OpenVINO 2023.1.0 & 2023.2.0, OpenVINO doesn't work at all; it crashes (as noted in the above Github issue #22678).

In OpenVINO 2023.3.0, now both -api sync and -api async work, but, -api sync runs using only 8 threads (output shows INFERENCE_NUM_THREADS: 8; using -hint none -nthreads 16 doesn't work and output still shows INFERENCE_NUM_THREADS: 8). -api async works correctly and uses all 16 threads.

Starting with OpenVINO 2024.0.0, now both api-sync and api-async are only able to use 8 cores and there doesn't seem to be any way to get OpenVINO to use all 16 cores.

Step-by-step reproduction

To reproduce someone will need a Windows 10 system that has the Intel Xeon E-2288G CPU configured as two sockets with 8 cores and 8 threads per socket. Alternatively, it's possible this problem will manifest on a system configured with two CPU sockets and any similar Intel Xeon CPU models with X number of both cores and threads in each socket.

Install Python 3.9

Create a directory for virtual environment, i.e. C:\OpenVINO, and open a command prompt in this directory.

python -m venv openvino_env
openvino_env\Scripts\activate
python -m pip install --upgrade pip
python -m pip install openvino-dev==2023.3.0

omz_downloader --all
omz_converter --all

benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"

benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"

Relevant log output

(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.3.0-13775-ceeafaf64f3-releases/2023/3
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 109.38 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1203.11 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 2
[ INFO ]   NUM_STREAMS: 2
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 16
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 2 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 619.90 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            26 iterations
[ INFO ] Duration:         11093.38 ms
[ INFO ] Latency:
[ INFO ]    Median:        844.79 ms
[ INFO ]    Average:       853.04 ms
[ INFO ]    Min:           827.79 ms
[ INFO ]    Max:           945.26 ms
[ INFO ] Throughput:   2.34 FPS

(openvino_env) C:\OpenVINO\test>

(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.4.0-16579-c3152d32c9c-releases/2024/4
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 109.35 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 937.45 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'float16'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 619.75 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10680.20 ms
[ INFO ] Latency:
[ INFO ]    Median:        629.98 ms
[ INFO ]    Average:       628.07 ms
[ INFO ]    Min:           570.90 ms
[ INFO ]    Max:           702.74 ms
[ INFO ] Throughput:   1.59 FPS

(openvino_env) C:\OpenVINO\test>

Relevant log sections:

2023.3.0:

[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 2
[ INFO ]   NUM_STREAMS: 2
[ INFO ]   INFERENCE_NUM_THREADS: 16

2024.4.0:

[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8

(Additional lines):

[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'float16'>

Issue submission checklist

I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

wangleis · 2024-11-18T08:38:20Z

hi @BlohoJo, The default latency behavior had changed, as shown in the logs:

2023.3 enabled 2 streams with 16 threads, 1 stream with 8 threads per socket.
2024.4 only enabled 1 stream with 8 threads on 1 socket.

If you want to use 1 stream with 16 in 2024.4, please try -hint none -nstreams 1 -nthreads 16.
If you want to use 2 stream with 16 in 2024.4 as 2023.3, please try -hint none -nstreams 2 -nthreads 16.

BlohoJo · 2024-11-18T23:46:14Z

Is there any way to automate the generation of those variables on different systems, so that applications that use OpenVINO (like Topaz Video AI) can automatically use all available cores & threads on the system in which it happens to be running? 😕

peterchen-intel · 2024-11-20T14:05:14Z

@BlohoJo -hint throughput will use all the available CPU cores on system.
or -hint none -nstreams 1 -nthreads <NUM of CPU cores>

BlohoJo · 2024-11-20T15:10:40Z

-hint throughput crashes Python on step 7 (model load) on my system (exception 0xc0000005, access violation).

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.4.0.16579, time stamp: 0x66d9c0bf
Exception code: 0xc0000005
Fault offset: 0x0000000000038d24
Faulting process id: 0x3130
Faulting application start time: 0x01db3b5cde0d4883
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\Program Files\Python39\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 62be2ad1-47d2-4936-b694-5818b0d0a0b1
Faulting package full name: 
Faulting package-relative application ID:

-hint none -nstreams 1 -nthreads 16 --> Only uses 8 cores.

-hint none -nstreams 2 -nthreads 16 --> Crashes Python on step 7 model load (exception 0xc0000094, divide by zero).

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.4.0.16579, time stamp: 0x66d9c0bf
Exception code: 0xc0000094
Fault offset: 0x0000000000037d03
Faulting process id: 0x28f8
Faulting application start time: 0x01db3b5db71a138d
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\Program Files\Python39\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 37ae84e2-eec8-4795-8466-8f1ddd65ff39
Faulting package full name: 
Faulting package-relative application ID:

-api sync and -api async don't make a difference for any of the above.

Dead in the water it seems for using all cores in OpenVINO 24.0.0 and above. 😞

BlohoJo · 2024-11-20T21:15:21Z

I tried to get some additional info by running x64dbg (it sometimes shows something useful), but it won't work. The process terminates before benchmark_app can do anything.

Batch file:

@echo off
call C:\OpenVINO\Test\openvino_env\Scripts\activate.bat
start "" "C:\Program Files\x64dbg\x64\x64dbg.exe" -run -exe "C:\OpenVINO\Test\benchmark_app.exe" -arg "-api async -hint latency -t 10 -m C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"

xAnalyzer 2.5.6 Plugin by ThunderCls 2021
Extended analysis for static code
-> For latest release, issues, etc....
-> For help type command "xanal help"
-> code: http://github.com/ThunderCls/xAnalyzer
-> blog: http://reversec0de.wordpress.com

Initializing wait objects...
Initializing debugger...
Initializing debugger functions...
Setting JSON memory management functions...
Getting directory information...
Start file read thread...
Retrieving syscall indices...
Symbol Path: C:\Program Files\x64dbg\x64\symbols
Allocating message stack...
Initializing global script variables...
Registering debugger commands...
Registering GUI command handler...
Registering expression functions...
Registering format functions...
Registering Script DLL command handler...
Starting command loop...
Initialization successful!
Loading plugins...
[pluginload] xAnalyzer
Syscall indices loaded!
Error codes database loaded!
Exception codes database loaded!
NTSTATUS codes database loaded!
Windows constant database loaded!
Reading notes file...
File read thread finished!
[PLUGIN, xAnalyzer] Command "xanal" registered!
[PLUGIN, xAnalyzer] Command "xanalremove" registered!
[PLUGIN] xAnalyzer v2 Loaded!
Handling command line...
  "C:\Program Files\x64dbg\x64\x64dbg.exe"  -run -exe "C:\OpenVINO\Test\benchmark_app.exe" -arg "-api async -hint latency -t 10 -m C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Debugging: C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
Database file: C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64
Loading commandline...
Loading database from C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64 31ms
Process Started: [00007FF61A950000](x64dbg://localhost/address64#00007FF61A950000) C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
  "C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe" -api async -hint latency -t 10 -m "C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml"
  argv[0]: C:\OpenVINO\test\openvino_env\Scripts\benchmark_app.exe
  argv[1]: -api
  argv[2]: async
  argv[3]: -hint
  argv[4]: latency
  argv[5]: -t
  argv[6]: 10
  argv[7]: -m
  argv[8]: C:\OpenVINO\Test\intel\face-detection-0206\FP16\face-detection-0206.xml
Breakpoint at [00007FF61A95427C](x64dbg://localhost/address64#00007FF61A95427C) (entry breakpoint) set!
DLL Loaded: [00007FFFBC850000](x64dbg://localhost/address64#00007FFFBC850000) C:\Windows\System32\ntdll.dll
DLL Loaded: [00007FFFBBE30000](x64dbg://localhost/address64#00007FFFBBE30000) C:\Windows\System32\kernel32.dll
DLL Loaded: [00007FFFB9F60000](x64dbg://localhost/address64#00007FFFB9F60000) C:\Windows\System32\KernelBase.dll
DLL Loaded: [00007FFFB7020000](x64dbg://localhost/address64#00007FFFB7020000) C:\Windows\System32\apphelp.dll
DLL Loaded: [00007FFFBC250000](x64dbg://localhost/address64#00007FFFBC250000) C:\Windows\System32\shlwapi.dll
DLL Loaded: [00007FFFBBC20000](x64dbg://localhost/address64#00007FFFBBC20000) C:\Windows\System32\msvcrt.dll
Thread 8608 created, Entry: ntdll.[00007FFFBC8A2AD0](x64dbg://localhost/address64#00007FFFBC8A2AD0), Parameter: [0000000001138920](x64dbg://localhost/address64#0000000001138920)
Thread 8420 created, Entry: ntdll.[00007FFFBC8A2AD0](x64dbg://localhost/address64#00007FFFBC8A2AD0), Parameter: [0000000001138920](x64dbg://localhost/address64#0000000001138920)
System breakpoint reached!
[xAnalyzer]: Analysis retrieved from data base 
INT3 breakpoint "entry breakpoint" at <benchmark_app.OptionalHeader.AddressOfEntryPoint> ([00007FF61A95427C](x64dbg://localhost/address64#00007FF61A95427C))!
Thread 8420 exit
Thread 8608 exit
Process stopped with exit code 0x1 (1)
Saving database to C:\Program Files\x64dbg\x64\db\benchmark_app.exe.dd64 16ms
Debugging stopped!

BlohoJo · 2024-11-20T21:40:04Z

The oldest OpenVINO version that will work with any of the model_zoo files is 2023.0.1. As mentioned above, that version works in Topaz Video AI with all 16 cores.

With benchmark_app, it crashes as above with -hint throughput.

With -api async, it hangs on step 10.

(openvino_env) C:\OpenVINO\test>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-relases/2023/0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-relases/2023/0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 120.90 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1093.97 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 0
[ INFO ]   NUM_STREAMS: 0
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 0
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: True
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filed with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 0 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
^C
(openvino_env) C:\OpenVINO\test>

With -api sync, it runs using all 16 cores:

(openvino_env) C:\OpenVINO\test>benchmark_app -api sync -hint latency -t 10 -m "
C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2023.0.1-11005-fa1c41994f3-releases/2023/0
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 115.96 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 1102.89 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 0
[ INFO ]   NUM_STREAMS: 0
[ INFO ]   AFFINITY: Affinity.NONE
[ INFO ]   INFERENCE_NUM_THREADS: 0
[ INFO ]   PERF_COUNT: False
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: PerformanceMode.LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   ENABLE_HYPER_THREADING: True
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 452.58 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            22 iterations
[ INFO ] Duration:         10150.12 ms
[ INFO ] Latency:
[ INFO ]    Median:        450.61 ms
[ INFO ]    Average:       461.57 ms
[ INFO ]    Min:           445.08 ms
[ INFO ]    Max:           647.56 ms
[ INFO ] Throughput:   2.22 FPS

(openvino_env) C:\OpenVINO\test>

What's interesting is that in 2023.0.1, it shows 0 for INFERENCE_NUM_THREADS, NUM_STREAMS, and other parameters, both for -api sync and -api async.

I'm not sure if any of that info is helpful or not. With OpenVINO versions 2024.0.0 and above (including the new 2024.5.0), it's as if it only sees the CPU in socket #1.

wangleis · 2024-11-28T06:07:04Z

@BlohoJo Could you run attached test_info.zip on your Windows platform and share log to us?

BlohoJo · 2024-11-28T07:00:18Z

Thanks very much for the reply and the help! 😄

*********test data*******************

"0300000030000000000000000000000000000000000000000000000000000100ff0000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00100000000000000000000000000000002000000380000000108400000800000020000000000000
000000000000000000000000000000000ff000000000000000000000000000000020000003800000
00108400000800000010000000000000000000000000000000000000000000000ff0000000000000
00000000000000000020000003800000002044000000004000000000000000000000000000000000
00000000000000000ff0000000000000000000000000000000200000038000000031040000000000
1000000000000000000000000000000000000000000000000ff00000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100020000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00400000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100080000000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010010000000000000000000000000000000000000003000000
00000000000000000000000000000000000000000000001002000000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100400000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
08000000000000000000000000000000003000000300000000000000000000000000000000000000
0000000000000010000ff00000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000010000000000000000000000000000020000003800000
0010840000080000002000000000000000000000000000000000000000000000000ff00000000000
00000000000000000020000003800000001084000008000000100000000000000000000000000000
0000000000000000000ff00000000000000000000000000000200000038000000020440000000040
000000000000000000000000000000000000000000000000000ff000000000000000000000000000
00200000038000000031040000000000100000000000000000000000000000000000000000000000
000ff000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100000200000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000040000000000000000000000000000000000003000000
00000000000000000000000000000000000000000000001000008000000000000000000000000000
00000000030000000000000000000000000000000000000000000000000000100001000000000000
00000000000000000000000003000000000000000000000000000000000000000000000000000010
00020000000000000000000000000000000000000300000000000000000000000000000000000000
00000000000000100004000000000000000000000000000000000000030000000000000000000000
00000000000000000000000000000010000800000000000000000000000000000010000003000000
0000000000000000000000000000000000000000000000000ffff000000000000000000000000000
00400000050000000010001000000000000000000000000000000000000000000101000000000000
00000000000000000000000000000000000000000000000000000000000000000ffff00000000000
0"

*********test data*******************

(without line breaks)

*********test data*******************

"0300000030000000000000000000000000000000000000000000000000000100ff00000000000000000000000000000000000000300000000000000000000000000000000000000000000000000001000100000000000000000000000000000002000000380000000108400000800000020000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000108400000800000010000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000204400000000400000000000000000000000000000000000000000000000000ff00000000000000000000000000000002000000380000000310400000000001000000000000000000000000000000000000000000000000ff000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010002000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010004000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010008000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010010000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010020000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010040000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000010080000000000000000000000000000000030000003000000000000000000000000000000000000000000000000000010000ff00000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000100000000000000000000000000000200000038000000010840000080000002000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000010840000080000001000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000020440000000040000000000000000000000000000000000000000000000000000ff00000000000000000000000000000200000038000000031040000000000100000000000000000000000000000000000000000000000000ff00000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000200000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000400000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100000800000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100001000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100002000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100004000000000000000000000000000000000000030000000000000000000000000000000000000000000000000000100008000000000000000000000000000000100000030000000000000000000000000000000000000000000000000000000ffff0000000000000000000000000000040000005000000001000100000000000000000000000000000000000000000010100000000000000000000000000000000000000000000000000000000000000000000000000000ffff000000000000"

*********test data*******************

BlohoJo · 2024-12-03T03:57:52Z

~~Is the "Merging is blocked" status likely to change? 😟~~

~~(I'm not entirely familiar with how these things go on the OpenVINO repo so I apologize if the answer to this question is obvious.)~~

### Details: - *support new windows platform which is a VM (VPS) running on a Hypervisor* - *using one stream on two sockets* ### Tickets: - *[issues-27581](#27581

wangleis · 2024-12-17T01:46:46Z

@BlohoJo PR has been merged. Could you please try master branch?

BlohoJo · 2024-12-21T14:13:39Z

Sorry, things have gotten extremely busy and stressful for me lately! 🥴

I tried the new OpenVINO 2024.6.0. (Is that what I should be trying at this point? 😕)

Unfortunately, it didn't work.

It does no longer crash using -api async -hint latency, so that's a good change! 🙂

But, apart from that, the rest is still the same. It still only uses half of my CPU cores (one 8 core socket instead of both 8 core sockets) using either -api async -hint latency or -api sync -hint latency.

And, it still crashes using -hint throughput or -hint none -nstreams 2 -nthreads 16. It also still uses only half of my available 16 cores (again, 8 cores x 2 CPU sockets) using -hint none -nstreams 1 -nthreads 16.

I know it it possible for it to use all 16 cores because in OpenVINO 2023.0.1, it does use all 16 cores using -api async -hint latency.

Commands tried (below, output is in order listed):
benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Only uses 8 cores.

benchmark_app -api sync -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Crashes Python on step 7 (model load) on my system (exception 0xc0000005, access violation).

benchmark_app -api sync -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Only uses 8 cores

benchmark_app -api sync -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
benchmark_app -api async -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
Crashes Python on step 7 model load (exception 0xc0000094, divide by zero).

(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 114.06 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 952.27 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 718.90 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            16 iterations
[ INFO ] Duration:         10021.07 ms
[ INFO ] Latency:
[ INFO ]    Median:        616.14 ms
[ INFO ]    Average:       626.29 ms
[ INFO ]    Min:           576.80 ms
[ INFO ]    Max:           718.93 ms
[ INFO ] Throughput:   1.60 FPS

(openvino_env) C:\OpenVINO>benchmark_app -api async -hint latency -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 100.07 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 923.88 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 617.26 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            18 iterations
[ INFO ] Duration:         11165.38 ms
[ INFO ] Latency:
[ INFO ]    Median:        620.57 ms
[ INFO ]    Average:       620.23 ms
[ INFO ]    Min:           535.83 ms
[ INFO ]    Max:           703.90 ms
[ INFO ] Throughput:   1.61 FPS

(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 111.56 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000005
Fault offset: 0x0000000000039411
Faulting process id: 0xb5c
Faulting application start time: 0x01db53b038bce5ec
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: ad3f0c36-5b10-4367-b7aa-610f69f7cef3
Faulting package full name: 
Faulting package-relative application ID:

(openvino_env) C:\OpenVINO>benchmark_app -api async -hint throughput -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 108.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000005
Fault offset: 0x0000000000039411
Faulting process id: 0x2948
Faulting application start time: 0x01db53b07d00fa60
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: ddec23e7-3b47-4aae-80f7-1a38f39041fa
Faulting package full name: 
Faulting package-relative application ID:

(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 112.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 917.12 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference synchronously, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 630.01 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10519.89 ms
[ INFO ] Latency:
[ INFO ]    Median:        607.83 ms
[ INFO ]    Average:       618.73 ms
[ INFO ]    Min:           559.84 ms
[ INFO ]    Max:           705.97 ms
[ INFO ] Throughput:   1.62 FPS

(openvino_env) C:\OpenVINO>benchmark_app -api async -hint none -nstreams 1 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 104.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device
[ INFO ] Compile model took 947.48 ms
[Step 8/11] Querying optimal runtime parameters
[ INFO ] Model:
[ INFO ]   NETWORK_NAME: torch-jit-export
[ INFO ]   OPTIMAL_NUMBER_OF_INFER_REQUESTS: 1
[ INFO ]   NUM_STREAMS: 1
[ INFO ]   INFERENCE_NUM_THREADS: 8
[ INFO ]   PERF_COUNT: NO
[ INFO ]   INFERENCE_PRECISION_HINT: <Type: 'float32'>
[ INFO ]   PERFORMANCE_HINT: LATENCY
[ INFO ]   EXECUTION_MODE_HINT: ExecutionMode.PERFORMANCE
[ INFO ]   PERFORMANCE_HINT_NUM_REQUESTS: 0
[ INFO ]   ENABLE_CPU_PINNING: False
[ INFO ]   SCHEDULING_CORE_TYPE: SchedulingCoreType.ANY_CORE
[ INFO ]   MODEL_DISTRIBUTION_POLICY: set()
[ INFO ]   ENABLE_HYPER_THREADING: False
[ INFO ]   EXECUTION_DEVICES: ['CPU']
[ INFO ]   CPU_DENORMALS_OPTIMIZATION: False
[ INFO ]   LOG_LEVEL: Level.NO
[ INFO ]   CPU_SPARSE_WEIGHTS_DECOMPRESSION_RATE: 1.0
[ INFO ]   DYNAMIC_QUANTIZATION_GROUP_SIZE: 32
[ INFO ]   KV_CACHE_PRECISION: <Type: 'uint8_t'>
[ INFO ]   AFFINITY: Affinity.NONE
[Step 9/11] Creating infer requests and preparing input tensors
[ WARNING ] No input files were given for input 'image'!. This input will be filled with random values!
[ INFO ] Fill input 'image' with random values
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests using 1 streams for CPU, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 648.89 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            17 iterations
[ INFO ] Duration:         10670.66 ms
[ INFO ] Latency:
[ INFO ]    Median:        612.50 ms
[ INFO ]    Average:       627.72 ms
[ INFO ]    Min:           591.16 ms
[ INFO ]    Max:           722.21 ms
[ INFO ] Throughput:   1.59 FPS

(openvino_env) C:\OpenVINO>benchmark_app -api sync -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 105.03 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000094
Fault offset: 0x0000000000038490
Faulting process id: 0x2d68
Faulting application start time: 0x01db53b0b4cbec91
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: 1f3132bf-76c1-44c0-ae47-49626090c1aa
Faulting package full name: 
Faulting package-relative application ID:

(openvino_env) C:\OpenVINO>benchmark_app -api async -hint none -nstreams 2 -nthreads 16 -t 10 -m "C:\OpenVINO\test\intel\face-detection-0206\FP16\face-detection-0206.xml"
[Step 1/11] Parsing and validating input arguments
[ INFO ] Parsing input parameters
[Step 2/11] Loading OpenVINO Runtime
[ INFO ] OpenVINO:
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ] Device info:
[ INFO ] CPU
[ INFO ] Build ................................. 2024.6.0-17404-4c0f47d2335-releases/2024/6
[ INFO ]
[ INFO ]
[Step 3/11] Setting device configuration
[Step 4/11] Reading model files
[ INFO ] Loading model files
[ INFO ] Read model took 104.00 ms
[ INFO ] Original model I/O parameters:
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : f32 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 5/11] Resizing model to match image sizes and given batch
[ INFO ] Model batch size: 1
[Step 6/11] Configuring input of the model
[ INFO ] Model inputs:
[ INFO ]     image (node: image) : u8 / [N,C,H,W] / [1,3,640,640]
[ INFO ] Model outputs:
[ INFO ]     boxes (node: boxes) : f32 / [...] / [..750,5]
[ INFO ]     labels (node: labels) : i64 / [...] / [..750]
[Step 7/11] Loading the model to the device

------------------------------

Faulting application name: python.exe, version: 3.9.18150.1013, time stamp: 0x64f598e1
Faulting module name: openvino_intel_cpu_plugin.dll, version: 2024.6.0.17404, time stamp: 0x675afe5a
Exception code: 0xc0000094
Fault offset: 0x0000000000038490
Faulting process id: 0x2930
Faulting application start time: 0x01db53b0ec2ee46f
Faulting application path: C:\Program Files\Python39\python.exe
Faulting module path: C:\OpenVINO\openvino_env\lib\site-packages\openvino\libs\openvino_intel_cpu_plugin.dll
Report Id: b617c895-36e7-4b88-bbd6-1a667179f044
Faulting package full name: 
Faulting package-relative application ID:

### Details: - *support new windows platform which is a VM (VPS) running on a Hypervisor* - *using one stream on two sockets* ### Tickets: - *[issues-27581](openvinotoolkit#27581

wangleis · 2024-12-24T03:47:42Z

@BlohoJo Please try master branch. The fix is not part of OpenVINO 2024.6.0.

BlohoJo · 2024-12-24T11:27:15Z

I greatly apologize, but compiling OpenVINO is beyond my skill set and capability. I can get as far as opening Git Bash and running git clone https://github.com/openvinotoolkit/openvino.git in a directory, which just clones the master branch from GitHub. But compiling it means installing and configuring cmake, Microsoft Visual Studio 2019, Intel Graphics Drivers, etc.

If someone can build the master branch for me and link me to an archive (which has the contents of Lib\site-packages\openvino), I can definitely try it.

Unless there is a much more simple or automated command that will compile the master branch for me that I'm missing.

Again I apologize for my lack of knowledge and skill. 🙁

wangleis · 2024-12-24T14:56:41Z

@BlohoJo Could you please try nightly build from https://storage.openvinotoolkit.org/repositories/openvino/packages/nightly/2025.0.0-17699-b0ff7090a30/?

BlohoJo added bug Something isn't working support_request labels Nov 16, 2024

ilya-lavrenov assigned wangleis Nov 16, 2024

wangleis mentioned this issue Nov 29, 2024

Support new windows platform #27809

Merged

github-merge-queue bot pushed a commit that referenced this issue Dec 16, 2024

Support new windows platform (#27809)

a021d7e

### Details: - *support new windows platform which is a VM (VPS) running on a Hypervisor* - *using one stream on two sockets* ### Tickets: - *[issues-27581](#27581

peterchen-intel assigned BlohoJo Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

BlohoJo commented Nov 16, 2024 •

edited

Loading

wangleis commented Nov 18, 2024

BlohoJo commented Nov 18, 2024

peterchen-intel commented Nov 20, 2024 •

edited

Loading

BlohoJo commented Nov 20, 2024 •

edited

Loading

BlohoJo commented Nov 20, 2024

BlohoJo commented Nov 20, 2024 •

edited

Loading

wangleis commented Nov 28, 2024

BlohoJo commented Nov 28, 2024 •

edited

Loading

BlohoJo commented Dec 3, 2024 •

edited

Loading

wangleis commented Dec 17, 2024

BlohoJo commented Dec 21, 2024 •

edited

Loading

wangleis commented Dec 24, 2024

BlohoJo commented Dec 24, 2024 •

edited

Loading

wangleis commented Dec 24, 2024

[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

[Bug]: OpenVINO will only use half of my available system threads on 2 socket configuration #27581

Comments

BlohoJo commented Nov 16, 2024 • edited Loading

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Step-by-step reproduction

Relevant log output

Issue submission checklist

wangleis commented Nov 18, 2024

BlohoJo commented Nov 18, 2024

peterchen-intel commented Nov 20, 2024 • edited Loading

BlohoJo commented Nov 20, 2024 • edited Loading

BlohoJo commented Nov 20, 2024

BlohoJo commented Nov 20, 2024 • edited Loading

wangleis commented Nov 28, 2024

BlohoJo commented Nov 28, 2024 • edited Loading

BlohoJo commented Dec 3, 2024 • edited Loading

wangleis commented Dec 17, 2024

BlohoJo commented Dec 21, 2024 • edited Loading

wangleis commented Dec 24, 2024

BlohoJo commented Dec 24, 2024 • edited Loading

wangleis commented Dec 24, 2024

BlohoJo commented Nov 16, 2024 •

edited

Loading

peterchen-intel commented Nov 20, 2024 •

edited

Loading

BlohoJo commented Nov 20, 2024 •

edited

Loading

BlohoJo commented Nov 20, 2024 •

edited

Loading

BlohoJo commented Nov 28, 2024 •

edited

Loading

BlohoJo commented Dec 3, 2024 •

edited

Loading

BlohoJo commented Dec 21, 2024 •

edited

Loading

BlohoJo commented Dec 24, 2024 •

edited

Loading