Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update doc for using one socket with latency hint instead of one numa node #28227

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -63,19 +63,19 @@ the model precision and the ratio of P-cores and E-cores.

Then the default settings for low-level performance properties on Windows and Linux are as follows:

+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| Property | Windows | Linux |
+======================================+=======================================================================+=======================================================================+
| ``ov::num_streams`` | 1 | 1 |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| ``ov::inference_num_threads`` | is equal to the number of P-cores or P-cores+E-cores on one numa node | is equal to the number of P-cores or P-cores+E-cores on one numa node |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| ``ov::hint::scheduling_core_type`` | :ref:`Core Type Table of Latency Hint <core_type_latency>` | :ref:`Core Type Table of Latency Hint <core_type_latency>` |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| ``ov::hint::enable_hyper_threading`` | No | No |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
| ``ov::hint::enable_cpu_pinning`` | No / Not Supported | Yes except using P-cores and E-cores together |
+--------------------------------------+-----------------------------------------------------------------------+-----------------------------------------------------------------------+
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+
| Property | Windows | Linux |
+======================================+====================================================================+====================================================================+
| ``ov::num_streams`` | 1 | 1 |
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+
| ``ov::inference_num_threads`` | is equal to the number of P-cores or P-cores+E-cores on one socket | is equal to the number of P-cores or P-cores+E-cores on one socket |
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+
| ``ov::hint::scheduling_core_type`` | :ref:`Core Type Table of Latency Hint <core_type_latency>` | :ref:`Core Type Table of Latency Hint <core_type_latency>` |
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+
| ``ov::hint::enable_hyper_threading`` | No | No |
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+
| ``ov::hint::enable_cpu_pinning`` | No / Not Supported | Yes except using P-cores and E-cores together |
+--------------------------------------+--------------------------------------------------------------------+--------------------------------------------------------------------+

.. note::

Expand All @@ -96,7 +96,7 @@ Then the default settings for low-level performance properties on Windows and Li
Starting from 5th Gen Intel Xeon Processors, new microarchitecture enabled new sub-NUMA clusters
feature. A sub-NUMA cluster (SNC) can create two or more localization domains (numa nodes)
within a socket by BIOS configuration.
By default OpenVINO with latency hint uses single NUMA node for inference. Although such
By default OpenVINO with latency hint uses single socket for inference. Although such
behavior allows to achive best performance for most of the models, there might be corner
cases which require manual tuning of ``ov::num_streams`` and ``ov::hint::enable_hyper_threading parameters``.
Please find more detail about `Sub-NUMA Clustering <https://www.intel.com/content/www/us/en/developer/articles/technical/xeon-processor-scalable-family-technical-overview.html>`__
Expand Down
Loading