You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/features/runtime_extension.md
+24-22
Original file line number
Diff line number
Diff line change
@@ -3,9 +3,9 @@ Runtime Extension
3
3
4
4
Intel® Extension for PyTorch\* Runtime Extension provides a couple of PyTorch frontend APIs for users to get finer-grained control of the thread runtime. It provides:
5
5
6
-
1. Multi-stream inference via the Python frontend module `intel_extension_for_pytorch.cpu.runtime.MultiStreamModule`.
7
-
2. Spawn asynchronous tasks via the Python frontend module `intel_extension_for_pytorch.cpu.runtime.Task`.
8
-
3. Program core bindings for OpenMP threads via the Python frontend `intel_extension_for_pytorch.cpu.runtime.pin`.
6
+
1. Multi-stream inference via the Python frontend module `ipex.cpu.runtime.MultiStreamModule`.
7
+
2. Spawn asynchronous tasks via the Python frontend module `ipex.cpu.runtime.Task`.
8
+
3. Program core bindings for OpenMP threads via the Python frontend `ipex.cpu.runtime.pin`.
9
9
10
10
**note**: Intel® Extension for PyTorch\* Runtime extension is in the **experimental** stage. The API is subject to change. More detailed descriptions are available at [API Documentation page](../api_doc.rst).
11
11
@@ -27,6 +27,9 @@ If the inputs' batchsize is larger than and divisible by ``num_streams``, the ba
27
27
28
28
Let's create some ExampleNets that will be used by further examples:
29
29
```
30
+
import torch
31
+
import intel_extension_for_pytorch as ipex
32
+
30
33
class ExampleNet1(torch.nn.Module):
31
34
def __init__(self):
32
35
super(ExampleNet1, self).__init__()
@@ -70,8 +73,8 @@ with torch.no_grad():
70
73
Here is the example of a model with single tensor input/output. We create a CPUPool with all the cores available on numa node 0. And creating a `MultiStreamModule` with stream number of 2 to do inference.
When creating a `MultiStreamModule`, we have default settings for `num_streams` ("AUTO") and `cpu_pool` (with all the cores available on numa node 0). For the `num_streams` of "AUTO", there are limitations to use with int8 datatype as we mentioned in below performance receipts section.
For module such as ExampleNet2 with structure input/output tensors, user needs to create `MultiStreamModuleHint` as input hint and output hint. `MultiStreamModuleHint` tells `MultiStreamModule` how to auto split the input into streams and concat the output from each steam.
@@ -133,12 +136,11 @@ Here are some performance receipes that we recommend for better multi-stream per
133
136
Here is an example for using asynchronous tasks. With the support of a runtime API, you can run 2 modules simultaneously. Each module runs on the corresponding cpu pool.
134
137
135
138
```
136
-
# Create the cpu pool and numa aware memory allocator
Runtime Extension provides API of `intel_extension_for_pytorch.cpu.runtime.pin` to a CPU Pool for binding physical cores. We can use it without the async task feature. Here is the example to use `intel_extension_for_pytorch.cpu.runtime.pin` in the `with` context.
154
+
Runtime Extension provides API of `ipex.cpu.runtime.pin` to a CPU Pool for binding physical cores. We can use it without the async task feature. Here is the example to use `ipex.cpu.runtime.pin` in the `with` context.
0 commit comments