This file is automatically generated using the script generate_controls_doc.py
.
Please do not edit it manually!
By default, the Intercept Layer for OpenCL Applications will not modify any OpenCL calls. You may notice some status messages being printed to stderr, but otherwise your application should run exactly as it does without the Intercept Layer for OpenCL Applications.
The Intercept Layer for OpenCL Applications is controlled using the Windows registry, Linux configuration files, or environment variables on all OSes.
On Windows, the Intercept Layer for OpenCL Applications reads its registry keys from:
HKEY_CURRENT_USER\SOFTWARE\INTEL\IGFX\CLIntercept
This is the recommended registry location as it has several advantages over HKEY_LOCAL_MACHINE: modifying the registry keys does not require Administrator access, registry keys do not need to be set in multiple places, and each user can set their own registry keys without affecting other users.
For backwards compatibility, the Intercept Layer for OpenCL applications will still read registry keys from:
// For 32-bit systems, or 64-bit applications on a 64-bit system:
HKEY_LOCAL_MACHINE\SOFTWARE\INTEL\IGFX\CLIntercept
// For 32-bit applications on a 64-bit system:
HKEY_LOCAL_MACHINE\SOFTWARE\WoW6432Node\INTEL\IGFX\CLIntercept
If a registry is set in both HKCU and HKLM, the setting in HKCU will "win".
On Linux, the Intercept Layer for OpenCL Applications will read control values from a config file named clintercept.conf. Controls in a config file may be set by putting the control on its own line, followed by an equals sign, followed by the value to set the control to. Lines that begin with a semi-colon (";"), a hash mark ("#"), or a C++-style comment ("//") are ignored. For example, to enable CallLogging, put a line in your clintercept.conf that looks like:
// Enable CallLogging:
CallLogging=1
The Intercept Layer for OpenCL Applications will search for config files in the following directories, in order:
- The current user's HOME directory (
~
). - The
sdcard
directory (for Android only). - The system directory
/etc/OpenCL
.
The Intercept Layer for OpenCL may be controlled using environment variables. The name of the environment variable control is "CLI_" and the control name, to distinguish controls from other environment variables, and to make it easy to list all of the environment variable controls. So, to enable CallLogging, you could type:
export CLI_CallLogging=1
To disable CallLogging, you could type:
unset CLI_CallLogging
To list all environment variable controls, you could type:
env | grep CLI_
Controls should generally be considered case sensitive. Some methods of setting controls on some operating systems may treat controls as case insensitive, but it is unsafe to rely on case insensitive behavior.
To verify that a control has been set correctly and is taking effect, please check the Intercept Layer for OpenCL Applications log file, which will record any controls that are set to non-default values.
Used to control the DLL or Shared Library that the Intercept Layer for OpenCL Applications loads to make real OpenCL calls. If present, only this DLL name is loaded. If omitted, the Intercept Layer for OpenCL Applications tries to load the real OpenCL DLL from file names in this order:
- real_OpenCL.dll (anywhere in the system path)
- %WINDIR%\SysWOW64\OpenCL.dll (32-bit DLLs only)
- %WINDIR%\System32\OpenCL.dll
If set to a nonzero value, the Intercept Layer for OpenCL Applications will break into the debugger when it is loaded.
If set to a nonzero value, suppresses all logging output from the Intercept Layer for OpenCL Applications. This is particularly useful for tools that only want report data.
By default, the Intercept Layer for OpenCL Applications log files will be created from scratch when the intercept DLL is loaded, and any Intercept Layer for OpenCL Applications report files will be created from scratch when the intercept DLL is unloaded. If AppendFiles is set to a nonzero value, the Intercept Layer for OpenCL Applications will append to an existing file instead of recreating it. This can be useful if an application loads and unloads the intercept DLL multiple times, or to simply preserve log or report data from run-to-run.
If set to a nonzero value, sends log information to the file "clintercept_log.txt" instead of to stderr.
If set to a nonzero value, sends log information to the debugger instead of to stderr. If both LogToFile and LogToDebugger are nonzero then log information will be sent both to a file and to the debugger.
Indents each log entry by this many spaces.
If set to a nonzero value, logs the program build log after each call to clBuildProgram(). This will likely only function correctly for synchronous builds. Note that the build log is logged regardless of whether the program built successfully, which allows compiler warnings to be logged for successful compiles.
If set to a nonzero value, logs the preferred work group size multiple for each kernel after each call to clCreateKernel(). On some devices this is the equivalent of the SIMD size for this kernel.
If set to a nonzero value, logs function entry and exit information for every OpenCL call. This can be used to easily determine which OpenCL call is causing an application to crash or fail or if a crash occurs outside of an OpenCL call. This setting is best used with LogToFile or LogToDebugger as it can generate a lot of log data.
If set to a nonzero value, logs the enqueue counter in addition to function entry and exit information for every OpenCL call. This can be used to determine appropriate limits for DumpBuffersMinEnqueue, DumpBuffersMaxEnqueue, DumpImagesMinEnqueue, or DumpBuffersMaxEnqueue. If CallLogging is disabled then this control will have no effect.
If set to a nonzero value, logs the ID of the calling thread in addition to function entry and exit information for every OpenCL call. This can be helpful when debugging multi-threading issues.
If set to a nonzero value, logs the symbolic number of the calling thread in addition to function entry and exit information for every OpenCL call. This can be helpful when debugging multi-threading issues.
If set to a nonzero value, logs the elapsed time in microseconds in addition to function entry and exit information for every OpenCL call, starting from the time the intercept DLL is loaded.
If set to a nonzero value, logs function entry and exit information for every OpenCL call using the ITT APIs. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.
If set to a nonzero value, logs function entry and exit information for every OpenCL call to a JSON file that may be used for Chrome Tracing.
If set to a nonzero value, logs all OpenCL errors and the function name that caused the error.
If set to a nonzero value, breaks into the debugger when an OpenCL error occurs.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will install a callback for every context and log any calls to the context callback. The application's context callback, if any, will be invoked after the Intercept Layer for OpenCL Applications' context callback.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will attempt to create contexts with the CL_CONTEXT_SHOW_DIAGNOSTICS_INTEL property set to the specified value. If this property is specified by the application, the Intercept Layer for OpenCL Applications will overwrite it with the specified value, otherwise the property and the specified value will be added to the list of context creation properties. This functionality is only available for OpenCL implementations that support the cl_intel_driver_diagnostics extension. If this functionality is not available in the underlying OpenCL implementation, the unmodified list of context properties will be used to create the context instead. More information about this feature, including valid values and their meaning, can be found in the cl_intel_driver_diagnostics extension specification.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will install its own callback for every event callback and log the call to the event callback. The application's event callback will be invoked after the Intercept Layer for OpenCL Applications' event callback.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will check and log any events in an event wait list that are invalid or in an error state. This can help to debug complex event dependency issues.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will check for leaks of various OpenCL objects, such as memory objects and events.
If set to a nonzero value, logs information about the platforms and devices in the system on the first call to clGetPlatformIDs().
If set, the Intercept Layer for OpenCL Applications will emit logs and dumps to this directory instead of the default directory. The default log and dump directory is "%SYSTEMDRIVE%\Intel\CLIntercept_Dump\<Process Name>" on Windows and "~/CLIntercept_Dump/<Process Name>" on other operating systems.
If set, the Intercept Layer for OpenCL Applications will append process ID to the log directory name.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will append the program and build option hashes to the kernel name in logs and reports.
If an OpenCL application uses kernels with very long names, the Intercept Layer for OpenCL Applications can substitute a "short" kernel identifier for a "long" kernel name in logs and reports. This control defines how long a kernel name must be (in characters) before it is replaced by a "short" kernel identifier.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will emit reports to stderr.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will write results to the file "clintercept_report.txt".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will track the minimum, maximum, and average host CPU time for each OpenCL entry point. When the process exits, this information will be included in the file "clIntercept_report.txt".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will add event profiling to track the minimum, maximum, and average device time for each OpenCL command. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. When the process exits, this information will be included in the file "clIntercept_report.txt".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels using information such as the kernel's Preferred Work Group Size Multiple (AKA SIMD size).
If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different global work offsets for the purpose of device performance timing.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different global work sizes for the purpose of device performance timing.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will distinguish between OpenCL NDRange kernels with different local work sizes for the purpose of device performance timing.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will skip device performance timing for unmap operations. This is a workaround for a bug in some OpenCL implementations, where querying events created from unmap operations results in driver crashes.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the host elapsed time for each OpenCL entry point. This can be useful to identify OpenCL entry points that execute significantly slower or faster than average on the host.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the device execution time deltas for each OpenCL command. This can be useful to identify specific OpenCL commands that execute significantly slower or faster than average on the device. If DevicePerformanceTiming is disabled then this control will have no effect.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will log the device execution times for each OpenCL command. This can be useful to visualize the execution timeline of OpenCL commands that execute on the device. If DevicePerformanceTiming is disabled then this control will have no effect.
If set, the Intercept Layer for OpenCL Applications will collect MDAPI metrics for the Metric Set corresponding to this value for each OpenCL command. Frequently used Metric Sets include: ComputeBasic, ComputeExtended, L3_1, Sampler. The output file has the potential to be very big depending on the work load. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. When the process exits, this information will be included in the file "clintercept_perfcounter_dump_<Set Name>.txt". This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.
Full path to a custom MDAPI file. This can be used to add custom Metric Sets.
If set to a nonzero value and DevicePerfCounterCustom is set, the Intercept Layer for OpenCL Applications will enable Intel GPU Performance Counters to track the minimum, maximum, and average performance counter deltas for each OpenCL command. This operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. When the process exits, this information will be included in the file "clIntercept_report.txt". This feature will only function if the Intercept Layer for OpenCL Applications is built with MDAPI support.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will collect also max values of target platform to .csv with MDAPI counters as a column next to each metric.
[Note: This control makes ITT calls, but they appear to do nothing!] If set to a nonzero value, the Intercept Layer for OpenCL Applications will generate ITT-compatible performance timing data. Similar to DevicePerformanceTiming, this operation may be fairly intrusive and may have side effects; in particular it forces all command queues to be created with PROFILING_ENABLED and may increment the reference count for application events. ITTPerformanceTiming will also silently create OpenCL command queues that support advanced performance counters if this functionality is available. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.
[Note: This control makes ITT calls, but they appear to do nothing!] By default, when ITTPerformanceTiming is enabled, the Intercept Layer for OpenCL Applications will generate ITT-compatible information for all states of an OpenCL event: when the command was queued, when it was submitted, when it started executing, and when it finished executing. If ITTShowOnlyExecutingEvents is set to a nonzero value, the Intercept Layer for OpenCL Applications will only generate ITT-compatible instrumentation when an event begins executing and when an event ends executing. Since no information will be displayed about when a command is queued or submitted, this can sometimes make it easier to identify times when the device is idle. This feature will only function if the Intercept Layer for OpenCL Applications is built with ITT support.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will generate device performance timing information in a JSON file that may be used for Chrome Tracing.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will omit the program number from dumped file names and hash tracking. This can produce deterministic results even if programs are built in a non-deterministic order (say, by multiple threads).
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump the last string(s) passed to clCreateProgramWithSource() to the file kernel.cl, and the last program options passed to clBuildProgram() to the file kernel.txt. These files will be dumped to the application's working directory. If an application fails to compile a program and exits the program immediately after detecting a compile failure SimpleDumpProgram may be all that is needed to identify the program and program options that are failing to compile.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every string passed to clCreateProgramWithSource() to its own file. The directory names and file names for the dumped files match the directory names and file names expected by a modified OpenCL conformance test script to capture kernels. This setting overrides SimpleDumpProgramSource, and if it is set to a nonzero value then the value of SimpleDumpProgramSource is ignored.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every string passed to clCreateProgramWithSource() to its own file. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_source.cl". Program options that are passed to clBuildProgram() or clCompileProgram() will be dumped to the same directory with the filename "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_options.txt". This setting can be used for information purposes to see all kernels that are used by an application or to dump programs for program injection. This setting overrides DumpProgramSourceScript and SimpleDumpProgramSource, and if it is set to a nozero value then the values of DumpProgramSourceScript and SimpleDumpProgramSource will be ignored.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program binary that is passed to clCreateProgramWithBinary() to its own file. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Device Type>.bin". This is the input program binary provided by the application, and not a device binary queried from the OpenCL implementation. In particular, note that it may be a SPIR 1.2 binary.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program binary that was successfully built with clBuildProgram() to its own file. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>.bin". Program options that are passed to clBuildProgram() or clCompileProgram() will be dumped to the same directory with the filename "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_options.txt". This setting can be used to examine compiled program binaries or to dump program binaries for program binary injection. Note that this option dumps the output binary, which is a device binary, after calling clBuildProgram().
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump every program IL binary passed to clCreateProgramWithIL() to its own file. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_0000.spv" - for now at least!. Program options that are passed to clBuildProgram() or clCompileProgram() will be dumped to the same directory with the filename "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_options.txt". This setting can be used for information purposes to see all kernels that are used by an application or to dump SPIRV programs for SPIRV injection.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel source to clCreateProgramWithSource() and/or potentially modified options to clBuildProgram().
If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel binaries via clCreateProgramWithBinary() in place of program text for each call to clCreateProgramWithSource(). This is typically done to reduce program compilation time or to use known good program binaries.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will reject kernel binaries passed via clCreateProgramWithBinary() and return CL_INVALID_BINARY. This can be used to force an application to re-compile program binaries from source.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to inject potentially modified kernel SPIR-V binaries via clCreateProgramWithIL() in place of program text for each call to clCreateProgramWithSource().
If set to a nonzero value, the Intercept Layer for OpenCL Applications will look to prepend kernel code from a file to the application provided kernel source passed to clCreateProgramWithSource(). The Intercept Layer for OpenCL Applications will look for kernel source to prepend in the dump and log directory. The files that are searched for are (in order) "CLI_<Program Number>_<Unique Program Hash Code>_prepend.cl", "CLI_<Unique Program Hash Code>_prepend.cl", and "CLI_prepend.cl".
If set, the Intercept Layer for OpenCL Applications will add these build options to the end of any application provided or injected build options for each call to clBuildProgram().
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump build logs for every device a program is built for to a separate file. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>_build_log.txt".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump kernel ISA binaries for every kernel, if supported. An ISA binaries can decoded into ISA text with a disassembler. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>_<Device Type>_<Kernel Name>.isabin".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will automatically create SPIR-V modules by invoking CLANG each time a program is built. The filename will have the form "CLI_<Program Number>_<Unique Program Hash Code>_<Compile Count>_<Unique Build Options Hash Code>.spv". Because invoking CLANG requires a file containing the OpenCL C source, setting this option implicitly sets DumpProgramSource as well. Additionally, this feature is not available for injected program source.
The clang executable used to compile an OpenCL C program to a SPIR-V module. This can be an executable in the system path, a relative path, or a full absolute path.
The OpenCL header file used to compile an OpenCL C program to a SPIR-V module. This must be a relative path or a full absolute path.
The spirv-dis executable used to optionally disassemble the compiled SPIR-V module to a SPIR-V text representation. This can be an executable in the system path, a relative path, or a full absolute path.
This is the list of options that is implicitly passed to CLANG to build a non-OpenCL 2.0 SPIR-V module. Any application-provided build options will be appended to these build options.
This is the list of options that is implicitly passed to CLANG to build an OpenCL 2.0 SPIR-V module. Any application-provided build options will be appended to these build options.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump the argument value on calls to clSetKernelArg(). Arguments are dumped as raw binary data. The filenames will have the form "SetKernelArg_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>.bin".
If set, the Intercept Layer for OpenCL Applications will dump buffers to a file after creation. This control still honors the enqueue counter limits, even though no enqueues are involved during buffer creation. Currently only works for cl_mem buffers created from host pointers.
If set, the Intercept Layer for OpenCL Applications will dump the contents of a buffer to a file after the buffer is mapped. Only valid if the buffer is NOT mapped with CL_MAP_WRITE_INVALIDATE_REGION. If the buffer was mapped non-blocking, this may insert a clFinish() into the command queue, which may have functional or performance implications.
If set, the Intercept Layer for OpenCL Applications will dump the contents of a buffer to a file immediately before the buffer is unmapped. This is done by inserting a blocking clEnqueueMapBuffer() (and matching clEnqueueUnmapMemObject()) into the command queue, which may have functional or performance implications.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump buffers before calls to clEnqueueNDRangeKernel(). Only buffers that are kernel arguments for the kernel being enqueued are dumped. Buffers are dumped as raw binary data to a "memDumpPreEnqueue" subdirectory of the dump directory. The filenames will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Buffer_<Unique Memory Object Number>.bin".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump buffers after calls to clEnqueueNDRangeKernel(). Only buffers that are kernel arguments for the kernel being enqueued are dumped. Buffers are dumped as raw binary data to a "memDumpPostEnqueue" subdirectory of the dump directory. The filenames will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Buffer_<Unique Memory Object Number>.bin". Note that this is the same naming convention as with DumpBuffersBeforeEnqueue, so the changes resulting from an enqueue can be determined by diff'ing the preEnqueue folder with the postEnqueue folder.
If set, the Intercept Layer for OpenCL Applications will only dump buffers when the specified kernel is enqueued. This control is ignored unless DumpBuffersBeforeEnqueue or DumpBuffersAfterEnqueue are enabled.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump images before calls to clEnqueueNDRangeKernel(). Only images that are kernel arguments for the kernel being enqueued are dumped. Images are dumped as raw binary data to a "memDumpPreEnqueue" subdirectory of the dump directory. The filenames will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Image_<Unique Memory Object Number>_<Width>x<Height>x<Depth>_<Element Size>bpp.raw".
If set to a nonzero value, the Intercept Layer for OpenCL Applications will dump images after calls to clEnqueueNDRangeKernel(). Only images that are kernel arguments for the kernel being enqueued are dumped. Images are dumped as raw binary data to a "memDumpPostEnqueue" subdirectory of the dump directory. The filenames will have the form "Enqueue_<Enqueue Number>_Kernel_<Kernel Name>_Arg_<Argument Number>_Image_<Unique Memory Object Number>_<Width>x<Height>x<Depth>_<Element Size>bpp.raw". Note that this is the same naming convention as with DumpImagesBeforeEnqueue, so the changes resulting from an enqueue can be determined by diff'ing the preEnqueue folder with the postEnqueue folder.
If set, the Intercept Layer for OpenCL Applications will only dump image when the specified kernel is enqueued. This control is ignored unless DumpImagesBeforeEnqueue or DumpImagesAfterEnqueue are enabled.
The Intercept Layer for OpenCL Applications will only dump buffers when the enqueue counter is greater than this value, inclusive.
The Intercept Layer for OpenCL Applications will only dump buffers when the enqueue counter is less than this value, inclusive.
The Intercept Layer for OpenCL Applications will only dump images when the enqueue counter is greater than this value, inclusive.
The Intercept Layer for OpenCL Applications will only dump images when the enqueue counter is less than this value, inclusive.
This is the master control for aub capture. The Intercept Layer for OpenCL Applications doesn't implement aub capture itself, but can be used to selectively enable and disable aub capture via other methods.
If set, the Intercept Layer for OpenCL Applications will use the older kdc.exe method of aub capture. By default, the newer NEO method of aub capture will be used. This control is ignored for all non-Windows operating systems.
If set, the Intercept Layer for OpenCL Applications will start aub capture before a kernel enqueue, and will also stop aub capture immediately after the kernel enqueue. Each file will have the form "AubCapture_Enqueue_<Enqueue Number>_kernel_<Kernel Name>". Note that non-kernel enqueues such as calls to clEnqueueReadBuffer() and clEnqueueWriteBuffer() will NOT be aub captured when this control is set. The AubCaptureMinEnqueue and AubCaptureMaxEnqueue controls are still honored when AubCaptureIndividualEnqueues is set.
The Intercept Layer for OpenCL Applications will only enable aub capture when the enqueue counter is greater than this value, inclusive.
The Intercept Layer for OpenCL Applications will stop aub capture when the encounter is greater than this value, meaning that only enqueues less than this value, inclusive, will be captured. If the enqueue counter never reaches this value, the Intercept Layer for OpenCL Applications will stop aub capture when the it is unloaded.
If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the kernel name equals this name.
If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the NDRange global work size matches this string. The string should have the form "XxYxZ". The wildcard "*" matches all global work sizes.
If set, the Intercept Layer for OpenCL Applications will only enable aub capture when the NDRange local work size matches this string. The string should have the form "XxYxZ". The wildcard "*" matches all local work sizes, and the string "NULL" matches a NULL local work size.
If set, the Intercept Layer for OpenCL Applications will only enable aub capture if the kernel signature (i.e. hash + kernelname + gws + lws) has not been seen already. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.
The Intercept Layer for OpenCL Applications will skip this many kernel enqueues before enabling aub capture. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.
The Intercept Layer for OpenCL Applications will only capture this many kernel enqueues. The behavior of this control is well-defined when AubCaptureIndividualEnqueues is not set, but it doesn't make much sense without AubCaptureIndividualEnqueues.
The Intercept Layer for OpenCL Applications will wait for this many milliseconds before beginning aub capture.
The Intercept Layer for OpenCL Applications will wait for this many milliseconds before ending aub capture.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will cause all OpenCL APIs to return a successful error status.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will force the context callback to be NULL. With both context callback logging and NULL context callback set, the context callback will still be logged, but any application context callback will not be called.
If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFinish() after every enqueue. The command queue that the command was just enqueued to is passed to clFinish(). This can be used to debug possible timing or resource management issues and will likely impact performance.
If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This can also be used to debug possible timing or resource management issues and is slightly less obtrusive than FinishAfterEnqueue but still will likely impact performance. If both FinishAfterEnqueue and FlushAfterEnqueue are nonzero then the Intercept Layer for OpenCL Applications will only insert a call to clFinish() after every enqueue, because clFinish() implies clFlush().
If set to a nonzero value, the Intercept Layer for OpenCL Applications inserts a call to clFlush() after every barrier enqueue. The command queue that the command was just enqueued to is passed to clFlush(). This has been useful to debug out-of-order queue issues.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will force all queues to be created in-order. This can be used for performance analysis, but may lead to deadlocks in some cases.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will silently ignore any enqueue. This can be used for performance analysis, but will likely cause errors if the application relies on any sort of information from OpenCL events and should be used carefully.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will force the local work size argument to clEnqueueNDRangeKernel() to be NULL, which causes the OpenCL implementation to pick the local work size. Note that this control takes effect before NullLocalWorkSizeX / NullLocalWorkSizeY / NullLocalWorkSizeZ (see below), so enabling both controls will have the effect of forcing a specific local work size.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will set the local work size that will be used if an application passes NULL as the local work size to clEnqueueNDRangeKernel(). 1D dispatches will only look at NullLocalWorkSizeX, 2D dispatches will only look at NullLocalWorkSizeX and NullLocalWorkSizeY, while 3D dispatches will look at NullLocalWorkSizeX, NullLocalWorkSizeY, and NullLocalWorkSizeZ. If the specified values for NullLocalWorkSize do not evenly divide the global work size then the specified values of NullLocalWorkSize will not take effect.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will initialize the contents of allocated buffers with zero. Only valid for non-COPY_HOST_PTR and non-USE_HOST_PTR allocations.
If set to a nonzero value, and if no other priority hint is specified by the application, the Intercept Layer for OpencL Applications will attempt to create a command queue with this priority hint value. Note: HIGH priority is 1, MED priority is 2, and LOW priority is 4.
If set to a nonzero value, and if no other throttle hint is specified by the application, the Intercept Layer for OpencL Applications will attempt to create a command queue with this throttle hint value. Note: HIGH throttle is 1, MED throttle is 2, and LOW throttle is 4.
If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_NAME will return this string instead of the true platform name.
If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_VENDOR will return this string instead of the true platform vendor.
If set to a non-empty value, the clGetPlatformInfo() query for CL_PLATFORM_PROFILE will return this string instead of the true platform profile.
If set to a non-empty string, the clGetPlatformInfo() query for CL_PLATFORM_VERSION will return this string instead of the true platform version.
Hides all device types that are not in the filter. Note: CL_DEVICE_TYPE_CPU = 2, CL_DEVICE_TYPE_GPU = 4, CL_DEVICE_TYPE_ACCELERATOR = 8, CL_DEVICE_TYPE_CUSTOM = 16.
If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_TYPE will return this value instead of the true device type. In addition, calls to clGetDeviceIDs() for this device type will return all devices, not just devices of the requested type. This can be used to enumerate all devices (even CPUs) as GPUs, or vice versa.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_NAME will return this value instead of the true device name.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_VENDOR will return this value instead of the true device vendor.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_PROFILE will return this value instead of the true device profile.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_VERSION will return this value instead of the true device version.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_OPENCL_C_VERSION will return this value instead of the true device version.
If set to a non-empty string, the clGetDeviceInfo() query for CL_DEVICE_EXTENSIONS will return this value instead of the true device extensions string.
If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_VENDOR will return this value instead of the true device vendor ID.
If set to a non-zero value, the clGetDeviceInfo() query for CL_DEVICE_MAX_COMPUTE_UNITS will return this value instead of the true device max compute units.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_HALF will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT will return this value instead of the true device preferred vector width.
If set to a non-negative value, the clGetDeviceInfo() query for CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE will return this value instead of the true device preferred vector width.
If set to a nonzero value, each of the buffer functions that are overridden (via one or more of the keys below) will use a byte-wise operation to read/write/copy the buffer (default behavior is to try to copy multiple bytes at a time, if possible). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueReadBuffer() instead of the implementation's clEnqueueReadBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueWriteBuffer() instead of the implementation's clEnqueueWriteBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueCopyBuffer() instead of the implementation's clEnqueueCopyBuffer(). Note: Requires OpenCL 1.1 or the "byte addressable store" extension.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueReadImage() instead of the implementation's clEnqueueReadImage(). Only 2D images are currently supported.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueWriteImage() instead of the implementation's clEnqueueWriteImage(). Only 2D images are currently supported.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use a kernel to implement clEnqueueCopyImage() instead of the implementation's clEnqueueCopyImage(). Only 2D images are currently supported.
If set to a nonzero value, the Intercept Layer for OpenCL Applications will use its own version of the built-in OpenCL kernels that may be accessed via clCreateProgramWithBuiltInKernels(). At present, only the VME block_motion_estimate_intel kernel is implemented.
Executes a SIMD survey state machine. The general idea of the SIMD survey state machine is to create and manage three additional kernels for each actual OpenCL kernel, one for each SIMD size. Then, execute and time the three kernels, and choose the fastest for subsequent executions.
This is the number of NDRanges that the SIMD survey state machine ignores before starting to time the SIMD survey.
This is the build option that is pre-pended to the application-specified build options to create the SIMD8 kernel.
This is the build option that is pre-pended to the application-specified build options to create the SIMD16 kernel.
This is the build option that is pre-pended to the application-specified build options to create the SIMD32 kernel.
[Note: Not currently implemented, but the idea behind the SIMD oracle is to save the best SIMD size from run-to-run, so the full SIMD survey does not need to be re-executed.]
* Other names and brands may be claimed as the property of others.
Copyright (c) 2018-2019, Intel(R) Corporation