-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] integrate ruapu for runtime cpu isa extension detection #4573
Comments
Just that it cannot tell apart haswell from zen |
thanks, interesting project for sure. (though we tend to use cpuinfo&similar only for direct identification of cpu model - I'm not sure if instruction trapping offers an advantage over querying cpu capability registers for instruction set extensions?) |
https://github.com/nihui/ruapu?tab=readme-ov-file#features ruapu is not intended to replace cpuinfo or the register method of obtaining information, but is a complementary detection method. The main purpose is to be used when conventional methods such as cpuinfo cannot be implemented, such as on the windows arm platform, such as detecting risc-v vendor extension, in a unified way Ruapu currently cannot obtain relevant CPU core architectures, such as skylake zen3 cortex-a75. I plan to complete the cpu isa extension first, and then add other information as needed. |
You always need CPUID bits. |
I must admit I am not aware of the situation around Windows on Arm - currently waiting for a CI solution to become available for that platform. But from what I've seen it would probably be sufficient for OpenBLAS to support a generic |
Hello
openblas uses operating system-related methods (parsing /proc/cpuinfo) and architecture-related methods (x86 cpuid) to obtain the isa extension information of the cpu at runtime and dynamically select the optimized code path.
In the neural network acceleration library ncnn ( https://github.com/Tencent/ncnn ), related strategies are also used, but these alone may not be enough to be compatible with more systems and architectures.
Therefore, I recommend integrating ruapu ( https://github.com/nihui/ruapu ) into openblas. Ruapu is a single C header implementation. It uses capture sigill to obtain CPU isa extension support. This is compatible with many operating systems such as linux, windows, macos, and can detect more directly and accurately. Sometimes /proc/cpuinfo or x86 cpuid may lie to us ;)
Comments are welcome, if ruapu is suitable for the project, or if you have any other suggestions
The text was updated successfully, but these errors were encountered: