-
Notifications
You must be signed in to change notification settings - Fork 12.1k
ggml: introduce GGML_NUMA_MIGRATE to optimize cross NUMA op computation #14232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -152,6 +152,11 @@ set(GGML_BLAS_VENDOR ${GGML_BLAS_VENDOR_DEFAULT} CACHE STRING | |
"ggml: BLAS library vendor") | ||
option(GGML_LLAMAFILE "ggml: use LLAMAFILE" ${GGML_LLAMAFILE_DEFAULT}) | ||
|
||
option(GGML_NUMA_MIGRATE "ggml: use NUMA_MIGRATE" OFF) | ||
set(GGML_NUMA_MIGRATE_NODES "2" CACHE STRING | ||
"ggml: the number of NUMA nodes during page migration") | ||
option(GGML_NUMA_MIGRATE_DEBUG "ggml: enable debugging of NUMA_MIGRATE" OFF) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoid this by checking an environment variable instead. See |
||
|
||
option(GGML_CUDA "ggml: use CUDA" OFF) | ||
option(GGML_MUSA "ggml: use MUSA" OFF) | ||
option(GGML_CUDA_FORCE_MMQ "ggml: use mmq kernels instead of cuBLAS" OFF) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -348,6 +348,9 @@ extern "C" { | |
// CPU buffer types are always available | ||
GGML_API ggml_backend_buffer_t ggml_backend_cpu_buffer_from_ptr(void * ptr, size_t size); | ||
GGML_API ggml_backend_buffer_type_t ggml_backend_cpu_buffer_type(void); | ||
#ifdef GGML_USE_NUMA_MIGRATE | ||
GGML_API size_t ggml_backend_get_page_size(void); | ||
#endif | ||
Comment on lines
+351
to
+353
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems like should be moved to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried to move the function definition from ggml-backend.cpp to ggml-cpu.c, but there're lots of link issues since the ggml-cpu.c is in the subdirectories of ggml-cpu while it can't be referenced by files under parent directories under ggml:
moved here:
|
||
|
||
#ifdef __cplusplus | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -12,6 +12,9 @@ extern "C" { | |
struct ggml_cplan { | ||
size_t work_size; // size of work buffer, calculated by `ggml_graph_plan()` | ||
uint8_t * work_data; // work buffer, to be allocated by caller before calling to `ggml_graph_compute()` | ||
#ifdef GGML_USE_NUMA_MIGRATE | ||
uint8_t * work_data_numa[GGML_NUMA_MIGRATE_NODES]; | ||
#endif | ||
Comment on lines
+15
to
+17
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have to find a way to avoid so many There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, it's a valid concern, i'll need to find a better way. |
||
|
||
int n_threads; | ||
struct ggml_threadpool * threadpool; | ||
|
@@ -28,6 +31,9 @@ extern "C" { | |
GGML_NUMA_STRATEGY_ISOLATE = 2, | ||
GGML_NUMA_STRATEGY_NUMACTL = 3, | ||
GGML_NUMA_STRATEGY_MIRROR = 4, | ||
#ifdef GGML_USE_NUMA_MIGRATE | ||
GGML_NUMA_STRATEGY_MIGRATE = 5, | ||
#endif | ||
GGML_NUMA_STRATEGY_COUNT | ||
}; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should become
GGML_CPU_NUMA
. AFAIU, these changes require linkinglibnuma
, is that correct?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I'm OK with the name change since it's targeted for CPU:)