Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 #3757

kahmed10 · 2025-01-11T12:13:00Z

I noticed allocation segments reaching close to uint64 max value, which clearly would throw an out of memory error when trying to allocate on the GPU. This happened when MIGRAPHX_NSTREAMS was set to 2 or greater and the model somehow was large enough to trigger it.

Changing from auto to size_t seems to fix the issue.

After some further investigation, the segment start value is the culprit which becomes int32. I've updated n as size_t for clarity.

I noticed allocation segments reaching close to uint64 max value, which clearly would throw an out of memory error when trying to allocate on the GPU. This happened when MIGRAPHX_NSTREAMS was set to 2 or greater and the model somehow was large enough to trigger it. Changing from `auto` to `size_t` seems to fix the issue.

codecov · 2025-01-11T13:50:54Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.16%. Comparing base (6d02806) to head (2e23878).
Report is 2 commits behind head on develop.

Additional details and impacted files

@@           Coverage Diff            @@
##           develop    #3757   +/-   ##
========================================
  Coverage    92.16%   92.16%           
========================================
  Files          515      515           
  Lines        21978    21978           
========================================
  Hits         20256    20256           
  Misses        1722     1722

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

TedThemistokleous · 2025-01-13T22:38:57Z

I'd say pull this out of draft, I've confirmed this works across all the models we were testing this with.

…m_coloring_fix

migraphx-bot · 2025-01-16T09:56:37Z

Test	Batch	Rate new eb3d71	Rate old f56b1b	Diff	Compare
torchvision-resnet50	64	3,251.42	3,255.67	-0.13%	✅
torchvision-resnet50_fp16	64	6,929.27	6,983.58	-0.78%	✅
torchvision-densenet121	32	2,452.93	2,431.65	0.88%	✅
torchvision-densenet121_fp16	32	4,180.73	4,074.03	2.62%	✅
torchvision-inceptionv3	32	1,629.65	1,628.91	0.05%	✅
torchvision-inceptionv3_fp16	32	2,716.70	2,746.14	-1.07%	✅
cadene-inceptionv4	16	763.00	764.54	-0.20%	✅
cadene-resnext64x4	16	812.63	813.45	-0.10%	✅
slim-mobilenet	64	7,456.51	7,469.86	-0.18%	✅
slim-nasnetalarge	64	208.59	209.05	-0.22%	✅
slim-resnet50v2	64	3,444.88	3,440.80	0.12%	✅
bert-mrpc-onnx	8	1,145.37	1,145.22	0.01%	✅
bert-mrpc-tf	1	482.30	476.55	1.21%	✅
pytorch-examples-wlang-gru	1	516.57	422.09	22.38%	🔆
pytorch-examples-wlang-lstm	1	432.10	394.94	9.41%	🔆
torchvision-resnet50_1	1	808.43	769.40	5.07%	🔆
cadene-dpn92_1	1	427.09	398.97	7.05%	🔆
cadene-resnext101_1	1	384.39	383.77	0.16%	✅
onnx-taau-downsample	1	374.16	345.22	8.38%	🔆
dlrm-criteoterabyte	1	33.31	33.32	-0.05%	✅
dlrm-criteoterabyte_fp16	1	52.67	52.72	-0.09%	✅
agentmodel	1	8,783.95	8,109.71	8.31%	🔆
unet_fp16	2	58.37	58.87	-0.85%	✅
resnet50v1_fp16	1	1,033.67	930.54	11.08%	🔆
resnet50v1_int8	1	1,032.25	1,002.61	2.96%	✅
bert_base_cased_fp16	64	1,180.91	1,168.63	1.05%	✅
bert_large_uncased_fp16	32	365.08	363.25	0.51%	✅
bert_large_fp16	1	200.87	198.22	1.34%	✅
distilgpt2_fp16	16	2,224.86	2,197.80	1.23%	✅
yolov5s	1	527.28	532.80	-1.04%	✅
tinyllama	1	43.83	43.43	0.93%	✅
vicuna-fastchat	1	171.33	174.20	-1.64%	✅
whisper-tiny-encoder	1	417.99	418.04	-0.01%	✅
whisper-tiny-decoder	1	433.35	433.15	0.05%	✅

Check results before merge 🔆

migraphx-bot · 2025-01-16T09:56:38Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

TedThemistokleous self-requested a review January 13, 2025 22:39

TedThemistokleous added the bugfix Fixes a bug found in the code. label Jan 13, 2025

TedThemistokleous assigned kahmed10 Jan 13, 2025

kahmed10 added 3 commits January 15, 2025 16:22

update copyright and add test case

be58a75

update test copyright

031d500

Merge branch 'develop' of https://github.com/ROCm/AMDMIGraphX into me…

eb3d71e

…m_coloring_fix

kahmed10 marked this pull request as ready for review January 15, 2025 16:24

kahmed10 requested a review from causten as a code owner January 15, 2025 16:24

TedThemistokleous approved these changes Jan 15, 2025

View reviewed changes

TedThemistokleous added the simple small or simple changes label Jan 15, 2025

kahmed10 requested review from ahsan-ca and shivadbhavsar January 15, 2025 16:27

causten merged commit 889fabc into develop Jan 16, 2025
34 of 35 checks passed

causten deleted the mem_coloring_fix branch January 16, 2025 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 #3757

Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 #3757

kahmed10 commented Jan 11, 2025 •

edited

Loading

codecov bot commented Jan 11, 2025 •

edited

Loading

TedThemistokleous commented Jan 13, 2025

migraphx-bot commented Jan 16, 2025

migraphx-bot commented Jan 16, 2025

Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 #3757

Fix memory_coloring pass when MIGRAPHX_NSTREAMS > 2 #3757

Conversation

kahmed10 commented Jan 11, 2025 • edited Loading

codecov bot commented Jan 11, 2025 • edited Loading

Codecov Report

TedThemistokleous commented Jan 13, 2025

migraphx-bot commented Jan 16, 2025

migraphx-bot commented Jan 16, 2025

kahmed10 commented Jan 11, 2025 •

edited

Loading

codecov bot commented Jan 11, 2025 •

edited

Loading