Skip to content

ramin-raeisi/HIP-Optimization-Bug

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HIP-Optimization-Bug


The code comes from the Lotus project, which can be found at https://github.com/filecoin-project/lotus

LICENSE-MIT and LICENSE-APACHE are two of the lotus licenses.


The GPU code was extracted from the bellperson project, which is a sub-project of the Lotus project. In this code, I've removed all unnecessary code in order to test the g1_add function in HIP/ROCm. The GPU code can be found in g1_add.cu source.

The main problem is that when we use optimization flags like -O3, the result differs significantly from when we use the -O0 flag.

I have also tested the code by cuda compiler and the result is the same with the -O0 flag.

Test1(gfx908)

  • gpu : gfx908
  • os: ubuntu 20.04
  • hip_version: RC

-O0 flag

$ ./run.sh 
rm -f g1_add *.o
/opt/rocm/hip/bin/hipcc -std=c++14 -O0 -I/opt/include/ -o g1_add g1_add.cu
-----g1_add_test--------
x:695163763, 838428337, 867136025, 3916970060, 1083605276, 2882035772, 3603006931, 2269309842, 422274527, 1169772790, 1990394245, 416975321, 
y:1229022948, 3366429108, 670218974, 1658335027, 392632874, 1379067484, 798160530, 3656524164, 3793686573, 2144155088, 2721370348, 298035558, 
z:413203031, 3318893592, 1282426328, 1145762026, 1542369093, 485346739, 1679000480, 4026228341, 2371190916, 3558967189, 3094593878, 414846589, 


-----g1_double_test--------
x:1632659750, 1265485755, 1750545604, 4265521061, 3934928281, 4157162780, 4059373350, 175259380, 1116337943, 828144285, 4015370401, 394235955, 
y:1337058212, 2000076061, 4167958530, 3201785366, 190644621, 1217337187, 1281056040, 2380457106, 1843262342, 2768612874, 1950340553, 89436721, 
z:459531309, 2907844289, 4049224835, 4075394049, 4006033233, 2142954301, 1167773149, 2944676389, 2627347026, 2069553728, 951579129, 432622738, 


-----g1_add_mixed_test--------
x:2210944813, 3909903087, 603786564, 3883152423, 2173389250, 2956470872, 809479795, 2862133481, 1088430826, 3104844840, 44451588, 226701902, 
y:2426309915, 1558163374, 4125809737, 2860931858, 1688959796, 3877074395, 4208292625, 4034170426, 2725679345, 1152044552, 819326913, 253933226, 
z:196605, 1980301312, 3289120770, 3958636555, 1405573306, 1598593111, 1884444485, 2010011731, 2723605613, 1543969431, 4202751123, 368467651, 


-----g2_add_test--------
x:
  c0: 208598400, 2694991227, 1047077762, 2846011457, 3072180666, 2586524079, 1102481180, 2748217909, 915908624, 1428853440, 185053769, 280202458, 
  c1: 896785299, 3995963242, 1823751211, 767703503, 1304145148, 662952197, 2882865842, 534777519, 1901401509, 1872043708, 2798653961, 293640618, 
y:
  c0: 3539470131, 2137220391, 1597398656, 1117125724, 730634957, 1078887918, 1321721234, 1724952331, 613546324, 2748204274, 2525111799, 289271208, 
  c1: 3887607577, 2510766255, 737788432, 3570942326, 2879936206, 962108494, 3792765839, 2585296344, 1640276112, 2758685254, 2846912960, 355409566, 
z:
  c0: 532666578, 2246343696, 2174845458, 1995569759, 1202232624, 3706636086, 748146386, 2090841127, 1842967206, 618837610, 1931725633, 318850490, 
  c1: 519291090, 1999124522, 2412840922, 2084572850, 1743581217, 2719486829, 567872724, 45755210, 2616696815, 3919313717, 4275453181, 15054798, 


-----g2_add_mixed_test--------
x:
  c0: 630543538, 2724914287, 2928286071, 1206988501, 2890832672, 1957021562, 428270052, 2832959188, 454800231, 280460773, 2371362276, 278770375, 
  c1: 1711034215, 786080655, 2132723336, 534742201, 1414172201, 793650756, 3273312053, 878272217, 1768425118, 3070428044, 3865254552, 313376622, 
y:
  c0: 1439687478, 1699762003, 2616701347, 58055755, 2869644811, 210378998, 3185430847, 563614816, 2438712083, 188035116, 717301920, 93025059, 
  c1: 2127133146, 498565606, 2156680602, 1824566867, 1488922578, 1021153846, 3024742101, 4131599155, 3962274867, 1026693349, 1379019381, 285018239, 
z:
  c0: 126944175, 404342239, 2527468170, 2956748879, 207986017, 4181688848, 1828342311, 4038727245, 1047828733, 654618778, 3419187055, 17126065, 
  c1: 557237479, 2276220765, 353482893, 2614146573, 410598167, 1993248169, 143346264, 1438889812, 1133180384, 476253848, 2787769590, 207208886, 


-----g2_double_test--------
x:
  c0: 3384291560, 884500155, 562551001, 1506254693, 3417674841, 1929197071, 1630878619, 728272718, 4220131018, 2130229034, 3504016523, 273936812, 
  c1: 2948027432, 638423744, 2081033080, 1489027210, 4085260338, 3827269722, 1523607993, 781941773, 1144299051, 1113946061, 2460346501, 31450159, 
y:
  c0: 3653240316, 3446957739, 240118700, 3194897906, 1162413476, 371534618, 1424033393, 2544013989, 399862425, 2007398811, 987599719, 244151954, 
  c1: 60547311, 3114605181, 2302982681, 2513243313, 560584245, 2572847200, 3903698591, 3621260444, 2118880666, 103308072, 3065387905, 35064384, 
z:
  c0: 1490758054, 1729390021, 1188585995, 416601573, 851875930, 801261935, 2505862590, 990050998, 774293611, 3700995950, 1470808578, 268104591, 
  c1: 334778596, 2987531052, 4094000946, 2653606598, 2870841776, 1235264205, 2792671188, 829949857, 977248672, 3206164523, 4008036371, 213969837, 


g1_add_test-> success
g1_double_test-> success
g1_add_mixed_test-> success
g2_add_test-> success
g2_add_mixed_test-> success
g2_double_test-> success

-O3 flag

$ ./run.sh 
rm -f g1_add *.o
/opt/rocm/hip/bin/hipcc -std=c++14 -O3 -I/opt/include/ -o g1_add g1_add.cu
-----g1_add_test--------
x:695163763, 838428337, 867136025, 3916970060, 1083605276, 2882035772, 3603006931, 2269309842, 422274527, 1169772790, 1990394245, 416975321, 
y:1229022948, 3366429108, 670218974, 1658335027, 392632874, 1379067484, 798160530, 3656524164, 3793686573, 2144155088, 2721370348, 298035558, 
z:413203031, 3318893592, 1282426328, 1145762026, 1542369093, 485346739, 1679000480, 4026228341, 2371190916, 3558967189, 3094593878, 414846589, 


-----g1_double_test--------
x:2713411487, 2426798798, 2080335201, 0, 2103612646, 3524251821, 2710683041, 2873179322, 1158341063, 2066016386, 3710732705, 142415244, 
y:3518176611, 3120496636, 2975072255, 514588670, 4138792484, 2487468688, 355143834, 2129682019, 1822889211, 2468734455, 964683415, 436277738, 
z:459531309, 2907844289, 4049224835, 0, 4006033233, 2142954301, 1167773149, 0, 2627347026, 2069553728, 951579129, 0, 


-----g1_add_mixed_test--------
x:2210944813, 3909903087, 603786564, 3883152423, 2173389250, 2956470872, 809479795, 2862133481, 1088430826, 3104844840, 44451588, 226701902, 
y:2426309915, 1558163374, 4125809737, 2860931858, 1688959796, 3877074395, 4208292625, 4034170426, 2725679345, 1152044552, 819326913, 253933226, 
z:196605, 1980301312, 3289120770, 3958636555, 1405573306, 1598593111, 1884444485, 2010011731, 2723605613, 1543969431, 4202751123, 368467651, 


-----g2_add_test--------
x:
  c0: 707082032, 2905540086, 2012561348, 1244716469, 3369864621, 1250858003, 1734953183, 2473207636, 4118007264, 3088542264, 2825020113, 157586477, 
  c1: 2052660064, 1054054625, 181907587, 3305410196, 1752922361, 1253055382, 401216195, 840442564, 3549571258, 3676157395, 2819154801, 160825318, 
y:
  c0: 3661892809, 1964288723, 2930949208, 1029177341, 3982617672, 3462505793, 1317959127, 456557484, 3352912525, 1041639032, 3864055902, 0, 
  c1: 3968149393, 155696932, 886518191, 0, 872114755, 1403314990, 888630324, 1685539717, 972278977, 1417881456, 2549162979, 0, 
z:
  c0: 1665836619, 2192807297, 2448451751, 514588671, 4221093372, 1731252895, 4085584575, 445821968, 1129032920, 3905432842, 964683418, 0, 
  c1: 2407720837, 3279300024, 919738785, 0, 3139694966, 1109163563, 4180549552, 0, 553857298, 3927224864, 1397668013, 60142698, 


-----g2_add_mixed_test--------
x:
  c0: 21845, 1311860077, 4284875367, 2022857026, 621041199, 1516893740, 1277070405, 2954527494, 3807975235, 126314879, 1532758244, 182867985, 
  c1: 2888078089, 1754250062, 2497096416, 987334444, 4019128359, 1917707305, 1473425019, 2982818062, 4058410156, 1867022434, 1989760357, 355929815, 
y:
  c0: 0, 2343470949, 2361446213, 436357850, 2810726791, 2719425411, 1240268329, 4294967295, 708757780, 4070049009, 3892750047, 349450576, 
  c1: 4252398389, 2763252505, 833966575, 4152384109, 3311259530, 2020529306, 622091467, 2323028873, 1794683397, 1513413470, 2227117397, 342556849, 
z:
  c0: 619158072, 2221342387, 1932327520, 2697526636, 75846095, 1026692193, 2348234968, 2081986644, 305681949, 3047311208, 35271182, 280919825, 
  c1: 0, 3908162606, 485828214, 1739249248, 3075111737, 2020432959, 3966748312, 2076918238, 479200887, 3951261620, 1552030366, 0, 


-----g2_double_test--------
x:
  c0: 1213647047, 1292745482, 1107076159, 3584885726, 583855519, 4077584544, 3402348304, 3115677270, 3214569210, 3424077555, 964683416, 0, 
  c1: 1747090166, 2892514921, 3103684097, 0, 1650911837, 1710986809, 1554290611, 0, 3758267726, 533533599, 3512322301, 0, 
y:
  c0: 2853816257, 4208013995, 4009892814, 1091864420, 4041560588, 3812078504, 1000346239, 213515632, 3849423110, 1403730206, 128575529, 137589676, 
  c1: 1556509703, 96956990, 2820191797, 3310922790, 3960973582, 587304306, 4216081535, 4037096771, 1299561063, 1571922778, 3187532784, 288968693, 
z:
  c0: 459509464, 1729390023, 1, 0, 3849858422, 801261936, 0, 0, 1312880962, 4294967295, 1916262545, 0, 
  c1: 3598562837, 2215975725, 3733825069, 1283269862, 2042381408, 336472812, 2831737616, 4043711920, 1346529860, 2384172423, 1559440087, 374832984, 


g1_add_test-> success
g1_double_test-> failed
g1_add_mixed_test-> success
g2_add_test-> failed
g2_add_mixed_test-> failed
g2_double_test-> failed

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages