Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiled code is segfaulting #32

Open
orchid-hybrid opened this issue Nov 21, 2014 · 22 comments
Open

compiled code is segfaulting #32

orchid-hybrid opened this issue Nov 21, 2014 · 22 comments

Comments

@orchid-hybrid
Copy link

Hello

I have built UHC from source code using GHC on a 64 bit computer

$ uhc --version
ehc-1.1.7.2, revision master@05180a3aff, timestamp 20141121 +0000 030619

but I tried to compile the demo with uhc hello.hs, this seems to work and creates output files but when I run it it crashes:

(gdb) r
Program received signal SIGSEGV, Segmentation fault.
0x000000000041c748 in gb_InitTables ()
(gdb) bt
#0  0x000000000041c748 in gb_InitTables ()
#1  0x00000000004065a0 in UHC_Base_initModule ()
#2  0x0000000000404935 in main ()

I looked around and found this is in the bytecode interpreter.c file but I don't know what might be going wrong.

@atzedijkstra
Copy link
Member

Hi, on what OS/GHC combi are you running, e.g. can you provide the output from ./configure followed by a clean build?

@orchid-hybrid
Copy link
Author

Hello, I'm using 64-bit Arch Linux with GHC 7.8.3, Shuffle 0.1.3.1 and uuagc 0.9.51.

$ uname -a
Linux arch 3.17.1-1-ARCH #1 SMP PREEMPT Wed Oct 15 15:04:35 CEST 2014 x86_64 GNU/Linux
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.8.3
$ shuffle --version
0.1.3.1
$ uuagc --version  
Attribute Grammar compiler / HUT project. Version 0.9.51

I can change to any other version if it helps!

$ git clone https://github.com/UU-ComputerScience/uhc --depth 1
$ cd uhc/EHC/
$ ./configure > configure.txt
$ make > make.txt 2>&1

Here are the two files: https://gist.github.com/orchid-hybrid/5df8e23a5d13d6f87db3

If I can give any more useful info just ask!

@atzedijkstra
Copy link
Member

Hi,

I cannot find anything suspicious in the dumps... A next step is to turn on tracing which is available only in variant 99 of uhf, so please:

make 99/ehc
make 99/ehclib
install/99/bin/ehc --gen-trace=1 -Operwholecore hw
./hw

where hw is your hello world program. This should give you an execution trace I'd like to see.

Atze

On 21 Nov, 2014, at 21:08 , orchid-hybrid [email protected] wrote:

Hello, I'm using 64-bit Arch Linux with GHC 7.8.3, Shuffle 0.1.3.1 and uuagc 0.9.51.

$ uname -a
Linux arch 3.17.1-1-ARCH #1 SMP PREEMPT Wed Oct 15 15:04:35 CEST 2014 x86_64 GNU/Linux
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.8.3
$ shuffle --version
0.1.3.1
$ uuagc --version
Attribute Grammar compiler / HUT project. Version 0.9.51

I can change to any other version if it helps!

$ git clone https://github.com/UU-ComputerScience/uhc --depth 1
$ cd uhc/EHC/
$ ./configure > configure.txt
$ make > make.txt 2>&1

Here are the two files: https://gist.github.com/orchid-hybrid/5df8e23a5d13d6f87db3

If I can give any more useful info just ask!


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@orchid-hybrid
Copy link
Author

Sure Atze. Here is the trace from the same program demo/hello.hs which again segfaults: https://gist.github.com/orchid-hybrid/b2b1c07b7b58fae0ae9b

@atzedijkstra
Copy link
Member

Hi,

internal setup seems ok but it never gets to run. Further digging needs to be done on your platform (as I cannot reproduce it on my environment and I don't have a virtual image for your platform readily available)... If you have time for it, please could you figure out where in main_GB_Run in file src/rts/rts.cc (and/or perhaps in main_GB_Init1) it crashes?

Atze

On 24 Nov, 2014, at 11:37 , orchid-hybrid [email protected] wrote:

Sure Atze. Here is the trace from the same program demo/hello.hs which again segfaults: https://gist.github.com/orchid-hybrid/b2b1c07b7b58fae0ae9b


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@orchid-hybrid
Copy link
Author

Hello,

I didn't see main_GB_Run or main_GB_Init1 in the backtraces and I'm not sure how to find out the information about the crash that would be useful. If there is anything you'd like me to try I would happy to.

I did attempt to investigate this more, I tried adding -g compile flag to various places to get line numbers in the backtrace but I couldn't work out how to get line numbers for gb_InitTables compiled from interpreter.c:

(gdb) bt
#0  0x00000000004203c2 in gb_InitTables ()
#1  0x00000000004078eb in UHC_Base_initModule (
    modTbl=0x648940 <Main_moduleEntries>, modTblInx=0)
    at install/99/lib/pkg//uhcbase-1.1.7.2/99/bc/plain/UHC/Base.c:57350
#2  0x00000000004058d9 in main (argc=1, argv=0x7fffffffe528)
    at demo/hello.c:834

With layout asm I could see that it crashes on the instruction at +1426:

0x4203c2 <gb_InitTables+1426>   paddq  -0x10(%r8),%xmm1
0x4203c8 <gb_InitTables+1432>   paddq  -0x20(%r8),%xmm0
0x4203ce <gb_InitTables+1438>   movaps %xmm1,-0x10(%rsi)
0x4203d2 <gb_InitTables+1442>   movaps %xmm0,-0x20(%rsi)
0x4203d6 <gb_InitTables+1446>   cmp    %rdi,%rbx
0x4203d9 <gb_InitTables+1449>   jae    0x420471 <gb_InitTables+1601>
0x4203df <gb_InitTables+1455>   movdqa %xmm3,%xmm0
0x4203e3 <gb_InitTables+1459>   jmp    0x420384 <gb_InitTables+1364>
0x4203e5 <gb_InitTables+1461>   lea    0x8(%rbx,%r15,1),%rdx
0x4203ea <gb_InitTables+1466>   mov    %rdx,(%rbx)
0x4203ed <gb_InitTables+1469>   jmpq   0x420110 <gb_InitTables+736>

I tried the build and hello world demo on my friends computer who also uses arch linux and the crash occurred at a different location +1112:

0x41c748 <gb_InitTables+1112>   paddq  (%r12),%xmm0
0x41c74e <gb_InitTables+1118>   paddq  %xmm5,%xmm1
0x41c752 <gb_InitTables+1122>   add    $0x20,%r12
0x41c756 <gb_InitTables+1126>   paddq  -0x10(%r12),%xmm1
0x41c75d <gb_InitTables+1133>   movaps %xmm0,-0x20(%rcx)
0x41c761 <gb_InitTables+1137>   paddq  %xmm4,%xmm6
0x41c765 <gb_InitTables+1141>   movaps %xmm1,-0x10(%rcx)
0x41c769 <gb_InitTables+1145>   cmp    %r10,%r15
0x41c76c <gb_InitTables+1148>   jae    0x41c7c3 <gb_InitTables+1235>
0x41c76e <gb_InitTables+1150>   movdqa %xmm6,%xmm0
0x41c772 <gb_InitTables+1154>   jmp    0x41c722 <gb_InitTables+1074>

Finally I tried it on a ubuntu machine it there was no problem.

So I am a little worried this is a problem with arch itself rather than the compiler. If so I am very sorry to take up your time on it! I will report back if I find anything about that.

@atzedijkstra
Copy link
Member

Hi,

I am afraid that in a similar situation I'd have to fall back on adding tracing printf statements etc.. Indeed installing/running uhc on ubuntu 32/64 bit both is ok, unix/linux in general is. If you can find the cause, please let me know, but again, I am afraid that is probably is something very silly like a small glitch in the C compiler (this has happened before, where it turned out that an expression had to be rewritten so that gcc would generate different (correct) code.).

regards,
Atze

On 26 Nov, 2014, at 01:07 , orchid-hybrid [email protected] wrote:

Hello,

I didn't see main_GB_Run or main_GB_Init1 in the backtraces and I'm not sure how to find out the information about the crash that would be useful. If there is anything you'd like me to try I would happy to.

I did attempt to investigate this more, I tried adding -g compile flag to various places to get line numbers in the backtrace but I couldn't work out how to get line numbers for gb_InitTables compiled from interpreter.c:

(gdb) bt
#0 0x00000000004203c2 in gb_InitTables ()
#1 0x00000000004078eb in UHC_Base_initModule (
modTbl=0x648940 <Main_moduleEntries>, modTblInx=0)
at install/99/lib/pkg//uhcbase-1.1.7.2/99/bc/plain/UHC/Base.c:57350
#2 0x00000000004058d9 in main (argc=1, argv=0x7fffffffe528)
at demo/hello.c:834

With layout asm I could see that it crashes on the instruction at +1426:

0x4203c2 <gb_InitTables+1426> paddq -0x10(%r8),%xmm1
0x4203c8 <gb_InitTables+1432> paddq -0x20(%r8),%xmm0
0x4203ce <gb_InitTables+1438> movaps %xmm1,-0x10(%rsi)
0x4203d2 <gb_InitTables+1442> movaps %xmm0,-0x20(%rsi)
0x4203d6 <gb_InitTables+1446> cmp %rdi,%rbx
0x4203d9 <gb_InitTables+1449> jae 0x420471 <gb_InitTables+1601>
0x4203df <gb_InitTables+1455> movdqa %xmm3,%xmm0
0x4203e3 <gb_InitTables+1459> jmp 0x420384 <gb_InitTables+1364>
0x4203e5 <gb_InitTables+1461> lea 0x8(%rbx,%r15,1),%rdx
0x4203ea <gb_InitTables+1466> mov %rdx,(%rbx)
0x4203ed <gb_InitTables+1469> jmpq 0x420110 <gb_InitTables+736>

I tried the build and hello world demo on my friends computer who also uses arch linux and the crash occurred at a different location +1112:

0x41c748 <gb_InitTables+1112> paddq (%r12),%xmm0
0x41c74e <gb_InitTables+1118> paddq %xmm5,%xmm1
0x41c752 <gb_InitTables+1122> add $0x20,%r12
0x41c756 <gb_InitTables+1126> paddq -0x10(%r12),%xmm1
0x41c75d <gb_InitTables+1133> movaps %xmm0,-0x20(%rcx)
0x41c761 <gb_InitTables+1137> paddq %xmm4,%xmm6
0x41c765 <gb_InitTables+1141> movaps %xmm1,-0x10(%rcx)
0x41c769 <gb_InitTables+1145> cmp %r10,%r15
0x41c76c <gb_InitTables+1148> jae 0x41c7c3 <gb_InitTables+1235>
0x41c76e <gb_InitTables+1150> movdqa %xmm6,%xmm0
0x41c772 <gb_InitTables+1154> jmp 0x41c722 <gb_InitTables+1074>

Finally I tried it on a ubuntu machine it there was no problem.

So I am a little worried this is a problem with arch itself rather than the compiler. If so I am very sorry to take up your time on it! I will report back if I find anything about that.


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@orchid-hybrid
Copy link
Author

I see!

I tried adding debug prints to each line of main_GB_Init1 and main_GB_Run in src/rts/rtc.cc, every one of the main_GB_Init1 was printed but none of main_GB_Run was. I don't think that execution reaches main_GB_Run.

main_GB_Init1 0
main_GB_Init1 1
main_GB_Init1 2
main_GB_Init1 3
main_GB_Init1 4
main_GB_Init1 5
*** module Main
  *** entry UHC.Handle.newEmptyBuffer
...
Segmentation fault (core dumped)

@atzedijkstra
Copy link
Member

Perhaps comparing these trace outputs as generated from your OS and (say) Ubuntu will give a clue as to where things break... Your output suggests that setting up tables for modules etc is going wrong. 64 Linux machines (I think) roughly have the same memory layout so differences would not that difficult to spot apart from obvious offset differences.

@phile314
Copy link
Member

I did run into a very similar problem today... I did upgrade the OS (linux - fedora) a few days ago, and since then all produced executables crash. The trace ( https://gist.github.com/phile314/826024d9bd5f72134fa0 ) looks similar to the one already in this bug report.

I am using an amd64 machine. UHC worked on Fedora 20 with ghc 4.8.3, now with Fedora 21 and ghc 4.9.2 the produced executables crash.

@atzedijkstra we can discuss it at the meeting tomorrow.

@atzedijkstra
Copy link
Member

I fear that the only way to find out where the problem originates is to have the traces of a yes/no working OS environment available for comparison. Otherwise too much of a needle-in-haystack (time-consuming) search. Now we at least have two known environments distinguishing the working from the crash, would it be doable to set these both up as a VM and obtain traces for the same program run? My initial guess is that the problem is not the instruction before the crash but a difference in setup of the runtime environment, tables etc. It might be gcc is making something different from generated code...

A

On 12 Jan, 2015, at 18:15 , Philipp Hausmann [email protected] wrote:

I did run into a very similar problem today... I did upgrade the OS (linux - fedora) a few days ago, and since then all produced executables crash. The trace ( https://gist.github.com/phile314/826024d9bd5f72134fa0 ) looks similar to the one already in this bug report.

I am using an amd64 machine. UHC worked on Fedora 20 with ghc 4.8.3, now with Fedora 21 and ghc 4.9.2 the produced executables crash.

@atzedijkstra we can discuss it at the meeting tomorrow.


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@phile314
Copy link
Member

I tried to debug the generated executables, but that hasn't really helped so far.... Interestingly, inside the debuger (gdb) the C executable crashes everywhere. I also can see in the debugger that the two versions are not optimized exactly the same way.

The crash seems to happen at interpreter.c:1887, backtrace:

#0  gb_InitTables (byteCodes=, byteCodesSz=, cafGlEntryIndices=0xd80000 , 
    cafGlEntryIndicesSz=, globalEntries=, globalEntriesSz=, 
    consts=0xd7fbc0 , gcStackInfos=0xc1a9c0 , linkChainInds=0x12b4680, 
    callinfos=0xc20a40 , callinfosSz=2350, functionInfos=0xc1b200 , functionInfosSz=2053, 
    bytePool=0xbf8380  "UHC.Base._'Dict_Constructor", linkChainOffset=122553, impModules=0x12b4680, impModulesSz=0, 
    expNode=0x12b4880 , expNodeSz=604, expNodeOffs=0xd84c40 , 
    modTbl=0xbf7380 , modTblInx=0) at build/99/rts/bc/bc/interpreter.c:1887
#1  0x000000000040793e in UHC_Base_initModule (modTbl=0xbf7380 , modTblInx=0)
    at install/99/lib/pkg//uhcbase-1.1.8.4/99/bc/plain/UHC/Base.c:103203
#2  0x000000000040592c in main (argc=1, argv=0x7fffffffe3f8) at Test.c:890

(To get debug information for the RTS, one can use the --with-gcc-ehc-options=-g configure argument.)

The traces are quite big (~30MB), so opening them in the browser may not be the best idea...
Working f20 trace:
http://files.314.ch/trace_B_f20.txt
Working f20 trace, cut off after where the f21 version crashes:
http://files.314.ch/trace_B_f20_cut.txt
Crashing f21 trace:
http://files.314.ch/trace_B_f21.txt

I also compiled UHC using gcc 3.4 on my notebook, and then the generated executables work.

@Atze Out of curiosity, what GCC version are you using?

@atzedijkstra
Copy link
Member

Ok,

on the interpreter.c:1887 location I have (long time ago) inserted code to work around a gcc/ubuntu bug which I did not understand at the time (still don't). Might well be that fixing that piece of code could solve the problem as maybe now gcc is fixed. An #ifdef on the compiler version around it then...

This is the code:

            {
                WPtr p = loc+1;
                int j ; // this must be int, otherwise gcc under ubuntu 11.10 makes following code crash. dont ask why, I do not know
                for ( j = 0 ; j < info ; j++ ) {
                    WPtr pp = &p[j+1];
                    p[j] += (Word)pp ;
                }
            }

Might be that changing 'int j' into 'Word j' fixes it... Just 'guessing'...

A

On 14 Jan, 2015, at 12:12 , Philipp Hausmann [email protected] wrote:

I tried to debug the generated executables, but that hasn't really helped so far.... Interestingly, inside the debuger (gdb) the C executable crashes everywhere. I also can see in the debugger that the two versions are not optimized exactly the same way.

The crash seems to happen at interpreter.c:1887, backtrace:

#0 gb_InitTables (byteCodes=, byteCodesSz=, cafGlEntryIndices=0xd80000 ,
cafGlEntryIndicesSz=, globalEntries=, globalEntriesSz=,
consts=0xd7fbc0 , gcStackInfos=0xc1a9c0 , linkChainInds=0x12b4680,
callinfos=0xc20a40 , callinfosSz=2350, functionInfos=0xc1b200 , functionInfosSz=2053,
bytePool=0xbf8380 "UHC.Base._'Dict_Constructor", linkChainOffset=122553, impModules=0x12b4680, impModulesSz=0,
expNode=0x12b4880 , expNodeSz=604, expNodeOffs=0xd84c40 ,
modTbl=0xbf7380 , modTblInx=0) at build/99/rts/bc/bc/interpreter.c:1887
#1 0x000000000040793e in UHC_Base_initModule (modTbl=0xbf7380 , modTblInx=0)
at install/99/lib/pkg//uhcbase-1.1.8.4/99/bc/plain/UHC/Base.c:103203
#2 0x000000000040592c in main (argc=1, argv=0x7fffffffe3f8) at Test.c:890

(To get debug information for the RTS, one can use the --with-gcc-ehc-options=-g configure argument.)

The traces are quite big (~30MB), so opening them in the browser may not be the best idea...
Working f20 trace:
http://files.314.ch/trace_B_f20.txt
Working f20 trace, cut off after where the f21 version crashes:
http://files.314.ch/trace_B_f20_cut.txt
Crashing f21 trace:
http://files.314.ch/trace_B_f21.txt

I also compiled UHC using gcc 3.4 on my notebook, and then the generated executables work.

@Atze Out of curiosity, what GCC version are you using?


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@phile314
Copy link
Member

I will give your suggestion a try next week.

If it doesn't fix the problem, I am probably just going to use an older GCC version. I am not that keen
on spending much more time on this right now... It's probably also a good idea to mention this in the README if it stays broken, but let's first see if the fix works.

To summarize what we know right now:
GCC <= 4.8 works
GCC == 4.9 is broken
LLVM/clang may work

@orchid-hybrid
Copy link
Author

I tried changing int j ; to Word j; in src/rts/bc/interpreter.cc in a fresh git clone, compiled and tested it, sorry to say that it still crashes in the same way. I have gcc 4.9.2.

@atzedijkstra
Copy link
Member

Ok, was just a guess...

I am installing ubuntu as a VM, will try to see what happens there now...

A

On 16 Jan, 2015, at 12:44 , name [email protected] wrote:

I tried changing int j ; to Word j; in src/rts/bc/interpreter.cc in a fresh git clone, compiled and tested it, sorry to say that it still crashes in the same way.


Reply to this email directly or view it on GitHub.

            - Atze -

Atze Dijkstra, Department of Information and Computing Sciences. /|
Utrecht University, PO Box 80089, 3508 TB Utrecht, Netherlands. / |
Tel.: +31-30-2534118/1454 | WWW : http://www.cs.uu.nl/~atze . /--|
Fax : +31-30-2513971 .... | Email: [email protected] ............... / |___\

@phile314
Copy link
Member

Not really the solution, but might be helpful for other people. Using LLVM/clang the generated executables work and don't segfault.

clang version 3.5.0 (tags/RELEASE_350/final)
Target: x86_64-redhat-linux-gnu
Thread model: posix

configure: ./configure --with-gcc=/usr/bin/clang

@orchid-hybrid
Copy link
Author

I confirm that compiling with clang allows me to build and run hello world successfully with ./install/101/bin/ehc demo/hello.hs.

Thank you!

@ghost
Copy link

ghost commented Jul 20, 2017

So, about that one switch branch in interpreter.c... something looks really off about all those +1s.

This is a more complete sketch of the situation:

typedef uint64_t Word64;
typedef Word64 Word;
typedef Word *WPtr;
Word info = somehowInitialised();
WPtr loc = somehowInitialised();

*loc = info ;
{
    WPtr p = loc+1;
    int j;
    for (j = 0; j < info; j++) {
        WPtr pp = &p[j+1];
        p[j] += (Word)pp ;
    }
}

After eliminating the typedefs, p, and the array notation, I get the following bit of code:

uint64_t info = somehowInitialised();
uint64_t *loc = somehowInitialised();
*loc = info;
int j;

for (j = 0; j < info; j++) {
    uint64_t *pp = loc + 1 + j + 1; // (sic)
    *(loc + 1 + j) += (uint64_t)pp;
}

I get the impression that the extra + 1 (from p[j+1]) should not be there. However, I don't know where to find documentation on the meaning of GB_LinkChainKind_Offsets, nor are the variable names of any help, so I can't really check it. I've tried a few permutations of solving possible off-by-one errors here, but none of them resolve the fault.

@ghost
Copy link

ghost commented Jul 20, 2017

Well, I think I found the problem after some debugging: gb_InitTables exhibits undefined behaviour. In particular, some part of UHC commits the following sin (says C99 in Appendix J.2):

Conversion between two pointer types produces a result that is incorrectly aligned (6.3.2.3).

The segfault occurs in a PADDQ instruction, which requires 128-bit alignment for memory operands. When I disable loop vectorisation for gb_InitTables (by adding __attribute__((optimize("no-tree-vectorize"))) to gb_InitTables, no other source changes), the problem disappears: GCC now emits code that uses ADD, which has no alignment requirements -- unless asked. And when I enable alignment checking, the program crashes with a bus error:

(gdb) b gb_InitTables 
Breakpoint 1 at 0xc5c70: file build/101/rts/bc/bc/interpreter.c, line 1799.
(gdb) run
Starting program: /tmp/a 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Breakpoint 1, gb_InitTables (byteCodes=0x555555856800 <UHC_Base_bytecode> "\376\377\341\002", byteCodesSz=238997, 
    cafGlEntryIndices=0x555555890da0 <UHC_Base_cafGlEntryIndices>, cafGlEntryIndicesSz=324, 
    globalEntries=0x5555558a4060 <UHC_Base_globalEntries>, globalEntriesSz=1642, consts=0x5555558a73c0 <UHC_Base_constants>, 
    gcStackInfos=0x555555891b20 <UHC_Base_gcStackInfos>, linkChainInds=0x555555a52040 <Unsafe_Coerce_expNode_size>, 
    callinfos=0x5555558969e0 <UHC_Base_callinfos>, callinfosSz=2193, functionInfos=0x555555892340 <UHC_Base_functionInfos>, 
    functionInfosSz=1642, bytePool=0x555555836d40 <UHC_Base_bytePool> "UHC.Base.primAsinDouble", linkChainOffset=2, 
    impModules=0x555555a52040 <Unsafe_Coerce_expNode_size>, impModulesSz=0, expNode=0x555555a52200 <UHC_Base_expNode>, 
    expNodeSz=529, expNodeOffs=0x5555558912c0 <UHC_Base_expNode_offs>, modTbl=0x555555836260 <a_moduleEntries>, modTblInx=0)
    at build/101/rts/bc/bc/interpreter.c:1799
1799	{
(gdb) set $ps |= (1<<18)
(gdb) continue
Continuing.

Program received signal SIGBUS, Bus error.
gb_InitTables (byteCodes=0x555555856800 <UHC_Base_bytecode> "\376\377\341\002", byteCodesSz=<optimized out>, 
    cafGlEntryIndices=0x555555890da0 <UHC_Base_cafGlEntryIndices>, cafGlEntryIndicesSz=324, globalEntries=<optimized out>, 
    globalEntriesSz=<optimized out>, consts=0x5555558a73c0 <UHC_Base_constants>, 
    gcStackInfos=0x555555891b20 <UHC_Base_gcStackInfos>, linkChainInds=0x555555a52040 <Unsafe_Coerce_expNode_size>, 
    callinfos=0x5555558969e0 <UHC_Base_callinfos>, callinfosSz=2193, functionInfos=0x555555892340 <UHC_Base_functionInfos>, 
    functionInfosSz=1642, bytePool=0x555555836d40 <UHC_Base_bytePool> "UHC.Base.primAsinDouble", linkChainOffset=2, 
    impModules=0x555555a52040 <Unsafe_Coerce_expNode_size>, impModulesSz=0, expNode=0x555555a52200 <UHC_Base_expNode>, 
    expNodeSz=529, expNodeOffs=0x5555558912c0 <UHC_Base_expNode_offs>, modTbl=0x555555836260 <a_moduleEntries>, modTblInx=0)
    at build/101/rts/bc/bc/interpreter.c:1817
1817			FunctionInfo_Inx off = callinfos[i].functionInfoModOff ;

So, yeah... UHC doesn't properly align its pointers.

(To note: /tmp/a was compiled from a source file containing main :: IO (); main = return () by ehc-1.1.9.6, revision master@6eb59da933, timestamp 20170720 +0000 215924.)

@atzedijkstra
Copy link
Member

atzedijkstra commented Jul 21, 2017 via email

@ghost
Copy link

ghost commented Jul 22, 2017

128-bit alignment is enforced by the compiler based on the types the programmer promises.

The PADDQ loop is only the main course of the the whole copy buffet. The aperitif makes sure the Word-aligned pointer (64-bit on my machine) is advanced to a 128-bit aligned pointer. The dessert is Duff's device.

The problem is in the assumptions that underly this line:

WPtr loc = (WPtr)(&byteCodes[ linkChainOffset ]) ;

Note that byteCodes has type GP_BytePtr. It is 8-bit aligned, with no further promises. Unless the bytecode language has syntactical restrictions on alignment (e.g. JVM bytecode has this for jump tables) and you adjust the pointer to have the right alignment, any intended Word-sized memory access has to be implemented in explicitly in Byte-sized access statements. Upcasting a byte pointer to a word pointer is asking for problems, because the compiler will likely serve you the wrong aperitif.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants