-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compiled code is segfaulting #32
Comments
Hi, on what OS/GHC combi are you running, e.g. can you provide the output from ./configure followed by a clean build? |
Hello, I'm using 64-bit Arch Linux with GHC 7.8.3, Shuffle 0.1.3.1 and uuagc 0.9.51.
I can change to any other version if it helps!
Here are the two files: https://gist.github.com/orchid-hybrid/5df8e23a5d13d6f87db3 If I can give any more useful info just ask! |
Hi, I cannot find anything suspicious in the dumps... A next step is to turn on tracing which is available only in variant 99 of uhf, so please: make 99/ehc where hw is your hello world program. This should give you an execution trace I'd like to see. Atze On 21 Nov, 2014, at 21:08 , orchid-hybrid [email protected] wrote:
Atze Dijkstra, Department of Information and Computing Sciences. /| |
Sure Atze. Here is the trace from the same program demo/hello.hs which again segfaults: https://gist.github.com/orchid-hybrid/b2b1c07b7b58fae0ae9b |
Hi, internal setup seems ok but it never gets to run. Further digging needs to be done on your platform (as I cannot reproduce it on my environment and I don't have a virtual image for your platform readily available)... If you have time for it, please could you figure out where in main_GB_Run in file src/rts/rts.cc (and/or perhaps in main_GB_Init1) it crashes? Atze On 24 Nov, 2014, at 11:37 , orchid-hybrid [email protected] wrote:
Atze Dijkstra, Department of Information and Computing Sciences. /| |
Hello, I didn't see main_GB_Run or main_GB_Init1 in the backtraces and I'm not sure how to find out the information about the crash that would be useful. If there is anything you'd like me to try I would happy to. I did attempt to investigate this more, I tried adding
With
I tried the build and hello world demo on my friends computer who also uses arch linux and the crash occurred at a different location +1112:
Finally I tried it on a ubuntu machine it there was no problem. So I am a little worried this is a problem with arch itself rather than the compiler. If so I am very sorry to take up your time on it! I will report back if I find anything about that. |
Hi, I am afraid that in a similar situation I'd have to fall back on adding tracing printf statements etc.. Indeed installing/running uhc on ubuntu 32/64 bit both is ok, unix/linux in general is. If you can find the cause, please let me know, but again, I am afraid that is probably is something very silly like a small glitch in the C compiler (this has happened before, where it turned out that an expression had to be rewritten so that gcc would generate different (correct) code.). regards, On 26 Nov, 2014, at 01:07 , orchid-hybrid [email protected] wrote:
Atze Dijkstra, Department of Information and Computing Sciences. /| |
I see! I tried adding debug prints to each line of
|
Perhaps comparing these trace outputs as generated from your OS and (say) Ubuntu will give a clue as to where things break... Your output suggests that setting up tables for modules etc is going wrong. 64 Linux machines (I think) roughly have the same memory layout so differences would not that difficult to spot apart from obvious offset differences. |
I did run into a very similar problem today... I did upgrade the OS (linux - fedora) a few days ago, and since then all produced executables crash. The trace ( https://gist.github.com/phile314/826024d9bd5f72134fa0 ) looks similar to the one already in this bug report. I am using an amd64 machine. UHC worked on Fedora 20 with ghc 4.8.3, now with Fedora 21 and ghc 4.9.2 the produced executables crash. @atzedijkstra we can discuss it at the meeting tomorrow. |
I fear that the only way to find out where the problem originates is to have the traces of a yes/no working OS environment available for comparison. Otherwise too much of a needle-in-haystack (time-consuming) search. Now we at least have two known environments distinguishing the working from the crash, would it be doable to set these both up as a VM and obtain traces for the same program run? My initial guess is that the problem is not the instruction before the crash but a difference in setup of the runtime environment, tables etc. It might be gcc is making something different from generated code... A
Atze Dijkstra, Department of Information and Computing Sciences. /| |
I tried to debug the generated executables, but that hasn't really helped so far.... Interestingly, inside the debuger (gdb) the C executable crashes everywhere. I also can see in the debugger that the two versions are not optimized exactly the same way. The crash seems to happen at interpreter.c:1887, backtrace: #0 gb_InitTables (byteCodes=, byteCodesSz=, cafGlEntryIndices=0xd80000 , cafGlEntryIndicesSz=, globalEntries=, globalEntriesSz=, consts=0xd7fbc0 , gcStackInfos=0xc1a9c0 , linkChainInds=0x12b4680, callinfos=0xc20a40 , callinfosSz=2350, functionInfos=0xc1b200 , functionInfosSz=2053, bytePool=0xbf8380 "UHC.Base._'Dict_Constructor", linkChainOffset=122553, impModules=0x12b4680, impModulesSz=0, expNode=0x12b4880 , expNodeSz=604, expNodeOffs=0xd84c40 , modTbl=0xbf7380 , modTblInx=0) at build/99/rts/bc/bc/interpreter.c:1887 #1 0x000000000040793e in UHC_Base_initModule (modTbl=0xbf7380 , modTblInx=0) at install/99/lib/pkg//uhcbase-1.1.8.4/99/bc/plain/UHC/Base.c:103203 #2 0x000000000040592c in main (argc=1, argv=0x7fffffffe3f8) at Test.c:890 (To get debug information for the RTS, one can use the --with-gcc-ehc-options=-g configure argument.) The traces are quite big (~30MB), so opening them in the browser may not be the best idea... I also compiled UHC using gcc 3.4 on my notebook, and then the generated executables work. @Atze Out of curiosity, what GCC version are you using? |
Ok, on the interpreter.c:1887 location I have (long time ago) inserted code to work around a gcc/ubuntu bug which I did not understand at the time (still don't). Might well be that fixing that piece of code could solve the problem as maybe now gcc is fixed. An #ifdef on the compiler version around it then... This is the code:
Might be that changing 'int j' into 'Word j' fixes it... Just 'guessing'... A
Atze Dijkstra, Department of Information and Computing Sciences. /| |
I will give your suggestion a try next week. If it doesn't fix the problem, I am probably just going to use an older GCC version. I am not that keen To summarize what we know right now: |
I tried changing |
Ok, was just a guess... I am installing ubuntu as a VM, will try to see what happens there now... A
Atze Dijkstra, Department of Information and Computing Sciences. /| |
Not really the solution, but might be helpful for other people. Using LLVM/clang the generated executables work and don't segfault. clang version 3.5.0 (tags/RELEASE_350/final) configure: ./configure --with-gcc=/usr/bin/clang |
I confirm that compiling with clang allows me to build and run hello world successfully with Thank you! |
So, about that one switch branch in This is a more complete sketch of the situation:
After eliminating the typedefs,
I get the impression that the extra |
Well, I think I found the problem after some debugging:
The segfault occurs in a
So, yeah... UHC doesn't properly align its pointers. (To note: |
Alignment may well be the problem. For the 64bit backend all size dependent codegen and/or C macros assume 64bit alignment, not 128bit. It will be in internal tables (for initializing interpreter info) holding 64bit values (likely pointers) where 64bit alignment might have to be enforced. It may also be you have to look at the codegen part which generates these tables, internal alignment also might have to be 128bit (currently alignment is enforced on the same size as used wordsize)
I am afraid I can (at this moment) not help you (am on vacation, and have no replica of your environment).
…--
Atze Dijkstra
On 21 Jul 2017, at 00:30, Stijn van Drongelen ***@***.***> wrote:
Well, I think I found the problem after some debugging: gb_InitTables exhibits undefined behaviour. In particular, this function performs unaligned memory access.
The segfault occurs in a PADDQ instruction, which requires 128-bit alignment for memory operands. When I disable loop vectorisation for gb_InitTables (by adding __attribute__((optimize("no-tree-vectorize"))) to gb_InitTables, no other source changes), the problem disappears. However, similar problems reappear when I enable alignment checking, the program crashes with a bus error:
(gdb) b gb_InitTables
Breakpoint 1 at 0xc5c70: file build/101/rts/bc/bc/interpreter.c, line 1799.
(gdb) run
Starting program: /tmp/a
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, gb_InitTables (byteCodes=0x555555856800 <UHC_Base_bytecode> "\376\377\341\002", byteCodesSz=238997,
cafGlEntryIndices=0x555555890da0 <UHC_Base_cafGlEntryIndices>, cafGlEntryIndicesSz=324,
globalEntries=0x5555558a4060 <UHC_Base_globalEntries>, globalEntriesSz=1642, consts=0x5555558a73c0 <UHC_Base_constants>,
gcStackInfos=0x555555891b20 <UHC_Base_gcStackInfos>, linkChainInds=0x555555a52040 <Unsafe_Coerce_expNode_size>,
callinfos=0x5555558969e0 <UHC_Base_callinfos>, callinfosSz=2193, functionInfos=0x555555892340 <UHC_Base_functionInfos>,
functionInfosSz=1642, bytePool=0x555555836d40 <UHC_Base_bytePool> "UHC.Base.primAsinDouble", linkChainOffset=2,
impModules=0x555555a52040 <Unsafe_Coerce_expNode_size>, impModulesSz=0, expNode=0x555555a52200 <UHC_Base_expNode>,
expNodeSz=529, expNodeOffs=0x5555558912c0 <UHC_Base_expNode_offs>, modTbl=0x555555836260 <a_moduleEntries>, modTblInx=0)
at build/101/rts/bc/bc/interpreter.c:1799
1799 {
(gdb) set $ps |= (1<<18)
(gdb) continue
Continuing.
Program received signal SIGBUS, Bus error.
gb_InitTables (byteCodes=0x555555856800 <UHC_Base_bytecode> "\376\377\341\002", byteCodesSz=<optimized out>,
cafGlEntryIndices=0x555555890da0 <UHC_Base_cafGlEntryIndices>, cafGlEntryIndicesSz=324, globalEntries=<optimized out>,
globalEntriesSz=<optimized out>, consts=0x5555558a73c0 <UHC_Base_constants>,
gcStackInfos=0x555555891b20 <UHC_Base_gcStackInfos>, linkChainInds=0x555555a52040 <Unsafe_Coerce_expNode_size>,
callinfos=0x5555558969e0 <UHC_Base_callinfos>, callinfosSz=2193, functionInfos=0x555555892340 <UHC_Base_functionInfos>,
functionInfosSz=1642, bytePool=0x555555836d40 <UHC_Base_bytePool> "UHC.Base.primAsinDouble", linkChainOffset=2,
impModules=0x555555a52040 <Unsafe_Coerce_expNode_size>, impModulesSz=0, expNode=0x555555a52200 <UHC_Base_expNode>,
expNodeSz=529, expNodeOffs=0x5555558912c0 <UHC_Base_expNode_offs>, modTbl=0x555555836260 <a_moduleEntries>, modTblInx=0)
at build/101/rts/bc/bc/interpreter.c:1817
1817 FunctionInfo_Inx off = callinfos[i].functionInfoModOff ;
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
128-bit alignment is enforced by the compiler based on the types the programmer promises. The The problem is in the assumptions that underly this line:
Note that |
Hello
I have built UHC from source code using GHC on a 64 bit computer
but I tried to compile the demo with
uhc hello.hs
, this seems to work and creates output files but when I run it it crashes:I looked around and found this is in the bytecode interpreter.c file but I don't know what might be going wrong.
The text was updated successfully, but these errors were encountered: