Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seg fault after bullet sample completes #2

Open
hdiedrich opened this issue Apr 21, 2014 · 26 comments
Open

seg fault after bullet sample completes #2

hdiedrich opened this issue Apr 21, 2014 · 26 comments
Labels

Comments

@hdiedrich
Copy link

erlc -o examples/bullet_engine examples/bullet_engine/*.erl
cd examples/bullet_engine && ./start.sh
Erlang/OTP 17 [erts-6.0] [source-07b8f44] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]

Eshell V6.0  (abort with ^G)
1> ./start.sh: line 2:  9736 Segmentation fault: 11  erl +stbt db -pa ../../ebin -eval "bullet_engine:run()."
make[1]: *** [bullet] Error 139
make: *** [all] Error 2
@essen
Copy link
Member

essen commented Apr 21, 2014

Yeah then it's definitely to do with GC order. I guess Linux doesn't care about it but OSX does.

@essen
Copy link
Member

essen commented Apr 27, 2014

I've been looking into this. On Linux I get the renderer destroyed before the window, so no problem. Can you add a printf in both dtor functions of c_src/sdl_renderer.c and c_src/sdl_window.c and tell me in what order they are called? If this is the actual issue then I have a fix in mind. Otherwise, heh, I'll need an OSX to test.

@essen
Copy link
Member

essen commented May 1, 2014

I believe this one should be fixed in the newest commit.

@brainstormi
Copy link

unfortunately, the issue persists on osx 10.9

@essen
Copy link
Member

essen commented Jul 23, 2015

I have an OSX around in a VM now, I will try when I can access it.

@essen
Copy link
Member

essen commented Jul 23, 2015

Well I fail to even compile it. I will get back to you.

@essen
Copy link
Member

essen commented Jul 23, 2015

I can reproduce. I will get you something during the day.

@essen
Copy link
Member

essen commented Jul 23, 2015

My problem might be because SMP was not enabled (trying to add a second core to the VM, but OSX doesn't seem to like that...). Do you have SMP enabled?

@essen
Copy link
Member

essen commented Jul 23, 2015

OK the SMP issue I could fix by adding -smp enable. And now I finally could observe the issue. I'll work on a fix when possible.

@brainstormi
Copy link

Yes, smp is enabled when the issue raised... Thanks for your support. I
would like to ask you some questions about your thoughts around your
project, not sure if this is the best place to discuss it...
El 23/7/2015 1:27 p. m., "Loïc Hoguin" [email protected] escribió:

OK the SMP issue I could fix by adding -smp enable. And now I finally
could observe the issue. I'll work on a fix when possible.


Reply to this email directly or view it on GitHub
#2 (comment).

@essen
Copy link
Member

essen commented Jul 23, 2015

Feel free to open a new ticket, tickets are fine for discussions. :-)

@essen essen added the Bug label Aug 15, 2015
@brainstormi
Copy link

Issue persist with esdl2 master brach in OSX... Segfault when demo ends:
./start.sh: line 2: 11970 Segmentation fault: 11 erl -smp enable +stbt db -pa ../../ebin -eval "bullet_engine:run()."
make: *** [bullet_engine] Error 139

@essen
Copy link
Member

essen commented Dec 13, 2015

Yep I confirm this both on Windows and OSX. Fine on Linux.

@brainstormi
Copy link

I'm just trying to get some insights about this issue... but trying to setup a dummy debugging environment for erlang+nifs in OSX/Windows is causing me more than a headache.
Some tip about how to debug this issue?

I could make some progress in OSX side, it seems a bit easier than windows to setup a enabled debug emulator, following the instructions in:
http://www.erlang.org/doc/installation_guide/INSTALL.html
The issue is that launching, for example, "hello_sdl" demo just crash trying to render the texture at the beginning, generating the following stack:

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_platform.dylib 0x00007fff93f2fd06 _platform_bzero$VARIANT$Merom + 22
1 libsystem_c.dylib 0x00007fff89f91a6b __memset_chk + 22
2 beam.debug.smp 0x0000000015e71fb3 debug_free + 147 (erl_alloc.c:4032)
3 beam.debug.smp 0x0000000015e66001 erts_free + 97 (erl_alloc.h:254)
4 beam.debug.smp 0x0000000016058a7c enif_free + 28 (erl_nif.c:245)
5 esdl2.so 0x0000000016627812 thread_render_copy + 82
6 esdl2.so 0x0000000016623723 nif_thread_handle + 99
7 esdl2.so 0x0000000016623816 nif_main_thread + 70
8 beam.debug.smp 0x000000001608d807 erts_sys_main_thread + 359 (sys.c:3311)
9 beam.debug.smp 0x0000000015e9e7fa erl_start + 14538 (erl_init.c:2166)
10 beam.debug.smp 0x0000000015e06402 main + 34 (erl_main.c:30)
11 libdyld.dylib 0x00007fffa02625ad start + 1

Launching it with standard emulator, crash at the end, as always... but seems to be associated to a missing texture ?¿???¿ ... this is only my assumption looking at the stack because I can't trace anything :(

Thread 15 Crashed:: 2_scheduler
0 libGL.dylib 0x00007fff9d8134bd glDeleteTextures + 18
1 libSDL2-2.0.0.dylib 0x0000000015d4298a GL_DestroyTexture + 54
2 libSDL2-2.0.0.dylib 0x0000000015d39578 SDL_DestroyTexture_REAL + 164
3 esdl2.so 0x00000000141aa55c dtor_Texture + 28
4 beam.smp 0x0000000013cdb392 nif_resource_dtor + 98

Seeing your last try to solve GC order between Renderer and Window using enif_keep_resource, enif_release_resource and the dependency macros for resources you integrated looks good... could it be the same but this time between Renderer and Texture GC order?... I'm only thinking out loud shooting at anything that moves...

@essen
Copy link
Member

essen commented Dec 17, 2015

This would be my guess, yes. Especially considering the second stack trace you just gave.

I don't really know how to get proper debug info other than going through the docs, looks like you are already more equipped than me. Last time I believe it was half guess half printf debugging.

@brainstormi
Copy link

After some more tests, I'm more confused than before... Not sure anymore that it's related to a GC order issue, probably more to some kind of timing/synchronization issue.
I was able to "debug" it in OSX using fprintf(stderr,XXX)... but GC order seems to be ok, because dtor_Renderer isn't called before dtor_Texture and/or crash. Also we have to add that when enabling a debug beam emulator the crash is produced at the start of the demo, inside the rendering loop...
In windows, it's impossible for me to manage printf output redirection properly, but I was able to compile and debug the dll inside Visual Studio 2015, attaching the debugger to the running process... and to get the issue more complex, heres the debug hangs forever when dtor_Window is invoked, not being able to replicate the segmentation fault in debug. Funny thing is that GC order and dtor_## seems to works properly in Windows invoking dtor_surface, dtor_texture, dtor_renderer without issues when closing.
Obviously my limited C skills doesn't help with this.

@essen
Copy link
Member

essen commented Dec 21, 2015

A few tests you can do:

Try creating a window and do nothing else (no loop, no call, just the process exiting).

Do the same with different window options.

Do the same with a renderer.

Do the same with texture.

Add a loop.

Etc.

@brainstormi
Copy link

In windows os, just creating a simple window makes it hangs indefinitely when trying to invoke dtor_Window on exit... it doesn't matter if the dll is cross-compiled with Msys2 or generated with VS2015.

@essen
Copy link
Member

essen commented Dec 21, 2015

Do you have a stacktrace of the crash on Windows? Is it any different from OSX?

@brainstormi
Copy link

Unfortunately I'm not able to generate the segfault anymore... In the beginning the usual behaviour was that launching it (hello_sdl demo) after a clean compilation generated this same freeze behaviour trying to close the window; following execution tries generated the segfault issue.

@brainstormi
Copy link

Excuse me, my fault, I know what is happening with the freeze behaviour... I substituted the SDL2.dll by the dev one in order to debug the issue; this has been causing the freeze issue. With the SDL2.dll release version it generates again the segfault in windows.

@brainstormi
Copy link

I was too quick... the freeze behaviour persists when exiting... It seems that the segfault could be dtor_texture when it's raised... but now, most of the times this doesn't happens and only freeze calling dtor_window... sorry no stacktrace, just erlang vm and window sits there forever until you kill the window. As pointed, same behaviour if you create only a windows and pull sdl events in a loop.

@brainstormi
Copy link

Ok, I have one last theory about what could be happening here... SDL_Destroy## functions are not thread safe calls and SDL multi-threading support varies between OS implementations, taking as dumb rule SDL is not thread safe, full period. It seems they are executed out of the main thread in esdl2 nif, so this would explain why different dtor_## functions shows different behaviors on different OS (OSX dtor_Texture seg faults, meanwhile dtor_window in Windows freezes). Also this would explain the random behavior in Windows, alternating freezes with seg faults?.
How does it sound?... I'll need to learn a bit about C macros and nifs before being able to test it. It's better I sleep a bit.

@essen
Copy link
Member

essen commented Dec 22, 2015

Makes sense.

@essen
Copy link
Member

essen commented Dec 22, 2015

That's why we have a thread in the first place.

@quantumproducer
Copy link

I see the same crash, but also it will crash if I click away and the window loses focus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants