Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable screen clearing on loadstate, and use Link Time Optimization #176

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Dwedit
Copy link

@Dwedit Dwedit commented Aug 15, 2018

Changes:

  • Do not clear the screen when loading state, now made into a setting.
  • Fix GCC compiler warning in port.h (NOMINMAX now defined as 1)
  • Re-added sanity check to prevent an infinite loop if you run without loading a file first
  • Makefile changes: Use Link-time optimization on GCC, and define __WIN32__ in all MSVC targets

Dwedit added 2 commits August 15, 2018 09:49
Fix GCC compiler warning in port.h (NOMINMAX now defined as 1)
Re-added sanity check to prevent an infinite loop if you run without loading a file first
@orbea
Copy link

orbea commented Aug 15, 2018

I think the sanity check is indicating a deeper problem, maybe in retroarch even. Cores shouldn't need to check if there is content loaded, I believe that is a frontend feature.

Also retroarch -L /usr/lib64/libretro/snes9x_libretro.so results in a crash here.

Edit: Also do we really want to force LTO?

Lastly, the upstream snes9x repo supports libretro so you know.

@Dwedit
Copy link
Author

Dwedit commented Aug 15, 2018

In my testing, I got around a 5% performance boost from LTO. If it's not appropriate to use on non-windows platforms, I can change the makeflie back.

@orbea
Copy link

orbea commented Aug 15, 2018

It was an open question, I'm not what the potential issues could be honestly.

@inactive123
Copy link

The idea is that we merge it first here, then send a PR to upstream.

@Dwedit
Copy link
Author

Dwedit commented Aug 15, 2018

I'm still a little unsure about the __WIN32__ change in the makefile, since I have not tested making MSVC builds with the makefile for the various platforms there. I know that this change may fix MSVC 2010 Windows builds, but I have no idea about compatiblity with Xbox, Xbox 360, or UWP builds.

@inactive123
Copy link

I can try out Xbox and Xbox 360 at least. There were previously issues with it building on MSVC2005 anyway, and original Xbox will require MSVC 2003, so there is still work left to be done anyway.

@orbea
Copy link

orbea commented Aug 15, 2018

The infinite loop is a regression in RetroArch, in RetroArch-1.7.3 it crashes entirely in RetroArch instead. I'll bisect this and get a backtrace later. I suggest fixing this in RetroArch instead of snes9x though.

Edit: I mean sometime tonight or tomorrow morning when my cpu is not busy building libreoffice...

@Dwedit
Copy link
Author

Dwedit commented Aug 16, 2018

Should I resubmit the pull request and include only the screen buffer clear change?

@andres-asm
Copy link

we;re trying to remake upstream friendly here...
tagging @bearoso

@bearoso
Copy link

bearoso commented Aug 16, 2018

Can’t we just reuse the fastsavestates flag to not clear the screen? Otherwise looks fine. LTO should work. In my experience, clang/llvm does a slightly better job optimizing, too—about a 3% advantage.

@Dwedit
Copy link
Author

Dwedit commented Aug 16, 2018

I think it's better to separate them, since in the libretro version, you pretty much never want to clear the screen after a loadstate.

@inactive123
Copy link

@bearoso Can you push these changes upstream? I would then just be able to do a git pull instead to pull in these changes.

@orbea
Copy link

orbea commented Aug 16, 2018

The crash without content is a regression between RetroArch 1.7.1 and 1.7.2 where it then turned into an infinite loop instead of a crash after 1.7.3. I'll bisect it now, but it probably occurs with other cores so it really should be fixed in RetroArch, not snes9x.

Edit: Yes, it happens with older snes9x cores too as well as bsnes-mercury where this was originally fixed for.

@orbea
Copy link

orbea commented Aug 16, 2018

First bad commit for the crash.

commit ad0a36b8259ae75103317852e58faf096f1df0cc
Author: Tatsuya79 <[email protected]>
Date:   Thu Apr 5 00:52:46 2018 +0200

    XMB thumbnails vertical disposition.

:100644 100644 6223deddcf7dffdbeac2f136c92d126f2b13bf6a 48041da022f88154798c003e97426893aba34185 M	config.def.h
:100644 100644 c37036be3d07b46448fc1cd0f6509e581a9f9557 9f0eed8a04bfa355e9865b3611bcff97aa761b71 M	configuration.c
:100644 100644 61909d8a95342772042365b3cc4b39efd8e507c4 b43facf5f6440a394bcb51400c6e57d5ca75dc6d M	configuration.h
:040000 040000 c60a0aee984c848eb06fc9aad22d7a90480b7fdc 85a451311a2a59ebe74b900ac0de5fc919895aab M	intl
:040000 040000 264177f08e23f2a45472d15b26767dad001f347a 91b1bbd5d92d79b45cfc3b36ed2e3cbfec838b0c M	menu
:100644 100644 781fb6d75724b1d5a6b8db04d970ec66a1894d57 0f9c21b9ce42ac7fe9b15b4949f8446a423d0a07 M	msg_hash.h

libretro/RetroArch@ad0a36b

@Tatsuya79 Can you take a look at this?

@orbea
Copy link

orbea commented Aug 16, 2018

And a backtrace using 1.7.3 and the snes9x upstream git master.

Thread 1 "retroarch" received signal SIGSEGV, Segmentation fault.
0x00007ffff364a036 in __strlen_sse2 () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ffff364a036 in __strlen_sse2 () from /lib64/libc.so.6
#1  0x00000000004a1243 in strcpy_alloc ()
#2  0x00000000004a0d9a in set_load_content_info ()
#3  0x00000000004274e8 in core_load_game ()
#4  0x000000000043eee3 in content_init ()
#5  0x0000000000430412 in command_event.localalias ()
#6  0x000000000042be61 in retroarch_main_init ()
#7  0x000000000043cd04 in content_load ()
#8  0x000000000043d0b9 in task_load_content ()
#9  0x000000000043dc1d in task_load_content_callback.constprop ()
#10 0x00000000004263c7 in rarch_main ()
#11 0x00007ffff35bcba7 in __libc_start_main () from /lib64/libc.so.6
#12 0x000000000042335a in _start () at ../sysdeps/x86_64/start.S:120
(gdb) bt full
#0  0x00007ffff364a036 in __strlen_sse2 () from /lib64/libc.so.6
No symbol table info available.
#1  0x00000000004a1243 in strcpy_alloc ()
No symbol table info available.
#2  0x00000000004a0d9a in set_load_content_info ()
No symbol table info available.
#3  0x00000000004274e8 in core_load_game ()
No symbol table info available.
#4  0x000000000043eee3 in content_init ()
No symbol table info available.
#5  0x0000000000430412 in command_event.localalias ()
No symbol table info available.
#6  0x000000000042be61 in retroarch_main_init ()
No symbol table info available.
#7  0x000000000043cd04 in content_load ()
No symbol table info available.
#8  0x000000000043d0b9 in task_load_content ()
No symbol table info available.
#9  0x000000000043dc1d in task_load_content_callback.constprop ()
No symbol table info available.
#10 0x00000000004263c7 in rarch_main ()
No symbol table info available.
#11 0x00007ffff35bcba7 in __libc_start_main () from /lib64/libc.so.6
No symbol table info available.
#12 0x000000000042335a in _start () at ../sysdeps/x86_64/start.S:120
No locals.
(gdb) t a a f

Thread 2 (Thread 0x7fffee5fe700 (LWP 28917)):
#0  0x00007ffff5e10783 in pthread_cond_wait@@GLIBC_2.3.2 ()
   from /lib64/libpthread.so.0

Thread 1 (Thread 0x7ffff7e09ec0 (LWP 28913)):
#0  0x00007ffff364a036 in __strlen_sse2 () from /lib64/libc.so.6

@orbea
Copy link

orbea commented Aug 16, 2018

@Dwedit The infinite loops started in one of your commits.

8cd8e7d3ae07aa772f0d8c09df7ee10290259c65 is the first bad commit
commit 8cd8e7d3ae07aa772f0d8c09df7ee10290259c65
Author: Dwedit <[email protected]>
Date:   Mon May 28 10:54:25 2018 -0500

    Fix a calloc(0) which led to uninitialized data being used later on.

:040000 040000 d7f00f01dd69703ddcd7ef93724558a21f3dda92 710bc1f715e5b4ddff69dee65db0c08aca847dae M	tasks

libretro/RetroArch@8cd8e7d

@orbea
Copy link

orbea commented Aug 16, 2018

I'm going to link this here too since it has a lot of backstory on these issues.

libretro/RetroArch#4493

@Tatsuya79
Copy link

I looked at libretro/RetroArch@ad0a36b but I can't see anything special.

It adds a menu option that maybe triggered some problem?

@orbea
Copy link

orbea commented Aug 16, 2018

I don't see the problem either, but as the parent commit doesn't have the issue it has to be related somehow...

@orbea
Copy link

orbea commented Aug 16, 2018

Maybe this helps, asan with 1.7.3.

$ ./retroarch -L /usr/lib64/libretro/snes9x_libretro.so 
Sound buffer size: 4100 (1025 samples)
=================================================================
==12244==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200004db10 at pc 0x000000559a59 bp 0x7ffdddc09d00 sp 0x7ffdddc09cf8
READ of size 8 at 0x60200004db10 thread T0
    #0 0x559a58 in clone_retro_game_info runahead/copy_load_info.c:38
    #1 0x559a58 in clone_retro_ctx_load_content_info runahead/copy_load_info.c:241
    #2 0x559a58 in set_load_content_info runahead/copy_load_info.c:257
    #3 0x438ca9 in core_load_game /home/orbea/gittings/installed/RetroArch/core_impl.c:294
    #4 0x46310e in content_file_load tasks/task_content.c:626
    #5 0x46310e in content_file_init tasks/task_content.c:807
    #6 0x46310e in content_init tasks/task_content.c:1926
    #7 0x448be1 in event_init_content /home/orbea/gittings/installed/RetroArch/command.c:1214
    #8 0x448be1 in command_event_init_core /home/orbea/gittings/installed/RetroArch/command.c:1272
    #9 0x448be1 in command_event /home/orbea/gittings/installed/RetroArch/command.c:2239
    #10 0x440b70 in retroarch_main_init /home/orbea/gittings/installed/RetroArch/retroarch.c:1336
    #11 0x45d6f7 in content_load tasks/task_content.c:279
    #12 0x45e1bc in task_load_content tasks/task_content.c:874
    #13 0x4605fd in task_load_content_callback tasks/task_content.c:1560
    #14 0x436d42 in rarch_main frontend/frontend.c:125
    #15 0x7f88bf4f1ba6 in __libc_start_main (/lib64/libc.so.6+0x21ba6)
    #16 0x42da49 in _start (/media/gittings/installed/RetroArch/retroarch+0x42da49)

0x60200004db11 is located 0 bytes to the right of 1-byte region [0x60200004db10,0x60200004db11)
allocated by thread T0 here:
    #0 0x7f88c3c29118 in calloc (/usr/lib64/libasan.so.5+0xe9118)
    #1 0x462bfd in content_file_init tasks/task_content.c:800
    #2 0x462bfd in content_init tasks/task_content.c:1926

SUMMARY: AddressSanitizer: heap-buffer-overflow runahead/copy_load_info.c:38 in clone_retro_game_info
Shadow bytes around the buggy address:
  0x0c0480001b10: fa fa fd fa fa fa 00 fa fa fa 00 01 fa fa fd fa
  0x0c0480001b20: fa fa 00 00 fa fa 00 06 fa fa 00 fa fa fa 00 01
  0x0c0480001b30: fa fa 05 fa fa fa fd fa fa fa 00 06 fa fa 04 fa
  0x0c0480001b40: fa fa 04 fa fa fa 05 fa fa fa 05 fa fa fa 04 fa
  0x0c0480001b50: fa fa fd fa fa fa fd fa fa fa fd fd fa fa 05 fa
=>0x0c0480001b60: fa fa[01]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001b70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001b80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001b90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001ba0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c0480001bb0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==12244==ABORTING

@bearoso
Copy link

bearoso commented Aug 16, 2018

I made most of the changes. We're just going to skip blacking the screen universally, since I don't think it'll affect anything except loading a state while paused (in which case is it really better to have a black screen than an old screen?). It's the port's memory, so it should be the port's decision on what to do with it.

I left out the rom_loaded check since you guys are still discussing it.

@ghost
Copy link

ghost commented Aug 16, 2018

This PR appears to make the core extremely slow. With fastforward on, and both RA+core under ASAN+debug, I only get 75fps. Without the patch I get ~500fps even with ASAN+debug on. Runahead was off in both cases.

@Dwedit
Copy link
Author

Dwedit commented Aug 16, 2018

Which platform and compiler is LTO failing horribly on?

@bearoso
Copy link

bearoso commented Aug 16, 2018

Which platform and compiler is LTO failing horribly on?

Older GCC used to require the optimization flags be on the link stage for LTO, perhaps that's the problem.

@ghost
Copy link

ghost commented Aug 16, 2018

Me? I'm on gcc 8.2.0 on archlinux

@ghost
Copy link

ghost commented Aug 16, 2018

-flto is also on the link stage for me

@bearoso
Copy link

bearoso commented Aug 16, 2018

I meant it needed -O3 et al on the link stage. gcc 8.2 it shouldn't matter, though.

@Dwedit
Copy link
Author

Dwedit commented Aug 16, 2018

It's using -O2 right now, is -O3 any better?

edit: added in the same optimization flag (currently -O2) to the linker flags.

@orbea
Copy link

orbea commented Aug 16, 2018

I get around 180fps with gcc 8.2.0 on Slackware64-current here with or without -flto. The main difference I noticed is that my compressed package is 3 mb larger and the link time is a lot longer.

@bearoso
Copy link

bearoso commented Aug 16, 2018

If anything, LTO should make it smaller. It's normal for the link time to be longer, since all the optimization is moved there instead.

@orbea
Copy link

orbea commented Aug 16, 2018

I checked again, the current master.

PACKAGE NAME:     snes9x-2018.08.15_8064f7f_master-x86_64-1_git
COMPRESSED PACKAGE SIZE:     2.5M
UNCOMPRESSED PACKAGE SIZE:     8.5M

And the parent commit.

PACKAGE NAME:     snes9x-2018.08.16_8db667f_master-x86_64-1_git
COMPRESSED PACKAGE SIZE:     2.3M
UNCOMPRESSED PACKAGE SIZE:     8.4M

Note that both the standalone snes9x and libretro core are contained in this package.

@orbea
Copy link

orbea commented Aug 16, 2018

@Tatsuya79 Apologies, seems I screwed up the first bisect somehow. It as PR libretro/RetroArch#6484.

@orbea
Copy link

orbea commented Aug 16, 2018

Yea, that PR. I found this diff stops the crashes and infinite loop, but it also breaks loading content...

diff --git a/dynamic.c b/dynamic.c
index 4a91c2693b..adb261e0d7 100644
--- a/dynamic.c
+++ b/dynamic.c
@@ -408,8 +408,6 @@ bool init_libretro_sym_custom(enum rarch_core_type type, struct retro_core_t *cu
          if (!lib_path || !lib_handle_p)
 #endif
          {
-            if (!load_dynamic_core())
-               return false;
             lib_handle_local = lib_handle;
          }
 #ifdef HAVE_RUNAHEAD

I made this issue. libretro/RetroArch#7082

@Dwedit
Copy link
Author

Dwedit commented Aug 17, 2018

I just did a quick test, and in the current Git Master version of RetroArch, it runs the dummy core instead of the Snes9x core with no content loaded. So it never reaches the sanity check code.

However, Retroarch is allowing access to the Quick Menu with no content loaded, and is also allowing leaving the menu with no content loaded. It does not do this if you use the Win32 Menu system to load a core rather than the main Load Core menu option.

@orbea
Copy link

orbea commented Aug 18, 2018

@Dwedit After reverting libretro/RetroArch@8cd8e7d I found snes9x does reach the sanity check code and exits without crashing or starting the dummy core, but asan still reveals problems. Other cores still just crash. I'm not sure where in your runahead PR this started, but I would really appreciate if you could help fix it. :)

@Dwedit
Copy link
Author

Dwedit commented Aug 18, 2018

When retroarch is started from the command line, it appears to not have the dummy core set without content.

edit:

  • Loading core from UI menu: Does not resolve symbols, stays with initially loaded dummy core, allows exiting the menu
  • Loading core from WIN32 GUI: Does not resolve symbols, stays with initially loaded dummy core, does not allow exiting the menu
  • Loading core from command line: Resolves symbols, calls retro_run without content loaded.

@orbea
Copy link

orbea commented Aug 21, 2018

I just want to point out that the issue with cores crashing and the infinite loop are mostly solved, there are some core bugs remaining, but as far as snes9x is concerned its fixed.

@inactive123
Copy link

Can this be merged then?

@inactive123
Copy link

Does this need to be implemented in upstream still or what is the deal there?

Does the upstream libretro core have issues remaining still compared to the one in here?

We need to start making the decision soon to completely migrate over to snes9xgit, and if;/when we do, I have to be certain we don't lose anything in the process.

So just tell me what to do here - merge this, then update to snes9xgit latest sources again? Can this be individually pushed to upstream or did they veto this?

@Dwedit
Copy link
Author

Dwedit commented Aug 28, 2018

From what I gather, Snes9x upstream reviewed this PR, and accepted the makefile changes, then completely removed screen clearing (no config flag), and accepted some minor changes.

I believe you can outright replace libretro/Snes9x with the upstream version completely.

@inactive123
Copy link

OK - just a heads up -

f09a33a

88db1ab

Can you push these two changes to upstream?

Also, I just pulled all changes from upstream. Can you let me know if it already incorporates these runahead changes and if so, if we can either close this PR or it still needs to be merged? If it still needs to be merged, there is a merge conflict that has to be resolved first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants