Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sim crashing on iPad 2 #435

Closed
jessegreenberg opened this issue Jul 31, 2018 · 29 comments
Closed

Sim crashing on iPad 2 #435

jessegreenberg opened this issue Jul 31, 2018 · 29 comments

Comments

@jessegreenberg
Copy link
Contributor

From #431 and phetsims/qa#158, the sim is crashing frequently on iPad2. @KatieWoe found that the one off version (https://bayes.colorado.edu/dev/html/energy-skate-park-basics/1.4.0-trackCanvasInChrome.1/phet/energy-skate-park-basics_en_phet.html) was crashing, which could have been expected after rendering the tracks with canvas in Chrome for #431.

But then @KatieWoe noticed that the sim was crashing frequently in Safari as well, which is not expected because the change was chrome specific. We should see if another recent change to this sim could have caused this, or possibly other changes in scenery or common code.

@jessegreenberg jessegreenberg self-assigned this Jul 31, 2018
@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Jul 31, 2018

Also, @lmulhall-phet and @KatieWoe confirmed that this was is happening in 1.4.0-dev.1, so this would have been introduced prior the date of that build.

@samreid
Copy link
Member

samreid commented Aug 2, 2018

Mac Chrome reports the following MB usage after starting the sim and putting the skater on the track on the 1st screen:

Published 1.1.7: 30MB
Local built: 34.3MB
Local requirejs: 38.4MB

@samreid samreid self-assigned this Aug 2, 2018
@ariel-phet
Copy link

Unassigning @samreid as he has plenty of issues. @jessegreenberg will figure this out 🐱

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Aug 29, 2018

Playing on an iPad2, I noticed that the sim runs very well in the Intro and Friction screens, but crashes quickly after starting to use the "Playground" screen.

EDIT: Maybe disregard this? Just hit a crash that suggests otherwise, maybe I haven't done enough testing.

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Aug 30, 2018

Here are some notes from what I tried over the last couple days:

I tried to use a tool called Weinre to debug without tethering the iPad to a Mac. The tool worked great and I will use it a lot in the future. But I wasn't able to use it to get any information about this particular crash.

A while ago, I made a tool that prints sim sizes against all commits against the project. (https://github.com/phetsims/perennial/blob/master/bin/print-sim-sizes.sh) I modified this script to checkout the project at dates of commits in ESP:B, then build the sim and save the build to hopefully find where this problem was introduced. If the sim crashes, it crashes in about 15 seconds of navigating between the screens otherwise it doesn't crash at all. So is is easy to compare versions in this way. Doing this, I found that at 2017-08-18 10:30:09 there is no crashing, and 2017-08-25 21:30:18 the sim crashes frequently. So something changed in that window either in this sim or in the project to cause this.

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Sep 10, 2018

Here are more notes from this:

A sim built with SHAs from 2017-08-17 at 22:41:19 does not crash. A sim built with SHAs at 2017-08-18 10:25:37 crashes almost as soon as the third screen button is pressed (sometimes before).

The commit at 2017-08-17 at 22:41:19 is dba01386dd884865c2111c0b5c63f31a4cdfbefb to Sherpa, with commit message

Adding jshashes-1.0.7 for phetsims/aqua#22 (string hashing needs)

The commit at 2017-19-18 10:25:37 is f611b69e80756e4e10653ec59113d0f7f74eea83 with message

Removed stale comments, see phetsims/joist#436

The chipper commit indeed only removed HTML comments, and would not have caused the break. I am going to check and see if my build at 2017-08-17 at 22:41:19 includes dba01386dd884865c2111c0b5c63f31a4cdfbefb. If not, that may indicate that jshashes-1.0.7 has something to do with this? I have no idea how, seems unlikely. I will also see if I missed any commits in this window.

@jessegreenberg
Copy link
Contributor Author

I also noticed that this issue is very related to #343, which was also caused and fixed by memory related issues. The fix there didn't seem to reduce the JS heap.

I noted that its JS Heap is unchanged at 35.5MB...

@jessegreenberg
Copy link
Contributor Author

Most times before the sim crashes I notice that the skater image disappears and the sim stalls for a second or two before we see the "Something went wrong..." message.

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Sep 10, 2018

My build at 2017-08-17 at 22:41:19 did not include the right sherpa SHA, it has SHA 7be57a56484a08b829ad60a853431667cb163973, just before the one I was after.

@jessegreenberg
Copy link
Contributor Author

Oops, looks like there git rev-list --before option can only get as specific as days, so I can't look into builds at exact timestamps. But no matter, this allowed me to find the day that some breaking change may have been introduced, and from the above comments it looks like 2017-08-18 is our day. I looked through commits on that day and noticed this one:

commit d84702b1549b497366034cbf270f28147feef3fb
Author: Jonathan Olson <[email protected]>
Date:   Fri Aug 18 13:56:15 2017 -0600

    Double WebGL backing scale if there is no built-in antialiasing, see https://github.com/phetsims/circuit-construction-kit-dc/issues/139

The change doubled the "backing scale" for WebGL on some devices (including iPad2), and there are comments in phetsims/circuit-construction-kit-dc#139 like

This makes the rectangles look beautiful, but it doubles the WebGL memory usage. Will this crash the sim earlier, or exacerbate #141? Possibly, or perhaps the graphics memory is separate from the JS heap memory and hence OK to go a bit higher.

We may wish to investigate the 2x backing scale and try to solve the memory issues.

@jessegreenberg
Copy link
Contributor Author

If I remove this.backingScale *= 2; from WebGLBlock it definitely helps. With that line, the sim crashes 10/10 times when switching between screens. When I removed that line, the sim crashed 1/10 times on my iPad2.

@jessegreenberg
Copy link
Contributor Author

I just tested again to see if the crash was a fluke after removing this.backingScale *= 2; from WebGLBlock, and I had exactly the same result, sim crashed 1/10 times. So that doesn't totally fix it.

@samreid
Copy link
Member

samreid commented Sep 10, 2018

Good discovery @jessegreenberg, the extra memory used for the WebGL canvas is a sensible culprit.

@jessegreenberg
Copy link
Contributor Author

So that doesn't totally fix it.

I just remembered I was running a non-mangled version with Weinre, maybe those are the cause of the remaining crashing.

Regarding the comment about heap sizes (#435 (comment)), I am not sure if that growth is a cause of this. For instance, when I run other sims like circuit-construction-kit-dc, the heap size grows beyond these values for me in Chrome.

@jessegreenberg
Copy link
Contributor Author

In the above commit with phetsims/scenery#859, I disabled backingScale as an antialiasing method while using mobile Safari. This helps quite a lot, but the sim still crashes infrequently on iPad2. I have only see it crash now when switching scenes.

@jessegreenberg
Copy link
Contributor Author

I was searching around for other changes related to WebGL, I found this issue phetsims/scenery#637. There is a comment about it making images more expensive, but reverting the changes had no impact on the crashing rate.

@jessegreenberg
Copy link
Contributor Author

When I remove all of the nodes in the "WebGL Layer" in EnergySkateParkBasicsScreenView, I am unable to get the sim to crash, so it still seems related to WebGL.

@jonathanolson do you have any thoughts about why the sim might crash occasionally while changing screens? I noticed in the document that the "webgl-container" div doesn't exist until one of the screens is launched, and that each screen has a different canvas element for WebGL that gets added/removed from the document when screens are changed. Im sure there are very good reasons for this, but could it be related?

@jessegreenberg
Copy link
Contributor Author

@jonathanolson and I met to investigate this this today. We started by assuming that the crashing was due to hitting the memory limit, so we began to profile. But got wildly differing memory results depending on things like browser, operating system and whether or not we were in a private window. So we were not able to pinpoint the problem with the profiling tools available to us.

I will just list out everything we tried and the memory usage values we observed from the profiling results. All comparisons were done against PhET branded versions of the sim.

MacOS Chrome task manager reported that the sim is using 160-180 MB while fuzzing. In a similar test of the deployed version, the task manager reported reported 138MB of use, then 104MB of use (presumably after a garbage collection). This indicated a substantial increase of ~60MB!

So we continued to investigate heap size, and observed 27.5MB for the published version, and 45.9MB for master just after start up. After fuzzing, these never got above 100MB so the Chrome process itself was taking up a lot of memory. But we also noticed in the memory tools that Chrome was reporting things from FractionsCommon and EqualityLabScreenView so it was counting objects from other tabs, so this test loses its value.

We took a look at phet.joist.display.getDebugHTML()`, and used that information to verify that the number of Nodes and Blocks looked OK. The deployed version has 897 in this report, while the new version has 919.

We tried using the Safari memory profiling tools, and Safari reported that the sim was using about 600 MB for "Page" content (https://webkit.org/blog/6425/memory-debugging-with-web-inspector/). Crazy huge amount! Then in re-tests, we observed values much lower, but still around 300MB. In similar tests we found that the deployed version also has ~300MB of "Page" content, so we think that the 600MB report was a red herring.

Finally we tried incognito mode and observed that the JS heap as reported by Chrome was ~20MB lower for both published and master versions. Looking through a comparison of heap snapshots between two versions didn't provide much information.

@jessegreenberg
Copy link
Contributor Author

Also, @jonathanolson mentioned that @phet-steele was able to produce a crash report by tethering the iPad to a Mac and inspecting with XCode. @phet-steele would you be able to do this? Or if you don't have the platforms available could you please list the steps for how to do this? This could help verify whether the crashing is actually memory related.

I also may be able to see Safari crash reports after syncing my iPad with iTunes.
https://help.getpocket.com/article/1098-how-to-find-the-iphone-ipad-app-crash-logs

@jessegreenberg
Copy link
Contributor Author

I was able to get the crash log after connecting my iPad2 to itunes. At the time of the crash, there is a new JetsamEvent .ips file. It looks something like this:

{"os_version":"iPhone OS 9.3.5 (13G36)","bug_type":"298","timestamp":"2018-09-17 18:53:47.47 -0400"}
{
  "kernel" : "Darwin Kernel Version 15.6.0: Fri Aug 19 10:37:54 PDT 2016; root:xnu-3248.61.1~1\/RELEASE_ARM_S5L8942X",
  "date" : "2018-09-17 18:53:47.47 -0400",
  "crashReporterKey" : "a5ac2c1416e6bcdc2e9c5f8c26d9f1e37389c3c8",
  "product" : "iPad2,4",
  "build" : "iPhone OS 9.3.5 (13G36)",
  "incident" : "CF4FA097-C462-4769-BB0E-DD674F8A20AC",
  "memoryStatus" : {
  "pageSize" : 4096,
  "memoryPages" : {
    "fileBacked" : 11730,
    "anonymous" : 561,
    "inactive" : 3959,
    "active" : 8002,
    "wired" : 38438,
    "speculative" : 330,
    "throttled" : 75287,
    "purgeable" : 0,
    "free" : 1394
  },
  "compressions" : 28984,
  "decompressions" : 16690,
  "compressorSize" : 1502,
  "uncompressed" : 5062
},
  "largestProcess" : "com.apple.WebKit",
  "timeDelta" : 1696,
  "processes" : [
  {
    "pid" : 472,
    "reason" : "vm-pageshortage",
    "name" : "callservicesd",
    "fds" : 50,
    "lifetimeMax" : 1439,
    "rpages" : 444,
    "cpuTime" : 0.542582,
    "states" : [
      "daemon",
      "idle"
    ],
    "purgeable" : 0,
    "uuid" : "249eb6ce-20aa-30db-ab50-51a46d3a08b4"
  },
...

All processes have "reason" : "vm-pageshortage", which according to https://developer.apple.com/library/archive/technotes/tn2151/_index.html means that the process was killed due to "memory pressure".

@samreid
Copy link
Member

samreid commented Sep 18, 2018

I’m running a built version from today's master of Energy Skate Park: Basics on iPad2 on iOS 9.3.5. Over a minute with no crash.

UPDATE: running ?fuzzMouse on the same built version on the same iPad2, crashes at 98 seconds.
UPDATE: 2nd run with ?fuzzMouse crashes at 70 seconds.

UPDATE: I built a new version with 5 screens: intro | friction | playground | playground | playground and fuzzing crashed at 4 seconds, 38 seconds, 24 seconds, 31 seconds.

UPDATE: built 5 screens fuzzing with no line dash (testing SVG memory concern). Crash at 40 seconds, 26 seconds.

UPDATE: Reduced the number of points in the track view, still crashed at 40 seconds.

UPDATE: With a max number of control points as 1000 and fuzzing disabled, the sim launches the homescreen, then crashes when trying to show the playground screen. 2nd run: it is showing the playground screeen OK but very sluggish.

UPDATE: Changing root renderer to canvas sometimes crashes within first few seconds of launching. Current session going 30+ seconds though.

UPDATE: Built version with 3 screens (no changes from master), but running with ?fuzzMouse&webgl=false crashes at 3:48.

@jessegreenberg
Copy link
Contributor Author

I can confirm the results of #435 (comment) on my iPad2 as well.

@jessegreenberg
Copy link
Contributor Author

jessegreenberg commented Sep 18, 2018

I found that the sim crashed most consistently when changing screens. I built a version with 3 Playground screens, but added pickable: false to NavigationBar.js. The sim did crash with fuzzMouse, but it took 4 minutes.

UPDATE: 3 Playground screens with pickable: false removed from navigation bar crashed in 90 seconds.

@samreid
Copy link
Member

samreid commented Sep 18, 2018

Does the sim crash in the PhET iOS App the same way it crashes when running in Safari?

@samreid
Copy link
Member

samreid commented Sep 20, 2018

One strategy that may yield a useful breakdown of memory usage could be to embed the sim in a WKWebView and launch with XCode and Activity Monitor, like described here: https://stackoverflow.com/questions/36561063/get-current-memory-usage-of-wkwebview

Maybe that will open up the sim so we can see the memory usage? Not 100% sure whether this or something like this would be useful, but I thought I'd mention it just in case.

@jessegreenberg
Copy link
Contributor Author

Thanks @samreid! That could definitely be a way to get more information.

@jessegreenberg
Copy link
Contributor Author

I ran fuzzTest as a comparison with the deployed version of the sim and found that the sim is crashing on that version with fuzz testing as well. Three trials, it crashed at 105 seconds, 144 seconds, 120 seconds.

@jessegreenberg
Copy link
Contributor Author

Discussed with @ariel-phet, we may not do any more investigation here as the sim is crashing very infrequently on iPad2 under normal usage. We will see if QA team can verify the low rate of crashing now, and we will move forward depending on what we learn in the next dev test.

@jessegreenberg jessegreenberg removed their assignment Sep 24, 2018
@jessegreenberg
Copy link
Contributor Author

From review comment asking about this in energy-skate-park, this flag can now be removed since we are not using WebGL and we are no longer targeting iPad2. I tested the sim for several minutes on an iPad 3 and saw no crashing or memory related failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants