-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ClojureScript tests regularly hang/time out #578
Comments
Relatedly: is there an opportunity to simplify the ClojureScript test setup as @arichiardi hinted at in #555? |
Isn't he asking for ClojureScript support in |
It's interesting to see if those race conditions will still happen with nREPL 0.6-snapshot, as @cgrand simplified recently a lot the code evaluation logic (see nrepl/nrepl#98).
We're working on something like this here nrepl/nrepl#87
That's a good idea and probably shouldn't be hard to do.
Same here.
Is there an easy way to do so? |
As a first step I added the jstack timeout thing in #583 |
@plexus Btw, when are we doing the switchover to Circle CI? I do hope that it will at least reduce the timeout issues. |
Nothing stopping us, I got the configuration sitting here, I can make a PR. Probably not today though. |
👍 |
Here is a build log that includes stacktraces after the tests hang, this could be a good starting point for diagnosing the issue. https://travis-ci.org/clojure-emacs/cider-nrepl/jobs/473545815 |
Look pretty good! Excellent work! |
Looking at the stack traces this is where it blocks, it's in https://github.com/clojure/clojurescript/blob/r1.8.51/src/main/clojure/cljs/repl/node.clj#L128 I'm guessing the connection to Node continuously fails, the exception gets swallowed and the loop gets executed again, over and over. The only logic in there that is likely to fail is creating the |
Interesting insight! I can't imagine why something so basic would be failing and I wonder whether it's somehow related to the container's environment, given the randomness of the failures. I also wonder if one way to fix this would be to replace the Node repl with a Nashorn REPL. Thinking a bit about the current usage of Node in the tests I recalled that back in the day we couldn't use Rhino, as its implementation was weird in many ways and we couldn't use Nashorn, because we still supported Java 6 and 7. On the other hand Oracle surprisingly announced some plans to pull the plug on Nashorn, so I'm not sure how good of an idea would be to rely on it. |
This is a spin-off of #569. I've managed to find and fix all "real" test failures, that leaves us with one more issue, being that regularly (one time out of two) a ClojureScript build will stop in the middle of a test and just hang there until after ten minutes the build times out.
Examples: (I'll add more links as I come across them)
This has been observed on different Java version (8, 11) and ClojureScript versions (1.9, 1.10), so it seems it's a general problem, some kind of race condition. My hunch is that a message is being sent to the ClojureScript environment before it's properly able to receive it, and so a reply is never received.
This is a pain, especially since I'm not sure if this will be reproducible locally (or consistently). Still here are some ideas that could help to pinpoint the issue.
*nrepl-messages*
, so you can see which messages go back and forth during testingjstack
, so we can see where it's blockingThe text was updated successfully, but these errors were encountered: