-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Program received signal SIGSEGV: Segmentation fault - invalid memory reference. #9
Comments
I cannot reproduce the reported segmentation fault. But, I have fixed some bugs and got rid of the warnings. You can find my work under the issue, "donationware" I am using Xubuntu 17.10 on a Celeron 1.8 GHz motherboard. |
I'm using Debian 8 Jessie on a 1st Gen Core i5 (2.4 GHz) in my personal laptop, and i also tried in a Intel Xeon CPU E3-1225 v3 @ 3.20GHz. It throw the same error in both machines. I will try your code, thanks! |
I discovered character-handling anomalies (not in the Science code). I had different symptoms with NMAX=20,000. Cannot claim that I found them all. |
I got the same error: ./mercury6 Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: |
Could you try adding a -g parameter to FFLAGS in the Makefile to see if we could get line numbers in the traceback? FFLAGS=-g -O2 -Wline-truncation -Wsurprising -Werror |
Are you using the exec_all.sh script that I created after the |
My last results (should still be in the log folder): ===== Thu May 10 13:04:57 CDT 2018 ============================================================== Integrating massive bodies and particles up to the same epoch. |
Set the NMAX back to 20000 then: elkins@biostar: ===== Mon May 14 10:02:18 CDT 2018 ============================================================== Integrating massive bodies and particles up to the same epoch. |
Nope, i spotted the issue. I had also to edit the CMAX parameter and recompile This is the info.out. Now is running fine tail -f info.out Initial energy: -3.32264E-08 solar masses AU^2 day^-2 Integrating massive bodies and particles up to the same epoch. Beginning the main integration. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. WARNING: Total number of current close encounters exceeds CMAX. |
When you had the crash, what were the parameters set to? Cut and paste or just tell me. I'll try to see if there is a relatively simple fix and put in some diagnostics. Crashing is dumb. |
Actually, that "warning" should be a termination message. When it appears, various vector and array initialization is by passed, leaving random values - see if-statement at line 1663 in mercury6_2.for: |
param.in |
small.in |
So, you modified param.in and small.in in the exec subfolder. |
Steps to reproduce the initial problem.
|
Compiled with -g option? Did you see a traceback with line numbers this time? |
I executed the ./compile from the git repo, and got plenty of warnings. |
Okay, with your small.in and param.in, I have mercury6 in a loop. WARNING: Total number of current close encounters exceeds CMAX. Again, this is not a "WARNING" situation, in my opinion. This should be a fatal error. |
yes, with your version, i still waiting to start... the full-of-warnings-git-version seems to do the trick. |
I suspect that this old Fortran IV/77 code has never really been fully diagnosed. Ideally, it will someday be converted to Python and use the numpy libraries for vector and matrix calls without hand-coding. But, in the meantime, Fortran diagnostic code can be added. |
Finally, a traceback with line numbers! Begin mercury6 (basic integration) ..... Integrating massive bodies and particles up to the same epoch. Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: |
Memory is trashed. Doesn't matter which version is used. The crash is caused by calculations in subroutine mco_x2ov, line 2878: Variables u, v, and w are parameters passed in. E.g. ending on line 5155 in the middle of a do-loop from 1 to nclo in subroutine mio_ce: but k exceeds CMAX! See subroutine mce_stat starting at line 1662. See local data arrays of dimension CMAX starting at line 296. nclo, as used all over the place, is passed in as an array-element counter. nclo must never exceed CMAX. Failure to control nclo causes the array bounds to be exceeded upon reference. |
Suggested solution: Once CMAX is breached in line 1663, put out an ERROR message and exit to the O/S. Make sense to you? |
I'm afraid that is too technical for my knowledge, but if you say so... go ahead. |
Sorry, I wasn't trying to be obtuse. You are too modest! |
Ok, the new code, with your small.in and param.in displays this and exits peacefully: Begin mercury6 (basic integration) ..... Integrating massive bodies and particles up to the same epoch. ERROR: Total number of current close encounters exceeds CMAX. That error message also appears in info.out. I just copied it to the console to wake up the sleeping scientist. |
Is that a valid advice? if i keep increasing CMAX value it eventually run? |
That depends on how much your small.in population causes close encounters. Your case was a lot of asteroids (4667?) compared to the default which was 2. On the other hand, CMAX cannot be arbitrarily large or it won't fit into RAM. This code needs to give some better advice. I have not read all of John Chambers' code. |
As far as I know mercury6 was initially designed to work with no more than 1000 objects. If you increase the number of object it might cause segfault errors. Please also take into account that if you increase the number of object you usually decrease the accuracy of the calculations. So, I would like to advise:
|
Fix was applied in #11. |
Thanks, i will take that into account. |
@asterismo, did you solve your issue by a combination of decreasing the number of objects and/or increasing CMAX? |
properly specified in dipoleconfig.f Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: Program received signal SIGABRT: Process abort signal. Backtrace for this error: |
I modified NMAX to 20000 objects, and recompiled. Executed the integrator and i got a segfault.
I pasted in Los Molinos Observatory pastebin.
https://pastebin.oalm.gub.uy/view/94773e01
At the bottom of the pastebin is the error.
The text was updated successfully, but these errors were encountered: