Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with MPI #193

Open
YvanFournier opened this issue May 13, 2018 · 5 comments
Open

Compatibility with MPI #193

YvanFournier opened this issue May 13, 2018 · 5 comments

Comments

@YvanFournier
Copy link

Describe the issue

I would be interested in using gdbgui to debug multiple processes started with MPI.
This is similar to closed issue #183, to which I added the same comment but might not be visible.

Steps to replicate

In a terminal:
mpiexec -n <# processes> gdbgui <executable_name> -e gdb -x <gdb_commands_file>
or:
mpiexec -n <# processes> <wrapper> <arguments>
In which the wrapper can optionnaly run gdbgui with a different (random) free port.

Describe your environment
  • Operating system and version: Linux (any distribution)
  • gdbgui version (gdbgui -v): 0.11.3.1
  • gdb version (gdb -v): 8.1
  • browser: Firefox (60)
  • python packages (pip freeze): Python 3.6 + virtualenv
Description of the desired feauture

MPI usually launches multiple processes, either on a single host or multiple ones, and usually provides slightly different environment variable values to each process so they can determine their own rank. Also, it must be mentioned that when restarting processes, the whole set of processes must be relaunched together (usually using an "mpiexec" or similar command), so processes cannot be restarted independently within a debugger).

I tried launching different gdbgui instances with different port numbers (before reading this issue), but it seems when 2 gdbgui instances are launched, the second is killed even when not using the same port number.

So using multiple tabs is intersesting, but in this case how can I connect to the correct process ? The issue is not the simply possibility os doing so (using gdb attach), but the possibility of automating it (so as to run and attach multiple processes frow a wrapper script).

For additional information, the case I am interested in is mainly debugging on localhost (for HPC environments with massive parallelism, having many windows or tabs become unworkable, so more specialized debuggers such as DDT or TotalView become necessary, though attaching to a few select processes remains very useful).

@YvanFournier
Copy link
Author

YvanFournier commented May 15, 2018

Hello,

I attached the simple case described here: simple_debug.tar.gz. Basically, a minimal "hello world", compiled on Linux using gcc -g hello.c, and run three instances (2 is enough to reproduced the issue) using the included script, which simply contains:

gdbgui -p 5011 ./a.out &
gdbgui -p 5021 ./a.out &
gdbgui -p 5031 ./a.out

On the first two tabs that appear under Firefox, I have the following message:

Server message: Session expired. Please refresh this webpage.

Error occurred on server when running gdb command: gdb is not running
No gdb response received after 10 seconds. This usually indicates an error.

Possible reasons include:

  1. an inferior program was not loaded; 2) gdb or the inferior process need to be interrupted with the SIGNINT signal; 3) gdb has exited unexpectedly; 4) an unexpected error occurred

If an operation that takes longer than 10 seconds was run, this message can be ignored.`

The last tab works normally.

As a reminder, my system is: Arch Linux, Python 3.6, Firefox 60.0. gdbgui was installed using pip in a Python virtualenv.

I'll be happy to run any additional tests you need.

@cs01
Copy link
Owner

cs01 commented May 28, 2018

I was able to run this without issue. I am wondering if the browser was open before you ran the commands to launch gdbgui. If that is the case, refreshing the respective tabs should fix the issue.

@YvanFournier
Copy link
Author

I tested both with browser opened and closed, and have the same issue. I also tested with Chromium instead of Firefox, with the same behavior.

Thanks for the tip, refreshing tabs does work (I was previously trying to restart the program, not refreshing the whole tab)

I also tried using the "-r" option for gdbserver mode. In this case, I needed to load the tabs manually, but had no other issue.

So this seems like a race condition on the browser/tab side, but can be worked around, although the behavior is surprising.

Looks like I'm almost ready to have a working alternative to ddd, allowing to mix command line and GUI, and without the obsolete display library issues of ddd ! Thanks !

PS. I upgraded to 0.12.0 for my latest tests.

@dimitargslavchev
Copy link

@YvanFournier Hello.

Were you able to debug mpi processes and if so how exactly?

I assume that it is possible to add an infinite loop or very long sleep at the beginning of my program so that I could attach to all processes manually, but this seems rather clumsy.

Best Regards,
Dimitar Slavchev

@YvanFournier
Copy link
Author

Hello,
Is seems I am able to do this when launching manually, but not using a Python debugger launcher script using subprocess.

For example,

mpiexec -n 1 gdbgui -p 44795 --gdb-args="-x ./commands.gdb" ./a.out \
      : -n 1 gdbgui -p 44796 --gdb-args="-x ./commands.gdb" ./a.out

seems to work, but:

mpiexec -n 2 ./cs_debug_wrapper.py gdbgui ./a.out

with the attached Python-based wrapper fails.
cs_debug_wrapper.py.gz

The wrapper function I use is built to automatically handle more complex cases (such as finding a free port for gdbgui, attaching gdb to valgrind gdb servers, ... and passing command-line arguments in a consistent manner.

In this case I obtain:

ERROR - Traceback (most recent call last):
  File "/home/yvan/.local/pipx/venvs/gdbgui/lib/python3.8/site-packages/gdbgui/backend.py", line 357, in run_gdb_command
    controller.write(cmd, read_response=False)
  File "/home/yvan/.local/pipx/venvs/gdbgui/lib/python3.8/site-packages/pygdbmi/gdbcontroller.py", line 201, in write
    self.verify_valid_gdb_subprocess()
  File "/home/yvan/.local/pipx/venvs/gdbgui/lib/python3.8/site-packages/pygdbmi/gdbcontroller.py", line 175, in verify_valid_gdb_subprocess
    raise NoGdbProcessError(
pygdbmi.gdbcontroller.NoGdbProcessError: gdb process has already finished with return code: 1

The issue might be in my wrapper or its use of "subprocess" or child process behavior (or possibly shell/no shell option), as I have already observed different behavior of the Valgrind gdb server using MPICH when launched from this wrapper (fails) or run manually (works).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants