Skip to content

Commit

Permalink
PEP 768: Add some clarifications and minor edits
Browse files Browse the repository at this point in the history
  • Loading branch information
pablogsal committed Feb 26, 2025
1 parent 6e1a745 commit bf0a97c
Showing 1 changed file with 28 additions and 22 deletions.
50 changes: 28 additions & 22 deletions peps/pep-0768.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,19 +136,20 @@ A new structure is added to PyThreadState to support remote debugging:
typedef struct _remote_debugger_support {
int debugger_pending_call;
char debugger_script[MAX_SCRIPT_SIZE];
char debugger_script_path[MAX_SCRIPT_SIZE];
} _PyRemoteDebuggerSupport;
This structure is appended to ``PyThreadState``, adding only a few fields that
are **never accessed during normal execution**. The ``debugger_pending_call`` field
indicates when a debugger has requested execution, while ``debugger_script``
provides Python code to be executed when the interpreter reaches a safe point.
indicates when a debugger has requested execution, while ``debugger_script_path``
provides a filesystem path to a Python source file (.py) that will be executed when
the interpreter reaches a safe point. The path must point to a Python source file,
not compiled Python code (.pyc) or any other format.

The value for ``MAX_SCRIPT_SIZE`` will be a trade-off between binary size and
how big debugging scripts can be. As most of the logic should be in libraries
and arbitrary code can be executed with very short amount of Python we are
proposing to start with 4kb initially. This value can be extended in the future
how long the path can be. As most paths are relatively short on modern systems,
we are proposing to start with 4kb initially. This value can be extended in the future
if we ever need to.


Expand All @@ -171,13 +172,13 @@ debugger support:
uint64_t eval_breaker; // Location of the eval breaker flag
uint64_t remote_debugger_support; // Offset to our support structure
uint64_t debugger_pending_call; // Where to write the pending flag
uint64_t debugger_script; // Where to write the script path
uint64_t debugger_script_path; // Where to write the script path
} debugger_support;
These offsets allow debuggers to locate critical debugging control structures in
the target process's memory space. The ``eval_breaker`` and ``remote_debugger_support``
offsets are relative to each ``PyThreadState``, while the ``debugger_pending_call``
and ``debugger_script`` offsets are relative to each ``_PyRemoteDebuggerSupport``
and ``debugger_script_path`` offsets are relative to each ``_PyRemoteDebuggerSupport``
structure, allowing the new structure and its fields to be found regardless of
where they are in memory.

Expand All @@ -199,13 +200,19 @@ When a debugger wants to attach to a Python process, it follows these steps:

5. Write control information:

- Write a filename containing Python code to be executed into the
``debugger_script`` field in ``_PyRemoteDebuggerSupport``.
- Most debuggers will pause the process before writing to its memory. This is
standard practice for tools like GDB, which use SIGSTOP or ptrace to pause the process.
This approach prevents races when writing to process memory. Profilers and other tools
that don't wish to stop the process can still use this interface, but they need to
handle possible races, which is a normal consideration for profilers in general.

- Write a file path to a Python source file (.py) into the
``debugger_script_path`` field in ``_PyRemoteDebuggerSupport``.
- Set ``debugger_pending_call`` flag in ``_PyRemoteDebuggerSupport``
- Set ``_PY_EVAL_PLEASE_STOP_BIT`` in the ``eval_breaker`` field

Once the interpreter reaches the next safe point, it will execute the script
provided by the debugger.
Once the interpreter reaches the next safe point, it will execute the Python code
contained in the file specified by the debugger.

Interpreter Integration
-----------------------
Expand Down Expand Up @@ -233,7 +240,7 @@ to be audited or disabled if desired by a system's administrator.
if (tstate->eval_breaker) {
if (tstate->remote_debugger_support.debugger_pending_call) {
tstate->remote_debugger_support.debugger_pending_call = 0;
const char *path = tstate->remote_debugger_support.debugger_script;
const char *path = tstate->remote_debugger_support.debugger_script_path;
if (*path) {
if (0 != PySys_Audit("debugger_script", "%s", path)) {
PyErr_Clear();
Expand Down Expand Up @@ -269,16 +276,17 @@ arbitrary Python code within the context of a specified Python process:

.. code-block:: python
def remote_exec(pid: int, code: str, timeout: int = 0) -> None:
def remote_exec(pid: int, code: str) -> None:
"""
Executes a block of Python code in a given remote Python process.
This function returns immediately, and the code will be executed at the next
available opportunity in the target process, similar to how signals are handled.
There is no way to determine when or if the code has been executed.
Args:
pid (int): The process ID of the target Python process.
code (str): A string containing the Python code to be executed.
timeout (int): An optional timeout for waiting for the remote
process to execute the code. If the timeout is exceeded a
``TimeoutError`` will be raised.
"""
An example usage of the API would look like:
Expand All @@ -288,9 +296,7 @@ An example usage of the API would look like:
import sys
# Execute a print statement in a remote Python process with PID 12345
try:
sys.remote_exec(12345, "print('Hello from remote execution!')", timeout=3)
except TimeoutError:
print(f"The remote process took too long to execute the code")
sys.remote_exec(12345, "print('Hello from remote execution!')")
except Exception as e:
print(f"Failed to execute code: {e}")
Expand Down Expand Up @@ -393,8 +399,8 @@ Rejected Ideas
Writing Python code into the buffer
-----------------------------------

We have chosen to have debuggers write the code to be executed into a file
whose path is written into a buffer in the remote process. This has been deemed
We have chosen to have debuggers write the path to a file containing Python code
into a buffer in the remote process. This has been deemed
more secure than writing the Python code to be executed itself into a buffer in
the remote process, because it means that an attacker who has gained arbitrary
writes in a process but not arbitrary code execution or file system
Expand Down

0 comments on commit bf0a97c

Please sign in to comment.