Skip to content

Commit ac4613e

Browse files
Improve documentation about when we free memory, resolves #311 (#1807)
* Improve API docs regarding when we free memory, resolves #311 * Add chapter to guide about when we free memory, resolves #311 * Fix typos in documentation Co-authored-by: David Hewitt <[email protected]> * Add links from guide to docs.rs * Update guide/src/memory.md Co-authored-by: David Hewitt <[email protected]>
1 parent 79c7e28 commit ac4613e

File tree

5 files changed

+237
-7
lines changed

5 files changed

+237
-7
lines changed

guide/src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
- [Parallelism](parallelism.md)
1818
- [Debugging](debugging.md)
1919
- [Features Reference](features.md)
20+
- [Memory Management](memory.md)
2021
- [Advanced Topics](advanced.md)
2122
- [Building and Distribution](building_and_distribution.md)
2223
- [Supporting multiple Python versions](building_and_distribution/multiple_python_versions.md)

guide/src/advanced.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,7 @@ The C API is naturally unsafe and requires you to manage reference counts, error
88

99
## Memory Management
1010

11-
PyO3's "owned references" (`&PyAny` etc.) make PyO3 more ergonomic to use by ensuring that their lifetime can never be longer than the duration the Python GIL is held. This means that most of PyO3's API can assume the GIL is held. (If PyO3 could not assume this, every PyO3 API would need to take a `Python` GIL token to prove that the GIL is held.)
12-
13-
The caveat to these "owned references" is that Rust references do not normally convey ownership (they are always `Copy`, and cannot implement `Drop`). Whenever a PyO3 API returns an owned reference, PyO3 stores it internally, so that PyO3 can decrease the reference count just before PyO3 releases the GIL.
14-
15-
For most use cases this behaviour is invisible. Occasionally, however, users may need to clear memory usage sooner than PyO3 usually does. PyO3 exposes this functionality with the the `GILPool` struct. When a `GILPool` is dropped, ***all*** owned references created after the `GILPool` was created will be cleared.
16-
17-
The unsafe function `Python::new_pool` allows you to create a new `GILPool`. When doing this, you must be very careful to ensure that once the `GILPool` is dropped you do not retain access any owned references created after the `GILPool` was created.
11+
PyO3's `&PyAny` "owned references" and `Py<PyAny>` smart pointers are used to
12+
access memory stored in Python's heap. This memory sometimes lives for longer
13+
than expected because of differences in Rust and Python's memory models. See
14+
the chapter on [memory management](./memory.md) for more information.

guide/src/memory.md

Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
# Memory Management
2+
3+
Rust and Python have very different notions of memory management. Rust has
4+
a strict memory model with concepts of ownership, borrowing, and lifetimes,
5+
where memory is freed at predictable points in program execution. Python has
6+
a looser memory model in which variables are reference-counted with shared,
7+
mutable state by default. A global interpreter lock (GIL) is needed to prevent
8+
race conditions, and a garbage collector is needed to break reference cycles.
9+
Memory in Python is freed eventually by the garbage collector, but not usually
10+
in a predictable way.
11+
12+
PyO3 bridges the Rust and Python memory models with two different strategies for
13+
accessing memory allocated on Python's heap from inside Rust. These are
14+
GIL-bound, or "owned" references, and GIL-independent `Py<Any>` smart pointers.
15+
16+
## GIL-bound Memory
17+
18+
PyO3's GIL-bound, "owned references" (`&PyAny` etc.) make PyO3 more ergonomic to
19+
use by ensuring that their lifetime can never be longer than the duration the
20+
Python GIL is held. This means that most of PyO3's API can assume the GIL is
21+
held. (If PyO3 could not assume this, every PyO3 API would need to take a
22+
`Python` GIL token to prove that the GIL is held.) This allows us to write
23+
very simple and easy-to-understand programs like this:
24+
25+
```rust
26+
Python::with_gil(|py| -> PyResult<()> {
27+
let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?;
28+
println!("Python says: {}", hello);
29+
Ok(())
30+
})?;
31+
```
32+
33+
Internally, calling `Python::with_gil()` or `Python::acquire_gil()` creates a
34+
`GILPool` which owns the memory pointed to by the reference. In the example
35+
above, the lifetime of the reference `hello` is bound to the `GILPool`. When
36+
the `with_gil()` closure ends or the `GILGuard` from `acquire_gil()` is dropped,
37+
the `GILPool` is also dropped and the Python reference counts of the variables
38+
it owns are decreased, releasing them to the Python garbage collector. Most
39+
of the time we don't have to think about this, but consider the following:
40+
41+
```rust
42+
Python::with_gil(|py| -> PyResult<()> {
43+
for _ in 0..10 {
44+
let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?;
45+
println!("Python says: {}", hello);
46+
}
47+
// There are 10 copies of `hello` on Python's heap here.
48+
Ok(())
49+
})?;
50+
```
51+
52+
We might assume that the `hello` variable's memory is freed at the end of each
53+
loop iteration, but in fact we create 10 copies of `hello` on Python's heap.
54+
This may seem surprising at first, but it is completely consistent with Rust's
55+
memory model. The `hello` variable is dropped at the end of each loop, but it
56+
is only a reference to the memory owned by the `GILPool`, and its lifetime is
57+
bound to the `GILPool`, not the for loop. The `GILPool` isn't dropped until
58+
the end of the `with_gil()` closure, at which point the 10 copies of `hello`
59+
are finally released to the Python garbage collector.
60+
61+
In general we don't want unbounded memory growth during loops! One workaround
62+
is to acquire and release the GIL with each iteration of the loop.
63+
64+
```rust
65+
for _ in 0..10 {
66+
Python::with_gil(|py| -> PyResult<()> {
67+
let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?;
68+
println!("Python says: {}", hello);
69+
Ok(())
70+
})?; // only one copy of `hello` at a time
71+
}
72+
```
73+
74+
It might not be practical or performant to acquire and release the GIL so many
75+
times. Another workaround is to work with the `GILPool` object directly, but
76+
this is unsafe.
77+
78+
```rust
79+
Python::with_gil(|py| -> PyResult<()> {
80+
for _ in 0..10 {
81+
let pool = unsafe { py.new_pool() };
82+
let py = pool.python();
83+
let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?;
84+
println!("Python says: {}", hello);
85+
}
86+
Ok(())
87+
})?;
88+
```
89+
90+
The unsafe method `Python::new_pool` allows you to create a nested `GILPool`
91+
from which you can retrieve a new `py: Python` GIL token. Variables created
92+
with this new GIL token are bound to the nested `GILPool` and will be released
93+
when the nested `GILPool` is dropped. Here, the nested `GILPool` is dropped
94+
at the end of each loop iteration, before the `with_gil()` closure ends.
95+
96+
When doing this, you must be very careful to ensure that once the `GILPool` is
97+
dropped you do not retain access to any owned references created after the
98+
`GILPool` was created. Read the
99+
[documentation for `Python::new_pool()`]({{#PYO3_DOCS_URL}}/pyo3/prelude/struct.Python.html#method.new_pool)
100+
for more information on safety.
101+
102+
## GIL-independent Memory
103+
104+
Sometimes we need a reference to memory on Python's heap that can outlive the
105+
GIL. Python's `Py<PyAny>` is analogous to `Rc<T>`, but for variables whose
106+
memory is allocated on Python's heap. Cloning a `Py<PyAny>` increases its
107+
internal reference count just like cloning `Rc<T>`. The smart pointer can
108+
outlive the GIL from which it was created. It isn't magic, though. We need to
109+
reacquire the GIL to access the memory pointed to by the `Py<PyAny>`.
110+
111+
What happens to the memory when the last `Py<PyAny>` is dropped and its
112+
reference count reaches zero? It depends whether or not we are holding the GIL.
113+
114+
```rust
115+
Python::with_gil(|py| -> PyResult<()> {
116+
let hello: Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract())?;
117+
println!("Python says: {}", hello.as_ref(py));
118+
Ok(())
119+
});
120+
```
121+
122+
At the end of the `Python::with_gil()` closure `hello` is dropped, and then the
123+
GIL is dropped. Since `hello` is dropped while the GIL is still held by the
124+
current thread, its memory is released to the Python garbage collector
125+
immediately.
126+
127+
This example wasn't very interesting. We could have just used a GIL-bound
128+
`&PyString` reference. What happens when the last `Py<Any>` is dropped while
129+
we are *not* holding the GIL?
130+
131+
```rust
132+
let hello: Py<PyString> = Python::with_gil(|py| {
133+
Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract())
134+
})?;
135+
// Do some stuff...
136+
// Now sometime later in the program we want to access `hello`.
137+
Python::with_gil(|py| {
138+
println!("Python says: {}", hello.as_ref(py));
139+
});
140+
// Now we're done with `hello`.
141+
drop(hello); // Memory *not* released here.
142+
// Sometime later we need the GIL again for something...
143+
Python::with_gil(|py|
144+
// Memory for `hello` is released here.
145+
);
146+
```
147+
148+
When `hello` is dropped *nothing* happens to the pointed-to memory on Python's
149+
heap because nothing _can_ happen if we're not holding the GIL. Fortunately,
150+
the memory isn't leaked. PyO3 keeps track of the memory internally and will
151+
release it the next time we acquire the GIL.
152+
153+
We can avoid the delay in releasing memory if we are careful to drop the
154+
`Py<Any>` while the GIL is held.
155+
156+
```rust
157+
let hello: Py<PyString> = Python::with_gil(|py| {
158+
Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract())
159+
})?;
160+
// Do some stuff...
161+
// Now sometime later in the program:
162+
Python::with_gil(|py| {
163+
println!("Python says: {}", hello.as_ref(py));
164+
drop(hello); // Memory released here.
165+
});
166+
```
167+
168+
We could also have used `Py::into_ref()`, which consumes `self`, instead of
169+
`Py::as_ref()`. But note that in addition to being slower than `as_ref()`,
170+
`into_ref()` binds the memory to the lifetime of the `GILPool`, which means
171+
that rather than being released immediately, the memory will not be released
172+
until the GIL is dropped.
173+
174+
```rust
175+
let hello: Py<PyString> = Python::with_gil(|py| {
176+
Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract())
177+
})?;
178+
// Do some stuff...
179+
// Now sometime later in the program:
180+
Python::with_gil(|py| {
181+
println!("Python says: {}", hello.into_ref(py));
182+
// Memory not released yet.
183+
// Do more stuff...
184+
// Memory released here at end of `with_gil()` closure.
185+
});
186+
```

src/instance.rs

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,19 @@ pub unsafe trait PyNativeType: Sized {
166166
/// implement the [`PyGCProtocol`](crate::class::gc::PyGCProtocol). If your pyclass
167167
/// contains other Python objects you should implement this protocol to avoid leaking memory.
168168
///
169+
/// # A note on Python reference counts
170+
///
171+
/// Dropping a [`Py`]`<T>` will eventually decrease Python's reference count
172+
/// of the pointed-to variable, allowing Python's garbage collector to free
173+
/// the associated memory, but this may not happen immediately. This is
174+
/// because a [`Py`]`<T>` can be dropped at any time, but the Python reference
175+
/// count can only be modified when the GIL is held.
176+
///
177+
/// If a [`Py`]`<T>` is dropped while its thread happens to be holding the
178+
/// GIL then the Python reference count will be decreased immediately.
179+
/// Otherwise, the reference count will be decreased the next time the GIL is
180+
/// reacquired.
181+
///
169182
/// # A note on `Send` and `Sync`
170183
///
171184
/// Accessing this object is threadsafe, since any access to its API requires a [`Python<'py>`](crate::Python) token.

src/python.rs

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,39 @@ impl PartialOrd<(u8, u8, u8)> for PythonVersionInfo<'_> {
129129
///
130130
/// To avoid deadlocking, you should release the GIL before trying to lock a mutex, e.g. with
131131
/// [Python::allow_threads].
132+
///
133+
/// # A note on Python reference counts
134+
///
135+
/// The [`Python`] type can be used to generate references to variables in
136+
/// Python's memory e.g. using [`Python::eval()`] and indirectly e.g.
137+
/// using [`PyModule::import()`], which takes a [`Python`] token
138+
/// as one if its arguments to prove the GIL is held. The lifetime of these
139+
/// references is bound to the GIL (more precisely the [`GILPool`], see
140+
/// [`Python::new_pool()`]), which can cause surprising results with respect to
141+
/// when a variable's reference count is decreased so that it can be released to
142+
/// the Python garbage collector. For example:
143+
///
144+
/// ```rust
145+
/// # use pyo3::prelude::*;
146+
/// # use pyo3::types::PyString;
147+
/// # fn main () -> PyResult<()> {
148+
/// Python::with_gil(|py| -> PyResult<()> {
149+
/// for _ in 0..10 {
150+
/// let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?;
151+
/// println!("Python says: {}", hello.to_str()?);
152+
/// }
153+
/// Ok(())
154+
/// })
155+
/// # }
156+
/// ```
157+
///
158+
/// The variable `hello` is dropped at the end of each loop iteration, but the
159+
/// lifetime of the pointed-to memory is bound to the [`GILPool`] and will not
160+
/// be dropped until the [`GILPool`] is dropped at the end of
161+
/// [`Python::with_gil()`]. Only then is each `hello` variable's Python
162+
/// reference count decreased. This means at the last line of the example there
163+
/// are 10 copies of `hello` in Python's memory, not just one as we might expect
164+
/// from typical Rust lifetimes.
132165
#[derive(Copy, Clone)]
133166
pub struct Python<'p>(PhantomData<&'p GILGuard>);
134167

0 commit comments

Comments
 (0)