|
| 1 | +# Memory Management |
| 2 | + |
| 3 | +Rust and Python have very different notions of memory management. Rust has |
| 4 | +a strict memory model with concepts of ownership, borrowing, and lifetimes, |
| 5 | +where memory is freed at predictable points in program execution. Python has |
| 6 | +a looser memory model in which variables are reference-counted with shared, |
| 7 | +mutable state by default. A global interpreter lock (GIL) is needed to prevent |
| 8 | +race conditions, and a garbage collector is needed to break reference cycles. |
| 9 | +Memory in Python is freed eventually by the garbage collector, but not usually |
| 10 | +in a predictable way. |
| 11 | + |
| 12 | +PyO3 bridges the Rust and Python memory models with two different strategies for |
| 13 | +accessing memory allocated on Python's heap from inside Rust. These are |
| 14 | +GIL-bound, or "owned" references, and GIL-independent `Py<Any>` smart pointers. |
| 15 | + |
| 16 | +## GIL-bound Memory |
| 17 | + |
| 18 | +PyO3's GIL-bound, "owned references" (`&PyAny` etc.) make PyO3 more ergonomic to |
| 19 | +use by ensuring that their lifetime can never be longer than the duration the |
| 20 | +Python GIL is held. This means that most of PyO3's API can assume the GIL is |
| 21 | +held. (If PyO3 could not assume this, every PyO3 API would need to take a |
| 22 | +`Python` GIL token to prove that the GIL is held.) This allows us to write |
| 23 | +very simple and easy-to-understand programs like this: |
| 24 | + |
| 25 | +```rust |
| 26 | +Python::with_gil(|py| -> PyResult<()> { |
| 27 | + let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?; |
| 28 | + println!("Python says: {}", hello); |
| 29 | + Ok(()) |
| 30 | +})?; |
| 31 | +``` |
| 32 | + |
| 33 | +Internally, calling `Python::with_gil()` or `Python::acquire_gil()` creates a |
| 34 | +`GILPool` which owns the memory pointed to by the reference. In the example |
| 35 | +above, the lifetime of the reference `hello` is bound to the `GILPool`. When |
| 36 | +the `with_gil()` closure ends or the `GILGuard` from `acquire_gil()` is dropped, |
| 37 | +the `GILPool` is also dropped and the Python reference counts of the variables |
| 38 | +it owns are decreased, releasing them to the Python garbage collector. Most |
| 39 | +of the time we don't have to think about this, but consider the following: |
| 40 | + |
| 41 | +```rust |
| 42 | +Python::with_gil(|py| -> PyResult<()> { |
| 43 | + for _ in 0..10 { |
| 44 | + let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?; |
| 45 | + println!("Python says: {}", hello); |
| 46 | + } |
| 47 | + // There are 10 copies of `hello` on Python's heap here. |
| 48 | + Ok(()) |
| 49 | +})?; |
| 50 | +``` |
| 51 | + |
| 52 | +We might assume that the `hello` variable's memory is freed at the end of each |
| 53 | +loop iteration, but in fact we create 10 copies of `hello` on Python's heap. |
| 54 | +This may seem surprising at first, but it is completely consistent with Rust's |
| 55 | +memory model. The `hello` variable is dropped at the end of each loop, but it |
| 56 | +is only a reference to the memory owned by the `GILPool`, and its lifetime is |
| 57 | +bound to the `GILPool`, not the for loop. The `GILPool` isn't dropped until |
| 58 | +the end of the `with_gil()` closure, at which point the 10 copies of `hello` |
| 59 | +are finally released to the Python garbage collector. |
| 60 | + |
| 61 | +In general we don't want unbounded memory growth during loops! One workaround |
| 62 | +is to acquire and release the GIL with each iteration of the loop. |
| 63 | + |
| 64 | +```rust |
| 65 | +for _ in 0..10 { |
| 66 | + Python::with_gil(|py| -> PyResult<()> { |
| 67 | + let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?; |
| 68 | + println!("Python says: {}", hello); |
| 69 | + Ok(()) |
| 70 | + })?; // only one copy of `hello` at a time |
| 71 | +} |
| 72 | +``` |
| 73 | + |
| 74 | +It might not be practical or performant to acquire and release the GIL so many |
| 75 | +times. Another workaround is to work with the `GILPool` object directly, but |
| 76 | +this is unsafe. |
| 77 | + |
| 78 | +```rust |
| 79 | +Python::with_gil(|py| -> PyResult<()> { |
| 80 | + for _ in 0..10 { |
| 81 | + let pool = unsafe { py.new_pool() }; |
| 82 | + let py = pool.python(); |
| 83 | + let hello: &PyString = py.eval("\"Hello World!\"", None, None)?.extract()?; |
| 84 | + println!("Python says: {}", hello); |
| 85 | + } |
| 86 | + Ok(()) |
| 87 | +})?; |
| 88 | +``` |
| 89 | + |
| 90 | +The unsafe method `Python::new_pool` allows you to create a nested `GILPool` |
| 91 | +from which you can retrieve a new `py: Python` GIL token. Variables created |
| 92 | +with this new GIL token are bound to the nested `GILPool` and will be released |
| 93 | +when the nested `GILPool` is dropped. Here, the nested `GILPool` is dropped |
| 94 | +at the end of each loop iteration, before the `with_gil()` closure ends. |
| 95 | + |
| 96 | +When doing this, you must be very careful to ensure that once the `GILPool` is |
| 97 | +dropped you do not retain access to any owned references created after the |
| 98 | +`GILPool` was created. Read the |
| 99 | +[documentation for `Python::new_pool()`]({{#PYO3_DOCS_URL}}/pyo3/prelude/struct.Python.html#method.new_pool) |
| 100 | +for more information on safety. |
| 101 | + |
| 102 | +## GIL-independent Memory |
| 103 | + |
| 104 | +Sometimes we need a reference to memory on Python's heap that can outlive the |
| 105 | +GIL. Python's `Py<PyAny>` is analogous to `Rc<T>`, but for variables whose |
| 106 | +memory is allocated on Python's heap. Cloning a `Py<PyAny>` increases its |
| 107 | +internal reference count just like cloning `Rc<T>`. The smart pointer can |
| 108 | +outlive the GIL from which it was created. It isn't magic, though. We need to |
| 109 | +reacquire the GIL to access the memory pointed to by the `Py<PyAny>`. |
| 110 | + |
| 111 | +What happens to the memory when the last `Py<PyAny>` is dropped and its |
| 112 | +reference count reaches zero? It depends whether or not we are holding the GIL. |
| 113 | + |
| 114 | +```rust |
| 115 | +Python::with_gil(|py| -> PyResult<()> { |
| 116 | + let hello: Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract())?; |
| 117 | + println!("Python says: {}", hello.as_ref(py)); |
| 118 | + Ok(()) |
| 119 | +}); |
| 120 | +``` |
| 121 | + |
| 122 | +At the end of the `Python::with_gil()` closure `hello` is dropped, and then the |
| 123 | +GIL is dropped. Since `hello` is dropped while the GIL is still held by the |
| 124 | +current thread, its memory is released to the Python garbage collector |
| 125 | +immediately. |
| 126 | + |
| 127 | +This example wasn't very interesting. We could have just used a GIL-bound |
| 128 | +`&PyString` reference. What happens when the last `Py<Any>` is dropped while |
| 129 | +we are *not* holding the GIL? |
| 130 | + |
| 131 | +```rust |
| 132 | +let hello: Py<PyString> = Python::with_gil(|py| { |
| 133 | + Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract()) |
| 134 | +})?; |
| 135 | +// Do some stuff... |
| 136 | +// Now sometime later in the program we want to access `hello`. |
| 137 | +Python::with_gil(|py| { |
| 138 | + println!("Python says: {}", hello.as_ref(py)); |
| 139 | +}); |
| 140 | +// Now we're done with `hello`. |
| 141 | +drop(hello); // Memory *not* released here. |
| 142 | +// Sometime later we need the GIL again for something... |
| 143 | +Python::with_gil(|py| |
| 144 | + // Memory for `hello` is released here. |
| 145 | +); |
| 146 | +``` |
| 147 | + |
| 148 | +When `hello` is dropped *nothing* happens to the pointed-to memory on Python's |
| 149 | +heap because nothing _can_ happen if we're not holding the GIL. Fortunately, |
| 150 | +the memory isn't leaked. PyO3 keeps track of the memory internally and will |
| 151 | +release it the next time we acquire the GIL. |
| 152 | + |
| 153 | +We can avoid the delay in releasing memory if we are careful to drop the |
| 154 | +`Py<Any>` while the GIL is held. |
| 155 | + |
| 156 | +```rust |
| 157 | +let hello: Py<PyString> = Python::with_gil(|py| { |
| 158 | + Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract()) |
| 159 | +})?; |
| 160 | +// Do some stuff... |
| 161 | +// Now sometime later in the program: |
| 162 | +Python::with_gil(|py| { |
| 163 | + println!("Python says: {}", hello.as_ref(py)); |
| 164 | + drop(hello); // Memory released here. |
| 165 | +}); |
| 166 | +``` |
| 167 | + |
| 168 | +We could also have used `Py::into_ref()`, which consumes `self`, instead of |
| 169 | +`Py::as_ref()`. But note that in addition to being slower than `as_ref()`, |
| 170 | +`into_ref()` binds the memory to the lifetime of the `GILPool`, which means |
| 171 | +that rather than being released immediately, the memory will not be released |
| 172 | +until the GIL is dropped. |
| 173 | + |
| 174 | +```rust |
| 175 | +let hello: Py<PyString> = Python::with_gil(|py| { |
| 176 | + Py<PyString> = py.eval("\"Hello World!\"", None, None)?.extract()) |
| 177 | +})?; |
| 178 | +// Do some stuff... |
| 179 | +// Now sometime later in the program: |
| 180 | +Python::with_gil(|py| { |
| 181 | + println!("Python says: {}", hello.into_ref(py)); |
| 182 | + // Memory not released yet. |
| 183 | + // Do more stuff... |
| 184 | + // Memory released here at end of `with_gil()` closure. |
| 185 | +}); |
| 186 | +``` |
0 commit comments