Skip to content

Add support for mappingproxy objects #2435

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from

Conversation

dignissimus
Copy link

This pull request adds support for mappingproxy objects.

Resolves #2108

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for implementing this!

It's overall very good. To make sure we get this API right when we merge, I've placed quite a few comments around the place with some things that need changing and then also some places where we can discuss the design a little to ensure we make the right choices.

Comment on lines 456 to 457
let mappingproxy = <PyMappingProxy as PyTryFrom>::try_from(ob)?;
mappingproxy.copy()?.extract()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .copy() call is sub-optimal as it will create a whole new copy as a Python dict first. Instead, we should just create the HashMap from iterating the mapping proxy.

Maybe to make the code reusable we could refactor into a helper something like this:

fn mappingproxy_to_items_sequence<'py, K: FromPyObject<'py>, V: FromPyObject<'py>>(mappingproxy: &'py PyMappingProxy) -> impl IntoIterator<Item = PyResult<(K, V)> {
    mappingproxy.iter()
        .map(|(key, value)| Ok((key.extract()?, value.extract()?)))
}

and then creating the HashMap should just become

mappingproxy_to_items_sequence(mappingproxy).collect()?

}
Ok(ret)
}
Err(_) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the object is not a mappingproxy it would be nice to return this Err(_) from the dict conversion, to preserve existing UX.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully resolved in c5e434d

Comment on lines +41 to +53
/// Creates a new mappingproxy from the sequence given.
///
/// The sequence must consist of `(PyObject, PyObject)`.
///
/// Returns an error on invalid input. In the case of key collisions,
/// this keeps the last entry seen.
#[cfg(not(PyPy))]
pub fn from_sequence(py: Python<'_>, seq: PyObject) -> PyResult<&PyMappingProxy> {
unsafe {
let dict = py.from_owned_ptr::<PyDict>(PyDict::from_sequence(py, seq)?.as_ptr());
py.from_owned_ptr_or_err(ffi::PyDictProxy_New(dict.as_ptr()))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is just a direct copy of the PyDict equivalent? Depending on what we decide for PyMappingProxy::new, this may or may not be necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that this isn't necessary, users will just be able to call PyMappingProxy::new(PyDict::from_sequence(...).as_mapping())

Comment on lines +55 to +58
/// Returns a new mappingproxy that contains the same key-value pairs as self.
///
/// This is equivalent to the Python expression `self.copy()`.
pub fn copy(&self) -> PyResult<&PyDict> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think &PyDict is correct, looks like it calls copy on the inner mapping:

https://github.com/python/cpython/blob/66f5add1f0ac801cba5ece36e6cfd68985d82029/Objects/descrobject.c#L1135

So the return type should either be &PyAny or (maybe) &PyMapping.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump for this.

}

#[cfg(test)]
mod tests {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice set of tests!

@@ -0,0 +1,86 @@
use std::collections::HashMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why these tests weren't just written as unit tests too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question still stands.

@davidhewitt
Copy link
Member

Hi @dignissimus, just wondering what the status of this is and whether you're waiting for any input from me? No rush at all, just browsing our backlog of PRs.

@dignissimus
Copy link
Author

Hi @dignissimus, just wondering what the status of this is and whether you're waiting for any input from me? No rush at all, just browsing our backlog of PRs.

Hey! For the elements parameter, I'm looking for input on the type? When writing it, I saw the benefit of impl IntoPyDict as the fact that it covered types of data, so they don't have to be converted. to other datatypes. But if something else works better, I'd be willing to change it.

@dignissimus
Copy link
Author

Also, some of the builds fail (emscripten) for reasons outside of testing, if I remember correctly it might be to do with imports? I'm not able to fix it on my own without help. I'd need guidance to amend it or have somebody else add a commit to the PR.

Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool - added a bunch of suggestions which should help you continue moving!

@@ -0,0 +1,86 @@
use std::collections::HashMap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question still stands.

use pyo3::prelude::Python;
use pyo3::types::{IntoPyMappingProxy, PyInt, PyString};

const LEN: usize = 10_000_000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think reducing this size may fix emscripten; I think it's simply asking for an allocation larger than emscripten can support.

Comment on lines +55 to +58
/// Returns a new mappingproxy that contains the same key-value pairs as self.
///
/// This is equivalent to the Python expression `self.copy()`.
pub fn copy(&self) -> PyResult<&PyDict> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump for this.


impl PyMappingProxy {
/// Creates a mappingproxy from an object.
pub fn new(py: Python<'_>, elements: impl IntoPyDict) -> PyResult<&'_ PyMappingProxy> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should go with &PyMapping, we can backwards-compatibly widen to &PyAny if that turns out to make sense.

}
Err(msg) => {
if let Ok(mappingproxy) = <PyMappingProxy as PyTryFrom>::try_from(ob) {
mappingproxy.copy()?.extract()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to avoid this copy by iterating the mappingproxy.

Ok(ret)
}
Err(_) => {
let mappingproxy = <PyMappingProxy as PyTryFrom>::try_from(ob)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment r.e. copy here, and that hashbrown / indexmap implementations still needed

Comment on lines +41 to +53
/// Creates a new mappingproxy from the sequence given.
///
/// The sequence must consist of `(PyObject, PyObject)`.
///
/// Returns an error on invalid input. In the case of key collisions,
/// this keeps the last entry seen.
#[cfg(not(PyPy))]
pub fn from_sequence(py: Python<'_>, seq: PyObject) -> PyResult<&PyMappingProxy> {
unsafe {
let dict = py.from_owned_ptr::<PyDict>(PyDict::from_sequence(py, seq)?.as_ptr());
py.from_owned_ptr_or_err(ffi::PyDictProxy_New(dict.as_ptr()))
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking that this isn't necessary, users will just be able to call PyMappingProxy::new(PyDict::from_sequence(...).as_mapping())

@dignissimus
Copy link
Author

Cool - added a bunch of suggestions which should help you continue moving!

Thanks! I'll make the changes this Saturday

@DataTriny
Copy link
Contributor

Hello @dignissimus

Thank you for contributing to PyO3!

The project is undergoing changes on its licensing model to better align with the rest of the Rust crates ecosystem.

Before your work gets merged, please express your consent by leaving a comment on this pull request.

Thank you!

@ofek
Copy link

ofek commented Jul 26, 2023

Has there been any update on this?

@davidhewitt
Copy link
Member

No progress since what is seen here, it's also not a priority feature for me so unlikely to move any time soon without further external contributions.

@davidhewitt
Copy link
Member

Superseded by #3521

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for mappingproxy? (standard library read-only dict)
4 participants