-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix segmentation fault on invalid SMARTS #19
base: main
Are you sure you want to change the base?
Conversation
src/graphmol/rw_mol.rs
Outdated
@@ -48,11 +48,17 @@ impl RWMol { | |||
ROMol { ptr } | |||
} | |||
|
|||
pub fn from_smarts(smarts: &str) -> Result<Self, Box<dyn std::error::Error>> { | |||
pub fn from_smarts(smarts: &str) -> Result<Self, &str> { | |||
if smarts.is_empty(){ return Err("empty SMARTS"); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I do not study chemistry myself, but informatics, I am not sure, if this is the wanted behavior or if you'd prefer a wildcard here or if maybe an empty string is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably don't want to allow an empty smarts, so this works for me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for contributing this PR! It's close but I think a little guidance would help. First off, it's good to reiterate the structure of this git repository. It contains two rust crates: rdkit and rdkit-sys. The sys crate is where pointers are built and where rust meets the C++ code of rdkit. You want all exception handling to happen in the sys crate. The rdkit
crate is for high-level convenience. Every pointer type from rdkit-sys
is wrapped in a struct -- with hidden pointers, as you've seen, we wan't to create a "nanny state" where our users can't do unsafe things. I believe you want to take a look at the sys crate as part of your bug fix.
Now for my feedback on this code snippet:
Using the Result
type is absolutely the right way to go -- we need to discriminate between good values and error values. My only objection is the type on the error path, &str
is not very Rust-y. We want to avoid "stringly-typed" data and instead use more enums.
See
Lines 21 to 34 in f6da32a
pub fn from_smiles(smiles: &str) -> Result<Self, ROMolError> { | |
let_cxx_string!(smiles_cxx_string = smiles); | |
let ptr = ro_mol_ffi::smiles_to_mol(&smiles_cxx_string); | |
match ptr { | |
Ok(ptr) => { | |
if ptr.is_null() { | |
Err(ROMolError::UnknownConversionError) | |
} else { | |
Ok(ROMol { ptr }) | |
} | |
} | |
Err(e) => Err(ROMolError::ConversionException(e.to_string())), | |
} | |
} |
RWMolError
enum and instantiate it in your error path.
My next concern is performing condition checking in rw_mol.rs
itself -- in general I want rdkit (c++) to drive what happens. It's very possible in the future our friends at the RDKit project could change how empty strings behave (although unlikely) and I don't want the Rust logic to be at odds with whatever they decide. Instead, check if the returned pointer is null and if so, build the appropriate enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for the detailed feedback! You are totally right about the enums and I am going to adapt it to the handling of errors in ro_mol.rs.
Concerning the empty SMARTS, I checked what happens in Python and they return a valid Mol object that doesn't have any contents (so not a nullpointer, but empty). So maybe we should go for that, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to model what the C++ library does -- it does mean that python folks are going to have to rethink how they use rdkit. The rdkit-rs implementation puts errors upfront and in your face. Clear error propagation is important for our users of the the cheminee search engine, for example. We don't want too much "magic".
So think of this project as C++ flavored RDKit 😄. Hopefully you are familiar with the autogenerated rdkit docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a deeper look at the rdkit-Documentation, though I must say I have never dealt with C++. I still found an interesting comment in the actual C++ code stating: // empty strings produce empty molecules
. So maybe having an error in Rust at this part would not really fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, please address my last comment and I'm happy to merge this and push it up to crates.io. Thanks for your perseverance and patience!
Thank you @aleebberg for finding this! We have been fixing pointer issues lately and looks like we missed this one. @xrl I'm going to merge this as it seems to be a good stop-gap measure, but we probably want to look into what's happening to the pointer in |
src/graphmol/rw_mol.rs
Outdated
@@ -48,11 +48,17 @@ impl RWMol { | |||
ROMol { ptr } | |||
} | |||
|
|||
pub fn from_smarts(smarts: &str) -> Result<Self, Box<dyn std::error::Error>> { | |||
pub fn from_smarts(smarts: &str) -> Result<Self, &str> { | |||
if smarts.is_empty(){ return Err("empty SMARTS"); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably don't want to allow an empty smarts, so this works for me!
@aleebberg Clippy is mainly complaining about a space before an "=" sign. Would you mind adding that? Also, the failed test looks to be related to an issue we introduced on our side (i.e. making a mutable romol). Feel free to add that "mut" if you like. |
src/graphmol/rw_mol.rs
Outdated
@@ -48,11 +48,17 @@ impl RWMol { | |||
ROMol { ptr } | |||
} | |||
|
|||
pub fn from_smarts(smarts: &str) -> Result<Self, Box<dyn std::error::Error>> { | |||
pub fn from_smarts(smarts: &str) -> Result<Self, &str> { | |||
if smarts.is_empty(){ return Err("empty SMARTS"); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for contributing this PR! It's close but I think a little guidance would help. First off, it's good to reiterate the structure of this git repository. It contains two rust crates: rdkit and rdkit-sys. The sys crate is where pointers are built and where rust meets the C++ code of rdkit. You want all exception handling to happen in the sys crate. The rdkit
crate is for high-level convenience. Every pointer type from rdkit-sys
is wrapped in a struct -- with hidden pointers, as you've seen, we wan't to create a "nanny state" where our users can't do unsafe things. I believe you want to take a look at the sys crate as part of your bug fix.
Now for my feedback on this code snippet:
Using the Result
type is absolutely the right way to go -- we need to discriminate between good values and error values. My only objection is the type on the error path, &str
is not very Rust-y. We want to avoid "stringly-typed" data and instead use more enums.
See
Lines 21 to 34 in f6da32a
pub fn from_smiles(smiles: &str) -> Result<Self, ROMolError> { | |
let_cxx_string!(smiles_cxx_string = smiles); | |
let ptr = ro_mol_ffi::smiles_to_mol(&smiles_cxx_string); | |
match ptr { | |
Ok(ptr) => { | |
if ptr.is_null() { | |
Err(ROMolError::UnknownConversionError) | |
} else { | |
Ok(ROMol { ptr }) | |
} | |
} | |
Err(e) => Err(ROMolError::ConversionException(e.to_string())), | |
} | |
} |
RWMolError
enum and instantiate it in your error path.
My next concern is performing condition checking in rw_mol.rs
itself -- in general I want rdkit (c++) to drive what happens. It's very possible in the future our friends at the RDKit project could change how empty strings behave (although unlikely) and I don't want the Rust logic to be at odds with whatever they decide. Instead, check if the returned pointer is null and if so, build the appropriate enum.
I enabled github actions for this PR, now you can see the formatting errors here: https://github.com/rdkit-rs/rdkit/actions/runs/7128018658/job/19420335878?pr=19#step:7:1211 . |
Hi, |
Fixes #18