-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: python imports #438
fix: python imports #438
Conversation
I think in general this is the right idea. I'll look into the details later today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening a PR! The idea is correct, but I think it's better to move setting FILE_PATH
and ROOT_PATH
up instead of pushing them down. I've left detailed comments on what I have in mind. I think this might make the change a little less invasive as well. Let me know if you have questions!
source_path, | ||
source_root, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of pushing source path and root path all the way down, I think it's better to set FILE_PATH_VAR
and ROOT_PATH_VAR
in the globals
object here. I expect that at some point this will need to be more flexible still, to allow language specific logic for computing source roots and relative paths (but that's a long term thing). But moving the place where we set those variables is in line with that idea. I think it will also make the change less invasive.
tree-sitter-stack-graphs/src/lib.rs
Outdated
source_path: &Path, | ||
source_root: &Path, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parameters won't be necessary anymore, and other places where they are propagated.
tree-sitter-stack-graphs/src/lib.rs
Outdated
if globals.get(&FILE_PATH_VAR.into()).is_none() { | ||
let file_name = self.stack_graph[self.file].to_string(); | ||
let file_name = self.source_path.to_str().unwrap().to_string(); | ||
globals | ||
.add(FILE_PATH_VAR.into(), file_name.into()) | ||
.expect("Failed to set FILE_PATH"); | ||
} | ||
|
||
if globals.get(&ROOT_PATH_VAR.into()).is_none() { | ||
let root_path = self.source_root.to_str().unwrap().to_string(); | ||
globals | ||
.add(ROOT_PATH_VAR.into(), root_path.into()) | ||
.expect("Failed to set ROOT_PATH"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole block would be removed here, and move into the indexer. The conditionals won't be needed there anymore.
let mut globals = Variables::new(); | ||
Self::load_globals_from_config_str(&config, &mut globals)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add to globals
here as well, but I think it is nicer to just require those variables to be set in builtins.cfg
. See for example:
@@ -13,6 +13,7 @@ | |||
;; ^^^^^^^^^^^^^^^^ | |||
|
|||
global FILE_PATH | |||
global ROOT_PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you set a default value here, it makes life a bit easier because you don't have to set ROOT_PATH
unless you need to.
global ROOT_PATH | |
global ROOT_PATH = "" |
@@ -428,10 +428,19 @@ impl<'a> Indexer<'a> { | |||
cancellation_flag: &dyn CancellationFlag, | |||
) -> std::result::Result<(), BuildErrorWithSource<'b>> { | |||
let relative_source_path = source_path.strip_prefix(source_root).unwrap(); | |||
// here the file should also have stripped the source_root from its path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to use the relative path for the file
. This would very easily lead to conflicts, for example if you index multiple source directories with the CLI. But if we set the global variables explicitly, it won't be an issue that the scope graph records the full path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep 100% agreed, this is an old comment I forgot to remove
Thanks for the review, will look into it and make those changes by the end of the weekend I believe. I agree moving it in the globals is overall a good approach for the reasons you mentioned, however don't you think that this could lead to some runtime errors as it will not be checked anymore ? I think at least file path (but probably root path too) are standard and should be required somewhere (/ checked at compile time) ? |
I implemented the changes, indeed it's a far better cascade. I also thought that we might want to create variables directly from the file path, as it is always needed |
1412bf5
to
b326053
Compare
It is true that callers can now forget this, which would lead to a runtime error. I think the impact of this problem will be low, since any attempt to run the TSG without the right variables set will fail. (It's not silently set to a default unless one is explicitly set in the TSG.) So any basic test triggering this code path should immediately reveal the problem. The idea we have for long term is that callers wouldn't have to set these variables explicitly. Instead, languages would have some kind of source folder analyzer as part of their configuration. That would determine which files in a directory can be analyzed as part of the languages and which variables are set for them. That being said, with these changes we are in an awkward middle. We might keep the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments. I think this is going in the right direction! I left a few more remarks, but these are relatively small. The overall structure looks good now.
tree-sitter-stack-graphs/src/lib.rs
Outdated
|
||
// FILE_PATH is mandatory | ||
globals | ||
.get(&FILE_PATH_VAR.into()) | ||
.expect("FILE_PATH not set"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is the right approach. While FILE_PATH
is required in all TSG's we actually use, there's nothing inherently here that requires it.
I'm thinking we might just keep the old code here. It will act as a fallback and we don't unnecessarily introduce a breaking change here.
let builtins_path_var = Path::new("<builtins>"); | ||
builtins_globals | ||
.add( | ||
FILE_PATH_VAR.into(), | ||
builtins_path_var.to_str().unwrap().into(), | ||
) | ||
.expect("failed to add file path variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're not using this anywhere else, I'd just inline `builtins_path_var here.
let builtins_path_var = Path::new("<builtins>"); | |
builtins_globals | |
.add( | |
FILE_PATH_VAR.into(), | |
builtins_path_var.to_str().unwrap().into(), | |
) | |
.expect("failed to add file path variable"); | |
builtins_globals | |
.add( | |
FILE_PATH_VAR.into(), | |
"<builtins>".into(), | |
) | |
.expect("failed to add file path variable"); |
let file_name = path.to_string_lossy(); | ||
let file: stack_graphs::arena::Handle<stack_graphs::graph::File> = | ||
graph.add_file(&file_name).unwrap(); | ||
let mut globals = Variables::new(); | ||
|
||
globals | ||
.add(FILE_PATH_VAR.into(), path.to_str().unwrap().into()) | ||
.expect("failed to add file path variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The builtins are a bit special and the actual path is not relevant. They are usually definitions that are implicitly defined, like a runtime library or something.
I'm not sure why we are using path
here in add_file
. I think we should be using "<builtins>"
here as well, both in add_file
and as the value in globals
. As a little code cleanup we could define a top-level constant BUILTINS_FILENAME
and replace the literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if having chevrons could cause issues (as it's not a standard path character I believe). But it was working as is so let's keep it
static FILE_PATH_VAR: &'static str = "FILE_PATH"; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make the constant in the crate public instead of redefining it here.
tree-sitter-stack-graphs/src/lib.rs
Outdated
static ROOT_NODE_VAR: &'static str = "ROOT_NODE"; | ||
static JUMP_TO_SCOPE_NODE_VAR: &'static str = "JUMP_TO_SCOPE_NODE"; | ||
static FILE_PATH_VAR: &'static str = "FILE_PATH"; | ||
static ROOT_PATH_VAR: &'static str = "ROOT_PATH"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static ROOT_NODE_VAR: &'static str = "ROOT_NODE"; | |
static JUMP_TO_SCOPE_NODE_VAR: &'static str = "JUMP_TO_SCOPE_NODE"; | |
static FILE_PATH_VAR: &'static str = "FILE_PATH"; | |
static ROOT_PATH_VAR: &'static str = "ROOT_PATH"; | |
/// Name of the variable used to pass the root node. | |
pub static ROOT_NODE_VAR: &'static str = "ROOT_NODE"; | |
/// Name of the variable used to pass the jump-to-scope node. | |
pub static JUMP_TO_SCOPE_NODE_VAR: &'static str = "JUMP_TO_SCOPE_NODE"; | |
/// Name of the variable used to pass the file path. | |
/// If a root path is given, it should be a descendant the root path. | |
pub static FILE_PATH_VAR: &'static str = "FILE_PATH"; | |
/// Name of the variable used to pass the root path. | |
/// If given, should be an ancestor of the file path. | |
pub static ROOT_PATH_VAR: &'static str = "ROOT_PATH"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made those consts
f3548fb
to
ad4d517
Compare
Thanks for the review, I added the requested changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Two places where I would change setting the file path variable to a fallback, so config can override it.
Some small style nits, but this looks good!
let file: stack_graphs::arena::Handle<stack_graphs::graph::File> = | ||
graph.add_file(&file_name).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the type annotation is necessary here, that should be possible to infer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes my bad, probably miss-clicked on the annotation in my IDE
globals | ||
.add(FILE_PATH_VAR.into(), BUILTINS_FILENAME.into()) | ||
.expect("failed to add file path variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move after the load globals below, and make conditional, so the config can override the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting it before was intended, load_globals_from_config_str
will override the default value if any, no need for a conditional I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay, overriding with add will return an error if it already exists. I believe just overriding could be nice but will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Yes, if the load functions would add or set, that would also work.
builtins_globals | ||
.add(FILE_PATH_VAR.into(), BUILTINS_FILENAME.into()) | ||
.expect("failed to add file path variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: move below the load globals and make optional so the config can override it.
@@ -4,11 +4,11 @@ | |||
// Licensed under either of Apache License, Version 2.0, or MIT license, at your option. | |||
// Please see the LICENSE-APACHE or LICENSE-MIT files in this distribution for license details. | |||
// ------------------------------------------------------------------------------------------------ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, except the conditions are inverted. I'll commit those suggestions myself and start a CI run to see how we are doing.
globals | ||
.add( | ||
FILE_PATH_VAR.into(), | ||
fragments.path.to_str().unwrap().into(), | ||
) | ||
.expect("failed to add file path variable"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI failure is probably because this is still before loading the globals from the test, which can set file path too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe adding a add_if_not_set
(not sure about the naming) convenience function to Variables would be a nice addition. Let me fix this
Oh sorry my bad on this one, did this late and did not pay attention, sorry |
🎉 Thanks for the contribution @nohehf! |
Hello @hendrikvanantwerpen, this fixes #430.
This issue was that a symbol was poped for each part of the file path, but it was pushed (on import) only for each part of the import. (eg. it pushed Users, nohehf, module, foo and popped module, foo).
Languages like python needs knowledge about the "root path", as modules are resolved relative to the "input script" as stated here: https://docs.python.org/3/tutorial/modules.html#the-module-search-path
Here is my fix proposal, and I believe it could be useful for several other languages:
ROOT_PATH
global variable, that allows to compute the relative file path. This is particularly useful for interpreted languages like python that creates the modules paths relative to a base path.stack_graph[self.file]
as it's stated instack-graphs/stack-graphs/src/graph.rs
Lines 335 to 337 in 558d1da
Let me know what you think.