-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#4478: Support level param in query and eval. #4529
base: master
Are you sure you want to change the base?
FIX-#4478: Support level param in query and eval. #4529
Conversation
Signed-off-by: mvashishtha <[email protected]>
Signed-off-by: mvashishtha <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #4529 +/- ##
==========================================
+ Coverage 86.36% 88.17% +1.81%
==========================================
Files 228 229 +1
Lines 18413 18726 +313
==========================================
+ Hits 15902 16512 +610
+ Misses 2511 2214 -297
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on this @mvashishtha. Just have a few questions, but overall looks good to me.
# either. | ||
level += 1 | ||
|
||
frame = sys._getframe(level + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we bumping level by one again here?
level += 1 | ||
|
||
frame = sys._getframe(level + 1) | ||
f_locals = frame.f_locals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason you removed the try
finally
statements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please retain the try..finally
, it's here to protect against creating a reference cycle if any exception fires during getting the variables. Maybe we should extend this finally
until for name in local_names
loop is over and remove f_locals
and f_globals
as well for the same refcycle reason...
level += 1 | ||
|
||
frame = sys._getframe(level + 1) | ||
f_locals = frame.f_locals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please retain the try..finally
, it's here to protect against creating a reference cycle if any exception fires during getting the variables. Maybe we should extend this finally
until for name in local_names
loop is over and remove f_locals
and f_globals
as well for the same refcycle reason...
# Bump level up one because this function is wrapped in the Modin | ||
# logging function wrapper. The alternative is for the wrapper to | ||
# give the Modin functions that deal with `level` the special | ||
# treatment of bumping `level` up by one, but that's not so nice | ||
# either. | ||
level += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should just discard any modin
-owned frames (or at least logging
-owned) instead of calculating this offset manually
self._validate_eval_query(expr, **kwargs) | ||
inplace = validate_bool_kwarg(inplace, "inplace") | ||
self._update_var_dicts_in_kwargs(expr, kwargs) | ||
self._update_var_dicts_in_kwargs(expr, level + 1, kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you had already increased level
above, why again +1
?
self._validate_eval_query(expr, **kwargs) | ||
inplace = validate_bool_kwarg(inplace, "inplace") | ||
self._update_var_dicts_in_kwargs(expr, kwargs) | ||
self._update_var_dicts_in_kwargs(expr, level + 1, kwargs) | ||
# Make sure to not pass level to query compiler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is needed
new_query_compiler = self._query_compiler.eval(expr, **kwargs) | ||
return_type = type( | ||
pandas.DataFrame(columns=self.columns) | ||
.astype(self.dtypes) | ||
.eval(expr, **kwargs) | ||
.eval(expr, level=level + 1, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, why +1
?
# Remove `level` from the kwargs if it's already there. "level" doesn't | ||
# make sense within the remote execution context, where the remote | ||
# functions have a stack which is different from the stack on the | ||
# driver node that ends in the modin eval call. | ||
level = kwargs.pop("level", 0) | ||
|
||
# Bump level up one because this function is wrapped in the Modin | ||
# logging function wrapper. The alternative is for the wrapper to | ||
# give the Modin functions that deal with `level` the special | ||
# treatment of bumping `level` up by one, but that's not so nice | ||
# eitiher. | ||
level += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given I see exact same wall of comments three times already, maybe we should turn this into a helper function like
def _extract_eval_level(kwargs):
"""
Extract `level` from eval or query `kwargs`.
Remove `level` from the kwargs if it's already there. "level" doesn't
make sense within the remote execution context, where the remote
functions have a stack which is different from the stack on the
driver node that ends in the modin eval call.
Bump level up one because eval/query are wrapped in the Modin
logging function wrapper.
"""
level = kwargs.pop("level", 0) + 1
return level, kwargs
and then use it here like
def query(self, expr, inplace=False, **kwargs):
# docstring
level, kwargs = _extract_eval_level(kwargs)
# do the rest of the stuff
df_method_with_local_var(modin_df, method, expr, level=level), | ||
df_method_with_local_var(pandas_df, method, expr, level=level), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also test the case of omitting the level
argument (to ensure we and pandas stay on the same page about default value of level
).
@@ -795,25 +795,40 @@ def equals(self, other): # noqa: PR01, RT01, D200 | |||
and self.eq(other).all().all() | |||
) | |||
|
|||
def _update_var_dicts_in_kwargs(self, expr, kwargs): | |||
def _update_var_dicts_in_kwargs(self, expr, level: int, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: might as well add type annotations for expr
and kwargs
here.
""" | ||
Copy variables with "@" prefix in `local_dict` and `global_dict` keys of kwargs. | ||
|
||
This function exists because we neeed to infer local and variables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this say local and *global* variables
here? it feels like it's missing a word.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a quick comment.
infer them on their own. The remote functions take place in a different | ||
thread with their own stacks that normally do not match the stack that | ||
leads to the Modin eval or query call. | ||
|
||
Parameters | ||
---------- | ||
expr : str | ||
The expression string to search variables with "@" prefix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expression string to search variables with "@" prefix. | |
The expression string to search for variables with "@" prefix. |
Signed-off-by: mvashishtha [email protected]
What do these changes do?
Support level param in query and eval.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date