Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Experiment to use pytypes to add support for python type hints #69

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

mariusvniekerk
Copy link
Collaborator

Most of the actual hard bits are delegated to pytypes for determining if types match expected instances.

@mariusvniekerk
Copy link
Collaborator Author

This is probably not going to work on py26 and pypy for now

@mrocklin
Copy link
Owner

This seems cool to me. It would be good to check how this will affect performance.

cc @llllllllll

@mariusvniekerk
Copy link
Collaborator Author

Do we have a more extensive suite we can benchmark for performance?

@mrocklin
Copy link
Owner

Nope. There is a very small script in bench/

@mariusvniekerk
Copy link
Collaborator Author

might be worth using something like pytest-benchmark if we want to be able to track performance over time

@mrocklin
Copy link
Owner

mrocklin commented Dec 27, 2017 via email

@mariusvniekerk
Copy link
Collaborator Author

One area that might cause some issues is

In [1]: import pytypes

In [2]: issubclass(int, float)
Out[2]: False

In [3]: pytypes.is_subtype(int, float)
Out[3]: True

@mrocklin
Copy link
Owner

To be clear, pytypes doesn't consider that a bug?

@mariusvniekerk
Copy link
Collaborator Author

Opened Stewori/pytypes#26

@mariusvniekerk
Copy link
Collaborator Author

mariusvniekerk commented Dec 27, 2017

Failures are from

  • py26, py32 lacks the typing module

@mariusvniekerk
Copy link
Collaborator Author

@llllllllll @mrocklin So i think this is ready for review again.

Test failures on py37 due to some pytypes incompatibility

if pytypes.is_Union(L[0]):
rest = expand_tuples(L[1:])
return [(item,) + t for t in rest for item in pytypes.get_Union_params(L[0])]
elif not pytypes.is_of_type(L[0], tuple):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehow this doesn't actually get hit when L[0] is not a tuple:

(Pdb) p L[0]
<type 'numpy.dtype'>
(Pdb) p pytypes.is_of_type(L[0], tuple)
True

This breaks importing datashape.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root cause of this is actually more concerning than this one bug. Why doesn't pytypes.is_of_type work correctly for type objects?

@llllllllll
Copy link
Collaborator

I was trying to profile a real world application using this branch; however, the
bug I commented on blocked me. I added a small patch to continue:

diff --git a/multipledispatch/utils.py b/multipledispatch/utils.py
index 816c7a4..8701756 100644
--- a/multipledispatch/utils.py
+++ b/multipledispatch/utils.py
@@ -32,6 +32,9 @@ def expand_tuples(L):
         elif not pytypes.is_of_type(L[0], tuple):
             rest = expand_tuples(L[1:])
             return [(L[0],) + t for t in rest]
+        elif not isinstance(L[0], tuple):
+            rest = expand_tuples(L[1:])
+            return [(L[0],) + t for t in rest]
         else:
             rest = expand_tuples(L[1:])
             return [(item,) + t for t in rest for item in L[0]]

With this patch, I ran out of call stack space. I noticed that there is
recursion cycle in trhe trace:

  File "/home/joe/projects/python/multipledispatch/multipledispatch/dispatcher.py", line 191, in __call__
    types = tuple([pytypes.deep_type(arg) for arg in args])
  File "/home/joe/.virtualenvs/qexec/lib/python2.7/site-packages/pytypes/type_util.py", line 281, in deep_type
    return _deep_type(obj, [], depth, max_sample)
  File "/home/joe/.virtualenvs/qexec/lib/python2.7/site-packages/pytypes/type_util.py", line 383, in _deep_type
    elif _issubclass_2(res, Container, None, None) and len(obj) == 0:
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/interactive.py", line 428, in table_length
    return int(expr.count())
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/reductions.py", line 52, in __new__
    axis = _normalize_axis(axis, _child)
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/reductions.py", line 18, in _normalize_axis
    axis = tuple(range(child.ndim))
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/expressions.py", line 241, in __getattr__
    result = func(self)
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/expressions.py", line 955, in ndim
    return len(shape(expr))
  File "/home/joe/projects/python/multipledispatch/multipledispatch/dispatcher.py", line 191, in __call__
    types = tuple([pytypes.deep_type(arg) for arg in args])
  File "/home/joe/.virtualenvs/qexec/lib/python2.7/site-packages/pytypes/type_util.py", line 281, in deep_type
    return _deep_type(obj, [], depth, max_sample)
  File "/home/joe/.virtualenvs/qexec/lib/python2.7/site-packages/pytypes/type_util.py", line 383, in _deep_type
    elif _issubclass_2(res, Container, None, None) and len(obj) == 0:
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/interactive.py", line 428, in table_length
    return int(expr.count())
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/reductions.py", line 52, in __new__
    axis = _normalize_axis(axis, _child)
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/reductions.py", line 18, in _normalize_axis
    axis = tuple(range(child.ndim))
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/expressions.py", line 241, in __getattr__
    result = func(self)
  File "/home/joe/.virtualenvs/qexec/src/blaze/blaze/expr/expressions.py", line 955, in ndim
    return len(shape(expr))
  File "/home/joe/projects/python/multipledispatch/multipledispatch/dispatcher.py", line 191, in __call__
    types = tuple([pytypes.deep_type(arg) for arg in args])
  ...

This cycle occurs because we go through a dispatch to compute len, and
now dispatch calls len in _deep_type. I don't really think that it is a
bug in blaze to use a multiply dispatched function in the __len__, so I am not
really sure that this change will work.

@mariusvniekerk
Copy link
Collaborator Author

So I haven't configured pytypes to do a subset for deep type. That might help. The recursion might still be a problem though.

@mariusvniekerk
Copy link
Collaborator Author

@llllllllll Can you point me to a test that i can run using this branch to see if i can correct the behavior?

@llllllllll
Copy link
Collaborator

I don't have time to manually distil a minimal failure case; but the blaze test suite should be a very comprehensive use of multipledispatch.

@mariusvniekerk
Copy link
Collaborator Author

I've been working a bit on a branch a bit further along that adds in diagonal dispatch etc, but that requires pretty aggressive use of the typing system.

Testing this with blaze (some constrained deps required to make blaze work).

conda create -n blz python=3.6 'dask<0.13' pandas=0.19
source activate blz
pytest --pyargs blaze

This passes all but 1 of the tests, which might just be a dep mismatch somewhere.

Looking at this again though, and after trying to dig through the blaze test suite, it might be better to have a different @dispatch function / Dispatcher that supports ONLY pep484 style type annotations and works only in py3.

Alternatively we could add a parameter to dispatcher for pep484 mode vs original. blaze / odo could continue to use original annotations easily without any of the complex changes.

cc @mrocklin. @llllllllll

@mariusvniekerk
Copy link
Collaborator Author

test failures are py37 (some weirdness) and py34 a strange flake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants