Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bind dispatched methods, and don't rely on "self" argument to infer methods #26

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented Jul 11, 2014

These two changes are not dependent on each other (sorry for having them in the same commit/PR). I don't know how solid either of the changes are, but I think they are worth exploring.

Previously, MethodDispatcher's __get__ would be called each time a method was called. Moreover, the method dispatcher does not look like a method. So, we now bind the method dispatcher as a method when __get__ is called, which should only happen once. This is similar to how regular functions are bound as methods in Python, although I don't know exactly when the magic normally occurs for regular methods.

The second tweak changes how methods are inferred. Previously, the signature was inspected, and it was determined to be a method if the first argument was named "self". This fails for functions declared outside of a class, or--goodness forbid--somebody doesn't use "self" as the first argument of a method. There is another way. In the standard Python data model, classes are the only objects that have the __module__ attribute. Interestingly, this appears to be already defined in the local scope of the class definition. Hence, we can use inspect to check in f_locals.

In practice, "self" should always be used as the first item in method signatures. In other words, I'm not concerned about false-negatives (i.e., when "self" is not used for methods). This means that we now have two criteria, each of which may give false-positives, but not false-negatives. Therefore, we may want to check for both "self" in the signature and __module__ in the f_locals.

Here's an example in Python 2.7 that illustrates how MethodDispatcher behaves like a method:

In [1]: %paste
from multipledispatch import dispatch

class Test(object):
    def __init__(self, x):
        self.x = x

    @dispatch(int)
    def f(self, y):
        return self.x + y

    @dispatch(float)
    def f(self, y):
        return self.x - y
## -- End pasted text --

In [2]: test = Test(1)

In [3]: test.f(1)
Out[3]: 2

In [4]: test.f(1.0)
Out[4]: 0.0

In [5]: test.f
Out[5]: <bound method Test.<dispatched f> of <__main__.Test object at 0x2501550>>

In [6]: test.__getattribute__('f')
Out[6]: <bound method Test.<dispatched f> of <__main__.Test object at 0x2501550>>

In [7]: Test.f
Out[7]: <unbound method Test.<dispatched f>>

In [8]: Test.__getattribute__(Test, 'f')
Out[8]: <dispatched f>

In [9]: test.f is test.f
Out[9]: True

Finally, I found the original test_methods_multiple_dispatch test very confusing. It didn't use "self", and the functions were essentially static functions. Was this intended?

eriknw added 2 commits July 11, 2014 15:04
…thods.

These two changes are not dependent on each other (sorry for having them in
the same commit/PR).  I don't know how solid either of the changes are, but
I think they are worth exploring.

Previously, MethodDispatcher's `__get__` would be called each time a method
was called.  Moreover, the method dispatcher does not *look* like a method.
So, we now bind the method dispatcher as a method when `__get__` is called,
which should only happen once.  This is similar to how regular functions are
bound as methods in Python, although I don't know exactly *when* the magic
normally occurs for regular methods.

The second tweak changes how methods are inferred.  Previously, the signature
was inspected, and it was determined to be a method if the first argument
was named "self".  This fails for functions declared outside of a class,
or--goodness forbid--somebody doesn't use "self" as the first argument of
a method.  There is another way.  In the standard Python data model, classes
are the only objects that have the `__module__` attribute.  Interestingly,
this appears to be already defined in the local scope of the class
definition.  Hence, we can use `inspect` to check in `f_locals`.

In practice, "self" should *always* be used as the first item in method
signatures.  In other words, I'm not concerned about false-negatives (i.e.,
when "self" is not used for methods).  This means that we now have two
criteria, each of which may give false-positives, but not false-negatives.
Therefore, we may want to check for both "self" in the signature *and*
"__module__" in the `f_locals`.
@eriknw
Copy link
Contributor Author

eriknw commented Jul 11, 2014

tl;dnr

It may not be a bad idea to use:

def ismethod(func):
    spec = inspect.getargspec(func)
    return spec and spec.args and spec.args[0] == 'self'

def inclass(n=1):
    frame = inspect.currentframe().f_back  # escape from current function
    for _ in range(n):
        frame = getattr(frame, 'f_back')
    return '__module__' in frame.f_locals

to determine if a function may be a method and is being defined in a class.

eriknw added 3 commits July 14, 2014 13:41
The previous method of setting `self.obj` in `__get__` is not reentrant, and
should not be used in multi-threaded applications.  Moreover, this approach
can fail in a single thread too.  This is what the new test tests.
@eriknw
Copy link
Contributor Author

eriknw commented Jul 14, 2014

I don't think I properly motivated why the method dispatcher should return a bound method. The primary issue is that the previous version is not reentrant and should not be used in multi-threaded applications. Moreover, it can fail in single threaded applications as well. The test added in 21f8b05 shows how.

@eriknw
Copy link
Contributor Author

eriknw commented Jul 14, 2014

I also convinced myself that checking if the first argument is "self" to infer methods is a good idea. This convention really ought to be used, and it distinguishes between classmethods and staticmethods.

We still use the new heuristic that was introduced in the PR. Hence, a function with "self" for a first argument can be defined outside a class as one would expect.

@mrocklin
Copy link
Owner

Hey, sorry I've left this lingering. Life is packed these days. To be honest I still haven't dived into these issues, but the added tests look like things that should work and the added code doesn't ring any "too much complexity" bells, so I'm happy to trust you on these things.

@eriknw
Copy link
Contributor Author

eriknw commented Jul 15, 2014

I figured as much. It's pretty easy to infer when you're really busy by how quickly and thoroughly you respond to github issues ;)

Thanks for the reply though. It doesn't bother me to have PRs sit for a while. They deserve a proper review (because I make mistakes just like anybody), and I fully expect you will give attention to multipledispatch as you are able.

I think there is a way to sanely handle classmethods and staticmethods--and to warn or raise exceptions on common errors--but this adds complexity, and should be a separate PR.

I think this PR is ready to merge.

@encukou
Copy link
Contributor

encukou commented Aug 24, 2014

Wouldn't it be easier to use a different decorator, say @dispatched_method, instead?

With all the magic, I would be afraid to build something more complicated on top of this.

@aldanor
Copy link

aldanor commented Sep 4, 2014

Would probably make sense to have a @dispatch_classmethod decorator as well then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants