Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cythonize Dispatcher methods #24

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

eriknw
Copy link
Contributor

@eriknw eriknw commented Jul 8, 2014

This is currently a quickly implemented proof-of-concept to cythonize Dispatcher.__call__/resolve as discussed in #13.

We will make Cython and a C compiler optional. We also still need to use Cython for MethodDispatcher.

@mrocklin, I'm curious how much faster this is for you. I have not performed any benchmarks. I believe we can make __call__ 50-100% faster than what is currently done in this PR, but I want to make sure I'm on the right track.

@eriknw
Copy link
Contributor Author

eriknw commented Jul 9, 2014

I benchmarked Cythonized versions on the following dispatched functions:

from multipledispatch import dispatch
@dispatch(int)
def f(x):
    return x + 1

@dispatch(float)
def f(x):
    return x - 1

Let's discuss times in terms of the run time of an f without being dispatched via multipledispatch. The pure Python version of Dispatcher takes 8x as long as the undecorated function. The current Cython version takes 3x as long. A faster Cython version takes 2x as long. If we don't construct a tuple of types for unary functions, then the fast Cython version takes about 1.6x as long. This is probably about the best we can hope to do.

eriknw added 2 commits July 10, 2014 12:34
Dispatcher is now derived from a C-extension base class.  This improves
performance two ways:

1. Being called via `__call__` is much faster
2. Attribute lookup (i.e., `_cache`, `funcs`, etc.) is much faster

I also added an optimization in the Cython version of Dispatcher when
dispatching for a single positional argument by not constructing a tuple
for the key for `_cache`.  I do, however, pass a tuple to `resolve`.

Cythonized versions should not be used in PyPy.  Hopefully the check works.

TravisCI script was modified to test with and without using Cython, and it
verifies if the C core was used when expected.  Hopefully this works too.
@eriknw
Copy link
Contributor Author

eriknw commented Jul 10, 2014

I think this PR is almost ready to merge. It achieves the two primary objectives:

  1. Fast Cythonized versions of __init__ and resolve methods of dispatchers.
  2. Fails gracefully (with a warning message) if there is an error building the extension modules.

@mrocklin, I know you are busy right now, so I'm just letting you know this PR is awaiting review and feedback.

New dispatching policies (as mentioned in #13) have not been explored. The resolve method could be optimized even more. I have not performed many benchmarks, and I have not benchmarked method dispatching at all. Why doesn't method dispatching use a cache, and might there be a better approach for method dispatching?

@mrocklin
Copy link
Owner

Yeah, sorry for not responding to this. Don't mistake my delay for a lack
of enthusiasm!
On Jul 10, 2014 5:03 PM, "Erik Welch" [email protected] wrote:

I think this PR is almost ready to merge. It achieves the two primary
objectives:

  1. Fast Cythonized versions of init and resolve methods of
    dispatchers.
  2. Fails gracefully (with a warning message) if there is an error
    building the extension modules.

@mrocklin https://github.com/mrocklin, I know you are busy right now,
so I'm just letting you know this PR is awaiting review and feedback.

New dispatching policies (as mentioned in #13
#13) have not been
explored. The resolve method could be optimized even more. I have not
performed many benchmarks, and I have not benchmarked method dispatching at
all. Why doesn't method dispatching use a cache, and might there be a
better approach for method dispatching?


Reply to this email directly or view it on GitHub
#24 (comment)
.

@eriknw
Copy link
Contributor Author

eriknw commented Jul 10, 2014

Yeah, sorry for not responding to this. Don't mistake my delay for a lack of enthusiasm!

No worries! I didn't expect you to be engaged in this PR just yet--I know how busy and social the conference can be--hence my slow pace :)

I don't know how well C extensions are supported in other implementations
of Python.  Hence, let's choose to do the conservative thing and only
use Cythonized versions of dispatchers for CPython.

Also added a Makefile for convenient building and testing.
@eriknw
Copy link
Contributor Author

eriknw commented Jul 17, 2014

Two burning questions:

  1. Why doesn't method dispatching use _cache?
  2. If method dispatching is changed to use _cache, why cythonize resolve?

@mrocklin
Copy link
Owner

Method dispatch hasn't seen nearly as much usage (by me) than function dispatch. It's likely that there are oversights like this.

I don't actually exactly know what the value of cythonizing resolve is. My recollection is that using cache we get down to 1-3 microseconds per call. Maybe this is sufficient or maybe we can't improve much beyond this and so using cython is just needless complexity. My guess is that the thing to do here is to benchmark what we think of as being common cases. This is on my TODO list, but it's being pushed down by other things :(

@eriknw
Copy link
Contributor Author

eriknw commented Jul 17, 2014

Alright. I'll change method dispatching to use _cache, because I don't see a reason why it shouldn't be used. All the tests pass too. If I'm wrong and it fails for somebody, hopefully they will be kind enough to let us know!

resolve is only called once per signature, because the result is put in _cache, so I don't consider it a top priority for implementing in Cython. This can change if somebody determines it is a significant factor during actual use (which can occur if they are frequently registering new functions). I think it would be cleaner to have resolve in pure Python. I suppose we can do a benchmark first to see how much faster it is in Cython.

I still think it's amazing that cythonized dispatch only adds 60% overhead when calling a unary function, and 100% for other arities!

Copy link

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you mind rebasing this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants