-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add static-dispatch versions #13
base: master
Are you sure you want to change the base?
Conversation
num-utils.asd
Outdated
(defsystem "num-utils/tests" | ||
:version "1.0.0" | ||
:description "Unit tests for NUM-UTILS." | ||
:author "Steven Nunez <[email protected]>" | ||
:license "Same as NUM-UTILS -- this is part of the NUM-UTILS library." | ||
#+asdf-unicode :encoding #+asdf-unicode :utf-8 | ||
:depends-on (#:num-utils | ||
:depends-on (#:num-utils/static-dispatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this temporary? It looks like this replaces all the num-util tests with the ones from static-dispatch. Ideally we could keep two test systems until static-dispatch is ready to be folded into the main system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, my bad! I will fix it.
I think I can relate to tpapp now. Optimizing Common Lisp - even with SBCL - is not trivial. There are many gotchas. For instance, take (declaim (inline sequence-minimum))
(defun sequence-minimum (x)
"Return the maximum value in the sequence X"
(check-type x alexandria:proper-sequence)
(cond ((listp x) (apply 'min x))
((vectorp x) (reduce #'min x)))) First of all, as of SBCL 2.3.4, (defun sequence-minimum (x)
"Return the minimum value in the sequence X"
(check-type x alexandria:proper-sequence)
(cond ((listp x)
(cond ((null x)
(error "Empty sequence"))
((null (cdr x))
(car x))
(t
(let ((min (car x)))
(dolist (elt x)
(setf min (min min elt)))
min))))
((vectorp x)
(case (length x)
(0 (error "Empty sequence"))
(1 (row-major-aref x 0))
(t (let ((min (row-major-aref x 0)))
(dotimes (index (1- (array-total-size x)))
(let ((elt (row-major-aref x (1+ index))))
(setf min (if (< elt min) elt min))))
min)))))) However, even this is not enough, a (disassemble
(lambda (x)
(declare (optimize speed)
(type (simple-array single-float 1) x))
(num-utils:sequence-minimum x))) To actually avoid a generic-<, completely - and we get a 10-15x speed boost if we do this - we need something like the following: (defun sequence-minimum (x)
"Return the minimum value in the sequence X"
(check-type x alexandria:proper-sequence)
(etypecase x
(list
(cond ((null x)
(error "Empty sequence"))
((null (cdr x))
(car x))
(t
(let ((min (car x)))
(dolist (elt x)
(setf min (min min elt)))
min))))
(vector
(case (length x)
(0 (error "Empty sequence"))
(1 (row-major-aref x 0))
(t (let ((min (row-major-aref x 0)))
(declare (type (simple-array single-float 1) x)
(type single-float min)) ; <== THIS TYPE DECLARATION
(dotimes (index (1- (array-total-size x)))
(let ((elt (row-major-aref x (1+ index))))
(setf min (if (< elt min) elt min))))
min)))))) But that makes the code non-generic to vector element types :/. In this particular case, polymorphic-functions too are only helpful if we write a polymorph-compiler-macro, which actually should not be necessary at all. To actually the generic versions we need better type inference. So, I'm unsure how much benefit static-dispatch alone could get us. May be a 1.5-2x, but so long as generic-+, generic-< etc remain, we are losing out on a 10x performance boost. |
This is still of limited use though.
In the upcoming commits, I will attempt the second task. For example, the following shows a generic-+ in its disassembly.
To get rid of it requires one to inline the
ref
insidemapping-array
. But after doing both of these, we get a 2.5x performance boost on SBCL 2.3.4:This can be optimized further if we allow for an optional
out
argument to thee2+
function.