-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: neon intrinsics packet backend #70
Conversation
Adding overloads for a neon intrinsics backend. The code is a ported version of the backend from the enoki repo.
8b3cc94
to
94b1d6c
Compare
95ebbd7
to
9fd09ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR looks good to me (especially since this was ported directly from the Enoki codebase).
Just a few minor comments to be addressed before we merge this.
@wjakob could you remind me the reason for not porting this to Dr.Jit from Enoki during the rewrite?
#include <ostream> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This header was needed since std::ostream
is used at:
@@ -442,7 +442,7 @@ void validate_horizontal(const std::vector<value_t<T>> &args, | |||
value_t<T> (*refFunc)(const T &), | |||
const value_t<T> &eps = 0) { | |||
std::mt19937 gen; | |||
std::uniform_int_distribution<> dis(0, (int) args.size()-1); | |||
std::uniform_int_distribution<> dis(1, (int) args.size()-1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary?
//! @{ \name Reinterpreting constructors, mask converters | ||
// ----------------------------------------------------------------------- | ||
|
||
#define ENOKI_REINTERPRET_BOOL(type, target) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#define ENOKI_REINTERPRET_BOOL(type, target) \ | |
#define DRJIT_REINTERPRET_BOOL(type, target) \ |
DRJIT_INLINE Derived mul_(Ref a) const { return vmulq_f64(m, a.m); } | ||
DRJIT_INLINE Derived div_(Ref a) const { return vdivq_f64(m, a.m); } | ||
|
||
#if defined(ENOKI_ARM_FMA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#if defined(ENOKI_ARM_FMA) | |
#if defined(DRJIT_ARM_FMA) |
Hi all -- sorry for the inactivity. The upcoming rewrite ( |
#include <drjit/half.h> | ||
|
||
NAMESPACE_BEGIN(drjit) | ||
DRJIT_PACKET_DECLARE(16) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was needed to build for android32:
DRJIT_PACKET_DECLARE(16) | |
DRJIT_PACKET_DECLARE_COND(16, enable_if_t<(std::is_same_v<Type, float>)>) | |
DRJIT_PACKET_DECLARE_COND(16, enable_if_int32_t<Type>) | |
#if defined(DRJIT_ARM_64) | |
DRJIT_PACKET_DECLARE_COND(16, enable_if_t<(std::is_same_v<Type, double>)>) | |
DRJIT_PACKET_DECLARE_COND(16, enable_if_int64_t<Type>) | |
#endif |
7ed15ce
to
754a541
Compare
The changes on |
Adding overloads for a neon intrinsics backend. The code is a ported version of the backend from the enoki repo. Tested via unit tests on an M1 mac. See also #63.
I had to add some "hacky" changes to make sure all tests compile and run. I'm happy to incorporate better ways to fix those issues:
changing the seed for the random generator in
validate_horizontal
intest.h
This is because
assert_close
doesn't really work for floats ifresult_ref
is0
andresult
is something like1e-7
. I was hitting exactly that case on my system. Changing the seed doesn't fix it it just avoids it.Adding
<ostream>
tohalf.h
, and including half inpacket_neon.h
This is just so
half
is defined inpacket_neon
when the conversion is instantiated.half.h
also includes an operator overload for ostreams so this needs to be defined as well.