Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: neon intrinsics packet backend #70

Closed
wants to merge 1 commit into from

Conversation

flanggut
Copy link

@flanggut flanggut commented Sep 3, 2022

Adding overloads for a neon intrinsics backend. The code is a ported version of the backend from the enoki repo. Tested via unit tests on an M1 mac. See also #63.

I had to add some "hacky" changes to make sure all tests compile and run. I'm happy to incorporate better ways to fix those issues:

  1. changing the seed for the random generator in validate_horizontal in test.h
    This is because assert_close doesn't really work for floats if result_ref is 0 and result is something like 1e-7. I was hitting exactly that case on my system. Changing the seed doesn't fix it it just avoids it.

  2. Adding <ostream> to half.h, and including half in packet_neon.h
    This is just so half is defined in packet_neon when the conversion is instantiated.
    half.h also includes an operator overload for ostreams so this needs to be defined as well.

Adding overloads for a neon intrinsics backend. The code is a ported version of the backend from the enoki repo.
Copy link
Member

@Speierers Speierers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good to me (especially since this was ported directly from the Enoki codebase).

Just a few minor comments to be addressed before we merge this.

@wjakob could you remind me the reason for not porting this to Dr.Jit from Enoki during the rewrite?

Comment on lines +16 to +17
#include <ostream>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -442,7 +442,7 @@ void validate_horizontal(const std::vector<value_t<T>> &args,
value_t<T> (*refFunc)(const T &),
const value_t<T> &eps = 0) {
std::mt19937 gen;
std::uniform_int_distribution<> dis(0, (int) args.size()-1);
std::uniform_int_distribution<> dis(1, (int) args.size()-1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

//! @{ \name Reinterpreting constructors, mask converters
// -----------------------------------------------------------------------

#define ENOKI_REINTERPRET_BOOL(type, target) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#define ENOKI_REINTERPRET_BOOL(type, target) \
#define DRJIT_REINTERPRET_BOOL(type, target) \

DRJIT_INLINE Derived mul_(Ref a) const { return vmulq_f64(m, a.m); }
DRJIT_INLINE Derived div_(Ref a) const { return vdivq_f64(m, a.m); }

#if defined(ENOKI_ARM_FMA)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if defined(ENOKI_ARM_FMA)
#if defined(DRJIT_ARM_FMA)

@wjakob
Copy link
Member

wjakob commented Oct 8, 2023

Hi all -- sorry for the inactivity. The upcoming rewrite (nanobind_v2 branch) will also have an ARM packet backend. Let's leave this PR open until this branch is merged in case others are looking for this.

#include <drjit/half.h>

NAMESPACE_BEGIN(drjit)
DRJIT_PACKET_DECLARE(16)
Copy link
Contributor

@jeongseok-meta jeongseok-meta Oct 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was needed to build for android32:

Suggested change
DRJIT_PACKET_DECLARE(16)
DRJIT_PACKET_DECLARE_COND(16, enable_if_t<(std::is_same_v<Type, float>)>)
DRJIT_PACKET_DECLARE_COND(16, enable_if_int32_t<Type>)
#if defined(DRJIT_ARM_64)
DRJIT_PACKET_DECLARE_COND(16, enable_if_t<(std::is_same_v<Type, double>)>)
DRJIT_PACKET_DECLARE_COND(16, enable_if_int64_t<Type>)
#endif

@wjakob wjakob force-pushed the master branch 2 times, most recently from 7ed15ce to 754a541 Compare October 20, 2023 21:48
@njroussel
Copy link
Member

The changes on nanobind_v2 have made their way onto master, I think it entirely covers this PR.

@njroussel njroussel closed this Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants