Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom geometry? #83

Open
wjakob opened this issue Jan 9, 2025 · 24 comments
Open

Custom geometry? #83

wjakob opened this issue Jan 9, 2025 · 24 comments

Comments

@wjakob
Copy link

wjakob commented Jan 9, 2025

Dear Jacco,

apologies for opening a ticket, this is more of a question :). This project is starting to look very tempting as an Embree replacement (hackability, ability to use double precision for scientific applications). However, one major feature that is missing compared to Embree is the ability to declare custom shapes (e.g. quadrics) with intersection callbacks. Is something that out of scope for TinyBVH?

@jbikker
Copy link
Owner

jbikker commented Jan 9, 2025

Hi Wenzel, great to see you here!
I had a quick look at how Embree handles custom geometry:

  • build a BVH over primitive AABBs
  • register a callback function which takes a primID and the ray hit record
  • the callback updates the hit record.
    Adding this for the base case (no TLAS, single-ray traversal, default + double precision BVH layout) should not be too much work. It also sounds like an interesting and pretty generic feature. Can you confirm the above is what you're looking for?

@wjakob
Copy link
Author

wjakob commented Jan 9, 2025

Hi Jacco,

that's great to hear and how I understand the feature as well! I think that Embree can also build data structures containing mixtures of triangle meshes and user defined primitives. To be honest, I am not quite sure of how that data is organized internally.

What does "no TLAS" mean in this context? I think that the ability to instance geometry consisting of non-triangle primitives is important.

@jbikker
Copy link
Owner

jbikker commented Jan 9, 2025

If Embree is mixing different primitive types I suspect they limit this to a single type per BVH, and then build a 'top level BVH' (TLAS) over these BVHs. Sounds like you also need this functionality. Hierarchies of BVHs (TLAS/BLAS), which includes instancing, is a planned feature, but not yet available in tinybvh. The functionality is not hard to add but designing a good interface for it is non-trivial.

@jbikker
Copy link
Owner

jbikker commented Jan 9, 2025

The custom geometry is now available in 1.2.2 in the dev branch. For a bvh you can now set two callbacks: customIntersect and customIsOccluded. Both take a ray and a primitive id. Have a look at the new tiny_bvh_custom.cpp example to see how this can be used to intersect a BVH with spheres.
The TLAS/BLAS functionality was planned already; I'll return to that now, hopefully something is up and running soon.

@TheSFReader
Copy link
Contributor

Should'nt there also be a custom Bounding Volume limits callback at build time ? I see you've "abused the array" in the code, but I imagine a dedicated callback woud be better ?

@jbikker
Copy link
Owner

jbikker commented Jan 9, 2025

Well there will soon be an api to build a BVH over a set of aabbs, which will be used for TLAS construction. This might work well here too. I don't really want to let people send arrays of custom primitives to the builder; it seems to make sense to either send the most common type (triangles) and have your aabbs generated for you, or send aabbs directly.

@brendan-duncan
Copy link
Contributor

brendan-duncan commented Jan 9, 2025

The solution of "abusing the triangle array to store bounding boxes" doesn't feel very clean.

With the TLAS aabb list, it seems like that's a reasonable solution for mixing triangle BLAS nodes and procedural object nodes. The TLAS builder doesn't need to know about either, it's just working with aabb's, and the BVHNode primitive index points to the index of the AABB.

It's then the TLAS walker that can know whether the primitive index in the TLAS points to a triangle BVH, or a procedural object.

The TLAS walker could have a list with the same count as the AABB list, where each element either points to another BVH (BLAS) for triangle meshes, or to a procedural object.

The TLAS walker would hit a TLAS root node, and if the index points to a triangle BLAS, walk into that structure for ray interception testing, or if it points to a procedural object, call its interception test.

Doing this, I wouldn't think a BLAS of procedural objects would be necessary. It puts the effort of mixing object types, and even the concept of procedural objects, onto the walker. The TLAS walker would also handle instancing by allowing the TLAS root note visitor to transform the ray before it enters the referenced BVH or procedural.

@brendan-duncan
Copy link
Contributor

I did a quick prototype for TLAS building, traversal, and mixing of triangle meshes and procedural objects, 0f657e6

There are some things that could still be cleaned up, and some things aren't quite right (need to transform the ray before checking instance intersections, etc), and it's hardly tested. But it's the general idea. I had to make some fixes to BVH to get BuildTLAS to work.

struct Sphere {
  tinybvh::bvhvec3 position;
  float radius;
  Sphere(const tinybvh::bvhvec3& pos, float rad) : position(pos), radius(rad) {}
};

void sphereIntersect( tinybvh::Ray& ray, unsigned primID, void* userdata ) {
  Sphere* sphere = (Sphere*)userdata;
  tinybvh::bvhvec3 oc = ray.O - sphere->position;
  // ...
}

bool sphereIsOccluded( const tinybvh::Ray& ray, unsigned primID, void* userdata ) {
  Sphere* sphere = (Sphere*)userdata;
  tinybvh::bvhvec3 oc = ray.O - sphere->position;
  // ...
}

int main() {
  // build a BVH for a mesh
  tinybvh::BVH bvh;
  bvh.Build(triangles, TRIANGLE_COUNT);
  
  tinybvh::BLASInstance instances[] = {
    // BVH instance
    tinybvh::BLASInstance(&bvh),
    // procedural instance
    tinybvh::BLASInstance(sphereIntersect, sphereIsOccluded, new Sphere{{0,0,0}, 1.0f})
  };
  
  // Update the instance bounds. This could be done less explicitly.
  instances[0].Update();
  instances[1].worldBounds = { tinybvh::bvhvec3(-0.5, -0.5, -0.5), 0, tinybvh::bvhvec3(0.5, 0.5, 0.5), 0 };
  
  tinybvh::BVH tlas;
  tlas.BuildTLAS(instances, 2);
  
  // tlas will intersect against both mesh BVH nodes, and procedural nodes.
  int steps = tlas.Intersect(ray);
}

@jbikker
Copy link
Owner

jbikker commented Jan 10, 2025

Ah yes, thanks for that, I understand what you mean now.
This would yield a lof of freedom for mixing triangle meshes and other types of geometry.
It does come at a cost however:
To represent a scene of e.g. 1M spheres, one would have a TLAS with 1M leafs, each containing a single sphere, which in turn all contain pointers to two callbacks. The sphere objects are potentially scattered over the heap. Traversal involves essentially a virtual function call per leaf. Combining multiple primitives in a leaf is not possible.
A BVH over spheres on the other hand would be identical to a BVH over triangles, after the init phase (::PrepareBuild). A single set of function pointers would be shared for all primitives in the BVH. And, given a proper BLAS/TLAS system, one could still fall-back to essentially your scheme by building a BVH over a single primitive.

@wjakob
Copy link
Author

wjakob commented Jan 10, 2025

Hi @jbikker,

that you, this looks great! I think that having a single set of callbacks for an entire data structure is the right one tradeoff to make. The computation of the AABB itself could also be a callback to save memory, but that seems comparably minor.

I have one naïve question: to what extent do the features of TinyBVH compose with each other?

There are a few different features mentioned on the readme page:

  • Custom objects
  • Instancing
  • Double precision
  • Packet ray traversal

What if I wanted to do all of these things at the same time? The context is Mitsuba/Dr.Jit, where some users do insane things with scales (think: instances of trees on earth's surfaces, lit by a sun, all to scale within a solar system) requiring high precision. The system also uses SIMD packets for everything, so that's the most natural interface to the ray tracing library.

@jbikker
Copy link
Owner

jbikker commented Jan 10, 2025

Not a naïve question at all.
Right now tinybvh has solid bvh construction, with a solid API for that functionality. Traversal is, to some extent, a 'related problem', for which code is provided, but not at the same level of completeness. Many features are also developed on a 'need to have' basis, after which typically an API update phase is needed.
Custom objects are a nice example; adding the feature happened because someone (well, you ;) ) asked for it, but now it needs a proper interface. It is implemented for the double precision BVH already, but it is not implemented for the GPU traversal kernels (that would be tricky..), nor for packet traversal.
Instancing and multi-level BVH traversal is under construction. This will 'automatically' include custom geometry, at least on the CPU. I think an initial version will be made before we stabilize the API; this appears to be a good order for development. It's not a hard feature so we should be able to move quite quickly (a few weeks perhaps).
SIMD / packet traversal currently is poorly developed. It exists mostly to compare primary ray CPU performance against 'GRays/s' claims of the GPU manufacturers, but frankly it is of little use for anything beyond primary rays. I want to implement the 'WiVe' algorithm for CPU packet traversal, which should deliver state-of-the-art performance for single rays and packets, but that will definitely be at least a month or two down the road. That one is hard. :)
So in the end this is about maturity, I suppose; tinybvh is 2 or 3 months old and probably needs to grow (and get battle tested) before it can realistically serve a big fish like Mitsuba.

@wjakob
Copy link
Author

wjakob commented Jan 10, 2025

Got it, thank you for the detailed response. Once there is an example of a trusted/efficient traversal kernel, adapting to double precision or custom geometry is probably quite mechanical, I can also help with that. The code looks much simpler than that of Embree. Mitsuba is using OptiX for GPU traversal, so my main interest is in having a better alternative to address some limitations with Embree on the CPU.

@brendan-duncan
Copy link
Contributor

brendan-duncan commented Jan 10, 2025

That's definitely a valid point.

To give it another shot and see if we can address some of those concerns, keeping the flexibility, here's a cleanup pass on the previous CL, brendan-duncan@161fc3e.

To be specific, I'm not personally needing procedural geometry for what I want from the library, but I am interested in TLAS building and traversal. If we could get a flexible procedural geometry API out of it, I think that would be a nice bonus.

struct Sphere
{
	tinybvh::bvhvec3 position;
	float radius;
	Sphere() {}
	Sphere(const tinybvh::bvhvec3& pos, float rad) : position(pos), radius(rad) {}
};

void sphereIntersect( tinybvh::Ray& ray, unsigned primID, void* userdata )
{
	Sphere* sphere = (Sphere*)userdata;
	tinybvh::bvhvec3 oc = ray.O - sphere->position;
	//...
}

bool sphereIsOccluded( const tinybvh::Ray& ray, unsigned primID, void* userdata )
{
	Sphere* sphere = (Sphere*)userdata;
	tinybvh::bvhvec3 oc = ray.O - sphere->position;
        // ...
}

int main()
{
    // ...
    tinybvh::BVH bvh;
    bvh.Build( triangles, TRIANGLE_COUNT );
    
    // Only 1 "procedural definition" set of function pointers
    tinybvh::Procedural sphereProcedural = { sphereIntersect, sphereIsOccluded };
    
    // The spheres are considered "userdata", passed to the Procedural Intersect functions
    const int sphereCount = 1000000;
    Sphere* spheres = new Sphere[sphereCount];
    for (int i = 0; i < sphereCount; ++i)
    {
	spheres[i].position = { uniform_rand() * 200 - 100, uniform_rand() * 200 - 100, uniform_rand() * 200 - 100 };
	spheres[i].radius = uniform_rand();
    }
    
    // TLAS instances, include both BVH BLAS nodes and procedural nodes.
    int instanceCount = sphereCount + 1;
    tinybvh::BLASInstance* instances = new tinybvh::BLASInstance[instanceCount];
    instances[0].blas = &bvh;
    instances[0].Update();
    
    for (int i = 0, j = 1; i < sphereCount; ++i, ++j)
    {
	float halfRad = spheres[i].radius * 0.5;
	tinybvh::bvhvec3 halfRad3 = { halfRad, halfRad, halfRad };
	// Procedural nodes are all pointing to the single Sphere custom functions
	instances[j].procedural = &sphereProcedural;
        // custom data passed to the intersect function defines the individual sphere
	instances[j].data= &spheres[i];
	// procedurals need to manually provide their world bounds
	instances[j].worldBounds = { spheres[i].position - halfRad3, 0, spheres[i].position + halfRad3, 0 };
    }

    tinybvh::BVH tlas;
    tlas.BuildTLAS(instances, instanceCount);
    
    tinybvh::Ray ray( O, D );
    steps = tlas.Intersect(ray);
}

@jbikker
Copy link
Owner

jbikker commented Jan 11, 2025

I coded something for TLAS/BLAS support, have a look at tiny_bvh_anim.cpp. The current api is quite lean I think.

	// build a TLAS
	inst[0] = BLASInstance( &bvh ); // static geometry
	inst[1] = BLASInstance( &blas );
	inst[1].transform[3 /* i.e., x translation */] = 4;
	inst[2] = BLASInstance( &blas );
	inst[2].transform[3 /* i.e., x translation */] = -4;
	tlas.Build( inst, 3 );

Here, constructing a TLAS is a matter of building a BVH over an array of BLASInstances. After this, the tlas can be intersected as any other regular bvh. See BVH::Intersect for how this is handled.

The TLAS/BLAS functionality does not yet properly report instance index in the hit record; I'll fix that later.

A TLAS with custom geometry BVHs would not be any different. The two features are now thus decoupled; we simply need an api for building a BVH over custom geometry.

@jbikker
Copy link
Owner

jbikker commented Jan 14, 2025

I changed the TLAS/BLAS interface slightly. A BLASInstance no longer keeps a pointer to a BVH (BLAS), for two reasons:

  • A BLAS pointer is a host side pointer and thus not great for GPU rendering
  • A pointer has a platform-dependent size; indices are better
  • The BVH type should be free.

So instead, BLASInstance now has a blasIdx, which redirects to a separate array in the TLAS: blasList (with size blasCount).

As a consequence, building a TLAS now requires this list:

tlas.Build( instList, instCount, blasList, blasCount );

Additionally, a BLASInstance now stores its own world-space bounds. These used to be derived from the BLAS; now they are updated with a call to BLASInstance::Update(..). This is still done during TLAS construction, but moving it from there and doing it manually is now simpler.

All this is mostly to facilitate the upcoming GPU version of TLAS/BLAS traversal; especially the BLAS pointer in BLASInstance was proving to be a bad idea for this.

tiny_bvh_anim.cpp has been updated with these changes. The same changes have been made for BVH_Double.

@brendan-duncan
Copy link
Contributor

brendan-duncan commented Jan 14, 2025

@jbikker The TLAS update looks good, but I see a couple potential issues.

  1. I like that BLASInstance is now more self contained, having a BVH index instead of a pointer and storing its world aabb, so the blasList isn't necessary for the construction of the TLAS. I'm not intending to use the CPU intersect functions, so the BVH list should be optional. In Build, you currently update the instance aabb using the BVH list. I would like to pass a null BVH list, and if the BVH list is null, then you can assume the instance list already has an updated aabb for instances.

  2. You already have a TODO for this, and it doesn't affect me since I'm not using the CPU intersect methods, but I don't know if it will be a simple problem to solve: blasList is an array of BVHBase pointers, but for the IntersectTLAS method you are static casting it to BVH, which will likely not be true. You're not using virtual methods, so BVHBase can't have an overwritable Intersect method, so you'll need to find the right way to cast the BVHBase and call its Intersect method.

@jbikker
Copy link
Owner

jbikker commented Jan 14, 2025

Good points. Having a zero pointer for the blasList is indeed a good idea. This takes good care of the situation you described: GPU-only TLAS/BLAS traversal, potentially with a huge TLAS which should not have the BLASInstance bounds unnecessarily updated.

For the CPU intersect methods, I plan to pick the right BVH layout with a switch/case or set of if-statements. Not pretty but I suspect the conditional code is highly predictable and thus fast. Using the 'feature flags' the performance-conscious user can disable certain options to make everything effectively unconditional.

@wjakob
Copy link
Author

wjakob commented Jan 14, 2025

The API looks great, and thank you for already supporting the BLAS/TLAS build in both precisions. What I don't fully understand yet is how analytic primitives would fit into this interface. They don't have a BLAS -- would one still specify them using a BLASInstance?

For the CPU intersect methods, I plan to pick the right BVH layout with a switch/case or set of if-statements. Not pretty but I suspect the conditional code is highly predictable and thus fast.

Would it make sense to use a template for this? You could either template the traversal on the specific type of tree that is encountered. And the logic to dispatch to different tree types using a switch statement could itself be factored out into a class that can be a template parameter.

@jbikker
Copy link
Owner

jbikker commented Jan 15, 2025

Custom geometry in the current implementation consists of a set of callbacks in a regular BVH, which turns all the primitives in that BVH into custom geometry. So if a scene contains some spheres, these will be gathered in one or more BVHs. After that they will be added as BLAS to the scene (encapsulated in a BLASInstance). This way they benefit from instancing functionality and free rigid animation via the transform specified in the BLASInstance.

Using a template to avoid the if statement: That would work if we assume that a scene will use only one type of BVH. Especially for real-time scenarios I don't expect this to be the case; one may want to use a 4 or 8-wide optimized BVH for static geometry, but a 2-wide BVH for geometry that requires rebuilding per frame.

I suspect that all of us here have a healthy aversion of a switch-case in such a low-level code section. Do consider the following however:

  • In a scene with a single BVH type the conditional code becomes fully predictable and effectively disappears.
  • The conditionals happen in a code snippet that will also transform a ray into the model space of the BLASInstance.
  • Rays are expected to encounter less than 2 BLASInstances on average.
  • Jacco.

@jbikker
Copy link
Owner

jbikker commented Jan 16, 2025

I like that BLASInstance is now more self contained, having a BVH index instead of a pointer and storing its world aabb, so the blasList isn't necessary for the construction of the TLAS. I'm not intending to use the CPU intersect functions, so the BVH list should be optional. In Build, you currently update the instance aabb using the BVH list. I would like to pass a null BVH list, and if the BVH list is null, then you can assume the instance list already has an updated aabb for instances.

@jbikker
Copy link
Owner

jbikker commented Jan 16, 2025

...the BVH list should be optional. In Build, you currently update the instance aabb using the BVH list. I would like to pass a null BVH list, and if the BVH list is null, then you can assume the instance list already has an updated aabb for instances.

Forgot it for the 1.2.5 release but this functionality is now in.

@jbikker
Copy link
Owner

jbikker commented Jan 23, 2025

Custom geometry update: There is now a third callback required for custom geometry, which calculates the bounding box of a single primitive. With this change, it is no longer necessary to create and abuse a triangle array; you can now directly build the BVH:

bvh.Build( &sphereAABB, primCount ); 

Here, sphereAABB is a function pointer, and the function is simply:

void sphereAABB( const unsigned primID, bvhvec3& boundsMin, bvhvec3& boundsMax )
{
	boundsMin = spheres[primID].pos - bvhvec3( spheres[primID].r );
	boundsMax = spheres[primID].pos + bvhvec3( spheres[primID].r );
}

This was originally proposed by @TheSFReader.

This behavior is demonstrated in tiny_bvh_custom.cpp for single-precision floating point numbers and in tiny_bvh_custom_double.cpp for double-precision floating point.

Similarly, tlas functionality has been split out to two example programs: tiny_bvh_anim.cpp and tiny_bvh_anim_double.cpp. Note that neither example actually has any animating geometry, both examples just build a scene with a tlas.

On the topic of TLAS functionality: tinybvh used to encode the instance index in the top bits of the primitive index of a hit, in order to keep the size of the hit record limited to 32 bytes. This turns out to be insufficient; therefore tinybvh now defaults to using a full uint32_t (for single precision BVHs) or uint64_t (for double precision BVHs) to encode the instance index. The old functionality still exists for performance-critical scenarios (see define INST_IDX_BITS) but this is no longer the default.

I will assume the custom geometry functionality to be complete at this point, and will now move on to combining the custom geometry with the tlas functionality.

@wjakob
Copy link
Author

wjakob commented Jan 23, 2025

Awesome, thank you so much 🍾

@jbikker
Copy link
Owner

jbikker commented Jan 23, 2025

Initial implementation is up. I would love some feedback on the interface and perhaps also on the implementation.

It's all demonstrated in tiny_bvh_anim.cpp and tiny_bvh_anim_double.cpp:

  • A BVH is constructed over some triangle geometry (line 77)
  • A 'custom geometry' BVH is constructed over a collection of spheres (line 97)
  • The second BVH is assigned two callbacks for intersection and occlusion (line 100, 101)
  • The instance array is set up (line 104 - 112)
  • The TLAS is constructed over the instances (line 113).

After this, the TLAS can be intersected like any BVH (line 152). It yields 5 fields: intersection distance, instance index, primitive index in the instance, and finally the barycentrics of a triangle hit. From here you're on your own. So in the example, based on blas index (retrieved from the instance), we process a sphere hit or a triangle hit. In both cases the vertices are in object space and must be transformed to world space.

Internally, the process is handled in BVH::IntersectTLAS (line 2156, 6056 for double). There is some overhead here for the new functionality: If the TLAS contains a blend of different BVH types or BVHs with and without custom geometry, the conditionals may cause mispredictions. However, in the absence of anything fancy, I believe this construct is acceptable: The conditionals will follow a fixed path and should be negligible compared to the overall cost of ray traversal.

Note that BLAS traversal for custom geometry requires special care. The commonly used sphere intersection algorithm assumes a normalized ray direction, but TLAS traversal doesn't guarantee this. So only for custom geometry, we now normalize the ray direction and scale ray.hit.t to compensate. This sadly involves a square root and a division per BLAS visit, so perhaps we should introduce a setting that disables this (if we're sure our custom geometry can handle not normalized rays).

As a by-product TLAS traversal may now include other BVH types, so that a scene can be populated with high-quality BVH4_CPU meshes as well as quick and dirty basic BVH instances. This should maximize render performance in dynamic scenarios.

So, a large operation all in all; probably needs some tweaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants