Skip to content

Refactor udt intrinsic arg copy to before SROA, flatten RayDesc #7440

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 16, 2025

Conversation

tex3d
Copy link
Contributor

@tex3d tex3d commented May 8, 2025

Intrinsics that take UDT arguments need copy-in/copy-out. Other aggregate args are flattened for intrinsic calls. Previously, these operations were intermingled, driven by SROA on alloca/GV values.

There were RayDesc arguments that weren't treated consistently, and weren't copied in when necessary, leading to problems. They should be flattened into the intrinsic arguments, but TraceRay calls didn't do this.

This change:

  • flattens RayDesc args for all intrinsics that use them.
  • separates the copy-in/copy-out generation into a separate operation before SROA. Ideally, this copy-in/copy-out would have been generated by CodeGen based on by-value passing, but that's a deeper intrinsic AST issue potentially.
  • Updated and added tests.

Fixes #7434.

Intrinsics that take UDT arguments need copy-in/copy-out.
Other aggregate args are flattened for intrinsic calls.
Previously, these operations were intermingled, driven by SROA on alloca/GV values.

There were RayDesc arguments that weren't treated consistently, leading to problems.
They should be flattened into the intrinsic arguments, but TraceRay calls didn't do this.
Doing this for TraceRay would cause an issue where the subsequent Payload UDT arg could be in a different position depending on whether it was processed before or after flattening the RayDesc.

This change is in preparation for flattening RayDesc args.
It separates the UDT copy-in/copy-out into a separate operation before SROA.

When doing the copy-in/copy-out separately, we don't expand the memcpy operations until they are handled separately during SROA.
This causes a change in some IR shape, which had to be adjusted for in some tests.
There are a couple remaining PIX pass tests that still need adjustment.
These will likely need further changes when flattening RayDesc, so maybe they should be revisited after that.

TODO:
- Fix the remaining tests
- Flatten the RayDesc parameters
- Adjust tests for accordingly
Copy link
Contributor

github-actions bot commented May 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

tex3d added 3 commits May 8, 2025 13:42
This renames:
- copyIntrinsicUDTArgs -> copyIntrinsicAggArgs
- RewriteCallArg -> memcpyAggCallArg.

Updated to copy the RayDesc aggregate arguments as well.

This will fix the issue with a RayDesc argument provided directly from a cbuffer,
since the incoming argument pointer will no longer be skipped by SROA.

Expected IR for tests will need to be updated after flattening RayDesc, so holding off on updating tests until then.
- Enable RayDesc flattening in SROA
- Update lowering code accordingly
- Separated MakeNop and MakeMiss lowering into two functions
- Updated tests

TODO: Update PIX pass tests
Copy link
Member

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sensible. Sorry if I was too detailed for a draft in some parts.

@tex3d tex3d marked this pull request as ready for review May 14, 2025 06:04
@tex3d tex3d self-assigned this May 14, 2025
@tex3d tex3d moved this to Active in HLSL Support May 14, 2025
@tex3d tex3d changed the title Refactor udt intrinsic arg copy to before SROA Refactor udt intrinsic arg copy to before SROA, flatten RayDesc May 14, 2025
Copy link
Member

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything against this change, just some confirmations and very mild suggestions. Since you need a second review anyway, perhaps getting Jeff to comment on the pix changes would be expedient. I don't expect him to review the other parts though, so you might choose to get someone else to give those a glance as well.

Copy link
Member

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving again again again.

Copy link
Contributor

@alsepkow alsepkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@simoll simoll mentioned this pull request May 16, 2025
41 tasks
@tex3d tex3d merged commit 053e7ac into microsoft:main May 16, 2025
13 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in HLSL Roadmap May 16, 2025
@github-project-automation github-project-automation bot moved this from Active to Closed in HLSL Support May 16, 2025
@tex3d tex3d deleted the sroa-udt-arg-refactor branch May 16, 2025 15:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Unflattened RayDesc breaking HL->DXIL lowering
3 participants