Skip to content

Fix the return types of dot4add_i8packed and dot4add_u8packed. #7401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 12, 2025

Conversation

jimblandy
Copy link
Contributor

Change the definition of the HLSL dot4add_i8packed and dot4add_u8packed intrinsics in utils/hct/gen_intrin_main.txt to simply spell out the return types, rather than saying that their return type is determined by their third argument.

This prevents DXC from trying to give those functions declarations like

declare i64 @"\01?dot4add_u8packed@hlsl@@YA_JII_J@Z"(i32, i32, i64 signext) #1

which seems to expect a 64-bit third argument and return value.

These functions are not generic, and they have only one overload, so there is no need to use interesting uComponentTypeId values to get the right effects, and HLSLExternalSource::MatchArguments seems to get confused about how to treat argument types that affect the return types.

Fixes #7400.

@jimblandy
Copy link
Contributor Author

I have absolutely no idea what I'm doing. Any advice would be welcome.

@jimblandy
Copy link
Contributor Author

@amaiorano Hi! This is another PR that seems to fix a bug that Firefox runs into.

@jimblandy jimblandy force-pushed the dot4add-packed-return-type branch from c8bd8bc to 974bdaa Compare May 2, 2025 02:54
@jimblandy
Copy link
Contributor Author

Added a test.

@amaiorano amaiorano requested review from llvm-beanz and pow2clk May 2, 2025 13:51
@amaiorano
Copy link
Collaborator

@amaiorano Hi! This is another PR that seems to fix a bug that Firefox runs into.

I've added DXC devs from Microsoft who would know this code better than me.

Copy link
Member

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks good. I'd like to see it expanded to another place it is needed and testing to cover it all, but otherwise I'm happy!

By way of a little more explanation since I took the time to figure out why this did what it did, we have a special optimization path for intrinsics whose return types depend on their arguments where all are immediates. This is where it will now correctly resolve the immediate because that dependency is removed:

// For literal arg which don't affect return type, find concrete type.
// For literal arg affect return type,
// TryEvalIntrinsic in CGHLSLMS.cpp will take care of cases
// where all argumentss are literal.
// CombineBasicTypes will cover the rest cases.
if (!affectRetType) {
TypeInfoEltKind =
ConcreteLiteralType(pCallArg, TypeInfoEltKind,
pIntrinsicArg->uLegalComponentTypes, this);
}

This is where in other functions that match this signature, a potential optimization reduces them to a constant expression:

Value *TryEvalIntrinsic(CallInst *CI, IntrinsicOp intriOp,

As you can see there, the dot products are not included as they can't be a constant expression, so it never gets resolved properly.

That's why I've concluded that this is an acceptable change and it doesn't reveal any deeper issue apart from a brittle optimization opportunity detection that I don't think is worth addressing.

@jimblandy
Copy link
Contributor Author

we have a special optimization path for intrinsics whose return types depend on their arguments where all are immediates.

Okay, great. I noticed this path getting taken, and that gave me the idea of changing the types in gen_intrin_main.txt. I'm glad to have someone who knows what's going on double-check this.

Change the definition of the HLSL `dot4add_i8packed`,
`dot4add_u8packed`, and `dot2add` intrinsics in
`utils/hct/gen_intrin_main.txt` to simply spell out the return types,
rather than saying that their return type is determined by their third
argument.

This prevents DXC from trying to give those functions declarations like

    declare i64 @"\01?dot4add_u8packed@hlsl@@YA_JII_J@Z"(i32, i32, i64 signext) microsoft#1

which seems to expect a 64-bit third argument and return value.

`HLSLExternalSource::MatchArguments` assumes that functions whose
return type depends on their arguments' types will get cleaned up by
`TryEvalInstrinsic`. Unfortunately, the `dot4add` variants cannot be
constant expressions, so this cleanup does not happen for them. But
these functions are not generic, and they have only one overload, so
there is no need to use interesting `uComponentTypeId` values to get
the right effects in the first place.

Fixes microsoft#7400.
@jimblandy jimblandy force-pushed the dot4add-packed-return-type branch from 974bdaa to 553be8c Compare May 8, 2025 15:37
@jimblandy
Copy link
Contributor Author

@pow2clk Thanks very much for the review. I think the PR now addresses your comments.

@pow2clk
Copy link
Member

pow2clk commented May 12, 2025

@jimblandy the follow up review would have been a lot easier if you'd not force pushed the changes. I understand you wanted to rebase, but you can do that with a merge which will keep those changes separate and would have allowed me to just review the new commit.

Copy link
Member

@pow2clk pow2clk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes!

@pow2clk pow2clk merged commit 377c4ca into microsoft:main May 12, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in HLSL Roadmap May 12, 2025
@jimblandy
Copy link
Contributor Author

@jimblandy the follow up review would have been a lot easier if you'd not force pushed the changes. I understand you wanted to rebase, but you can do that with a merge which will keep those changes separate and would have allowed me to just review the new commit.

I'm sorry - I'll be sure to do this next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

DXC crashes compiling calls to dot4add_u8packed and dot4add_i8packed
3 participants