-
Notifications
You must be signed in to change notification settings - Fork 760
Fix the return types of dot4add_i8packed
and dot4add_u8packed
.
#7401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I have absolutely no idea what I'm doing. Any advice would be welcome. |
@amaiorano Hi! This is another PR that seems to fix a bug that Firefox runs into. |
c8bd8bc
to
974bdaa
Compare
Added a test. |
I've added DXC devs from Microsoft who would know this code better than me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix looks good. I'd like to see it expanded to another place it is needed and testing to cover it all, but otherwise I'm happy!
By way of a little more explanation since I took the time to figure out why this did what it did, we have a special optimization path for intrinsics whose return types depend on their arguments where all are immediates. This is where it will now correctly resolve the immediate because that dependency is removed:
DirectXShaderCompiler/tools/clang/lib/Sema/SemaHLSL.cpp
Lines 6738 to 6747 in 1198c30
// For literal arg which don't affect return type, find concrete type. | |
// For literal arg affect return type, | |
// TryEvalIntrinsic in CGHLSLMS.cpp will take care of cases | |
// where all argumentss are literal. | |
// CombineBasicTypes will cover the rest cases. | |
if (!affectRetType) { | |
TypeInfoEltKind = | |
ConcreteLiteralType(pCallArg, TypeInfoEltKind, | |
pIntrinsicArg->uLegalComponentTypes, this); | |
} |
This is where in other functions that match this signature, a potential optimization reduces them to a constant expression:
Value *TryEvalIntrinsic(CallInst *CI, IntrinsicOp intriOp, |
As you can see there, the dot products are not included as they can't be a constant expression, so it never gets resolved properly.
That's why I've concluded that this is an acceptable change and it doesn't reveal any deeper issue apart from a brittle optimization opportunity detection that I don't think is worth addressing.
Okay, great. I noticed this path getting taken, and that gave me the idea of changing the types in |
Change the definition of the HLSL `dot4add_i8packed`, `dot4add_u8packed`, and `dot2add` intrinsics in `utils/hct/gen_intrin_main.txt` to simply spell out the return types, rather than saying that their return type is determined by their third argument. This prevents DXC from trying to give those functions declarations like declare i64 @"\01?dot4add_u8packed@hlsl@@YA_JII_J@Z"(i32, i32, i64 signext) microsoft#1 which seems to expect a 64-bit third argument and return value. `HLSLExternalSource::MatchArguments` assumes that functions whose return type depends on their arguments' types will get cleaned up by `TryEvalInstrinsic`. Unfortunately, the `dot4add` variants cannot be constant expressions, so this cleanup does not happen for them. But these functions are not generic, and they have only one overload, so there is no need to use interesting `uComponentTypeId` values to get the right effects in the first place. Fixes microsoft#7400.
974bdaa
to
553be8c
Compare
@pow2clk Thanks very much for the review. I think the PR now addresses your comments. |
@jimblandy the follow up review would have been a lot easier if you'd not force pushed the changes. I understand you wanted to rebase, but you can do that with a merge which will keep those changes separate and would have allowed me to just review the new commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the changes!
I'm sorry - I'll be sure to do this next time. |
Change the definition of the HLSL
dot4add_i8packed
anddot4add_u8packed
intrinsics inutils/hct/gen_intrin_main.txt
to simply spell out the return types, rather than saying that their return type is determined by their third argument.This prevents DXC from trying to give those functions declarations like
which seems to expect a 64-bit third argument and return value.
These functions are not generic, and they have only one overload, so there is no need to use interesting
uComponentTypeId
values to get the right effects, andHLSLExternalSource::MatchArguments
seems to get confused about how to treat argument types that affect the return types.Fixes #7400.