-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR][CIRGen] Add support for global named registers #741
base: main
Are you sure you want to change the base?
Conversation
…lvm#501) This PR adds the LLVMIR lowering support for CIR bit operations. For `cir.bit.clz`, `cir.bit.ctz`, and `cir.bit.popcount`, they can be lowered directly to LLVM intrinsic calls to `@llvm.ctlz`, `@llvm.cttz`, and `@llvm.ctpop`, respectively. For the other three bit operations, namely `cir.bit.clrsb`, `cir.bit.ffs`, and `cir.bit.parity`, they are lowered to a sequence of LLVM IR instructions that implements their functionalities. This lowering scheme is also used by the original clang CodeGen.
CIRGenFunction::buildFromMemory can handle the `cir.bool` values. So we no longer need to emit the `NIY` error here.
This is the next step in inline assembly support and it's more like a service PR and mostly dedicated to the in/out argument types. Also, operand attributes are added and it's the last change in the `cir.asm` operation afaik. But I would wait untill the next PR, which will contain more examples and maybe will help us to get more readable format for the operation. Note, that we have to add an attribute for each operand - because the lowering of the llvm dialect to LLVM IR iterates over them in the same order. The next PR will be last one (so far) in the series of PRs dedicated to the inline assembly support. It will add storing of the results.
Three small changes, all cleanup or refactoring in nature. 1. Fix the assemblyFormat for all the vector operations in the ClangIR dialect so that vector types appear in ClangIR as `!cir.vector<type x n>` instead of as just `<type x n>`. When I first created the vector ops, I forgot to use `qualified` as necessary when writing out types. This change fixes that. There is no change in behavior, but there is a change to the text version of ClangIR, which required changing the ClangIR expected results and ClangIR inputs in the tests. 2. Create a new `cir.vec.splat` operation and use that for "vector splat", i.e. a conversion from a scalar to a vector. A "vector splat" conversion had been implemented with `cir.vec.create` before. This change results in different ClangIR and different LLVM IR, which again required updating the tests, but no noticeable change in compiler behavior. 3. Create an `IntegerVector` type constraint, which requires that the given type be a vector whose element type is an integer. It can be any integral type, and the vector can be of any size. Use the new type constraint in the definition of `cir.vec.ternary`, whose condition operand must be an `IntegerVector`. Remove the integral type check from `VecTernaryOp::verify`, since doing the check there is now redundant. The only possibly visible change is to the text of an error message when validation of `cir.vec.ternary` fails. The expected output of a validation test was updated with the new message.
This PR intends to fix some problems with packed structures support, so the llvm#473 will work. Basically, the main problem for the packed structures support is an absence of arbitrary sized integers in CIR. Well, one workaround is to use `mlir::IntegerType` for that, but it's kind of wrong way (please correct me if I'm wrong). Another way is to introduce this type in CIR. So far I suggest this way: instead of arbitrary sized integers we will create an array of bytes for bitfield storages whenever they doesn't fit into the CIR `IntType`. Well, the original codegen creates storages with alignment 8 - so it can be `i24` storage type for instance. Previously, we just created storages that could be represented as CIR `IntType`: 8, 16, 32, 64. And it was working before I came up with a necessity to support packed structures. At first glance it's not a problem - just add `determinePacked` method from the original codegen and that's it. But it turned out that this method _infers_ the fact if a structure is packed or not. It doesn't use the AST attribute for that as one could think - it works with offsets and alignments of fields. Thus, we either need to invent our own way to determine packed structures (which is error prone and maybe not doable at all) or try to use the existing one. Also, we go closer to the original lllvm's data layout in this case. 1) I had to move the lowering details from the `LoweringPrepare` to the `LowerToLLVM`, because it's not possible to do a `load` from the array of bytes to the integer type - and it's ok in llvm dialect. Thus, all the math operations can be expressed without any problems. Basically the most of the diff you see is because of the changes in the lowering. The remaining part is more or less easy to read. 2) There are minor changes in `CIRRecordLayoutBuilder` - as described above, we use may generate an array of bytes as a storage. 3) Some cosmetic changes in `CIRGenExpr` - since we don't want to infer the storage type again and just use the one stored in the `CIRGenBitFieldInfo`. 4) Helpers are introduced in the lowering - but nothing hard - just shifts and logical ops. 5) I removed `bitfield-ops` test - because now the test cases covered there are all in `bitfields.c` and `bitfields.cpp` . So ... This is still a suggestion, though I believe it's a good one. So you are welcome to discuss, suggest another ways to solve the problem and etc.
Still missing lowering support, which will come next.
This doesn't change existing functionality, for existing crashes related to atomics we just hit asserts a bit further now, but no support added just yet. Next set of commits will introduce functionlity with testcases.
Just like previous commit, add more infra pieces, still NFC since all relevant testcases hit asserts, just a bit deeper.
This PR adds a support for packed structures. Basically, now both `pragma pack(...)` and `__attribute__((aligned(...)))` should work. The only problem is that `getAlignment` is not a total one - I fix only a couple of issues I faced with - for struct types and arrays.
PR adds support for initialization of unions. The change is copy-pasted from the original CodeGen.
This PR adds support for `__builtin_prefetch`. CIRGen of this builtin emits the new 'cir.prefetch' opcode. Then `cir.prefetch` lowers to `llvm.prefetch` intrinsic. Co-authored-by: Bruno Cardoso Lopes <[email protected]>
This PR adds MemRead/MemWrite markers to the `GetBitfieldOp` and `SetBitfieldOp` (as discussed in llvm#487) Also, minor renaming in the `SetBitfieldOp` --------- Co-authored-by: Bruno Cardoso Lopes <[email protected]>
This PR adds support for `__builtin_constant_p`. Implementation introduces the new `cr.is_constant` opcode to it during the codegeneration of builtin. Codegeneration is taken from the original llvm codegen.
This patch adds CIRGen support for the C++20 three-way comparison operator `<=>`. The binary operator is directly lowered to existing CIR operations. Most of the changes are tests.
This change is taken from the original codegen.
…m#517) This PR adds handling of AttributedStmt to support fallthrough attribute.
This change is taken from the original codegen
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
Implement `__builtin_shufflevector` and `__builtin_convertvector` in ClangIR. This change contributes to the implemention of issue llvm#284. `__builtin_convertvector` is implemented as a cast. LLVM IR uses the same instructions for arithmetic conversions of both individual scalars and entire vectors. So ClangIR does the same. The code for handling conversions, in both CodeGen and Lowering, is cleaned up to correctly handle vector types. To simplify the lowering code and avoid `if (type.isa<VectorType>())` statements everywhere, the utility function `elementTypeIfVector` was added to `LowerToLLVM.cpp`. `__builtin_shufflevector` has two forms, only one of which appears to be documented. The documented form, which takes a variable-sized list of integer constants for the indices, is implemented with the new ClangIR operation `cir.vec.shuffle.ints`. This operation is lowered to the `llvm.shufflevector` op. The undocumented form, which gets the indices from a vector operand, is implemented with the new ClangIR operation `cir.vec.shuffle.vec`. LLVM IR does not have an instruction for this, so it gets lowered to a long series of `llvm.extractelement` and `llvm.insertelement` operations.
This introduces CIRGen and LLVM lowering for the first of a bunch of these atomic operations, incremental work should generelize the current constructs.
This PR adds support for the following intrinsic functions: - `__builtin_bswap{16, 32, 64}` - `_byteswap_{ushort, ulong, uint64}` This PR adds a new `cir.bswap` operation to represent such an intrinsic call. CIRGen and LLVMIR lowering for the new operation is included in this PR.
This is the final commit for issue llvm#284. Vector types other than GNU vector types will be covered by other yet-to-be-created issues. Now that GNU vector types (the ones defined via the vector_size attribute) are implemented, do a final cleanup of the assertions and other checks related to vector types. Remove `UnimplementedFeature::cirVectorType()`. Deal with the remaining calls to that function. When the that is not yet implemented has to do with Arm SVE vectors, the assert was changed to `UnimplementedFeature::scalableVectors()` instead. The assertion was removed in cases where the code correctly handle GNU vector types. While cleaning up the assertion checks, I noticed that BinOp handling of vector types wasn't quite complete. Any special handling for integer or floating-point types wasn't happening when the operands were vector types. To fix this, split `BinOpInfo::Ty` into two fields, `FullType` and `CompType`. `FullType` is the type of the operands. `CompType` is normally the same as `FullType`, but is the element type when `FullType` is a vector type.
LLVM support coming next.
… value (llvm#719) This commit fixes GlobalOp lowering for floating without initial value. It implies to be initialized with zeros. E.g. float f[100]; double d;
ClangIR was failing on ``` __atomic_compare_exchange_n(&a, &old, 42, true, 5, 5); ``` The `true` was the problem. It would work with a literal `0` or `1`, but not with a literal `true` or `false`. The bug was in `isCstWeak` in CIRGenAtomic.cpp, which was only looking for an integral constant. It didn't recognize a boolean constant and was falling back on the non-constant path, which isn't implemented yet. Rewrite `isCstWeak` to check for both intergral and boolean constants.
This PR adds LLVM lowering for the following operations related to complex numbers: - `cir.complex.create`, - `cir.complex.real_ptr`, and - `cir.complex.imag_ptr`. The LLVM IR generated for `cir.complex.create` is a bit ugly since it includes the `insertvalue` instruction, which typically is not generated in upstream CodeGen. Later we may need further CIR canonicalization passes to try folding `cir.complex.create`.
This PR fixes the bug described as in llvm#727 (comment). It should resolve the crash reported in llvm#727 .
This PR introduces a new attribute `OpenCLKernelMetadataAttr` to model the OpenCL kernel metadata structurally in CIR, with its corresponding implementations of CodeGen, Lowering and Translation. The `"TypeAttr":$vec_type_hint` part is tricky because of the absence of the signless feature of LLVM IR, while SPIR-V requires it. According to the spec, the final LLVM IR should encode signedness with an extra `i32` boolean value. In this PR, the droping logic from CIR's `TypeConverter` is still used to avoid code duplication when lowering to LLVM dialect. However, the signedness is then restored (still capsuled by a CIR attribute) and dropped again in the translation into LLVM IR.
… BinOp (llvm#720) This commit extends the pass to support loop invariant BinOp hoisting as SCF forOp boundary. E.g. // (100 - 1) should be hoisted out of loop. // So the boundary could be input operand to generate SCF forOp. for (int i = 0; i < 100 - 1; ++i) {}
…#729) Previously, when lowering induction variable in forOp, we removed the IV load and replaced the users with SCF.IV. The CIR IV users might still CIR operations during lowering forOp. It caused the issue that CIR operation contained SCF.IV as operand which is MLIR integer type instead CIR type. This comment lower CIR load IV_ADDR with ARITH addi SCF.IV, 0 So SCF.IV can be propagated by OpAdaptor when lowering individual IV users. This simplifies the lowering and fixes the issue. The redundant arith.addi can be removed by later MLIR passes.
LLVM lowering for the following operations is introduced in llvm#616 and llvm#651: `cos`, `exp`, `exp2`, `log`, `log10`, `log2`, `sin`, `sqrt`, `fmod`, and `pow`. However, they are not lowered to their corresponding LLVM intrinsics; instead they are transformed to libc calls during lowering prepare. This does not match the upstream behavior. This PR tries to correct this mistake. It makes all CIR FP intrinsic ops lower to their corresponding LLVM intrinsics (`fmod` is a special case and it is lowered to the `frem` LLVM instruction).
For now only handle the cir.try part, cir.catch is coming next. Using flat cir for tests make this easy to incrementally build.
as title. document will be in another PR as it seems to be a different upstream branch
…lvm#733) In [this commit](llvm@e5d840b), minimal support for Darwin aarch64 triples was added. But TargetLoweringInfo was not updated correspondingly. This could lead to a failure of the test `driver.c` with CallConvLowering pass enabled (or `LowerModule` used in some other ways). This PR fixes the inconsistency and adds an extra missing feature flag for it.
Although currently LowerModule is not ready for formal usage, we need it for target-specific lowering to LLVM. This PR temporarily public the symbol `createLowerModule` to reuse the logic of preparing a `LowerModule`, making it easier for future refactor (making `TargetLoweringInfo` available for most stages in CIR Lowering).
In this PR, we 1. implement defaultVisibility as far as dsolocal is concerned, currently is either MLIR::Visibility isPublic() or isPrivate(). Now, we don't handle hiddenVisibility and protectedVisibility from AST. I put missFeature assert so that If in anyway we translate hiddenVisibility or protectedVisibility into mlir::SymbolTable::Visibility::Private (hopefully not for it'd be confusing), then we need to revise this defaultVisibility setting. 2. call setNonAliasAttributes on global op upon discovery of its initialization, thus we have globals dso_local correctly set. Still missing is lots of function should have dso_local set, but the all depend on comDat implementation, which will come from next PR within the next few days.
OG codegen does not generate any exception related content when there are not calls happening inside the try block. For now we mimic OG and do the same, until we see a concrete use case that would have used emitting this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, it's a nice feature addition!
We can achieve the same by re-using existing ClangIR components. The suggested approach:
- Create/use a regular global.
- Use regular
cir.get_global
to retrieve the address. - Add a new attribute #cir.asm_reg that holds a StringAttr. This should be used to initialize the global.
- Add a
isGlobalRegister
method tocir.global
'sextraClassDeclaration
, which checks if the global has an initializer and whether that is a#cir.asm_reg
. - Add support in
LowerToLLVM.cpp
to materializecir.get_global
s intometadata !<whatever>
.cir.{load,store}
referring to these globals should be converted to the proper intrinsic calls. - Use a more complete usecase that exercise both intrinsics: https://godbolt.org/z/deGs17rox
- Use
clang/test/CIR/CodeGen/abstract-cond.c
as an example for testing both CIR and LLVM output in the same test.
// cir.func no_proto @main() extra(#fn_attr) { | ||
// %0 = cir.const #cir.int<1> : !s32i | ||
// %1 = cir.cast(integral, %0 : !s32i), !s64i | ||
// %2 = cir.get_RG @llvm.named.register.x20 : !cir.ptr<!u64i> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for trying to improve this! We don't eagerly lower things to LLVM right away though, this is done in LowerToLLVM.cpp
.
Realize RG and GET_RG op which are copied from cir.Global
op. The concept is that the same asm label means a unique abstract memory
address with u64i type, and get_rg will return ptr. We could use
cir.cast(bitcast) to transform ptr to ptr.