-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR] initial support for pointer-to-data-member type #401
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
8d5442a
to
4513317
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lancern awsome! I left a few suggestions to improve testing and the codegen side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lancern thanks for all the changes! Just one more round of reviews.
Thanks for adding support for pointers to data members! This is going to allows to build even more C++, way to go. I believe we can use currently existing types and ops to support these, and no need to introduce new ops/attr/types. If you have a specific use case you care about, can you please provide examples of how using this extra layer information helps?
All ops that use/return these might as well return a
The AST allows you to query the type of the struct element given you already have the index, so it can just be used directly.
Without a |
@bcardosolopes Thanks for your comment! But TBH I'm sorry I don't get your point exactly.
I don't clearly see how this is possible. A member pointer is totally different from an ordinary pointer. If we insist the reuse of the existing types instead of the new types, one possible way is to use the types compatible with the underlying ABI layout to represent a ptr-to-data-member. On most platforms, a ptr-to-data-member is simply represented by an integer which represents the offset of the target field from the beginning of the containing object. So we might use a Another possible way to represent the ptr-to-member type in CIR is to invent an imagined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems that I missed the null pointer value support for ptr-to-data-member. I'll check this and update the PR.
@bcardosolopes take a look at this godbolt example. I believe it corroborates @Lancern's arguments.
This might work, but we would be losing source code information and diverging from the original codegen. These two types are inherently different, as seen in the example above. Having separate types should facilitate both analysis and lowering. It might also come in handy when we have a proper solution for dealing with target/ABI-specific stuff.
I didn't quite understand this point. @bcardosolopes could you elaborate?
I believe this is necessary regardless of the |
Just added support for null ptr-to-data-member values. |
2c32fcc
to
4795e86
Compare
Rebased onto the latest main. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lancern a few more suggestions and some minor nitpicks.
let description = [{ | ||
A data member pointer attribute is a literal attribute that represents a | ||
constant pointer-to-data-member value. | ||
}]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expand this explaining that the literal represents the index of the data member.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
Addressing @Lancern comments:
No problem, let me clarify!
I think my point can be easier to understand if I ask you a different question: how do you plan to lower this to LLVM? CIR already has most of the things you need to have a lower level representation to handle this just like LLVM does.
Can you give me an example of such analysis and a specific use case you care about? Note that I'm not against adding nice abstractions, I'm all for it, but I rather not introduce complexity for things we don't have concrete use cases in mind. Note that if we add the abstractions like you are proposing, you'd also need to lower it down to more lower level CIR as part of the LoweringPrepare pass before we go down to LLVM. Without some roadmap on how you'd like to use this abstractions, I rather they only get added upon compromise on some upcoming feature that's going to use it.
I'm looking at https://godbolt.org/z/hjaMcqKov and CIR has all the operations necessary to the same approach, can you elaborate on the specific limitations you've found? |
Addressing @sitio-couto
I'm willing to bite that compromise unless there's a concrete use case in mind. So far, lowering to LLVM sounds like the only one.
Lowering these ABI details early sounds fine to me, how much these featues get used in practice to justify the abstraction? I don't have numbers nor am I asking for them, but I don't stumble into data-member pointers that often, I might be biased tho.
It's a good way to fix it while keeping nice abstraction, but I'm not yet convinced there are users to take advantage of the benefits. |
IMO, this and being semantically closer to C/C++ is enough to at least justify the type and the attribute. After all, the complexity of these two components is minimal.
I do not know, but this may also be a reason for us to be conservative by adding said abstraction.
I don't think there are. GEP's with runtime indexes are not uncommon in the original codegen. In CIR, however, most are replaced with pointer strides (example). We could go the same route with data-member pointers. |
Unfortunately I cannot give a concrete example at this moment that shows the immediate usefulness of the newly added stuff in this PR. But I believe that it brings benefits similar to the
This makes sense to me. So if it finally comes to a consenus that this PR is too early to be landed at this stage of this project, I'll close this PR and draft a new PR that directly implements ptr-to-data-member upon existing CIR types and operations. |
4e069c6
to
79d4dc7
Compare
I'm still open to discuss if you feel very strong about this. For instance, you initially suggest to add |
4795e86
to
653b989
Compare
Rebased onto the latest |
Apologies for the delay here. I finally had some time to take a closer look at this. Before you make changes to this patch, let's perhaps wrap this discussion and converge on a resolution first, so you don't waste resources. My comments/suggestions follow: case 1
If the type of My suggestion:
case 2
My suggestion:
case 3
Ok, I think we can still use this one, with some changes: My suggestion:
Seems like the same applies to the incomplete types as well. On the In a gist, the changes would be:
Does this makes to you? Anything I'm missing? |
@bcardosolopes Thanks for your comment! Overall the suggestions make sense, but one problem in case 1 and case 2:
This decay does not make sense because member pointers are not pointers as represented by For case 1 a more reasonable lowering: int Point::*pt_member = &Point::x;
// cir.global external @pt_member = #cir.data_member_ptr<!ty_22Point22[0]> : !u64i For case 2: auto test1() -> int Point::* {
return &Point::y;
}
// cir.func @_Z5test1v() -> !u64i
// %{{.+}} = cir.const(#cir.data_member_ptr<!ty_22Point22[1]> : !u64i) : !u64i
// } Case 3 makes sense to me. |
Good catch, sounds great to me! |
This representation will complicate printing/parsing and type-checking. I suggested
Is the intent to drop the data-member-pointers type info right after the declaration? If this is the case, the new type will be useless, I think, since it will only be used to declare such data and be dropped right after. Other codegen sections for data-member-pointer will likely also require change, such as assignments. Both comments above apply to Case 2 as well.
An interface seems excessive for a temporary solution. Are there any current places where said |
Works for me.
One IPO opportunity I can see in future is tracking a perhaps offset on a callsite param and specialize the callee to transform a
I'm in favor of generalizing |
My point is not whether we need the new attribute and type. Since we are introducing them, it seems contradictory that the first action we take is to discard this information by decaying it to an integer, especially if it will complicate other areas such as codegen. Considering this, what is the benefit of this decay? The IPO mentioned seems viable regardless.
I would prefer to leave them separate. BTW, I mentioned before that runtime GEP operations are represented as low-level pointer strides and loads in CIR. If this is by design, perhaps we should do the same thing with data-member-pointers. |
Ok, almost converging :)
It seems to me that we can easily retrieve information (versus tree matching a bunch of conditions) by having a data attribute (there's no other way to initialze those otherwise) and a dynamic
Can you elaborate on how it will complicate codegen? @sitio-couto @Lancern: if we had such a type, how do you plan to later lower it to an actual offset in LowerPrepare? Will we need AST information or does CIR already has everything? I'm asking because the current proposed type seems to be: "there's a
I have the reverse question, what's the benefit of keeping it? The only ones I see are mentioned above: defer ABI decision when it's a function argument. @sitio-couto: if you feel strong about this, I'm willing to take it, I probably just need some comments on the questions above, so I know what's the plan of record.
I'm fine with having them separate, I don't have a strong take here, tradeoffs seem fair. My idea with the interface was considering that the thing was actually a pointer type, and if that was the case, it'd be silly to have more than one type that behaves like a pointer without an interface to unify queries. However, since the thing represents an offset, it doesn't make sense to use an interface - I forgot to relay that info in previous comments.
I don't like pointer strides here because it loses the "this is dynamically walking a struct" semantic, and makes it semantically easy to transform into static in the future. Also, LowerPrepare could lower On another note, we haven't been using strides for struct related code, at least not from what I can currently see in tests. If we want to unify that somehow, should probably be its own PR proposal. |
@bcardosolopes and I talked a bit offline and agreed on the following:
This is mostly what you're already doing, however, there are a couple of changes to be made:
@Lancern let us know when you've addressed the remaining reviews, and if you have any other questions! |
653b989
to
7bfd595
Compare
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
!s32i = !cir.int<s, 32> | ||
!ty_22Foo22 = !cir.struct<struct "Foo" {!s32i}> | ||
|
||
#global_ptr = #cir.data_member<0> : !cir.data_member<!s32i in !ty_22Foo22> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I've stumbled upon this test when refactoring the cir.const
assembly.
Can you explain what you were trying to test here? The reason I'm asking is that what you have here is an attribute alias, not a global value. That's the reason that the output MLIR is "inlining" it into:
// CHECK-NEXT: cir.func @get_global_member(%arg0: !cir.ptr<!ty_22Foo22>) {
// CHECK-NEXT: %0 = cir.const(#cir.data_member<0> : !cir.data_member<!s32i in !ty_22Foo22>) : !cir.data_member<!s32i in !ty_22Foo22>
// CHECK-NEXT: %1 = cir.get_runtime_member %arg0[%0 : !cir.data_member<!s32i in !ty_22Foo22>] : !cir.ptr<!ty_22Foo22> -> !cir.ptr<!s32i>
// CHECK-NEXT: cir.return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lancern ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orbiri Well this should just be an effort to save some typings when writing this test. But clearly it does little help, so you can change it if it's convenient for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@orbiri Well this should just be an effort to save some typings when writing this test. But clearly it does little help, so you can change it if it's convenient for you.
From the test name it seems like you tried to test the incorporation of some global property here, but if not - then get_global_member
is 1:1 equivalent to get_runtime_member
, and I'd suggest to remove it.
If indeed you intended to test some global property, I'd suggest to revisit this test to make sure that it won't break over time 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the test name it seems like you tried to test the incorporation of some global property here, but if not - then get_global_member is 1:1 equivalent to get_runtime_member, and I'd suggest to remove it.
I checked this again and I'm sure they're 1:1 equivalent as you said. I'll remove it in an NFC PR if you request so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove it in an NFC PR if you request so.
See #582 .
As suggested in llvm#401, this patch removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant.
As suggested in #401, this PR removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant. The original comment: #401 (comment)
) As suggested in llvm#401, this PR removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant. The original comment: llvm/clangir#401 (comment)
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
) As suggested in llvm#401, this PR removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant. The original comment: llvm#401 (comment)
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull llvm#401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
) As suggested in llvm#401, this PR removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant. The original comment: llvm#401 (comment)
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes: - New ops, types, and attributes: - CodeGen for pointer-to-data-member types and values - Lower C++ pointer-to-member type - Lower C++ expression `&C::D` - Lower C++ expression `c.*p` and `c->*p` This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet: - Pointer to member function; - Conversion from `T Base::*` to `T Derived::*`; - LLVMIR lowering.
As discussed in pull #401 , The present `UnimplementedFeature` class is made for the CIRGen submodule and we need similar facilities for code under `clang/lib/CIR/Dialect/IR`. This NFC patch adds a new `CIRDialectUnimplementedFeature` class that provides unimplemented feature guards for CIR dialect code.
As suggested in #401, this PR removes the `get_global_member` test in `CIR/IR/data-member-ptr.cir` as it is redundant. The original comment: #401 (comment)
This patch adds initial support for the pointer-to-data-member type. Specifically, this commit includes:
!cir.member_ptr
#cir.data_member_ptr
cir.get_indirect_member
!cir.member_ptr
&C::D
tocir.const(#cir.data_member_ptr)
c.*p
andc->*p
tocir.get_indirect_member
This patch only includes an initial support. The following stuff related to pointer-to-member types are not supported yet:
T Base::*
toT Derived::*
;Hopefully they will eventually be supported in later PRs :)