Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][CIRGen][Lowering] Add support for attribute annotate #804

Merged
merged 15 commits into from
Sep 13, 2024

Conversation

ghehg
Copy link
Collaborator

@ghehg ghehg commented Aug 23, 2024

The main purpose of this PR is to add support for C/C++ attribute annotate. The PR involves both CIR generation and Lowering Prepare. In the rest of this description, we first introduce the concept of attribute annotate, then talk about expectations of LLVM regarding annotation, after it, we describe how ClangIR handles it in this PR. Finally, we list trivial differences between LLVM code generated by clang codegen and ClangIR codegen.
The concept of attribute annotate. and expected LLVM IR
the following is C code example of annotation.
say in example.c
int *b __attribute__((annotate("withargs", "21", 12 ))); int *a __attribute__((annotate("oneargs", "21", ))); int *c __attribute__((annotate("noargs")));
here "withargs" is the annotation string, "21" and 12 are arguments for this annotation named "withargs". LLVM-based compiler is expected keep these information and build a global variable capturing all annotations used in the translation unit when emitting into LLVM IR. This global variable itself is not constant, but will be initialized with constants that are related to annotation representation, e.g. "withargs" should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its of array of struct type, and should be initialized with a const array of const structs, each const struct is a representation of an annotation site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const, line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var should be in section "llvm.metadata".
e.g. In the above example,
We shall have following in the generated LLVM IR like the following

@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@a = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @a, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"

notice that since variable c's annotation has no arg, the last field of its corresponding annotation entry is a nullptr.

ClangIR's handling of annotations
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its annotations. That way, we are able to make fast query about annotation if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global annotations' var to LLVM lowering. But at LoweringPrepare we collect all annotations and create a module attribute "cir.global_annotations" so to facilitate LLVM lowering.

Some implementation details and trivial differences between clangir generated LLVM code and vanilla LLVM code

  1. I suffix names of constants generated for annotation purpose with ".annotation" to avoid redefinition, but clang codegen doesn't do it.
  2. clang codegen seems to visit FuncDecls in slightly different orders than CIR, thus, sometimes the order of elements of the initial value const array for llvm.global.annotations var is different from clang generated LLVMIR, it should be trivial, as I don't expect consumer of this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR for annotations!

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most comments towards the need to prevent lowering too early, a lot of it is LLVM boilerplate we don't need in CIR because we are not looking deeper into the information. Therefore, I believe it's possible to simplify some of the logic.

I added comments in light of the design we discussed before.

clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIROps.td Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/attribute-annotate-multiple.cpp Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/attribute-annotate-multiple.cpp Outdated Show resolved Hide resolved
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more round of review.

clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/Dialect/Transforms/LoweringPrepare.cpp Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
@ghehg ghehg force-pushed the ftw3 branch 2 times, most recently from 5ea324a to 27fdf40 Compare September 6, 2024 01:20
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for breaking LowerToLLVM code a bit more and updating comments, looks much nicer and tide. Still a lot of changes to get done (which were not addressed from previous review rounds). Thanks!

clang/lib/CIR/CodeGen/CIRGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.cpp Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.h Outdated Show resolved Hide resolved
clang/lib/CIR/CodeGen/CIRGenModule.h Outdated Show resolved Hide resolved
clang/include/clang/CIR/Dialect/IR/CIRAttrs.td Outdated Show resolved Hide resolved
clang/test/CIR/CodeGen/attribute-annotate-multiple.cpp Outdated Show resolved Hide resolved
clang/test/CIR/IR/annotations.cir Show resolved Hide resolved
clang/test/CIR/IR/annotations.cir Show resolved Hide resolved
clang/test/CIR/CodeGen/attribute-annotate-multiple.cpp Outdated Show resolved Hide resolved
Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one minor change for the attribute name + 80-col violation (discussed in pvt)

@bcardosolopes bcardosolopes merged commit d2d8b45 into llvm:main Sep 13, 2024
6 checks passed
Copy link
Collaborator

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some feedback.

}
llvm::FoldingSetNodeID id;
for (Expr *e : exprs) {
id.Add(cast<clang::ConstantExpr>(e)->getAPValueResult());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While debugging some core dump in this function, I wonder what happens here when e is not a clang::ConstantExpr?
@ghehg an idea about the missing comments here? 😄
No need for some IgnoreParenCasts() here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still diving into it. Please ignore my comments since I got the inheritance wrong.

intE->getValue().getBitWidth()),
intE->getValue()));
} else {
llvm_unreachable("NYI");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It crashes on my use-case when e is a ConstantExpr which is not handled. I can handle it with a PR but I need to understand the neighborhood first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Actually everything is a ConstantExpr here. Just that I have an enum which is not lowered to its underlying type.
Thinking about it...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this! As you might notice that the logic is actually from clang codegen.

llvm::Constant *CodeGenModule::EmitAnnotationArgs(const AnnotateAttr *Attr) {

I think the assumption is that as argument to annotation, everything should be ConstantExpr, so I guess we just trust clang front end wouldn't shoot its own foot by generating other types of AST expr for annotation arg values.

But you could find an example that breaks the assumption, what I'd do is to use clang code gen (removing -fclangir from opt list), and if you can reproduce it. That's an issue we should open at upstream, and of course, we could fix it there to be ahead of upstream :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "normal" clang is able to evaluate the ConstantExp beyond just literals.
I have made #874 for integers.

Hugobros3 pushed a commit to shady-gang/clangir that referenced this pull request Oct 2, 2024
The main purpose of this PR is to add support for C/C++ attribute
annotate. The PR involves both CIR generation and Lowering Prepare. In
the rest of this description, we first introduce the concept of
attribute annotate, then talk about expectations of LLVM regarding
annotation, after it, we describe how ClangIR handles it in this PR.
Finally, we list trivial differences between LLVM code generated by
clang codegen and ClangIR codegen.
**The concept of attribute annotate. and expected LLVM IR**
the following is C code example of annotation.
say in example.c
`int *b __attribute__((annotate("withargs", "21", 12 )));
int *a __attribute__((annotate("oneargs", "21", )));
int *c __attribute__((annotate("noargs")));
`
here "withargs" is the annotation string, "21" and 12 are arguments for
this annotation named "withargs". LLVM-based compiler is expected keep
these information and build a global variable capturing all annotations
used in the translation unit when emitting into LLVM IR. This global
variable itself is **not** constant, but will be initialized with
constants that are related to annotation representation, e.g. "withargs"
should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its
of array of struct type, and should be initialized with a const array of
const structs, each const struct is a representation of an annotation
site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const,
line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var
should be in section "llvm.metadata".
e.g. In the above example, 
We shall have following in the generated LLVM IR like the following 
```
@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@A = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"
```
notice that since variable c's annotation has no arg, the last field of
its corresponding annotation entry is a nullptr.

**ClangIR's handling of annotations**
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its
annotations. That way, we are able to make fast query about annotation
if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global
annotations' var to LLVM lowering. But at LoweringPrepare we collect all
annotations and create a module attribute "cir.global_annotations" so to
facilitate LLVM lowering.

**Some implementation details and trivial differences between clangir
generated LLVM code and vanilla LLVM code**
1. I suffix names of constants generated for annotation purpose with
".annotation" to avoid redefinition, but clang codegen doesn't do it.
3. clang codegen seems to visit FuncDecls in slightly different orders
than CIR, thus, sometimes the order of elements of the initial value
const array for llvm.global.annotations var is different from clang
generated LLVMIR, it should be trivial, as I don't expect consumer of
this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR
for annotations!
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
The main purpose of this PR is to add support for C/C++ attribute
annotate. The PR involves both CIR generation and Lowering Prepare. In
the rest of this description, we first introduce the concept of
attribute annotate, then talk about expectations of LLVM regarding
annotation, after it, we describe how ClangIR handles it in this PR.
Finally, we list trivial differences between LLVM code generated by
clang codegen and ClangIR codegen.
**The concept of attribute annotate. and expected LLVM IR**
the following is C code example of annotation.
say in example.c
`int *b __attribute__((annotate("withargs", "21", 12 )));
int *a __attribute__((annotate("oneargs", "21", )));
int *c __attribute__((annotate("noargs")));
`
here "withargs" is the annotation string, "21" and 12 are arguments for
this annotation named "withargs". LLVM-based compiler is expected keep
these information and build a global variable capturing all annotations
used in the translation unit when emitting into LLVM IR. This global
variable itself is **not** constant, but will be initialized with
constants that are related to annotation representation, e.g. "withargs"
should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its
of array of struct type, and should be initialized with a const array of
const structs, each const struct is a representation of an annotation
site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const,
line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var
should be in section "llvm.metadata".
e.g. In the above example, 
We shall have following in the generated LLVM IR like the following 
```
@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@A = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"
```
notice that since variable c's annotation has no arg, the last field of
its corresponding annotation entry is a nullptr.

**ClangIR's handling of annotations**
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its
annotations. That way, we are able to make fast query about annotation
if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global
annotations' var to LLVM lowering. But at LoweringPrepare we collect all
annotations and create a module attribute "cir.global_annotations" so to
facilitate LLVM lowering.

**Some implementation details and trivial differences between clangir
generated LLVM code and vanilla LLVM code**
1. I suffix names of constants generated for annotation purpose with
".annotation" to avoid redefinition, but clang codegen doesn't do it.
3. clang codegen seems to visit FuncDecls in slightly different orders
than CIR, thus, sometimes the order of elements of the initial value
const array for llvm.global.annotations var is different from clang
generated LLVMIR, it should be trivial, as I don't expect consumer of
this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR
for annotations!
smeenai pushed a commit to smeenai/clangir that referenced this pull request Oct 9, 2024
The main purpose of this PR is to add support for C/C++ attribute
annotate. The PR involves both CIR generation and Lowering Prepare. In
the rest of this description, we first introduce the concept of
attribute annotate, then talk about expectations of LLVM regarding
annotation, after it, we describe how ClangIR handles it in this PR.
Finally, we list trivial differences between LLVM code generated by
clang codegen and ClangIR codegen.
**The concept of attribute annotate. and expected LLVM IR**
the following is C code example of annotation.
say in example.c
`int *b __attribute__((annotate("withargs", "21", 12 )));
int *a __attribute__((annotate("oneargs", "21", )));
int *c __attribute__((annotate("noargs")));
`
here "withargs" is the annotation string, "21" and 12 are arguments for
this annotation named "withargs". LLVM-based compiler is expected keep
these information and build a global variable capturing all annotations
used in the translation unit when emitting into LLVM IR. This global
variable itself is **not** constant, but will be initialized with
constants that are related to annotation representation, e.g. "withargs"
should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its
of array of struct type, and should be initialized with a const array of
const structs, each const struct is a representation of an annotation
site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const,
line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var
should be in section "llvm.metadata".
e.g. In the above example, 
We shall have following in the generated LLVM IR like the following 
```
@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@A = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"
```
notice that since variable c's annotation has no arg, the last field of
its corresponding annotation entry is a nullptr.

**ClangIR's handling of annotations**
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its
annotations. That way, we are able to make fast query about annotation
if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global
annotations' var to LLVM lowering. But at LoweringPrepare we collect all
annotations and create a module attribute "cir.global_annotations" so to
facilitate LLVM lowering.

**Some implementation details and trivial differences between clangir
generated LLVM code and vanilla LLVM code**
1. I suffix names of constants generated for annotation purpose with
".annotation" to avoid redefinition, but clang codegen doesn't do it.
3. clang codegen seems to visit FuncDecls in slightly different orders
than CIR, thus, sometimes the order of elements of the initial value
const array for llvm.global.annotations var is different from clang
generated LLVMIR, it should be trivial, as I don't expect consumer of
this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR
for annotations!
keryell pushed a commit to keryell/clangir that referenced this pull request Oct 19, 2024
The main purpose of this PR is to add support for C/C++ attribute
annotate. The PR involves both CIR generation and Lowering Prepare. In
the rest of this description, we first introduce the concept of
attribute annotate, then talk about expectations of LLVM regarding
annotation, after it, we describe how ClangIR handles it in this PR.
Finally, we list trivial differences between LLVM code generated by
clang codegen and ClangIR codegen.
**The concept of attribute annotate. and expected LLVM IR**
the following is C code example of annotation.
say in example.c
`int *b __attribute__((annotate("withargs", "21", 12 )));
int *a __attribute__((annotate("oneargs", "21", )));
int *c __attribute__((annotate("noargs")));
`
here "withargs" is the annotation string, "21" and 12 are arguments for
this annotation named "withargs". LLVM-based compiler is expected keep
these information and build a global variable capturing all annotations
used in the translation unit when emitting into LLVM IR. This global
variable itself is **not** constant, but will be initialized with
constants that are related to annotation representation, e.g. "withargs"
should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its
of array of struct type, and should be initialized with a const array of
const structs, each const struct is a representation of an annotation
site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const,
line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var
should be in section "llvm.metadata".
e.g. In the above example, 
We shall have following in the generated LLVM IR like the following 
```
@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@A = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"
```
notice that since variable c's annotation has no arg, the last field of
its corresponding annotation entry is a nullptr.

**ClangIR's handling of annotations**
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its
annotations. That way, we are able to make fast query about annotation
if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global
annotations' var to LLVM lowering. But at LoweringPrepare we collect all
annotations and create a module attribute "cir.global_annotations" so to
facilitate LLVM lowering.

**Some implementation details and trivial differences between clangir
generated LLVM code and vanilla LLVM code**
1. I suffix names of constants generated for annotation purpose with
".annotation" to avoid redefinition, but clang codegen doesn't do it.
3. clang codegen seems to visit FuncDecls in slightly different orders
than CIR, thus, sometimes the order of elements of the initial value
const array for llvm.global.annotations var is different from clang
generated LLVMIR, it should be trivial, as I don't expect consumer of
this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR
for annotations!
lanza pushed a commit that referenced this pull request Nov 5, 2024
The main purpose of this PR is to add support for C/C++ attribute
annotate. The PR involves both CIR generation and Lowering Prepare. In
the rest of this description, we first introduce the concept of
attribute annotate, then talk about expectations of LLVM regarding
annotation, after it, we describe how ClangIR handles it in this PR.
Finally, we list trivial differences between LLVM code generated by
clang codegen and ClangIR codegen.
**The concept of attribute annotate. and expected LLVM IR**
the following is C code example of annotation.
say in example.c
`int *b __attribute__((annotate("withargs", "21", 12 )));
int *a __attribute__((annotate("oneargs", "21", )));
int *c __attribute__((annotate("noargs")));
`
here "withargs" is the annotation string, "21" and 12 are arguments for
this annotation named "withargs". LLVM-based compiler is expected keep
these information and build a global variable capturing all annotations
used in the translation unit when emitting into LLVM IR. This global
variable itself is **not** constant, but will be initialized with
constants that are related to annotation representation, e.g. "withargs"
should be literal string variable in IR.

This global variable has a fixed name "llvm.global.annotations", and its
of array of struct type, and should be initialized with a const array of
const structs, each const struct is a representation of an annotation
site, which has 5-field.
[ptr to global var/func annotated, ptr to translation unit string const,
line_no, annotation_name, ptr to arguments const]
annotation name string and args constants, as well as this global var
should be in section "llvm.metadata".
e.g. In the above example, 
We shall have following in the generated LLVM IR like the following 
```
@b = global ptr null, align 8
@.str = private unnamed_addr constant [9 x i8] c"withargs\00", section "llvm.metadata"
@.str.1 = private unnamed_addr constant [10 x i8] c"example.c\00", section "llvm.metadata"
@.str.2 = private unnamed_addr constant [3 x i8] c"21\00", align 1
@.args = private unnamed_addr constant { ptr, i32 } { ptr @.str.2, i32 12 }, section "llvm.metadata"
@A = global ptr null, align 8
@.str.3 = private unnamed_addr constant [8 x i8] c"oneargs\00", section "llvm.metadata"
@.args.4 = private unnamed_addr constant { ptr } { ptr @.str.2 }, section "llvm.metadata"
@c = global ptr null, align 8
@.str.5 = private unnamed_addr constant [7 x i8] c"noargs\00", section "llvm.metadata"
@llvm.global.annotations = appending global [3 x { ptr, ptr, ptr, i32, ptr }] [{ ptr, ptr, ptr, i32, ptr } { ptr @b, ptr @.str, ptr @.str.1, i32 1, ptr @.args }, { ptr, ptr, ptr, i32, ptr } { ptr @A, ptr @.str.3, ptr @.str.1, i32 2, ptr @.args.4 }, { ptr, ptr, ptr, i32, ptr } { ptr @c, ptr @.str.5, ptr @.str.1, i32 3, ptr null }], section "llvm.metadata"
```
notice that since variable c's annotation has no arg, the last field of
its corresponding annotation entry is a nullptr.

**ClangIR's handling of annotations**
In CIR, we introduce AnnotationAttr to GlobalOp and FuncOp to record its
annotations. That way, we are able to make fast query about annotation
if in future a CIR pass is interested in them.
We leave the work of generating const variables as well as global
annotations' var to LLVM lowering. But at LoweringPrepare we collect all
annotations and create a module attribute "cir.global_annotations" so to
facilitate LLVM lowering.

**Some implementation details and trivial differences between clangir
generated LLVM code and vanilla LLVM code**
1. I suffix names of constants generated for annotation purpose with
".annotation" to avoid redefinition, but clang codegen doesn't do it.
3. clang codegen seems to visit FuncDecls in slightly different orders
than CIR, thus, sometimes the order of elements of the initial value
const array for llvm.global.annotations var is different from clang
generated LLVMIR, it should be trivial, as I don't expect consumer of
this var is assuming a fixed order of collecting annotations.

Otherwise, clang codegen and clangir pretty much generate same LLVM IR
for annotations!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants