Skip to content

Commit c394ef0

Browse files
ChuanqiXu9lanza
authored andcommitted
[ClangIR][CIRGen] Introduce CaseOp and refactor SwitchOp (#1006)
Close #522 This solves the issue we can't handle `case` in nested scopes and we can't handle if the switch body is not a compound statement. The core idea of the patch is to introduce the `cir.case` operation to the language. Then we can get the cases by traversing the body of the `cir.switch` operation easily instead of counting the regions and the attributes. Every `cir.case` operation has a region and now the `cir.switch` has only one region too. But to make the analysis and optimizations easier, I add a new concept `simple form` here. That a simple `cir.switch` operation is: all the `cir.case` operation owned by the `cir.switch` lives in the top level blocks of the `cir.switch` region and there is no other operations except the ending `cir.yield`. This solves the previous `simplified for common-case` vs `general solution` discussion in #522. After implemented this, I feel the correct answer to it is, we want a general solution for constructing and lowering the operations but we like simple and common case for analysis and optimizations. We just mixed the different phases. For other semantics, see `CIROps.td`. For lowering, we can make it generally by lower the cases one by one and finally lower the switch itself. Although this patch has 1000+ lines of changes, I feel it is relatively neat especially it erases some odd behaviors before. Tested with Spec2017's C benchmarks except 500.perlbench_r.
1 parent 22cb90f commit c394ef0

19 files changed

+1124
-908
lines changed

clang/include/clang/CIR/Dialect/IR/CIROps.td

+163-47
Original file line numberDiff line numberDiff line change
@@ -665,7 +665,7 @@ def StoreOp : CIR_Op<"store", [
665665

666666
def ReturnOp : CIR_Op<"return", [ParentOneOf<["FuncOp", "ScopeOp", "IfOp",
667667
"SwitchOp", "DoWhileOp",
668-
"WhileOp", "ForOp"]>,
668+
"WhileOp", "ForOp", "CaseOp"]>,
669669
Terminator]> {
670670
let summary = "Return from function";
671671
let description = [{
@@ -900,7 +900,7 @@ def ConditionOp : CIR_Op<"condition", [
900900
def YieldOp : CIR_Op<"yield", [ReturnLike, Terminator,
901901
ParentOneOf<["IfOp", "ScopeOp", "SwitchOp", "WhileOp", "ForOp", "AwaitOp",
902902
"TernaryOp", "GlobalOp", "DoWhileOp", "TryOp", "ArrayCtor",
903-
"ArrayDtor", "CallOp"]>]> {
903+
"ArrayDtor", "CallOp", "CaseOp"]>]> {
904904
let summary = "Represents the default branching behaviour of a region";
905905
let description = [{
906906
The `cir.yield` operation terminates regions on different CIR operations,
@@ -1819,22 +1819,38 @@ def CaseOpKind : I32EnumAttr<
18191819
let cppNamespace = "::mlir::cir";
18201820
}
18211821

1822-
def CaseEltValueListAttr :
1823-
TypedArrayAttrBase<AnyAttr, "cir.switch case value condition"> {
1824-
let constBuilderCall = ?;
1825-
}
1822+
def CaseOp : CIR_Op<"case", [
1823+
DeclareOpInterfaceMethods<RegionBranchOpInterface>,
1824+
RecursivelySpeculatable, AutomaticAllocationScope]> {
1825+
let summary = "Case operation";
1826+
let description = [{
1827+
The `cir.case` operation represents a case within a C/C++ switch.
1828+
The `cir.case` operation must be in a `cir.switch` operation directly or indirectly.
18261829

1827-
def CaseAttr : AttrDef<CIR_Dialect, "Case"> {
1828-
// FIXME: value should probably be optional for more clear "default"
1829-
// representation.
1830-
let parameters = (ins "ArrayAttr":$value, "CaseOpKindAttr":$kind);
1831-
let mnemonic = "case";
1832-
let assemblyFormat = "`<` struct(params) `>`";
1833-
}
1830+
The `cir.case` have 4 kinds:
1831+
- `equal, <constant>`: equality of the second case operand against the
1832+
condition.
1833+
- `anyof, [constant-list]`: equals to any of the values in a subsequent
1834+
following list.
1835+
- `range, [lower-bound, upper-bound]`: the condition is within the closed interval.
1836+
- `default`: any other value.
1837+
1838+
Each case region must be explicitly terminated.
1839+
}];
1840+
1841+
let arguments = (ins ArrayAttr:$value, CaseOpKind:$kind);
1842+
let regions = (region AnyRegion:$caseRegion);
1843+
1844+
let assemblyFormat = "`(` $kind `,` $value `)` $caseRegion attr-dict";
1845+
1846+
let hasVerifier = 1;
18341847

1835-
def CaseArrayAttr :
1836-
TypedArrayAttrBase<CaseAttr, "cir.switch case array attribute"> {
1837-
let constBuilderCall = ?;
1848+
let skipDefaultBuilders = 1;
1849+
let builders = [
1850+
OpBuilder<(ins "ArrayAttr":$value,
1851+
"CaseOpKind":$kind,
1852+
"OpBuilder::InsertPoint &":$insertPoint)>
1853+
];
18381854
}
18391855

18401856
def SwitchOp : CIR_Op<"switch",
@@ -1847,45 +1863,136 @@ def SwitchOp : CIR_Op<"switch",
18471863
conditionally executing multiple regions of code. The operand to an switch
18481864
is an integral condition value.
18491865

1850-
A variadic list of "case" attribute operands and regions track the possible
1851-
control flow within `cir.switch`. A `case` must be in one of the following forms:
1852-
- `equal, <constant>`: equality of the second case operand against the
1853-
condition.
1854-
- `anyof, [constant-list]`: equals to any of the values in a subsequent
1855-
following list.
1856-
- `range, [lower-bound, upper-bound]`: the condition is within the closed interval.
1857-
- `default`: any other value.
1866+
The set of `cir.case` operations and their enclosing `cir.switch`
1867+
represents the semantics of a C/C++ switch statement. Users can use
1868+
`collectCases(llvm::SmallVector<CaseOp> &cases)` to collect the `cir.case`
1869+
operation in the `cir.switch` operation easily.
18581870

1859-
Each case region must be explicitly terminated.
1871+
The `cir.case` operations doesn't have to be in the region of `cir.switch`
1872+
directly. However, when all the `cir.case` operations lives in the region
1873+
of `cir.switch` directly and there is no other operations except the ending
1874+
`cir.yield` operation in the region of `cir.switch` directly, we call the
1875+
`cir.switch` operation is in a simple form. Users can use
1876+
`bool isSimpleForm(llvm::SmallVector<CaseOp> &cases)` member function to
1877+
detect if the `cir.switch` operation is in a simple form. The simple form
1878+
makes analysis easier to handle the `cir.switch` operation
1879+
and makes the boundary to give up pretty clear.
18601880

1861-
Examples:
1881+
To make the simple form as common as possible, CIR code generation attaches
1882+
operations corresponding to the statements that lives between top level
1883+
cases into the closest `cir.case` operation.
18621884

1863-
```mlir
1864-
cir.switch (%b : i32) [
1865-
case (equal, 20) {
1866-
...
1867-
cir.yield break
1868-
},
1869-
case (anyof, [1, 2, 3] : i32) {
1870-
...
1871-
cir.return ...
1885+
For example,
1886+
1887+
```
1888+
switch(int cond) {
1889+
case 4:
1890+
a++;
1891+
1892+
b++;
1893+
case 5;
1894+
c++;
1895+
1896+
...
1897+
}
1898+
```
1899+
1900+
The statement `b++` is not a sub-statement of the case statement `case 4`.
1901+
But to make the generated `cir.switch` a simple form, we will attach the
1902+
statement `b++` into the closest `cir.case` operation. So that the generated
1903+
code will be like:
1904+
1905+
```
1906+
cir.switch(int cond) {
1907+
cir.case(equal, 4) {
1908+
a++;
1909+
b++;
1910+
cir.yield
18721911
}
1873-
case (range, [10, 15]) {
1874-
...
1875-
cir.yield break
1876-
},
1877-
case (default) {
1878-
...
1879-
cir.yield fallthrough
1912+
cir.case(equal, 5) {
1913+
c++;
1914+
cir.yield
18801915
}
1881-
]
1916+
...
1917+
}
1918+
```
1919+
1920+
For the same reason, we will hoist the case statement as the substatement
1921+
of another case statement so that they will be in the same level. For
1922+
example,
1923+
1924+
```
1925+
switch(int cond) {
1926+
case 4:
1927+
default;
1928+
case 5;
1929+
a++;
1930+
...
1931+
}
1932+
```
1933+
1934+
will be generated as
1935+
1936+
```
1937+
cir.switch(int cond) {
1938+
cir.case(equal, 4) {
1939+
cir.yield
1940+
}
1941+
cir.case(default) {
1942+
cir.yield
1943+
}
1944+
cir.case(equal, 5) {
1945+
a++;
1946+
cir.yield
1947+
}
1948+
...
1949+
}
1950+
```
1951+
1952+
The cir.switch might not be considered "simple" if any of the following is
1953+
true:
1954+
- There are case statements of the switch statement lives in other scopes
1955+
other than the top level compound statement scope. Note that a case
1956+
statement itself doesn't form a scope.
1957+
- The sub-statement of the switch statement is not a compound statement.
1958+
- There are codes before the first case statement. For example,
1959+
1960+
```
1961+
switch(int cond) {
1962+
l:
1963+
b++;
1964+
1965+
case 4:
1966+
a++;
1967+
break;
1968+
1969+
case 5:
1970+
goto l;
1971+
...
1972+
}
1973+
```
1974+
1975+
the generated CIR for this non-simple switch would be:
1976+
1977+
```
1978+
cir.switch(int cond) {
1979+
cir.label "l"
1980+
b++;
1981+
cir.case(4) {
1982+
a++;
1983+
cir.break
1984+
}
1985+
cir.case(5) {
1986+
goto "l"
1987+
}
1988+
cir.yield
1989+
}
18821990
```
18831991
}];
18841992

1885-
let arguments = (ins CIR_IntType:$condition,
1886-
OptionalAttr<CaseArrayAttr>:$cases);
1993+
let arguments = (ins CIR_IntType:$condition);
18871994

1888-
let regions = (region VariadicRegion<AnyRegion>:$regions);
1995+
let regions = (region AnyRegion:$body);
18891996

18901997
let hasVerifier = 1;
18911998

@@ -1897,10 +2004,19 @@ def SwitchOp : CIR_Op<"switch",
18972004

18982005
let assemblyFormat = [{
18992006
custom<SwitchOp>(
1900-
$regions, $cases, $condition, type($condition)
2007+
$body, $condition, type($condition)
19012008
)
19022009
attr-dict
19032010
}];
2011+
2012+
let extraClassDeclaration = [{
2013+
// Collect cases in the switch.
2014+
void collectCases(llvm::SmallVector<CaseOp> &cases);
2015+
2016+
// Check if the switch is in a simple form. If yes, collect the cases to \param cases.
2017+
// This is an expensive and need to be used with caution.
2018+
bool isSimpleForm(llvm::SmallVector<CaseOp> &cases);
2019+
}];
19042020
}
19052021

19062022
//===----------------------------------------------------------------------===//

0 commit comments

Comments
 (0)