-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BIR #2702
Merged
+3,779
−3
Merged
BIR #2702
Changes from all commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
0b3d408
borrowck: Add initial structure for borrowchecking
jdupak 193cb2e
borrowck: Add CLI option for borrowck
jdupak 9f20320
borrowck: Execute only with CLI flag
jdupak d7bcc5e
borrowck: Create Borrow-checker IR (BIR)
jdupak 4656475
borrowck: Create BIR builders (visitors)
jdupak 13e67dd
borrowck: BIR dump
jdupak 42000a1
borrowck: Dump: proper comma separation
jdupak c304940
borrowck: Dump: simplify cfg
jdupak d70becd
borrowck: Dump improve jumps
jdupak f13ab05
borrowck: BIR: handle break
jdupak 917b6bd
borrowck: Dump: handle infinite loops
jdupak fa9b72f
borrowck: BIR continue
jdupak 4d6af23
borrowck: Make goto explicit.
jdupak a46aec1
borrowck: Docs
jdupak c29c900
borrowck: Dev notes
jdupak 0798bc5
borrowck: Refactor and BIR improvements
jdupak File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,188 @@ | ||
# Borrow-checker IR (BIR) design notes | ||
|
||
## Design goals | ||
|
||
Rust GCC project aims to use the [Polonius project](https://github.com/rust-lang/polonius) as its borrow-checker. | ||
Polonius operates on a set of [facts](https://github.com/rust-lang/polonius/blob/master/polonius-engine/src/facts.rs) about the program to determine | ||
the actual lifetimes of borrows. | ||
As Polonius's primary analysis is location sensitive, the facts are tied to the program's control flow graph (CFG). | ||
Unlike rustc, which has | ||
its | ||
own three address language specific representation called [MIR](https://rustc-dev-guide.rust-lang.org/mir/index.html), gccrs uses gcc's AST based | ||
representation GENERIC. | ||
GIMPLE (the three address IR) is generated from GENERIC inside GCC and can carry no language-specific information. | ||
Therefore, we | ||
need to generate our own representation of the program's CFG. | ||
Since gccrs has no intention of having a MIR-like IR, the BIR is not to be used for | ||
code generation. | ||
Therefore, BIR carries only minimal information that is necessary for the borrow-checker and for BIR debugging. | ||
|
||
Since BIR is in fact a dead branch of the compilation pipeline, the only way to verify its generations is through manual inspection. | ||
To have some frame of reference for testing, BIR build and dump are carefully designed to resemble the textual dump of rustc's MIR as much as | ||
possible. | ||
This includes the style of the output, numbering of locals and order of basic blocks (when possible). | ||
|
||
## BIR Dump Example | ||
|
||
An example program calculating the i-th fibonacci number: | ||
|
||
```rust | ||
|
||
fn fib(i: usize) -> i32 { | ||
if i == 0 || i == 1 { | ||
1 | ||
} else { | ||
fib(i - 1) + fib(i - 2) | ||
} | ||
} | ||
``` | ||
|
||
Here is an example of BIR dump (note: this needs to be updated regularly): | ||
|
||
``` | ||
fn fib(_1: usize) -> i32 { | ||
let _0: i32; | ||
let _2: i32; | ||
let _3: bool; | ||
let _4: bool; | ||
let _5: bool; | ||
let _6: usize; | ||
let _7: i32; | ||
let _8: usize; | ||
let _9: i32; | ||
let _10: i32; | ||
|
||
bb0: { | ||
_4 = Operator(_1, const usize); | ||
switchInt(_4) -> [bb1, bb2]; | ||
} | ||
|
||
bb1: { | ||
_3 = const bool; | ||
goto -> bb3; | ||
} | ||
|
||
bb2: { | ||
_5 = Operator(_1, const usize); | ||
_3 = _5; | ||
goto -> bb3; | ||
} | ||
|
||
bb3: { | ||
switchInt(_3) -> [bb4, bb7]; | ||
} | ||
|
||
bb4: { | ||
_2 = const i32; | ||
goto -> bb8; | ||
} | ||
|
||
bb5: { | ||
_6 = Operator(_1, const usize); | ||
_7 = Call(fib)(_6, ) -> [bb6]; | ||
} | ||
|
||
bb6: { | ||
_8 = Operator(_1, const usize); | ||
_9 = Call(fib)(_8, ) -> [bb7]; | ||
} | ||
|
||
bb7: { | ||
_10 = Operator(_7, _9); | ||
_2 = _10; | ||
goto -> bb8; | ||
} | ||
|
||
bb8: { | ||
_0 = _2; | ||
return; | ||
} | ||
} | ||
|
||
|
||
``` | ||
|
||
The dump consists of: | ||
|
||
- A function header with arguments: `fn fib(_1: usize) -> i32 { ... }`. | ||
- Declaration of locals: `let _0: i32;`, where `_0` is the return value (even if it is of the unit type). Arguments are not listed here, they are | ||
listed in the function header. | ||
- A list of basic blocks: `bb0: { ... }`. The basic block name is the `bb` prefix followed by a number. | ||
- Each basic block consists of a list of BIR nodes (instructions). Instruction can be either assigned to a local (place) or be a statement. | ||
Instructions take locals (places) as arguments. | ||
- Each basic block is terminated with a control flow instruction followed by a list of destinations: | ||
- `goto -> bb3;` - a goto instruction with a single destination. | ||
- `switchInt(_3) -> [bb4, bb7];` - a switch instruction with multiple destinations. | ||
- `return;` - a return instruction with no destinations. | ||
- `Call(fib)(_6, ) -> [bb6];` - a call instruction with a single destination. This section is prepared for panic handling. | ||
|
||
## BIR Structure | ||
|
||
BIR structure is defined in `gcc/rust/checks/errors/borrowck/rust-bir.h`. It is heavily inspired by rustc's MIR. The main difference is that BIR | ||
drastically reduces the amount of information carried to only borrow-checking relevant information. | ||
|
||
As borrow-checking is performed on each function independently, BIR represents a single function (`struct Function`). A `Function` consists of a list | ||
of basic blocks, list of arguments (for dump only) and place database, which keeps track of locals. | ||
|
||
### Basic Blocks | ||
|
||
A basic block is identified by its index in the function's basic block list. It contains a list of BIR nodes (instructions) and a list of successor | ||
basic block indices in CFG. | ||
|
||
### BIR Nodes (Instructions) | ||
|
||
BIR nodes are of three categories: | ||
|
||
- An assignment of an expression to a local (place). | ||
- A control flow operation (switch, return). | ||
- A special node (not executable) node, which carries additional information for borrow-checking (`StorageDead`, `StorageLive`). | ||
|
||
#### Expressions | ||
|
||
Expressions represent the executable parts of the rust code. Many different Rust contracts are represented by a single expression, as only data (and | ||
lifetime) flow needs to be tracked. | ||
|
||
- `InitializerExpr` represents any kind of struct initialization. It can be either explicit (struct expression) or implicit (range expression, | ||
e.g. `0..=5`). | ||
- `Operator<ARITY>` represents any kind of operation, except the following, where special information is needed either for borrow-checking or for | ||
better debugging. | ||
- `BorrowExpr` represents a borrow operation. | ||
- `AssignmentExpr` holds a place for a node of assignment (i.e., no operation is done on the place, it is just assigned). | ||
- `CallExpr` represents a function call. | ||
- For functions, the callable is represented by a constant place (see below). (E.i. all calls use the same constant place.) | ||
- For closures and function pointers, the callable is represented by a (non-constant) place. | ||
|
||
### Places | ||
|
||
Places are defined in `gcc/rust/checks/errors/borrowck/rust-bir-place.h`. | ||
|
||
Places represent locals (variables), their field, and constants. They are identified by their index (`PlaceId`) in the function's place database. For | ||
better dump correspondence to MIR, constants use a different index range. | ||
|
||
Non-constant places are created according to Polonius path [documentation](https://rust-lang.github.io/polonius/rules/atoms.html). The following | ||
grammar describes | ||
possible path elements: | ||
|
||
``` | ||
Path = Variable | ||
| Path "." Field // field access | ||
| Path "[" "]" // index | ||
| "*" Path | ||
``` | ||
|
||
It is important to highlight that different fields are assigned to different places; however, all indices are assigned to the same place. | ||
Also, to match the output of rustc. | ||
In dump, paths contain at most one dereference and are split otherwise. | ||
Same paths always result in the same place. | ||
|
||
Variables are identified by `AST` `NodeId`. Fields indexes are taken from `TyTy` types. | ||
|
||
Each place holds indices to its next relatives (in the path tree), `TyTy` type, lifetime and information whether the type can be copies or it needs to | ||
be moved. Not that unlike rustc, we copy any time we can (for simplicity), while rustc prefers to move if possible (only a single copy is held). | ||
|
||
## BIR Builders | ||
|
||
There are multiple builders (visitor classes) for BIR based on what context is needed in them. | ||
provides the entry point that handles function parameters and return values, and it creates the BIR main unit `Function`. | ||
`rust-bir-internal.h` provides abstract builder classes with common helper methods for all builder and for expression builders. | ||
Specific builders are then defined for expressions+statements, lazy boolean expressions, patterns, and struct initialization. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
# Borrow-checker Development Notes | ||
|
||
## Testing BIR building | ||
|
||
There is no way to test BIR building directly, since it is a dead branch of the compilation pipeline. | ||
The only way to verify its generations is through manual inspection. | ||
The best way to inspect the BIR is to compare it with rustc's MIR. | ||
|
||
The following command will compile a rust file into a library and dump its MIR: | ||
|
||
```shell | ||
rustc --crate-type=lib -A dead_code -A unused -Z dump-mir="" <file> | ||
``` | ||
|
||
The MIR dump directory `mir_dump` contains a dump before and after each MIR pass. | ||
We are interested in the one used for borrow-checking, which is called `<crate>.<function>.002-000.analysis.after.mir`. | ||
|
||
BIR dump is emitted to a `bir_dump` directory. With the following naming scheme: `<crate>.<function>.bir.dump`. | ||
|
||
At this point, MIR dump contains helper constructions that BIR does not contain yet (like storage live/dead annotations). To remove them from the MIR dump, run the following command: | ||
|
||
```shell | ||
awk -i inplace '!/^\s*(\/\/|StorageLive|StorageDead|FakeRead)/' mir_dump/* | ||
``` | ||
|
||
To get the BIR dump into a similar format, run the following command: | ||
|
||
```shell | ||
./crab1 <file> -frust-incomplete-and-experimental-compiler-do-not-use -frust-borrowcheck -frust-dump-bir -frust-compile-until=compilation | ||
``` | ||
|
||
|
||
## TODO | ||
|
||
- scope handling, cleanup | ||
- switch coercions to adjustments from typechecking | ||
- operator overloading | ||
- match selection | ||
- let without an initializer | ||
- lifetime parameters |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have all these doc files in a
doc/
subdir?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with either variant. If
docs/
is more consistent, we can go with that.