-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement prism -> sorbet conversion for multi-statement programs #28
Implement prism -> sorbet conversion for multi-statement programs #28
Conversation
826a67d
to
c0e0cb2
Compare
Sorbet constructs slightly different ASTs depending on whether a program contains one statement or more than one statements. Correctly parsing programs with more than one statement will make it easier to benchmark this project.
c0e0cb2
to
2e863c2
Compare
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node); | ||
pm_statements_node *stmts = programNode->statements; | ||
|
||
auto size = stmts->body.size; | ||
|
||
// For a single statement, do not create a Begin node and just return the statement | ||
if (size == 1) { | ||
return convertPrismToSorbet((pm_node *)stmts->body.nodes[0], parser, gs); | ||
} | ||
|
||
// For multiple statements, convert each statement and add them to the body of a Begin node | ||
parser::NodeVec sorbetStmts; | ||
|
||
for (int i = 0; i < stmts->body.size; i++) { | ||
pm_node_t *node = stmts->body.nodes[i]; | ||
unique_ptr<parser::Node> convertedStmt = convertPrismToSorbet(node, parser, gs); | ||
sorbetStmts.emplace_back(std::move(convertedStmt)); | ||
} | ||
|
||
auto *loc = &programNode->base.location; | ||
|
||
return make_unique<parser::Begin>(locOffset(loc, parser), std::move(sorbetStmts)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love some advice on the implementation here -- I decided to handle all the statement logic in the program node case because Sorbet doesn't have a representation of statement nodes, it just stores them as a NodeVec
(vector of nodes) in the body of a Begin
node, which is the sorbet equivalent of program.
Probably not a huge deal because this is still a prototype but I'm trying to learn how to do things in C++ 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine if you think adding a node will require too many changes.
My C++ comment would be to iterate using a range instead:
for (auto node : stmts->body.nodes) {}
. It's cleaner and prevents range bugs. This version calls the copy constructor for node
creation which may be inefficient and typed differently depending on what you need it for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really good to know you can do that in C++! I actually can't iterate over a pm_node_list
this way because it doesn't implement the begin
function (I get the error Invalid range expression of type 'struct pm_node **'; no viable 'begin' function available
). I can look into adding that to the Prism API, but for now I think this is the only way to iterate.
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node); | ||
pm_statements_node *stmts = programNode->statements; | ||
|
||
auto size = stmts->body.size; | ||
|
||
// For a single statement, do not create a Begin node and just return the statement | ||
if (size == 1) { | ||
return convertPrismToSorbet((pm_node *)stmts->body.nodes[0], parser, gs); | ||
} | ||
|
||
// For multiple statements, convert each statement and add them to the body of a Begin node | ||
parser::NodeVec sorbetStmts; | ||
|
||
for (int i = 0; i < stmts->body.size; i++) { | ||
pm_node_t *node = stmts->body.nodes[i]; | ||
unique_ptr<parser::Node> convertedStmt = convertPrismToSorbet(node, parser, gs); | ||
sorbetStmts.emplace_back(std::move(convertedStmt)); | ||
} | ||
|
||
auto *loc = &programNode->base.location; | ||
|
||
return make_unique<parser::Begin>(locOffset(loc, parser), std::move(sorbetStmts)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine if you think adding a node will require too many changes.
My C++ comment would be to iterate using a range instead:
for (auto node : stmts->body.nodes) {}
. It's cleaner and prevents range bugs. This version calls the copy constructor for node
creation which may be inefficient and typed differently depending on what you need it for.
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node); | ||
pm_statements_node *stmts = programNode->statements; | ||
|
||
auto size = stmts->body.size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simplify this code a bit by wraping this raw C pointer and size into a C++ std::span
. It's like a vector in that it'll let you use C++-style foreach loops, but it doesn't copy/own/free the buffer.
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node); | |
pm_statements_node *stmts = programNode->statements; | |
auto size = stmts->body.size; | |
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node); | |
pm_statements_node *stmts = programNode->statements; | |
std::span<pm_node_t *> nodes(stmts->body.nodes, stmts->body.size); |
Then you can:
if (nodes.size() == 1) {
return convertPrismToSorbet(nodes[0], parser, gs);
}
for (auto node : nodes) {
unique_ptr<parser::Node> convertedStmt = convertPrismToSorbet(node, parser, gs);
sorbetStmts.emplace_back(std::move(convertedStmt));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aw man, we're actually using C++17, which doesn't implement span 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a real bummer, but luckily, we have absl::span
, which is already used a fair bit throughout the codebase! 🥳
Sorbet constructs slightly different ASTs depending on whether a program contains one statement or more than one statements. Correctly parsing programs with more than one statement will make it easier to benchmark this project.
Motivation
Sorbet constructs slightly different ASTs depending on whether a program contains one statement or more than one statements. Correctly parsing programs with more than one statement will make it easier to benchmark this project.
Test plan
Added automated tests for parsing a multi-statement program.