Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement prism -> sorbet conversion for multi-statement programs #28

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 23 additions & 7 deletions main/pipeline/pipeline.cc
Original file line number Diff line number Diff line change
Expand Up @@ -173,8 +173,28 @@ const unique_ptr<parser::Node> convertPrismToSorbet(pm_node_t *node, pm_parser_t
return make_unique<parser::Integer>(locOffset(loc, parser), std::to_string(intNode->value.value));
}
case PM_PROGRAM_NODE: {
pm_statements_node *stmts = (reinterpret_cast<pm_program_node *>(node))->statements;
return convertPrismToSorbet((pm_node *)stmts, parser, gs);
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node);
pm_statements_node *stmts = programNode->statements;

auto size = stmts->body.size;
Comment on lines +176 to +179

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this code a bit by wraping this raw C pointer and size into a C++ std::span. It's like a vector in that it'll let you use C++-style foreach loops, but it doesn't copy/own/free the buffer.

Suggested change
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node);
pm_statements_node *stmts = programNode->statements;
auto size = stmts->body.size;
pm_program_node *programNode = reinterpret_cast<pm_program_node *>(node);
pm_statements_node *stmts = programNode->statements;
std::span<pm_node_t *> nodes(stmts->body.nodes, stmts->body.size);

Then you can:

if (nodes.size() == 1) {
    return convertPrismToSorbet(nodes[0], parser, gs);
}
for (auto node : nodes) {
    unique_ptr<parser::Node> convertedStmt = convertPrismToSorbet(node, parser, gs);
    sorbetStmts.emplace_back(std::move(convertedStmt));
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aw man, we're actually using C++17, which doesn't implement span 😭

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a real bummer, but luckily, we have absl::span, which is already used a fair bit throughout the codebase! 🥳

#34


// For a single statement, do not create a Begin node and just return the statement
if (size == 1) {
return convertPrismToSorbet((pm_node *)stmts->body.nodes[0], parser, gs);
}

// For multiple statements, convert each statement and add them to the body of a Begin node
parser::NodeVec sorbetStmts;

for (int i = 0; i < stmts->body.size; i++) {
pm_node_t *node = stmts->body.nodes[i];
unique_ptr<parser::Node> convertedStmt = convertPrismToSorbet(node, parser, gs);
sorbetStmts.emplace_back(std::move(convertedStmt));
}

auto *loc = &programNode->base.location;

return make_unique<parser::Begin>(locOffset(loc, parser), std::move(sorbetStmts));
Comment on lines +176 to +197
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love some advice on the implementation here -- I decided to handle all the statement logic in the program node case because Sorbet doesn't have a representation of statement nodes, it just stores them as a NodeVec (vector of nodes) in the body of a Begin node, which is the sorbet equivalent of program.

Probably not a huge deal because this is still a prototype but I'm trying to learn how to do things in C++ 😅

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine if you think adding a node will require too many changes.

My C++ comment would be to iterate using a range instead:
for (auto node : stmts->body.nodes) {}. It's cleaner and prevents range bugs. This version calls the copy constructor for node creation which may be inefficient and typed differently depending on what you need it for.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's really good to know you can do that in C++! I actually can't iterate over a pm_node_list this way because it doesn't implement the begin function (I get the error Invalid range expression of type 'struct pm_node **'; no viable 'begin' function available). I can look into adding that to the Prism API, but for now I think this is the only way to iterate.

}
case PM_RATIONAL_NODE: {
auto *rationalNode = reinterpret_cast<pm_rational_node *>(node);
Expand All @@ -190,11 +210,7 @@ const unique_ptr<parser::Node> convertPrismToSorbet(pm_node_t *node, pm_parser_t
break;
}
case PM_STATEMENTS_NODE: {
pm_node_list *body = &(reinterpret_cast<pm_statements_node *>(node))->body;
// TODO: Handle multiple statements
pm_node *first = body->nodes[0];

return convertPrismToSorbet(first, parser, gs);
Exception::raise("Handling for statements node should be performed in program node");
}
case PM_STRING_NODE: {
auto strNode = reinterpret_cast<pm_string_node *>(node);
Expand Down
13 changes: 13 additions & 0 deletions test/prism_regression/multi_statements.parse-tree.exp
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Begin {
stmts = [
Integer {
val = "42"
}
String {
val = <U hello>
}
Rational {
val = "1.0"
}
]
}
5 changes: 5 additions & 0 deletions test/prism_regression/multi_statements.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# typed: true

42
"hello"
1.0r