Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix recursive type segfault in hiltic #1881

Merged
merged 1 commit into from
Oct 5, 2024

Conversation

evantypanski
Copy link
Contributor

Closes #1867

tl;dr: I went back and forth but went with the almost-too-simple solution for this one, but I'm open to suggestions :)

There are two different segfaults that can happen, one from something like Data = Data, the other from like Data = Data&. The first is solved by adding a recursion_depth to the resolved type (which is just a magic 1000 limit but having a way to configure that would be nice, I just didn't find any). The second is solved with a cycle detection in the unifier (approximating an "occurs check")

Note that this just disallows both cases (so type Data = Data& will error, not be accepted and properly generate code). Adding support to allow containers with recursive types would be a fair bit more work that should probably come after this, in my opinion. The crux there is that type definitions just end up being "find and replace" aliases in codegen, which will just loop infinitely. I think the solution would be to create a new struct or something in the C++ generated code and use its incomplete type, but that seemed a bit outside the scope here. Was that behavior ever allowed? Because it seems pretty deeply engrained that a type would never reference itself.

FWIW, this solution is a bit mediocre. A lot of weird problems arose with other solutions. It would be nice to make Name::resolvedType return a result (since now there is a "fail state") but the number of changes for that is pretty substantial and probably not really worth it. So instead we get an iffy error message in resolver.cc because I don't want to assume the only fail state is hitting the recursion limit.

Then putting the cycle detection in the unifier is weird since there were two separate places that would segfault (initially the unifier, but if that's "fixed" codegen also segfaults). A separate validation would make sense, but since the fix slots in neatly to act like an occurs check, it seems fine there. Then we don't need to traverse the tree in the somewhat unique way required to get the infinite recursion - which the type unifier will do when following names.

but.. open to any other suggestion, this is mainly what worked in the easiest way with the fewest drawbacks, in my opinion.

Copy link
Member

@bbannier bbannier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I do not think that this looks particularly bad. Re: Result for resolveType, this function already had an error state before if _resolved_type_index was unset.

My main question would be whether we'd want the arbitrary recursion limit in the first place.

hilti/toolchain/include/ast/types/name.h Show resolved Hide resolved
hilti/toolchain/src/compiler/type-unifier.cc Outdated Show resolved Hide resolved
hilti/toolchain/src/compiler/resolver.cc Outdated Show resolved Hide resolved
@evantypanski evantypanski force-pushed the topic/etyp/recursive-type-segfault-fix branch from cdd66ee to 3ce4127 Compare October 4, 2024 12:55
@evantypanski evantypanski force-pushed the topic/etyp/recursive-type-segfault-fix branch from 3ce4127 to 0b1f321 Compare October 4, 2024 18:07
Closes #1867

There are two different cases where infinite loops happen with recursive
types. First, a type may reference itself (`type Data = Data`). Second,
a type may reference itself inside some other type (`type Data =
vector<Data>`).

The first is fixed with a recursion limit. Since the type simply cannot
resolve, it doesn't get anywhere near codegen. You could detect cycles,
but that introduces some extra overhead and complexity that shouldn't
be needed in a "simple" function.

The second is fixed with an ad-hoc "occurs" check in type unification.
That just detects cycles and aborts if one is present. This could be
placed at some other place in the "resolve until convergence" loop, but
it seems best put closest to the source of the issue.
@bbannier bbannier force-pushed the topic/etyp/recursive-type-segfault-fix branch from 2a7fa58 to b3a3aef Compare October 5, 2024 11:43
@bbannier bbannier merged commit 8c175ef into main Oct 5, 2024
5 of 14 checks passed
@bbannier bbannier deleted the topic/etyp/recursive-type-segfault-fix branch October 5, 2024 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recursive type alias segfaults
2 participants