Rework incorrect indentation recovery #1273

auduchinok · 2023-05-23T11:23:52Z

F# has indentation-based syntax powered by offside rules. They are implemented in a component called LexFilter that works between lexing and parsing stages. LexFilter keeps track of the indentation blocks and inserts additional begin- and end-like tokens for the parser.

To help newcomers write F# code, LexFilter has a recovery rule that is used when a token is insufficiently indented. This rule produces a warning and starts a new indentation block anyway:

While the warning is good, the insertion of the block causes many problems for the parsing and analysis when writing code. Consider a case where a user is writing a new type member:

There are subsequent definitions and expressions, and all of them are parsed and analyzed correctly.
As soon as the user types = in, the LexFilter begins to expect an indented member body below, doesn't find it, and applies the recovery rule:

That makes all the code below parsed incorrectly as the body of the property. Very often it makes the parser fail to parse the file at all. In the example above, the lenses below the edited place are gone. They will reappear when you type an expression after =, making the code move up and down and blink. There's an unexpected error about attributes inside an expression. The P type is inferred incorrectly.

The bigger a file gets, the more unexpected parser/analysis errors appear, because neither the parser nor the type checker expect arbitrary code below to be parsed as the indented block. This leads to blinking highlightings, moving code in editor (because of appearing/disappearing lenses/hints), and features like Code Completion or Parameter Info (Signature Help) become significantly less precise, which is crucial when editing code.

There're many cases where it breaks deep inside expressions, like in the binary expression here:

let a =
    1 +

let b = 1

I propose we rework how wrong indentation is handled. It's possible to change LexFilter, so it doesn't start a new indentation block with the wrong indentation, and to introduce new parser recovery rules, so it would show errors about missing expressions/definitions and continue parsing.

Pros and Cons

The advantages of making this adjustment to F# are

Doing that will improve parser recovery, which in turn make the analysis significantly more stable and precise. It can improve the editing experience a lot, since there will be much less blinking, code jumping, and features that rely on inferred types will be much more stable.

The disadvantages of making this adjustment to F# are

This is a breaking change, and some code that can be compiled today will no longer compile. Such code produces warnings today and should be adjusted to have proper indentation. I expect such code to be extremely rare and mostly produced by generators (so hopefully only generators will need updates).

Extra information

Estimated cost (XS, S, M, L, XL, XXL): M-L

I've made a small prototype with an additional parser rule for let bindings, and it shows that this recovery works as expected, with an additional error produced, see this gist. The error texts can be improved separately. The main thing it shows is that the subsequent definitions are parsed correctly.

Since it's a breaking change, we should only introduce it in a new language version. It's possible to make it into F# 8 now, so the tooling will be able to use the improvements in the fall, when the new language version is released. If we don't ship it in F# 8, the closest date when improvements could be used would probably be ~November 2024, i.e. when .NET 9 is released.

Affidavit (please submit!)

Please tick this by placing a cross in the box:

This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
I have searched both open and closed suggestions on this site and believe this is not a duplicate
This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

This is not a breaking change to the F# language design
I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

The text was updated successfully, but these errors were encountered:

vzarytovskii · 2023-05-23T15:25:58Z

I think tradeoff of introducing this kind of breaking change in new language versions is well worth it.

dsyme · 2023-05-23T18:06:45Z

The problem is, I've seen beginner code with literally many of these warnings, and the warning turned off.

I started searching for stuff in github that turns of warning 58, here are two examples:

e.g. https://github.com/search?q=FS0058&type=code

I'd expect those projects don't specify language version. An alternative might be to have a flag --strict:SOME-NUMBER that is turned on by default in new project templates.

In short I'm happy for this warning to become an error, and the grammar improved to make use of that, subject to this kind of concern about being a breaking change.

If we need to duplciate the parser to implement that, we should also ditch all the non-indentation-aware syntax rules from the grammar, and all long-deprecated features (if giving errors), which will make it much simpler.

baronfel · 2023-05-23T20:58:51Z

Slightly tweaked query that only includes fsproj, props, and targets file: https://github.com/search?q=FS0058+%28path%3A*.fsproj+OR+path%3A*.props+OR+path%3A*.targets%29&type=code

There are only 20 occurrences here, and the majority of those are in two projects - Pulumi F# extensions and the CoFlows family of projects already linked. That doesn't seem like an insurmountable hurdle to overcome.

dsyme · 2023-05-24T16:46:12Z

I think we should really follow this path for this one:

....have a flag --strict:SOME-FEATURE-NAME that is turned on by default in new project templates....

There will 100% be significant enterprise code that emits and suppresses this warning. We can't have installing new versions of the F# compiler or .NET SDK just randomly start giving a hard error for suppressed instances of this warning, or indeed even for unsuppressed instances.

auduchinok · 2023-05-24T16:54:32Z

We can't have installing new versions of the F# compiler or .NET SDK just randomly start giving a hard error for suppressed instances of this warning, or indeed even for unsuppressed instances.

If we don't enable it by default in existing projects, then it seems there would be much less point in improving it at all. From the tooling standpoint we want a better parser to have the broadest reach possible. We can, for sure, provide a way to opt-out and to revert to the current warnings, if for some reason it's not easy to correct wrong indentation when the SDK is released.

dsyme · 2023-05-24T17:14:04Z

I mean I guess an alternative would be to parse all sources twice, and see if the warning occurs anywhere. If not then go ahead with the better parser.

Anyway, it's ultimately up to @vzarytovskii, @baronfel and others to decide policy on this sort of thing. I'm OK with the change but it's a breaking change for sure, and it will hit companies where the developers who wrote the code are no longer even at the company.

vzarytovskii · 2023-05-24T17:17:14Z

I mean I guess an alternative would be to parse all sources twice, and see if the warning occurs anywhere. If not then go ahead with the better parser.

Anyway, it's ultimately up to @vzarytovskii, @baronfel and others to decide policy on this sort of thing. I'm OK with the change but it's a breaking change for sure, and it will hit companies where the developers who wrote the code are no longer even at the company.

Yeah, my fear is that such code can be in a bunch of closed codebases, which "just work".

baronfel · 2023-05-24T17:20:21Z

Super valid points. Templates are a great way to kickstart adoption of features like this, and we also have knobs like defaulting the new parser to off for .NET 8, and immediately in 9 preview 1 builds turning the parser on, that way we have the entire 9 preview cycle to find and react to bugs or impacts in the tech. I pinged Kathleen so we could chat about what ideas for what our overall strategy for these kinds of changes should be (which fits nicely into a "rules that folks who contribute to the .NET SDK should follow for breaking changes and diagnostics" thing I'm working on).

vzarytovskii · 2023-05-24T17:33:33Z

@auduchinok I wonder how hard it will be to hide new recovery path under a flag?

auduchinok · 2023-05-24T19:20:40Z

@auduchinok I wonder how hard it will be to hide new recovery path under a flag?

Should be easy, since it's going to be a language version dependent change anyway. LexFilter changes will look at the flag or language version and control this, just like existing version-dependent changes there, while the new parser rules just won't be triggered in the most (if not all) cases. The parser rules are going to be just new recovery rules, like the ones I've been adding recently.

auduchinok · 2023-05-25T14:36:08Z

I'm OK with the change but it's a breaking change for sure, and it will hit companies where the developers who wrote the code are no longer even at the company.

Templates are a great way to kickstart adoption of features like this, and we also have knobs like defaulting the new parser to off for .NET 8, and immediately in 9 preview 1 builds turning the parser on, that way we have the entire 9 preview cycle to find and react to bugs or impacts in the tech.

I should say that I'm biased, but I'll try to reason about my point of view.

I understand that breaking some code in closed source codebases is not good. However, the code affected should be rare. On top of that, the fix for most cases should be trivial and will only require adding spaces in, hopefully, a few places. Alternatively, developers could set a language version in a project during the transition period.

On the other hand, with this change in the language, we can significantly improve the quality of the editor features for the vast majority of F# users, starting with the new parser recovery rules and then incrementally improving features like Code Completion.

My only concern is that doing this will require an update in some (possibly unmaintained) codebases, and that code will need a fix, regardless of whether that happens in F# 8 or F# 9. Postponing it to F# 9 doesn't help enterprises overcome this issue; it only postpones the inevitable. Not enabling it by default in existing projects at some point isn't the way to go, either.

I want to work on this change, and if we consider the quality to be suitable by the feature freeze time, I think it could make sense to consider enabling it in .NET 8, as it seems the benefits of improving the tooling outweigh the possible downsides, which are easy enough to overcome. However, if we see that there're major issues need resolving by then, I'll be happy to proceed with it for the .NET 9 release instead.

In the best case, we'll be able to ship the change this year, and there will be an opportunity to improve features further in subsequent releases of the tools, enabling us to offer better tooling support by the .NET 9 release time. On the other hand, if we postpone it to .NET 9, most users won't get these improvements for another 1.5 years, which seems like a long time for the language tooling.

cartermp · 2023-05-26T17:41:53Z

FWIW I would be in favor of:

Enable a bunch of these fixes by default for everyone
Have a straightforward setting in project files to go back to old behavior (and VS tools checkbox in project settings for that), then document this and include some message about it in the error message

I think we should also weigh the disruption to some against other kinds of disruptions. I have no doubt that we've done many "acceptable" things, especially in tools, that are far more disruptive than something like this.

That doesn't excuse this change from needing to be done responsibly, but if the net benefit is that newcomers to the language can just type code like they expect and it usually "just works" then I think that's big also for these larger, closed-source enterprise codebases where people need to maintain code they didn't write.

Tarmil · 2023-05-27T12:43:52Z

I think we should really follow this path for this one:

....have a flag --strict:SOME-FEATURE-NAME that is turned on by default in new project templates....

There will 100% be significant enterprise code that emits and suppresses this warning. We can't have installing new versions of the F# compiler or .NET SDK just randomly start giving a hard error for suppressed instances of this warning, or indeed even for unsuppressed instances.

Alternatively, we could consider revising the rule that if a language version is not specified, then the latest is used. And instead, use v7 -- which would be the last version that had this policy -- and show a warning or lint that you should specify a language version.

vzarytovskii · 2023-05-28T08:47:42Z

I think we should really follow this path for this one:

....have a flag --strict:SOME-FEATURE-NAME that is turned on by default in new project templates....

There will 100% be significant enterprise code that emits and suppresses this warning. We can't have installing new versions of the F# compiler or .NET SDK just randomly start giving a hard error for suppressed instances of this warning, or indeed even for unsuppressed instances.

Alternatively, we could consider revising the rule that if a language version is not specified, then the latest is used. And instead, use v7 -- which would be the last version that had this policy -- and show a warning or lint that you should specify a language version.

We want people to use latest version if its not specified, since it's usually contains a whole bunch of diagnostics, better codegen, new features.

It's better to have a separate flag for this feature.

auduchinok · 2023-05-30T09:17:38Z

We want people to use latest version if its not specified, since it's usually contains a whole bunch of diagnostics, better codegen, new features.
It's better to have a separate flag for this feature.

It seems the best way to do this would be to have a flag and to enable/disable it by default based on the language version. This way people using the new SDK will get the changes automatically, while it will still be possible to disable it, or to enable it if an older language version is used explicitly in the project file for some reason.

auduchinok · 2023-05-30T13:57:27Z

@dsyme @vzarytovskii If you agree with this proposal, could it get an approval, please?

If there're no objections to trying to get it into .NET 8, then I'll start working on it.

dsyme · 2023-05-30T14:58:54Z

I'll mark it as approved.

By approving I mean if the feature applies then this strawman (please let's capture this very precisely):

make the "offside warning" into a hard error
make the proposed adjustment to the grammar to improve indentation recovery
implement this by duplicating the entire parser
take advantage of this duplication to also completely flush all the non-indentation-aware rules from the copy of the parser
also flush any other rules that give hard errors on long deprecated features
put the whole thing in preview so we can perhaps pile in some other changes, since we don't get to do this often
if langversion < PREVIEW the old parser is used
I'd favour using the "immediately on-by-default-at-next-major-upgrade-of-.NET-SDK" mechanism @baronfel mentioned to get this to be the default behaviour.

Now I think of it, duplicating the parser is actually problematic because the token definitions all get duplicated. That would require a change in FsLexYacc. Ugh. I really feel this needs it though.... @auduchinok Do you know a way that would avoid this duplication while retaining certainty that we haven't mucked with parsing of old code?

auduchinok · 2023-05-30T15:12:06Z

implement this by duplicating the entire parser

Now I think of it, duplicating the parser is actually problematic because the token definitions all get duplicated. That would require a change in FsLexYacc. Ugh. I really feel this needs it though.... @auduchinok Do you know a way that would avoid this duplication while retaining certainty that we haven't mucked with parsing of old code?

On the parser side, this change should only require adding/changing recovery rules, like the ones I've been adding recently, not changing normal case ones. It should have no more effect on normal rules than lots of recently added recovery ones.

I think splitting the parser into the old and the new ones can be good in the long term but should probably be done separately, as it's really orthogonal to the indentation recovery I'm proposing. This task may be big enough by itself even without this additional refactoring, as we'll need to make sure all the IDE features work with the new recovery. Let's maybe make it a separate change? 🙂

kerams · 2023-05-30T15:13:41Z

Love the idea of completing the sunsetting of non-light syntax and ML compatibility.

dsyme · 2023-05-30T16:07:01Z

On the parser side, this change should only require adding/changing recovery rules, like the ones I've been adding recently, not changing normal case ones. It should have no more effect on normal rules than lots of recently added recovery ones.

But don't those rules apply unconditionally? Making it impossible for that parser to parse like it did previously?

auduchinok · 2023-05-30T16:32:55Z

But don't those rules apply unconditionally? Making it impossible for that parser to parse like it did previously?

Yes, they should, but we have control over what's inserted by LexFilter, and we can make the new recovery rules specific to the LexFilter changes where needed.

Tarmil · 2023-05-30T17:30:38Z

Among the obsolete rules that could be flushed by a parser clean slate, #806 is also a candidate. Infix operator de-indentation was considered a misfeature by everyone in the conversation and it bites people regularly. Don approved adding a warning on it; but while we're making breaking changes to the parser, we could even remove it altogether.

auduchinok · 2023-06-08T10:27:44Z

I've implemented the new rules for some of the most common cases in dotnet/fsharp#15334, and it improved parsing recovery significantly, fixing many issues with highlighting/folding/lenses during typing, like reported in dotnet/fsharp#7078.

With this change places where code gets deindented act as syncronization points, and it allows producing begin/end-like tokens for the parser, instead of making it parse everything below the unfinished code incorrectly. Since F# syntax is indentation based, using indentation blocks as recovery mechanism seems to be natural and probably the best choice we have. In other words, every place in the language that required additional indentation now allows us to significantly improve things, and every place that didn't may make it more challenging (or impossible) to improve the tooling support.

Here're some examples:

let _ =
  if true then
  () // warning

let _ =
  if true then () else

  () // no warning, so it's possible to write early returns without additional indentation

let _ =
  match () with
  | _ ->

  () // no warning, for a similar reason

These cases deliberately allow code to not have additional indentation, and it makes the usage of the language much better in some scenarios. Fortunately, the most of such cases normally appear inside other expressions/declarations, so the outer indentation block still help us have better recovery with the new rules.

Unfortunately, there're cases where the language design doesn't seem to be consistent, and some cases don't require additional indentation, despite code below often becomes impossible to be parsed (in some expected way or at all):

type T =
member this.Prop = 1 // a warning is produced (an error with the new rules)

type T with
member this.Prop = 1 // no warning

Not having indentation here makes it impossible to properly parse things below:

type T with
member this.Prop = 1

let x = 1 // error, this is considered a part of the type augmentation

Subsequently, it doesn't parse things as expected while typing too, even if you're going to have the proper indentation:

type T with

let x = 1 // errors here and below, until a member with proper indentation is typed above

There're other similar cases:

type T() =
    do
    () // no warning

    member this.Prop = 1 // error, this is parsed as a part of the `do` expression

And it breaks for unfinished code too, making the user experience much worse:

type T() =
    do

    // errors here and below
    // this is unsuccessfully parsed as a part of the `do` expression
    member this.Prop = 1

With all this considered, I propose that we normalize the indentation requirements where it makes sense:

for top level declarations indentation should be required, if the code below can't normally be parsed without it in the most of the cases (like in the type/with and do examples above)
the changed places should start producing warnings if the new indentation rules are disabled
deliberate undentation is going to still be allowed, because it's an important part of the language that people use
other cases can be discussed if found

We can discuss the particular cases where it makes sense to change the behaviour, and I'll try to start normalizing the requirements in dotnet/fsharp#15102 and other PRs.

auduchinok · 2023-06-08T11:17:11Z

There're also less critical inconsistent cases:

let _ =
  if true then
  () // warning

let _ =
  while true do
  () // no warning

Expressions like while are normally placed into other declarations/expressions, so there're already less cases where they break code below with the new rules enabled. An important exception is top-level expressions in modules, often used in scripts (and Program.fs in console apps). To improve things here we may want to normalize some of these cases too. It's possible and could be good to do, but makes the changes more breaking. They also don't fall into 'breaks too many things below' or 'deliberately designed and used' categories, so there should be more discussion about such cases.

dsyme added the approved-in-principle label May 30, 2023

auduchinok mentioned this issue Jun 7, 2023

LexFilter: rework wrong indentation recovery dotnet/fsharp#15334

Merged

auduchinok mentioned this issue Jun 26, 2023

LexFilter/strict: add compiler option dotnet/fsharp#15491

Merged

nojaf mentioned this issue Jun 29, 2023

Update FCS to 'Correct range of SynInterfaceImpl' fsprojects/fantomas#2919

Merged

auduchinok mentioned this issue Jul 19, 2023

Enable the new indentation rules dotnet/fsharp#15631

Closed

edgarfgp mentioned this issue Apr 6, 2024

Potential relaxation of indentation requirements for type extensions dotnet/fsharp#16988

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework incorrect indentation recovery #1273

Rework incorrect indentation recovery #1273

auduchinok commented May 23, 2023 •

edited

Loading

vzarytovskii commented May 23, 2023

dsyme commented May 23, 2023 •

edited

Loading

baronfel commented May 23, 2023

dsyme commented May 24, 2023

auduchinok commented May 24, 2023 •

edited

Loading

dsyme commented May 24, 2023 •

edited

Loading

vzarytovskii commented May 24, 2023 •

edited

Loading

baronfel commented May 24, 2023 •

edited

Loading

vzarytovskii commented May 24, 2023

auduchinok commented May 24, 2023 •

edited

Loading

auduchinok commented May 25, 2023 •

edited

Loading

cartermp commented May 26, 2023

Tarmil commented May 27, 2023

vzarytovskii commented May 28, 2023

auduchinok commented May 30, 2023

auduchinok commented May 30, 2023

dsyme commented May 30, 2023

auduchinok commented May 30, 2023 •

edited

Loading

kerams commented May 30, 2023 •

edited

Loading

dsyme commented May 30, 2023

auduchinok commented May 30, 2023

Tarmil commented May 30, 2023

auduchinok commented Jun 8, 2023 •

edited

Loading

auduchinok commented Jun 8, 2023 •

edited

Loading

Rework incorrect indentation recovery #1273

Rework incorrect indentation recovery #1273

Comments

auduchinok commented May 23, 2023 • edited Loading

Pros and Cons

Extra information

Affidavit (please submit!)

For Readers

vzarytovskii commented May 23, 2023

dsyme commented May 23, 2023 • edited Loading

baronfel commented May 23, 2023

dsyme commented May 24, 2023

auduchinok commented May 24, 2023 • edited Loading

dsyme commented May 24, 2023 • edited Loading

vzarytovskii commented May 24, 2023 • edited Loading

baronfel commented May 24, 2023 • edited Loading

vzarytovskii commented May 24, 2023

auduchinok commented May 24, 2023 • edited Loading

auduchinok commented May 25, 2023 • edited Loading

cartermp commented May 26, 2023

Tarmil commented May 27, 2023

vzarytovskii commented May 28, 2023

auduchinok commented May 30, 2023

auduchinok commented May 30, 2023

dsyme commented May 30, 2023

auduchinok commented May 30, 2023 • edited Loading

kerams commented May 30, 2023 • edited Loading

dsyme commented May 30, 2023

auduchinok commented May 30, 2023

Tarmil commented May 30, 2023

auduchinok commented Jun 8, 2023 • edited Loading

auduchinok commented Jun 8, 2023 • edited Loading

auduchinok commented May 23, 2023 •

edited

Loading

dsyme commented May 23, 2023 •

edited

Loading

auduchinok commented May 24, 2023 •

edited

Loading

dsyme commented May 24, 2023 •

edited

Loading

vzarytovskii commented May 24, 2023 •

edited

Loading

baronfel commented May 24, 2023 •

edited

Loading

auduchinok commented May 24, 2023 •

edited

Loading

auduchinok commented May 25, 2023 •

edited

Loading

auduchinok commented May 30, 2023 •

edited

Loading

kerams commented May 30, 2023 •

edited

Loading

auduchinok commented Jun 8, 2023 •

edited

Loading

auduchinok commented Jun 8, 2023 •

edited

Loading