Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic scope handler interfaces #1031

Merged
merged 72 commits into from
Oct 23, 2022

Conversation

AndreasArvidsson
Copy link
Member

@AndreasArvidsson AndreasArvidsson commented Oct 12, 2022

Checklist

  • Add "identifier" scope type
  • Switch "word" and "char" to use this new setup, with "identifier" as their parent scope type
  • Add test for "take next word" when cursor is in the whitespace between two tokens, eg "foo | barBaz"
  • Add test for "one token backward"
  • Link issues that this PR closes and clean up related issues
  • Remove old modifier stages
  • Figure our grand scopes. Just pass ancestorIndex to getScopesTouchingPosition
  • I have added tests
  • I have updated the docs and cheatsheet

@pokey
Copy link
Member

pokey commented Oct 12, 2022

Ok super helpful to see the code written out; great call to just bang it out.

Here's what I'm thinking. Scope type exposes two functions:

  • A function that takes a position and returns containing iteration range, just as a range
  • A function that takes an iteration range and returns a list of scopes

I think it would look as follows

interface ScopeDefinition {
   getIterationRange(editor: TextEditor, position: Position): Range;
   getScopesInIterationRange(editor: TextEditor, range: Range): Scope[];
}

interface Scope {
   domain: Range;
   getTarget(isBackward: boolean): Target;
   getPriority?(): number;
}

Then the base class / intermediary has an interface as follows:

/** For use as return type of {@link ScopeHandler.getTargetsForRange} */
interface IterationScope {
   targets: Target[]
   containingStartIndex: number;
   containingEndIndex: number;
}

class ScopeHandler {
   getTargetsForRange(editor: TextEditor, range: Range, isBackward: boolean): IterationScope {
      const startIterationRange = self.scopeDefinition.getIterationRange(editor, range.start)
      const iterationRange = contains(startIterationRange, range.end) ? startIterationRange : startIterationRange.union(self.scopeDefinition.getIterationRange(editor, range.end)

      const scopes = self.scopeDefinition.getScopesInRange(editor, iterationRange)

      // Then compute containing scope by intersection with domain for non-empty range
      // For empty range, pick weakly containing domain, breaking ties using `scope.getPriority()` if it exists, else prefer right
      // Then call `getTarget()` on the ones we want to return for `targets`
   }
}

Then

  • containing scope modifier stage is implemented by grabbing the iterationScope.containingIndices and constructing range target
  • every scope modifier stage is implemented by grabbing the iterationScope.targets and just doing an intersection internally if hasExplicitRange
  • ordinal scope modifier stage is implemented by calling the every stage, like we do today. I think it is actually good that there is a hard dependency here, because we want every and ordinal to behave identically
  • relative scope modifier stage is implemented by looking at containing scope and list of targets and doing its magic from there

Note that the only place that we ever refer to hasExplicitRange is in the every scope modifier stage

wdyt?

@pokey
Copy link
Member

pokey commented Oct 12, 2022

One other question is how we handle "every round". I guess maybe we just have it return an empty range for getIterationRange? I'm not sure we want to do any expansion

Then for getIterationScope, if any pairs start or end inside the iteration range, it returns all top-level pairs that it it intersects with. Top-level here means not contained by any other pair intersecting with iteration range.

If no pairs start or end inside the iteration range, we return the smallest pair containing the iteration range, or maybe just error

So I think this works within the api I defined above; just wanted to write it out to see

I guess one question is how "next round" works, because I believe this algorithm won't return enough for it to see the next one. I wonder if pairs want special treatment for next 🤔. Might be better for them to be able to walk forward and figure it out themselves. Ie go until they hit an opening paren and then take that round.

Maybe "every" isn't really the right abstraction to use for implementing "next" 🤔

Eg it's slightly unfortunate that "next token" breaks when it hits a line boundary

@pokey
Copy link
Member

pokey commented Oct 12, 2022

For functions, we expand to class in getIterationRange

In getScopesInRange, I think we just check that the range is just one class, and return its functions if so? Otherwise error? Idk

@pokey
Copy link
Member

pokey commented Oct 18, 2022

Ok what should the following do?

aaa bbb|. ccc

Cursor is the |; saying "two tokens"? Should it refer to bbb. or . ccc? I think probably the consistent thing is bbb., because it's two tokens starting from "token". One could also argue it should always just look to the right, though that breaks down when the cursor is in the middle of a token, which should obviously be that token and the following.

Unfortunately, I don't think it's as simple as containing and then adding as many scopes as necessary. For example:

{
    aaa {bbb} ccc {ddd} eee
    {fff}
}

In this case, if the user says "three curlies line air", I think it should be {bbb} past {fff}, but if we treat "<number> <scope>s" as first expanding to containing, it will expand to the big curly containing everything and then have nothing to add

So my thinking is as follows:

If the input target is empty, then index 0 should refer to containing scope. If the input target is not empty, then index 0 should refer to all intersecting scopes. Wdyt?

@pokey
Copy link
Member

pokey commented Oct 20, 2022

fwiw this is what it looks like to add a scope type now: 0818816 cc/ @AndreasArvidsson

Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think it's good to go!

Comment on lines +59 to +60
@mod.capture(rule="<user.cursorless_scope_type> {user.cursorless_backward_modifier}")
def cursorless_relative_scope_one_backward(m) -> dict[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I introduced this one because otherwise you need to say "one tokens backward", which is kinda awkward 😅. Now you just say "token backward". Cheatsheet updated


export default class CharacterScopeHandler extends NestedScopeHandler {
public readonly scopeType = { type: "character" } as const;
public readonly iterationScopeType = { type: "token" } as const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that I made this token rather than identifier. Having it be identifier broke "every char block" because it would skip over things like .. I think token is a sensible iteration scope anyway: the user can always say "first char identifier"

Comment on lines +38 to +41
constructor(
public readonly scopeType: ScopeType,
protected languageId: string
) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to keep these args for consistency, but note that many nested scope types will just reuse this constructor, so won't need to define anything. See eg IdentifierScopeHandler, CharacterScopeHandler, etc

Comment on lines +18 to +19
// FIXME: Switch to using getMatchesInRange once we are able to properly
// mock away vscode for the unit tests in subtoken.test.ts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bit annoying, but I wasn't prepared to shove the graph through all the functions necessary to get it to this scope handler for this PR and then add proper VSCode mocking for it. But I think we need to do that sooner rather than later so that we can have unit tests on these handlers rather than having to choose between full end-to-end or having to factor out functions from our handlers so that they can be tested

@@ -0,0 +1,8 @@
export * from "./NestedScopeHandler";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I forgot to add something to this file and did a direct import instead, and sure enough hit a circular dependency problem. Looks like we do in fact need this file

We should look into a simple plop setup to automate adding these things

@@ -1,204 +0,0 @@
import { Range, TextEditor } from "vscode";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🪦🎉

@@ -0,0 +1,30 @@
languageId: plaintext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that word can cross both token and line boundaries now

@@ -0,0 +1,35 @@
languageId: plaintext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to be one of your word tokeniser unit tests. I made it into an end-to-end test because it now relies on interactions between word tokenizer and identifier tokenizer, so was easier to just test this way. All your other word tokenizer tests were within an identifier boundary so easier to run as unit tests

{
input: "apiV1 api_v_1",
expectedOutput: ["api", "V", "1", "api", "v", "1"],
},
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above; this one is now an end-to-end test

@pokey pokey marked this pull request as ready for review October 21, 2022 17:02
@pokey
Copy link
Member

pokey commented Oct 21, 2022

Filed #1052 to track migrating remaining scope types

@AndreasArvidsson AndreasArvidsson merged commit a4cee96 into main Oct 23, 2022
@AndreasArvidsson AndreasArvidsson deleted the generic-scope-handler-interfaces branch October 23, 2022 09:56
cursorless-bot pushed a commit that referenced this pull request Oct 23, 2022
* Original scope handlers

* More stuff

* Use proper token stage in sub token stage

* Work around for identifier matcher

* Use new scope handler in relative scope stage

* Use new scope handlers in ordinal scope stage

* Update usage of containing indices

* Rename

* Refactored create target

* Clean up

* clean up

* Add some tests that should pass

* Add a bunch more tests

* Attempt at different approach to scope handlers

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Initial implementation of new idea

* More work on this stuff

* Rename and add some todos

* More stuff

* Tweaks

* Renames and docstrings

* Restructuring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* More jsdocs

* Tweaks

* Test fixes

* Fix error messages

* Revert `OrdinalScopeStage`

* Tweak

* jsdocs; fix import

* Don't export legacy types

* Fix import

* Preparation for surrounding pairs

* Naming cleanup

* Unify `getLegacyScopeStage` functions

* Lots of cleanup

* Fix regex `lastIndex` issue

* More cleanup

* More cleanup

* More cleanup

* Add `ancestorIndex` in prepartion for #124

* Add more jsdocs

* More docs

* More docs

* docstrings

* Improve hierarchical error type

* Docs

* More minor dog tweaks

* More docs

* More docs

* Doc strings and a couple tests

* Remove `isPreferredOver`

* Support `ancestorIndex` on api surface

* Improved jsdocs

* Split and cleanup relative stages

* Make scope handler constructor args optional

* More legacy type fixes

* Add `"identifier"` scope

* Working new code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Docs + cheatsheet

* Update docs

* Add jsdoc

* jsdoc

* JSDocs

* doc tweaks

* reflow

* Tweaks

Co-authored-by: Pokey Rule <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@pokey pokey restored the generic-scope-handler-interfaces branch October 27, 2022 18:03
@pokey pokey deleted the generic-scope-handler-interfaces branch October 27, 2022 18:06
thetomcraig-aya pushed a commit to thetomcraig/cursorless that referenced this pull request Mar 27, 2024
* Original scope handlers

* More stuff

* Use proper token stage in sub token stage

* Work around for identifier matcher

* Use new scope handler in relative scope stage

* Use new scope handlers in ordinal scope stage

* Update usage of containing indices

* Rename

* Refactored create target

* Clean up

* clean up

* Add some tests that should pass

* Add a bunch more tests

* Attempt at different approach to scope handlers

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Initial implementation of new idea

* More work on this stuff

* Rename and add some todos

* More stuff

* Tweaks

* Renames and docstrings

* Restructuring

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* More jsdocs

* Tweaks

* Test fixes

* Fix error messages

* Revert `OrdinalScopeStage`

* Tweak

* jsdocs; fix import

* Don't export legacy types

* Fix import

* Preparation for surrounding pairs

* Naming cleanup

* Unify `getLegacyScopeStage` functions

* Lots of cleanup

* Fix regex `lastIndex` issue

* More cleanup

* More cleanup

* More cleanup

* Add `ancestorIndex` in prepartion for cursorless-dev#124

* Add more jsdocs

* More docs

* More docs

* docstrings

* Improve hierarchical error type

* Docs

* More minor dog tweaks

* More docs

* More docs

* Doc strings and a couple tests

* Remove `isPreferredOver`

* Support `ancestorIndex` on api surface

* Improved jsdocs

* Split and cleanup relative stages

* Make scope handler constructor args optional

* More legacy type fixes

* Add `"identifier"` scope

* Working new code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Docs + cheatsheet

* Update docs

* Add jsdoc

* jsdoc

* JSDocs

* doc tweaks

* reflow

* Tweaks

Co-authored-by: Pokey Rule <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement generic "every" logic Update scope type modifier to expand on both ends
2 participants