-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Replace tree-sitter
bindings with tree-house
#12972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e5a1f9a
to
69b68df
Compare
This is fantastic!! It's great to have parameters retain their color even when the function definition goes out of the viewport. I just rebased #12768 on top of your PR and it works flawlessly. We don't have to inject into every argument anymore, positional injections work!! |
While writing the code for query precedence in the injection query I was assuming I was fixing niche bugs (and sometimes questioned neither it was worth writing that quite subtle code). Seeing an actual usecase for it so quickly is great! |
Thanks a lot for doing this! I’m going to have a look at whether I can use it in |
I've been testing this along with #12768 and found a reproducible crash This panic happens if:
To reproduce, add this at the end of ((macro_invocation
macro:
[
(scoped_identifier
name: (_) @_macro_name)
(identifier) @_macro_name
]
(token_tree . (_) . (_) . (string_literal) @injection.content))
(#any-of? @_macro_name
; std
"some_macro")
(#set! injection.language "rust")
(#set! injection.include-children)) Open a Rust project with this structure with cursor at /// ```
/// some_macro!((), (), "`rust` injection happens here");
/// ```
macro_rules! some_macro {
($a:expr, $b:expr, $c:literal) => {};
}
fn main() {
some_ma|
} When Rust analyzer auto-completes the
If you remove the But in that case, my PR is no longer able to inject the |
69b68df
to
ece473b
Compare
Looks like it was a bug in some of the precedence handling for injections and it could cause injection layers to get out of order. Fixed in helix-editor/tree-house@9e2063c |
ece473b
to
691c9bd
Compare
Found a segfault Open this Rust file, with cursor at the end of the comment, after the //! L
//! - `a`
struct A {}
impl A {
//
pub fn a() {}
} When you try to insert a //! L
//! - `a`
struct A {}
impl A {
///
pub fn a() {}
} Note: Removing the Tested with the latest version of this PR (does not happen on master) What is bizarre is that if you copy paste the final version of the file then there is no segfault, and you can write doc comments just fine I tried to run this with with |
Also, the It is fairly common not to have an At the moment, what I'm thinking is that there can be a Then we can use this Edit: I was able to implement this in #13116 |
W.r.t. the segfault: I pushed a workaround in helix-editor/tree-house@8b94086. What's happening is that adding a new doc comment to the root layer adds a new range to the markdown combined injection, so it starts out covering ranges This doesn't happen on master because incremental injections are smarter in tree-house. On master we don't pass the old parse tree because we don't recognize that the new injection layer is the same as the old one. Backtrace...
|
d04f940
to
2e79bb5
Compare
I think I saw this as well it seems like there is something up with the timeout, at least that's what the logs where saying for me I believe |
What's stopping us from moving to a newer tree-sitter version now? I'm dependent on a newer ABI (15) but unable to use since we're stuck with 0.22 treesitter version. However seeing this newer tree-house library only deepens the problem as bunch of PRs might still use the old APIs of helix. It's already painful to keep rebasing the non-merged PRs and seeing such 2k+ changes makes me question my ability on rebasing further... |
Newer versions of the |
Will directives |
So my issue seems a little more interesting(?) than a timeout. When I have |
Actually, just changing |
Here are a few extra injections to add for (fenced_code_block
(info_string
(language) @__language)
(code_fence_content) @injection.content
; list of attributes for Rust syntax highlighting:
; https://doc.rust-lang.org/rustdoc/write-documentation/documentation-tests.html?highlight=no_run#attributes
(#match? @__language
"(ignore|should_panic|no_run|compile_fail|standalone_crate|custom|edition*)")
(#set! injection.language "rust")
(#set! injection.include-unnamed-children)) Since at the moment there is no syntax highlighting for |
813f771
to
974caf0
Compare
I believe this PR will have an impact on Tree-sitter queries, so here my question: I am asking, because i'd be happy to help with some languages. |
This should actually only affect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this change merged, I've been running it for a month now with no issues. Helix facing changes look good to me and I've done a preliminary pass through the tree-house code.
for matched_node in m.matched_nodes() { | ||
let node_id = matched_node.node.id(); | ||
let capture = Some(matched_node.capture); | ||
let capture_type = if capture == query.indent_capture { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be a match clause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly not - I wish you could do Elixir-like pinning in Rust matches like ^query.indent_capture => { .. }
This type also exists on `Editor`. This change brings it to the `Document` as well because the replacement for `Syntax` in the child commits will eliminate `Syntax`'s copy of `syn_loader`. `Syntax` will also be responsible for returning the highlighter and query iterators (which will borrow the loader), so the loader must be separated from that type. In the long run, when we make a larger refactor to have `Document::apply` be a function of the `Editor` instead of the `Document`, we will be able to drop this field on `Document` - it is currently only necessary for `Document::apply`. Once we make that refactor, we will be able to eliminate the surrounding `Arc` in `Arc<ArcSwap<syntax::Loader>>` and use the `ArcSwap` directly instead.
Co-authored-by: Nik Revenco <[email protected]>
Since this was merged shebangs for Bash no longer work. If the files begins with |
Ah yep I accidentally dropped some shebang-parsing code thinking it was redundant. Should be fixed now in 09bc67a |
I've noticed that if no highlight is defined for a scope, the color is |
Ah good catch, that was a regression in the syntax highlighter. Instead of using the theme's |
Ah yeah the Slint and HTML injections need to come after the generic |
This PR drops our dependency on the upstream
tree-sitter
Rust bindings and moves us to a new highlighter crate we've been working on:tree-house
.The main benefit of the new highlighter is that it fixes correctness issues, especially since the reverse query precedence PR. It solves longstanding issues #1151 and #6491, and updates us to the latest tree-sitter C source (v0.22 -> v0.25), which will become important as more grammars adopt the new ABI. It's also a slight performance boost in my benchmarking (5-10%). We've hoped to separate out our highlighting machinery for a while so that other applications could take advantage and the hope is that
tree-house
is the right way to expose it (\cc @hadronized).The new highlighter crate makes some abstractions clearer. The types to care about are already established in the
syntax
module:Loader
, an ECS-style holder of all languages (pub struct Language(pub u32)
). This stores the language configuration as well as any grammar and query information and allows looking up a language by file-type, language name, shebang, etc.. Instead of being stored onSyntax
this is now stored separately onDocument
and passed toSyntax
for updates. In the long run we can eventually store a singleArcSwap<syntax::Loader>
onEditor
instead of many clones ofArc<ArcSwap<syntax::Loader>>
.Syntax
, a wrapper of theSyntax
type fromtree-house
. It holds a tree of trees of the injection layers of a document and provides logarithmic lookups to determine an injection layer (including the layer's language, config, tree and queries) for a given byte range. It also has a query iterator that works across injections which can support indents, textobjects, etc. in the future.There are some differences compared to
master
that are less significant:helix_core::syntax::config
. This is not strictly necessary for this change but it seems like a good opportunity to move these largely unrelated types away from the syntax module.u32
s instead ofusize
. This causes some changes inhelix-core
and the commands modules but is otherwise an implementation detail.tree_sitter::Point
in the indents module has been replaced with byte indices. Generally we can avoid the point types and functions.syntax::generate_edits
for example now produces zero points.TreeCursor
type that acts over injection layers is built intotree-house
and is basically free to create - it doesn't require any extra sorting of layers. That type intree-house
uses theTreeCursor
type from the C library and so it should be more efficient.Highlight
type comes fromtree-house
now and is represented as a non-max u32 (enables the null pointer optimization forOption<Highlight>
).... and some other changes that are worth reviewing:
syntax::merge
in favor of a cursor-like API that matches the highlighter. See theOverlayHighlights
andOverlayHighlighter
types in the rewrittensyntax
module.cargo xtask query-check
can now point out unknown properties like(#set! priority ..)
and unknown predicates like(#has-ancestor? ..)
and has a nice visual output. This especially benefits the indent query which previously used runtime panics for the same effect, seeIndentQuery
.Example query compilation errors...
Fixes #1151
Fixes #3391
Fixes #6491