Skip to content

Remove tree-sitter dependency from tree-sitter-generate #4117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

bbb651
Copy link

@bbb651 bbb651 commented Jan 12, 2025

It was only used to share two constants, and balloons it's dependencies.

This also makes generate_parser_for_grammar work in wasm. (Tested in wasm32-wasip2 in wasmtime with the json grammar, wasm32-unknown-unknown running in the same setup exited successfully so I'm pretty confident it works as well).

I wasn't sure what to do with parser.h, I could've include_str!ed it outside the crate, or symlinked it, but I didn't find a conscious if this is ok to do, and apparently cargo publish warns against the first one, so I ended up copying it like alloc.h and array.h (I'm aware these aren't exact copies like parser.h). The other constant was LANGUAGE_VERSION, ABI_VERSION_MIN and ABI_VERSION_WITH_RESERVED_WORDS need to be updated anyway so it's not really a problem to manually update it too.

Copy link
Member

@amaanq amaanq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not have parser.h live in two spots, this makes it very easy for them to be out of sync. The main reasons being this file is a lot more likely to be updated compared to alloc.h/array.h, and because those other files are slightly different due to differences in parser code vs the core library.

I also think include_str wouldn't work when publishing.

@bbb651 bbb651 force-pushed the generate-without-tree-sitter-dep branch from dad7b1e to a15a868 Compare January 13, 2025 00:20
@bbb651
Copy link
Author

bbb651 commented Jan 13, 2025

I changed it to a symlink, cargo publish --dry-run works so I think it's fine

@amaanq
Copy link
Member

amaanq commented Jan 13, 2025

Symlinks break windows though..

@ObserverOfTime
Copy link
Member

Could it be copied on build time through build.rs?

@amaanq
Copy link
Member

amaanq commented Jan 13, 2025

Could it be copied on build time through build.rs?

Pretty sure you can't access files outside the crate root when publishing, but it works for local builds/installs

@maxbrunsfeld
Copy link
Contributor

Seems like this is adding maintenance complexity. Why is this change needed?

@bbb651
Copy link
Author

bbb651 commented Jan 13, 2025

For me the main motivation is experimenting with trading off size for performance in a non-web wasm environment which at least currently cannot compile tree-sitter (the approach I'm currently trying is templating a wasm binary directly with wasm-encoder, the main downside being no support for external scanners), but I believe most usecases for tree-sitter-generate explicitly don't require tree-sitter itself, e.g. a CI action to build parsers or a build tool for websites that use the js bindings.
Edit: I found https://github.com/hydro-project/rust-sitter, that another cool use case

If this is too much of a hassle I can just use a fork, but it seems like 90% of the work to separate tree-sitter-generate was already done, so it's a bit unfortunate to leave it at this state.

@amaanq
Copy link
Member

amaanq commented Jan 18, 2025

I think it makes sense, and the current solution could be to use a build script for local development/installation, and in CI, before we publish our crates, ensure we run this build script beforehand (I would think cargo check is enough to copy the file over). Are you up for doing that @bbb651?

@amaanq
Copy link
Member

amaanq commented Jan 25, 2025

ping @bbb651?

It was only used to share two constants, and balloons it's dependencies.

This also makes `generate_parser_for_grammar` work in wasm.
(Tested in `wasm32-wasip2` in wasmtime with the json grammar,
`wasm32-unknown-unknown` running in the same setup exited successfully
so I'm pretty confident it works as well)
@bbb651 bbb651 force-pushed the generate-without-tree-sitter-dep branch from a15a868 to ce351b3 Compare January 27, 2025 10:29
@bbb651
Copy link
Author

bbb651 commented Jan 27, 2025

After a bunch of trail and error I landed on a solution that passes cargo publish --dry-run, this needs testing but I'm pretty sure no changes to CI are necessary because that already triggers the build script.
At first I copied it to src/templates and added a .gitignore but turns out you're not supposed to do that, so I changed it to output to OUT_DIR and include_str!ed that instead.

Also I used a small hack to get the workspace root in a consistent way, that also simplifies read_git_sha.

@bbb651 bbb651 force-pushed the generate-without-tree-sitter-dep branch from c3db6d6 to a6f8e0f Compare January 27, 2025 12:26
@bbb651
Copy link
Author

bbb651 commented Jan 27, 2025

Sorry about that, I resolved the conflicts in the web editor and didn't get the formatting right

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants