Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed exports #16

Open
willcrichton opened this issue May 18, 2022 · 4 comments
Open

Proposed exports #16

willcrichton opened this issue May 18, 2022 · 4 comments

Comments

@willcrichton
Copy link
Contributor

willcrichton commented May 18, 2022

For Nota, I have implemented a new syntax as a Markdown extension using the interface in @lezer-parser/markdown: https://github.com/nota-lang/nota/blob/markdown/packages/nota-syntax/lib/parse.ts

Everything worked quite well -- great job on the flexible API design! But I ran into a few cases where I needed functionality that's currently private to markdown.ts. I'd like to show what those cases are, and in turn propose they be exported.

  1. Type should be exported from index.ts: I need Type in order to post-process the Markdown AST. However, although Type is exported from markdown.ts, it's not accessible through the module root. My bundler can't seem to resolve import "@lezer/markdown/dist/markdown" so it's not easy for me to import it from the file directly.
  2. BlockResult should be exported: this type appears in a public interface (BlockParser.parse), so it would be nice to name it directly.
  3. InlineContext constructor should be exported: similar to the problems described in Line objects should maybe have a char() function equivalent to InlineContext.char() #6, I had issues dealing with relative vs. global indexes inside block parsers. My solution was to just construct an InlineContext for a given line, use absolute indexes everywhere, and then let the context methods handle the translation to relative indexes for me. But this strategy requires access to the InlineContext constructor, which isn't exported.

If you're ok with any of these, let me know and I will put up a pull request.

@marijnh
Copy link
Contributor

marijnh commented May 18, 2022

Since Type only covers a part of the node types in an extended parser, and the library provides separate vocabulary for defining node types, I intentionally kept that enum out of the public interface. You can construct something like it, for a specific parser, by iterating through parser.nodeSet.types and building up an object. Or, if you want to refer to node types statically, use their names. Does that help?

BlockResult should be exported:

It's boolean | null, which is only a few characters more than the name. What is gained by exporting this?

My solution was to just construct an InlineContext for a given line

There's a bunch of other internal stuff that would probably have to be exposed for this (such as resolveMarkers) and I'm not sure I want to complicate the public interface for this rather specific use. What kind of block are you defining here, and where do the problems with relative/absolute indices come from?

@willcrichton
Copy link
Contributor Author

willcrichton commented May 18, 2022

You can construct something like it, for a specific parser, by iterating through parser.nodeSet.types and building up an object.

I can just do this if you want to keep Type private. I already do this to extract the numeric type ids for my Markdown extension.

It's boolean | null, which is only a few characters more than the name. What is gained by exporting this?

IMO it's just a bit of a surprising type, since I would normally expect this to be an enum with names for each case. The type name conveys that it's very specific to the block parsing context. Not a huge deal though.

There's a bunch of other internal stuff that would probably have to be exposed for this (such as resolveMarkers)

In my use case at least, I was able to stick to the current public interface with the exception of the constructor.

What kind of block are you defining here, and where do the problems with relative/absolute indices come from?

For example, I have a composite block that has a "header" on the first line followed by indented blocks. Such as:

@div[class: "foo"]:
  **content!**

Then I have code like this that needs to parse through the header line, which involves many lookups to line.text. I try to only use absolute positions everywhere to be consistent and avoid confusion. However the Line structure does not expose methods to lookup substrings or characters based on absolute positions (exactly the issue in #6).

An alternative here would be a method like Line.toInlineContext(cx: BlockContext): InlineContext that abstracts the details of the constructor.

@marijnh
Copy link
Contributor

marijnh commented May 20, 2022

The code you linked seems to have already moved in the meantime.

In any case, I can see the use for an abstraction to help here, but simply exporting the InlineContext constructor feels dodgy—that does a bunch of things that don't make sense for a task like this. Maybe we should split off a superclass of InlineContext that just does the position-offsetting and element-building, and export that instead? Not too sure about the name, though. Maybe InputWindow or something.

@acnebs
Copy link
Contributor

acnebs commented Nov 23, 2023

I actually would also like to see the Type enum exported. When dealing with the syntax tree, I find that I frequently am checking if a certain node is a certain type. Although yes I can clearly do node.type.name === 'StrongEmphasis', I think it would be much nicer from a dev perspective to be able to do node.type.name === MarkdownType.StrongEmphasis.

In general, I find enums much better to use in this case, as they are a better guarantee of forwards compatibility and they improve discoverability. Although of course I can just make my own version of the enum, this enum would become outdated if a new node type was ever added to lezer-parser/markdown. Although this is highly unlikely in this particular case (as the Markdown spec is fairly stable at this point), recreating an enum from a library is quite code-smelly. I'm also not sure I understand the reticence – what is lost by making this an export from the lib? Sure, the enum is not "complete" in an extended parser, but I can make it complete with:

import { Type as LezerMarkdownType } from '@lezer/markdown'

type ExtendedMarkdownType = MyCustomMarkdownNodeType | LezerMarkdownType;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants