Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOP #45

Open
ojwb opened this issue Dec 2, 2020 · 8 comments
Open

TOP #45

ojwb opened this issue Dec 2, 2020 · 8 comments
Labels
bug Something isn't working syntax highlighting

Comments

@ojwb
Copy link
Collaborator

ojwb commented Dec 2, 2020

TOP is a bit of an odd-ball keyword in that it doesn't actually have its own token but is actually tokenised as a TO token followed by a letter P.

The syntax highlighting doesn't seem to know about this quirk, e.g. this actually colours the P brown instead of blue:

PRINT TOP

You can't just colour any P after TO blue though, consider:

F.P=TOP TOP:P.P:N.

There the first TOP is TOP, but the second TOP is actually TO P.

@mattgodbolt
Copy link
Owner

Oh good grief! I can't believe I'm today years old and I learn TOP isn't real!

@mattgodbolt mattgodbolt added bug Something isn't working syntax highlighting labels Dec 2, 2020
@ojwb
Copy link
Collaborator Author

ojwb commented Dec 2, 2020

The TOP is a lie!

I don't think there are any other pseudo-tokens to worry about at least.

I've long wondered (not very actively admittedly - it doesn't often keep me awake at night) why there isn't a token for TOP - several were spare in BASIC I and &CE was still spare in BASIC II (that became EDIT in later versions, which then had to get more inventive to have tokens beyond that). Maybe it was just easier to have context-sensitive handling at runtime rather than tokenisation time.

Golf tip: If you want to use TOP and you're pre-tokenising, you can save a byte by using LOMEM instead as that is a real token and the two have the same value unless you set them.

@mattgodbolt
Copy link
Owner

So good :) thanks @ojwb

@mattgodbolt
Copy link
Owner

Well, this one's a tricky one and no mistaking. The tokeniser is much too simple to know when to colesce the tokens: it's necessarily not a full parser (and it's based off what the the underlying editor provides us).

Looks like this would need us to "know" we're in a FOR and we're expecting TO, but there must be a simpler way!

@mattgodbolt
Copy link
Owner

Note for self: this test needs to pass:

    it("should handle TOP properly", () => {
        // See #45
        checkTokens(
            ["F.P=TOP TOP"],
            [
                {offset: 0, type: "keyword"}, // FOR
                {offset: 2, type: "variable"}, // P
                {offset: 3, type: "operator"}, // =
                {offset: 4, type: "keyword"}, // = TOP
                {offset: 5, type: "white"}, // space
                {offset: 6, type: "keyword"}, // TO
                {offset: 8, type: "variable"}, // P
            ]
        );
    });

@ojwb
Copy link
Collaborator Author

ojwb commented Dec 14, 2020

I've wondered if the editor should use the tokenised form as its internal representation, though if the syntax highlighting can only change the styling of the text we couldn't just make the text be the tokenised BASIC as we'd need to show something different to the text (e.g. display "TO" for character \xB8). So the whole idea might be a non-starter.

That wouldn't directly solve the TOP vs TO+P issue, but I think would make it simpler to determine if we're in a place where TOP is TO+P.

Perhaps there's a neat trick to get this right though - I'll see if I can come up with anything.

@mattgodbolt
Copy link
Owner

mattgodbolt commented Dec 14, 2020 via email

@ojwb
Copy link
Collaborator Author

ojwb commented Jan 21, 2025

Note for self: this test needs to pass:

Some of the offsets there are ... offset - they should be:

    it("should handle TOP properly", () => {
        // See #45
        checkTokens(
            ["F.P=TOP TOP"],
            [
                {offset: 0, type: "keyword"}, // FOR
                {offset: 2, type: "variable"}, // P
                {offset: 3, type: "operator"}, // =
                {offset: 4, type: "keyword"}, // TOP
                {offset: 7, type: "white"}, // space
                {offset: 8, type: "keyword"}, // TO
                {offset: 10, type: "variable"}, // P
            ]
        );
    });

I had a little look - it seems it would need to be done by adding more states to the highlighting, but doing it properly seems to require doing the runtime expression parsing which BASIC presumably does when running the code since after the = we need to skip over a complete expression before we treat TO followed by P as TO P rather than TOP - consider e.g.:

FORP=TOP-TOP TOP+TOP/TOP

After = we need to parse an expression (in which TO followed by P is TOP), i.e. TOP-TOP and then we're in the "in a FOR loop expecting TOstate. After we parse aTOwe need to be back in a state whereTOfollowed byPisTOP` for the rest of the line.

Perhaps we can cheat and in a FOR statement handle TO right after a variable or pseudo-variable token or constant or closing parenthesis, or something like that. We might not even have to take nesting of parentheses into account. I haven't spotted a counterexample that this trick fails with yet at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working syntax highlighting
Projects
None yet
Development

No branches or pull requests

2 participants