Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should "a" be represented by dj.substring(0, 1) - or (0,0) ? #26

Open
FrankFischer opened this issue Jan 7, 2023 · 3 comments
Open

Comments

@FrankFischer
Copy link

If you parse the Djot string "a"
the following events will be produced:

[{ startpos: 0, endpos: 0, annot: "+para" }
,{ startpos: 0, endpos: 1, annot: "str" }
,{ startpos: 2, endpos: 2, annot: "-para" }]
  • "+para" 'is' the first char of "ab"
    (same startpos and endpos as 'a')

  • "-para" 'is' the char after "ab"
    (the char at offset 2) - but this
    char does not exist!


Even if this 'works' in an implementation - a more
concise and clearer concept should be considered:

[{ startpos: 0, endpos: 0, annot: "+para" }
,{ startpos: 0, endpos: 2, annot: "str" }
,{ startpos: 2, endpos: 2, annot: "-para" }]
  • startpos would be 'at' the start of the
    first char' (the point before "ab")
  • and endpos at the start of the char following
    the last char that should be included
    (the point after "ab")
  • and "str" would be the chars between this two
    points

As far as i know Java, JavaScript, Scala and
many other programming languages use this
concept.


In my opinion it this might be the better
way in the long run.

Frank

@FrankFischer
Copy link
Author

If you parse the Djot string "ab" ...
was what i wanted to say.

@jgm
Copy link
Owner

jgm commented Jan 7, 2023

(the char at offset 2) - but this
char does not exist!

Well it does: it's a \n (newline) character.

@jgm
Copy link
Owner

jgm commented Jan 7, 2023

I'm not really sure what is best. In fact, there are three ways we could go:

  1. Current system with the offset of the first character and the offset of the last character
  2. Your way, with the offset of the first character and the offset of the character after the last character
  3. Offset of first character plus length

Since all the code currently implements 1, we'd need strong reasons to change from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants