Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chokes on special characters #1

Open
Zamiell opened this issue Aug 28, 2021 · 9 comments
Open

Chokes on special characters #1

Zamiell opened this issue Aug 28, 2021 · 9 comments

Comments

@Zamiell
Copy link

Zamiell commented Aug 28, 2021

Hello, and thanks for the excellent library.

vdf-parser fails to parse one of my user's localconfig.vdf.
It complains "invalid syntax on line 208".
Line 208 is equal to this:

"3"		"DMR ALİ"

So my guess is that vdf-parser is simply choking on the special characters.
Any chance for a fix?

@Zamiell
Copy link
Author

Zamiell commented Aug 28, 2021

I glanced at the code, I guess its just because this regex doesn't care about non-English characters:
https://github.com/p0358/vdf-parser/blob/master/main.js#L35-L45

I think the solution is to use ES2018 unicode property escapes.
This StackOverflow shows how: https://stackoverflow.com/a/48902765/1062714

@p0358
Copy link
Owner

p0358 commented Aug 28, 2021

I'm looking into it, but so far I think the issue is different, in quoted values there can theoretically be anything inside (qval), only unquoted ones have the limitations to their contents applied (val) (though I'll also see if that doesn't need to be adjusted).

I did try to reproduce the issue, but couldn't with both inline and from-file reads:

Inline

let a = VDF.parse('"3"		"DMR ALİ"');
console.log(a);
console.log(VDF.stringify(a));

Result:

{ '3': 'DMR ALİ' }
"3" "DMR ALİ"

From file

a.vdf (UTF-8):

test
{
    "3"		"DMR ALİ"
    "4"		"무슨 일이"
    "5"		"發生了什麼"
    "6"		"Zażółć gęślą jaźń"
    "7"		"ماذا يحدث"
}
import * as fs from 'fs';
let a = VDF.parse(fs.readFileSync("./a.vdf").toString());
console.log(a);
console.log(VDF.stringify(a));

Result:

{
  test: {
    '3': 'DMR ALİ',
    '4': '무슨 일이',
    '5': '發生了什麼',
    '6': 'Zażółć gęślą jaźń',
    '7': 'ماذا يحدث'
  }
}
"test"
{
"3" "DMR ALİ"
"4" "무슨 일이"
"5" "發生了什麼"
"6" "Zażółć gęślą jaźń"
"7" "ماذا يحدث"
}

I'd ask for the raw file that was parsed (not pastebinned, because the exact bytes/encoding used is important here) and the code that was used to read/parse it or some minimum reproducible example if you could get one.
It might be just a matter of encoding rather than a bug, but even in such scenario I'd still want to see how the incorrectly read bytes are seen by JS and whether there's something that can be improved in the parser around it

@p0358
Copy link
Owner

p0358 commented Aug 28, 2021

Btw I assume the line that you pasted is what was shown in the exact error, right?
Because for example if line #41 regex was to be used (without quotes), then the resulting error would include the exact moment of failure:

this_is_gonna_fail ALİ
SyntaxError: VDF.parse: invalid syntax on line 10:
İ

@Zamiell
Copy link
Author

Zamiell commented Aug 28, 2021

I do have the raw file, should I email it to you?

@Zamiell
Copy link
Author

Zamiell commented Aug 28, 2021

Or, I use Discord if you want to add me, Zamiel#8743.

@p0358
Copy link
Owner

p0358 commented Aug 28, 2021

You should just be able to drag-and-drop a file to attach it here, I just tested and it seems to preserve the raw bytes/encoding

@Zamiell
Copy link
Author

Zamiell commented Aug 28, 2021

It contains potentially sensitive information, like it has all of their steam friends and stuff, maybe I should do it over a side channel.

@p0358
Copy link
Owner

p0358 commented Aug 28, 2021

I see, I sent a Discord invite then, maybe that will be the easiest

@Zamiell
Copy link
Author

Zamiell commented Oct 2, 2021

Hello, any update on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants