Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detection fails on particular, simple ANSI file #7

Open
GoogleCodeExporter opened this issue Mar 17, 2015 · 1 comment
Open

Detection fails on particular, simple ANSI file #7

GoogleCodeExporter opened this issue Mar 17, 2015 · 1 comment

Comments

@GoogleCodeExporter
Copy link

What steps will reproduce the problem?
1. Save an ANSI file containing the text "CONFIG: main 30000000"
2. Run the library and/or exe on it

What is the expected output? What do you see instead?

I expect ANSI detected.

What version of the product are you using? On what operating system?

The library shows null for charset, and the exe shows "detection failed".

Please provide any additional information below.

I don't know if this is how the library is intended to work, but I think it 
would be more useful to detect ANSI if all the characters fit into ANSI. Or at 
least support this behavior optionally.

Original issue reported on code.google.com by [email protected] on 14 Sep 2014 at 4:59

@GoogleCodeExporter
Copy link
Author

There is a bug in UniversalDetector.cs (around line 152):
       } else { 
           if (inputState == InputState.PureASCII &&
               (buf[i] == 0x33 || (buf[i] == 0x7B && lastChar == 0x7E))) {
                ^^^^^^^^^^^^^^^^
                ESC = 27 (decimal) = 33(octal)
                0x33 = 51 (decimal) = "3" (ASCII)

                // found escape character or HZ "~{"

Original comment by [email protected] on 5 Dec 2014 at 2:46

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant