UTF-8 encoding of Degree Symbol #50

glennkitchellcaci · 2019-09-24T17:22:45Z

The issue I'm having is because of the degree symbol:
UTF-8 \xc2\xb0
http://www.fileformat.info/info/unicode/char/b0/index.htm

Below, I include the boiled-down calls. My true testing data sample includes properly formatted XML; but through testing I found that having more and more text does not affect the confidence or output of the "jschardet.detect()" call.

With 1, 2, or 3 degree symbols, it detects as windows-1252 (which parses with an extra \xc2 for each, since it's supposed to be UTF-8)
jschardet.detect('\xc2\xb0');

With 4 degree symbols, it detects as EUC-KR
jschardet.detect('\xc2\xb0\xc2\xb0\xc2\xb0\xc2\xb0');

lingsamuel · 2020-07-02T02:43:19Z

Fixed in #57 and #59 @aadsm .

glennkitchellcaci mentioned this issue Sep 24, 2019

KML import has corrupted degree symbol rendering for info ngageoint/opensphere#674

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UTF-8 encoding of Degree Symbol #50

UTF-8 encoding of Degree Symbol #50

glennkitchellcaci commented Sep 24, 2019

lingsamuel commented Jul 2, 2020

UTF-8 encoding of Degree Symbol #50

UTF-8 encoding of Degree Symbol #50

Comments

glennkitchellcaci commented Sep 24, 2019

lingsamuel commented Jul 2, 2020