Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with unicode to utf8 characters #53

Open
dknx01 opened this issue Apr 2, 2023 · 2 comments
Open

Problems with unicode to utf8 characters #53

dknx01 opened this issue Apr 2, 2023 · 2 comments

Comments

@dknx01
Copy link

dknx01 commented Apr 2, 2023

I tried to convert a shp file into geojson with uniscode characters inside. Some seems to be converted to UTF-8, if I use Shapefile::OPTION_DBF_CONVERT_TO_UTF8 option, but not All. Especially german Umlaute seems to be not converted.

ö is still \u00f6

Example file

@gasparesganga
Copy link
Owner

This is not a matter of reading and converting UTF-8 from the shapefile DBF, which is happening, but rather a deliberate choice for the GeoJSON output, where multibyte unicode characters get escaped.

I could add an option to Geometry::toGeoJSON() in order to output unescaped multibyte UTF-8 characters, basically it's just a matter of passing JSON_UNESCAPED_UNICODE flag to PHP json_encode function. I will think about possible issues with that and eventually add it.

By the way, currently escaped JSON is 100% valid and should be the in the most interoperable form though, why is it causing you issues?

@dknx01
Copy link
Author

dknx01 commented Apr 11, 2023

The issue is that I cannot use the current output directly. So I need another step to handle them.

An option would be a good choice. Maybe the same for other options for json_encode, so that everyone can pass them through.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants