Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handle special characters in mysql tables with latin charsets #23

Open
kralan opened this issue Oct 23, 2019 · 5 comments
Open

handle special characters in mysql tables with latin charsets #23

kralan opened this issue Oct 23, 2019 · 5 comments

Comments

@kralan
Copy link

kralan commented Oct 23, 2019

Ansible uses UTF-8. When using ansible-mysql-query to update mysql tables with latin encodings, special characters (german umlauts in my use case) get messed up.

Is it possible to add functionality along the lines of ansible issue #121 and ansible pull request #42171?

Is there a workaroud that can be used in the meantime?

@elmarx
Copy link
Owner

elmarx commented Oct 23, 2019

Hm, good question. That's a tough one. My idea would be to tell MySQL on insert/update, that the given string is UTF-8 encoded, which seems to be possible with SELECT _utf8'some text';. But that would require ansible-mysql-query to detect the type of the values, since strings and numbers need different handling.

And for reading (to compare the current db state) the translation would be necessary again…

OR another approach would be to inspect the current DB schema and do the necessary comversions in python code.

Hm. I have to think about and try different approaches.

Just using UTF-8 in your database is not an option?

I don't know about any workarounds. If the data are being used to be shown on a webpage, maybe encoding the umlauts as HTML-entities could be a solution (e.g. ü becomes ü)?

@kralan
Copy link
Author

kralan commented Oct 23, 2019

Thank you for looking into this so quickly!
Of course it would be nice to live in an all-UTF world, but on rare occasions, it is not that way and cannot be changed easily. Big databases tend to be resistant to changes.

I think the conversion will be easiest done in python. Detecting the table's schema would be nice to have, but it is not necessary, as you normally know the encoding of the tables you deal with. A parameter to the ansible-mysql-query module like e.g. output_encoding will do.

@elmarx
Copy link
Owner

elmarx commented Oct 24, 2019

Ah, now I get your idea (concerning output_encoding, related to the linked issues). I'll look into it in the next days; feel free to ping/remind me if there's no progress :P

@kralan
Copy link
Author

kralan commented Oct 29, 2019

Maybe the name output_encoding served a good purpose in linking to related issues, but in the context of the ansible-mysql-query-module, table_encoding might be a more intuitive name.
I don't know enough about the policies of ansible whether it is desirable to use the name output_encoding for consistency with similar purposes in other modules.

@kralan
Copy link
Author

kralan commented Jan 2, 2020

May I take your offer to ping on this issue and wish you a happy new year
Can I do anything to help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants