-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MariaDBclient + Mysql8 support #30
base: 0.2.x_force_latin1_to_utf8
Are you sure you want to change the base?
Conversation
Unix systems using libtool do not need to do a version check against the client version string as the libraries themselves are versioned.
libmariadb-client-lgpl-dev in newly released Debian stable (jessie) ships `/usr/bin/mariadb_config`.
See also makandra@56b46c8
Triggered by out of bound array access when retreiving the mysql encoding.
How does the mapping differ? How does mainline mysql2 handle the incompatibility? |
I pulled collations from MySQL 5.7 and 8 locally, diffed them, and they appear to be fully compatible EXCEPT for utf8/utf8mb3 naming changes. The collation IDs are stable across MySQL versions and the mapping scripts are sorted by ID, so collations don't collide. The utf8 naming was used in MySQL <8 and MariaDB <10.6, but it was always an alias for the underlying utf8mb3. Both dbs switched to utf8mb3 naming in anticipation of switching the utf8 alias to utf8mb4 in the future. So we'll need mysql2 support for utf8mb3 charset and collation names. Open issue here: brianmario#1315 We can do that by updating the charset map generation scripts to account for the alias and map both utf8 and utf8mb3 names to Ruby utf-8. Here's a rough diff: --- mysql57_coll 2023-08-02 11:41:50
+++ mysql8_coll 2023-08-02 11:42:01
@@ -29,7 +29,7 @@
[30, "latin5_turkish_ci"],
[31, "latin1_german2_ci"],
[32, "armscii8_general_ci"],
- [33, "utf8_general_ci"],
+ [33, "utf8mb3_general_ci"],
[34, "cp1250_czech_cs"],
[35, "ucs2_general_ci"],
[36, "cp866_general_ci"],
@@ -72,13 +72,14 @@
[73, "keybcs2_bin"],
[74, "koi8r_bin"],
[75, "koi8u_bin"],
+ [76, "utf8mb3_tolower_ci"],
[77, "latin2_bin"],
[78, "latin5_bin"],
[79, "latin7_bin"],
[80, "cp850_bin"],
[81, "cp852_bin"],
[82, "swe7_bin"],
- [83, "utf8_bin"],
+ [83, "utf8mb3_bin"],
[84, "big5_bin"],
[85, "euckr_bin"],
[86, "gb2312_bin"],
@@ -168,31 +169,31 @@
[181, "utf32_croatian_ci"],
[182, "utf32_unicode_520_ci"],
[183, "utf32_vietnamese_ci"],
- [192, "utf8_unicode_ci"],
- [193, "utf8_icelandic_ci"],
- [194, "utf8_latvian_ci"],
- [195, "utf8_romanian_ci"],
- [196, "utf8_slovenian_ci"],
- [197, "utf8_polish_ci"],
- [198, "utf8_estonian_ci"],
- [199, "utf8_spanish_ci"],
- [200, "utf8_swedish_ci"],
- [201, "utf8_turkish_ci"],
- [202, "utf8_czech_ci"],
- [203, "utf8_danish_ci"],
- [204, "utf8_lithuanian_ci"],
- [205, "utf8_slovak_ci"],
- [206, "utf8_spanish2_ci"],
- [207, "utf8_roman_ci"],
- [208, "utf8_persian_ci"],
- [209, "utf8_esperanto_ci"],
- [210, "utf8_hungarian_ci"],
- [211, "utf8_sinhala_ci"],
- [212, "utf8_german2_ci"],
- [213, "utf8_croatian_ci"],
- [214, "utf8_unicode_520_ci"],
- [215, "utf8_vietnamese_ci"],
- [223, "utf8_general_mysql500_ci"],
+ [192, "utf8mb3_unicode_ci"],
+ [193, "utf8mb3_icelandic_ci"],
+ [194, "utf8mb3_latvian_ci"],
+ [195, "utf8mb3_romanian_ci"],
+ [196, "utf8mb3_slovenian_ci"],
+ [197, "utf8mb3_polish_ci"],
+ [198, "utf8mb3_estonian_ci"],
+ [199, "utf8mb3_spanish_ci"],
+ [200, "utf8mb3_swedish_ci"],
+ [201, "utf8mb3_turkish_ci"],
+ [202, "utf8mb3_czech_ci"],
+ [203, "utf8mb3_danish_ci"],
+ [204, "utf8mb3_lithuanian_ci"],
+ [205, "utf8mb3_slovak_ci"],
+ [206, "utf8mb3_spanish2_ci"],
+ [207, "utf8mb3_roman_ci"],
+ [208, "utf8mb3_persian_ci"],
+ [209, "utf8mb3_esperanto_ci"],
+ [210, "utf8mb3_hungarian_ci"],
+ [211, "utf8mb3_sinhala_ci"],
+ [212, "utf8mb3_german2_ci"],
+ [213, "utf8mb3_croatian_ci"],
+ [214, "utf8mb3_unicode_520_ci"],
+ [215, "utf8mb3_vietnamese_ci"],
+ [223, "utf8mb3_general_mysql500_ci"],
[224, "utf8mb4_unicode_ci"],
[225, "utf8mb4_icelandic_ci"],
[226, "utf8mb4_latvian_ci"],
@@ -219,4 +220,67 @@
[247, "utf8mb4_vietnamese_ci"],
[248, "gb18030_chinese_ci"],
[249, "gb18030_bin"],
- [250, "gb18030_unicode_520_ci"]]
+ [250, "gb18030_unicode_520_ci"],
+ [255, "utf8mb4_0900_ai_ci"],
+ [256, "utf8mb4_de_pb_0900_ai_ci"],
+ [257, "utf8mb4_is_0900_ai_ci"],
+ [258, "utf8mb4_lv_0900_ai_ci"],
+ [259, "utf8mb4_ro_0900_ai_ci"],
+ [260, "utf8mb4_sl_0900_ai_ci"],
+ [261, "utf8mb4_pl_0900_ai_ci"],
+ [262, "utf8mb4_et_0900_ai_ci"],
+ [263, "utf8mb4_es_0900_ai_ci"],
+ [264, "utf8mb4_sv_0900_ai_ci"],
+ [265, "utf8mb4_tr_0900_ai_ci"],
+ [266, "utf8mb4_cs_0900_ai_ci"],
+ [267, "utf8mb4_da_0900_ai_ci"],
+ [268, "utf8mb4_lt_0900_ai_ci"],
+ [269, "utf8mb4_sk_0900_ai_ci"],
+ [270, "utf8mb4_es_trad_0900_ai_ci"],
+ [271, "utf8mb4_la_0900_ai_ci"],
+ [273, "utf8mb4_eo_0900_ai_ci"],
+ [274, "utf8mb4_hu_0900_ai_ci"],
+ [275, "utf8mb4_hr_0900_ai_ci"],
+ [277, "utf8mb4_vi_0900_ai_ci"],
+ [278, "utf8mb4_0900_as_cs"],
+ [279, "utf8mb4_de_pb_0900_as_cs"],
+ [280, "utf8mb4_is_0900_as_cs"],
+ [281, "utf8mb4_lv_0900_as_cs"],
+ [282, "utf8mb4_ro_0900_as_cs"],
+ [283, "utf8mb4_sl_0900_as_cs"],
+ [284, "utf8mb4_pl_0900_as_cs"],
+ [285, "utf8mb4_et_0900_as_cs"],
+ [286, "utf8mb4_es_0900_as_cs"],
+ [287, "utf8mb4_sv_0900_as_cs"],
+ [288, "utf8mb4_tr_0900_as_cs"],
+ [289, "utf8mb4_cs_0900_as_cs"],
+ [290, "utf8mb4_da_0900_as_cs"],
+ [291, "utf8mb4_lt_0900_as_cs"],
+ [292, "utf8mb4_sk_0900_as_cs"],
+ [293, "utf8mb4_es_trad_0900_as_cs"],
+ [294, "utf8mb4_la_0900_as_cs"],
+ [296, "utf8mb4_eo_0900_as_cs"],
+ [297, "utf8mb4_hu_0900_as_cs"],
+ [298, "utf8mb4_hr_0900_as_cs"],
+ [300, "utf8mb4_vi_0900_as_cs"],
+ [303, "utf8mb4_ja_0900_as_cs"],
+ [304, "utf8mb4_ja_0900_as_cs_ks"],
+ [305, "utf8mb4_0900_as_ci"],
+ [306, "utf8mb4_ru_0900_ai_ci"],
+ [307, "utf8mb4_ru_0900_as_cs"],
+ [308, "utf8mb4_zh_0900_as_cs"],
+ [309, "utf8mb4_0900_bin"],
+ [310, "utf8mb4_nb_0900_ai_ci"],
+ [311, "utf8mb4_nb_0900_as_cs"],
+ [312, "utf8mb4_nn_0900_ai_ci"],
+ [313, "utf8mb4_nn_0900_as_cs"],
+ [314, "utf8mb4_sr_latn_0900_ai_ci"],
+ [315, "utf8mb4_sr_latn_0900_as_cs"],
+ [316, "utf8mb4_bs_0900_ai_ci"],
+ [317, "utf8mb4_bs_0900_as_cs"],
+ [318, "utf8mb4_bg_0900_ai_ci"],
+ [319, "utf8mb4_bg_0900_as_cs"],
+ [320, "utf8mb4_gl_0900_ai_ci"],
+ [321, "utf8mb4_gl_0900_as_cs"],
+ [322, "utf8mb4_mn_cyrl_0900_ai_ci"],
+ [323, "utf8mb4_mn_cyrl_0900_as_cs"]] |
Ah: mysql2 looks up charsets by ID, not name, so the alias naming shouldn't even matter! Adding a utf8mb3 → utf-8 mapping may be sufficient. |
Done upstream: brianmario#1323 |
290c9f4
to
12cea12
Compare
6a8a4b1
to
fffaa42
Compare
- Port new mappings from latest master - Add utf8mb3 -> utf_8 missing mapping - rebuild ext/mysql2/mysql_enc_to_ruby.h and ext/mysql2/mysql_enc_name_to_ruby.h from Mysql 8.0.32
fffaa42
to
b8ab80d
Compare
Thank you @jeremy for the in-depth review and upstream fix 🙇♂️. I've tested setting the encoding of
So something goes wrong initializing the connection, as the next step I'd enable verbose logging in MYSQL to see what's going on but I wonder if I'm doing something dumb and you already know it 😅. |
Looks good @intrip! No need to switch to |
Would you backport to the 0.3.x_force_latin1_to_utf8 branch as well? |
Ahh, right! 🤦♂️ |
Based on brianmario#976
utf8mb3
->utf_8
missing mapping (thanks @jeremy 🙏)ext/mysql2/mysql_enc_to_ruby.h
andext/mysql2/mysql_enc_name_to_ruby.h
from Mysql 8.0.32This PR encompasses also @lewispb work done in #28