-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XALANJ-2617 Fixed serializer for high-surrogate UTF-16 characters #4
base: trunk
Are you sure you want to change the base?
Conversation
…urrogate UTF-16 characters
Thanks for your pull request. I build serializer.jar with it and it works well with normal text. But I still have problems with text in attributes. XML (content of element): XSL: Output (method xml; encoding utf-8): Output (method html, encoding utf-8): With encoding utf-16 xml and html: |
I'll gladly investigate further and fix, but I'm afraid the description of your reproducible scenario is a bit too concise for me to understand it properly. I understand attributes in XML, but I don't understand what the XSL has to do with it. It would help if you could give me more detailed steps to reproduce. |
Discard my previous comment. In the meantime I was able to reproduce the bug and I fixed it for attribute values as well. The fix for attribute values was a bit more involved than that for element text, because in general the method |
…o for attribute values
Thanks for your response and fix. It now work well with attributes when the output is xml. When I change the OutputKeys.METHOD to 'html' text in attributes is still broken. The character is not a html entity and there is extra code '�' (�). Output of file xalan_emoji.txt (changed from .java to .txt) with your fixed xalan version:
|
I tried to investigate the problem and found
This seems to work, the output of my test (see comment above) is now:
The character is not changed to an entity, but for me it is better than an extra and wrong character. |
Hi, Can we hope a release with this bug fixed or their is a non-official maven repo available with this issue fixed ? |
@peterdemaeyer thanks for this fix, do you know when this can be merged and released ? |
My pull request has been ready for several months. I didn't closely examine @nicolas-albert's additional fixes but assuming they are good, this can be merged as far as I'm concerned. I don't think I can merge it myself, so I was just waiting for someone with sufficient privileges to merge it. I'm also quite eager to get this fix in an official release, so if I can/need to do something additional myself, let me know. |
@peterdemaeyer I didn't customize the code, I only applied this patch to my project (and seems ok). |
Hi @peterdemaeyer with reference to https://issues.apache.org/jira/browse/XALANJ-2617 I have converted http://svn.apache.org/repos/asf/xalan/test/ to https://github.com/plutext/xalan-test I see you ran minitest and smoketest; those work for me as well :-) Do you recall whether you tried the conf and perf tests? With these I get some test failures, even with the published xalan and serializer jars from maven central. The same tests fail for me under Java 8 and 11, on both master and branch xalan-j_2_7_x. FYI, I have converted Xalan to a maven multi module project; minitest and smoketest pass using Java 8 and Java 11 . I have made module-info.java files for it. Once satisfied about the tests, I'll apply selected patches. |
I don't think It tried the conf and perf tests. |
@nicolas-albert convertigo/convertigo@bd37cff is 7,493 additions. Would you consider re-doing your patch without the noise? thanks! |
@plutext I just apply the PR of this conversation, there is your patch : https://github.com/apache/xalan-j/pull/4/commits |
Guys, you're losing me here. I only touched 1 file. I don't understand where all the rest of the 7493 additions come from, but I can't imagine that this patch is hard to merge. From the discussion above, it's not entirely clear to me what is happening, but I hope this fix can be merged soon. |
I think the confusion comes from your reference to "@nicolas-albert's additional fixes"; I didn't notice he later clarified that there are no additional fixes :-) so sorry for the noise. Regarding likelihood of merging, please see comments at https://issues.apache.org/jira/browse/XALANJ-2419 |
…omment) fix) and for text
Fixed serializer such that it correctly deals with high-surrogate UTF-16 characters. This pull request replaces an earlier one from Daniel Kec, see comments on https://issues.apache.org/jira/browse/XALANJ-2617.