Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] UTF7 conversions need the original bytes - it is too late to convert once the bytes are converted to String #3017

Open
1 task done
pjfanning opened this issue Jan 23, 2025 · 4 comments

Comments

@pjfanning
Copy link
Contributor

pjfanning commented Jan 23, 2025

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

/**
* convert original string to UTF-7 String
*
* @param original original text
* @param charset encode charset
* @return String
*/
public static String stringEncodeUtf7String(String original, String charset) {
return new String(original.getBytes(new CharsetProvider().charsetForName(charset)), StandardCharsets.US_ASCII);
}
/**
* convert UTF-7 string to original String
*
* @param encoded encoded String
* @param charset encode charset
* @return String
*/
public static String utf7StringDecodeString(String encoded, String charset) {
return new String(encoded.getBytes(StandardCharsets.US_ASCII), new CharsetProvider().charsetForName(charset));
}

UTF-7 is also obsolete - https://en.wikipedia.org/wiki/UTF-7

Expected Behavior

Are you just trying to remove non-ASCII chars from a String - maybe this might be an approach.
https://stackoverflow.com/questions/8519669/how-can-non-ascii-characters-be-removed-from-a-string

Steps To Reproduce

No response

Environment

HertzBeat version(s):

Debug logs

No response

Anything else?

No response

@a-little-fool
Copy link
Contributor

a-little-fool commented Jan 24, 2025

The UTF-7 encoding is indeed outdated,if only want to remove non ASCII characters, combined with your information, this may help solve this problem:

Image

@a-little-fool a-little-fool pinned this issue Jan 24, 2025
@a-little-fool a-little-fool unpinned this issue Jan 24, 2025
@a-little-fool a-little-fool pinned this issue Jan 24, 2025
@a-little-fool a-little-fool unpinned this issue Jan 24, 2025
@zuobiao-zhou
Copy link
Member

Hi, the purpose of this function is to first encode the original string in UTF-7, and then convert it into an ASCII-compatible string. This is necessary because the IMAP protocol requires UTF-7 encoding.

@pjfanning
Copy link
Contributor Author

@zuobiao-zhou I think this code needs tests to show that it behaves the way that you want it to.

@zuobiao-zhou
Copy link
Member

@pjfanning This method is only used for the IMAP protocol, which can monitor QQ and NetEase email accounts. If you encounter any issues while using it, welcome to point them out.

@zuobiao-zhou zuobiao-zhou added discussion and removed bug Something isn't working labels Jan 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants