Improve `int.toUnicode()` documentation

The documentation for the newly added `int.toUnicode()` predicate [says](https://codeql.github.com/codeql-standard-libraries/java/predicate.int$toUnicode.0.html):
> Returns the unicode character for the receiver seen as a unicode code point

This is slightly misleading because CodeQL strings consist of UTF-16 code points. Therefore supplementary code points (> U+FFFF) will result in two CodeQL string characters (demonstrated by [this query](https://lgtm.com/query/8363238946666626198/)). It might also be good to describe its behavior for invalid code point values. For surrogate code point it does not seem to have a result either, e.g. `55296.toUnicode()`.
Also it should uppercase "Unicode".

I would recommend the following description (or similar):
> Returns the Unicode character for the receiver seen as a Unicode code point. Because CodeQL strings consist of UTF-16 code units, supplementary code points (that is > U+FFFF) result in a CodeQL string of length 2. This predicate has no result if the int receiver does not represent a valid Unicode code point, or represents the code point of a surrogate character.

This requires changes to the built-in documentation (which is why I created the issue here) as well as the language specification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve `int.toUnicode()` documentation #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve int.toUnicode() documentation #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve `int.toUnicode()` documentation #80