-
Notifications
You must be signed in to change notification settings - Fork 2
Read-only release history for Unicode
gitpan/Unicode
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
These are experimental modules to handle various Unicode issues. This is software under construction. Not even alpha state right now. More information on Unicode can be found at http://www.unicode.org Current modules are: Unicode::String - represent strings of Unicode chars. Unicode::CharName - look up character names Some of ideas to investigate for the Unicode modules are: o Fast encoding/decoding to various 8-bit char sets. Mapping table objects perhaps? o Fast convertion to other large char sets (east-asien). I don't know anything about this. o Composition/decomposition support: $u->decomp; # will decomposite as much as possible: "å" --> "a°" $u->comp; # will composite as much as possible: "a°" --> "å" Need separate routines or a special argument to distinguish between compatibility decomposition and canonical decomposition. The last one is a subset of the first one. o General Unicode string to number convertion (based on unidata number attributes) o Case convertions (lc, uc, ucfirst) last one should use title-case o Fast lookup of Unicode attributes (unidata lookup using XS) $u->isletter, $u->isupper, $u->islower,.... why do we need them when perl does not need them for normal text?? o There might be some support for the private area (i.e. adding case convertion and char properties to chars within the area). o Unicode tr-function, sprintf-function o Unicode string comparison functions: cmp(), le, eq,... o Unicode regular expressions: m// s/// split(//,..) o Unicode filehandles (automatic convertion from UTF-7/UTF-8/8-bit char set when reading,writing to filehandles) The following are examples of use of the current modules: use Unicode::String qw(latin1 utf8); $u = utf8("this is a string\n"); print $u->ucs4; print $u->utf16; print $u->utf8; print $u->utf7; print $u->latin1; print $u->hex; print latin1("naïve\n")->utf8; use Unicode::CharName qw(uname); print uname(ord('$')), "\n"; COPYRIGHT © 1997 Gisle Aas. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
About
Read-only release history for Unicode
Resources
Stars
Watchers
Forks
Packages 0
No packages published