README

These are experimental modules to handle various Unicode issues.  This
is software under construction.  Not even alpha state right now.

More information on Unicode can be found at http://www.unicode.org

Current modules are:

  Unicode::String   - represent strings of Unicode chars.
  Unicode::CharName - look up character names


Some of ideas to investigate for the Unicode modules are:

   o Fast encoding/decoding to various 8-bit char sets.  Mapping
     table objects perhaps?

   o Fast convertion to other large char sets (east-asien).  I don't
     know anything about this.

   o Composition/decomposition support:
     $u->decomp;  # will decomposite as much as possible:  "å"  --> "a°"
     $u->comp;    # will composite as much as possible:    "a°" --> "å"

     Need separate routines or a special argument to distinguish
     between compatibility decomposition and canonical decomposition.
     The last one is a subset of the first one.

   o General Unicode string to number convertion (based on unidata
     number attributes)

   o Case convertions (lc, uc, ucfirst)  last one should use title-case

   o Fast lookup of Unicode attributes (unidata lookup using XS)
     $u->isletter, $u->isupper, $u->islower,....  why do we need them when
     perl does not need them for normal text??

   o There might be some support for the private area (i.e. adding case
     convertion and char properties to chars within the area).

   o Unicode tr-function, sprintf-function

   o Unicode string comparison functions: cmp(), le, eq,...

   o Unicode regular expressions: m// s/// split(//,..)

   o Unicode filehandles (automatic convertion from UTF-7/UTF-8/8-bit 
     char set when reading,writing to filehandles)


The following are examples of use of the current modules:

   use Unicode::String qw(latin1 utf8);

   $u = utf8("this is a string\n");
   print $u->ucs4;
   print $u->utf16;
   print $u->utf8;
   print $u->utf7;
   print $u->latin1;
   print $u->hex;

   print latin1("naïve\n")->utf8;

   use Unicode::CharName qw(uname);
   print uname(ord('$')), "\n";


COPYRIGHT

  © 1997 Gisle Aas. All rights reserved.

This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.