NAME
    HTML::Entities::ImodePictogram - encode / decode i-mode pictogram

SYNOPSIS
      use HTML::Entities::ImodePictogram;

      $html      = encode_pictogram($rawtext);
      $rawtext   = decode_pictogram($html);
      $cleantext = remove_pictogram($rawtext);

      use HTML::Entities::ImodePictogram qw(find_pictogram);

      $num_found = find_pictogram($rawtext, \&callback);

DESCRIPTION
    HTML::Entities::ImodePictogram handles HTML entities for i-mode
    pictogram (emoji), which are assigned in Shift_JIS private area.

    See http://www.nttdocomo.co.jp/i/tag/emoji/index.html for details about
    i-mode pictogram.

FUNCTIONS
    In all functions in this module, input/output strings are asssumed as
    encoded in Shift_JIS. See the Jcode manpage for conversion between
    Shift_JIS and other encodings like EUC-JP or UTF-8.

    This module exports following functions by default.

    encode_pictogram
          $html = encode_pictogram($rawtext);
          $html = encode_pictogram($rawtext, unicode => 1);

        Encodes pictogram characters in raw-text into HTML entities. If
        $rawtext contains extended pictograms, they are encoded in Unicode
        format. If you add "unicode" option explicitly, all pictogram
        characters are encoded in Unicode format (""). Otherwise,
        encoding is done in decimal format ("&#NNNNN;").

    decode_pictogram
          $rawtext = decode_pictogram($html);

        Decodes HTML entities (both for "" and "&#NNNNN;") for
        pictogram into raw-text in Shift_JIS.

    remove_pictogram
          $cleantext = remove_pictogram($rawtext);

        Removes pictogram characters in raw-text.

    This module also exports following functions on demand.

    find_pictogram
          $num_found = find_pictorgram($rawtext, \&callback);

        Finds pictogram characters in raw-text and executes callback when
        found. It returns the total numbers of charcters found in text.

        The callback is given three arguments. The first is a found
        pictogram character itself, and the second is a decimal number which
        represents Shift_JIS codepoint of the character. The third is a
        Unicode codepoint. Whatever the callback returns will replace the
        original text.

        Here is a stub implementation of encode_pictogram(), which will be
        the good example for the usage of find_pictogram(). Note that this
        example version doesn't support extended pictograms.

          sub encode_pictogram {
              my $text = shift;
              find_pictogram($text, sub {
                                 my($char, $number, $cp) = @_;
                                 return '&#' . $number . ';';
                             });
              return $text;
          }

CAVEAT
    *   This module works so slow, because regex used here matches "ANY"
        characters in the text. This is due to the difficulty of extracting
        character boundaries of Shift_JIS encoding.

    *   Extended pictogram support of this module is not complete. If you
        handle pictogram characters in Unicode, try Encode module with perl
        5.8.0, or Unicode::Japanese.

AUTHOR
    Tatsuhiko Miyagawa <miyagawa@bulknews.net>

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO
    the HTML::Entities manpage, the Unicode::Japanese manpage,
    http://www.nttdocomo.co.jp/p_s/imode/tag/emoji/