Qkc is a kanji code converter capable of SHIFT-JIS, EUC and JIS.
Unlike nkf, qkc can handle multiple files at a time. Qkc also has
functionality to change end-of-line characters, ie, from CR+LF to LF,
or to CR, and vice versa.
Roget's Thesaurus was produced by PROJECT GUTENBERG. This file is
converted from the original dictionary into JIS X 4081 format (that is
a subset of EPWING V1) by FreePWING. So this can be used by EPWING
viewers on Unix and the other OS (e.g. Windows or MacOS).
o URL for the original dictionary:
http://promo.net/pg/
o URL for this converted dictionary:
This is the MeCab library module for Perl5.
This is the ChaSen library module for Perl5.
This is the Kakasi library module for Perl5.
plain2 r2.54 1994/04 by A.Uchida NEC Corporation
usage: plain2 [options] [files ...]
---- parser options ---- ---- output options ----(default)
-table=dd: table factor [0-100](def=50) -roff: troff output
-exam=dd: example factor[0-100](def=50) -ms/-mm: troff macro (mm)
-indsec: sections can be indented -tex: tex output
-ktable:enable JIS keisen table -tstyle=ss:tex style
-ref: figure/picture reference -renum: renumbering only
-[no]listd:list decoration (on)
---- Others ---- -[no]space:spacing (on)
-v: verbose output -[no]pre: preamble block (on)
-dLevel: debug level -[no]acursec: section numbers (off)
----- experimental ---- -raw: quote special chars(off)
-pt=Size: font size -jis: JIS code output
-sjis: Shift-JIS code input/output
-f file: output customization
libmecab (http://mecab.sourceforge.ne.jp) already has a perl interface
built with it, so why a new module? I just feel that while a subtle
difference, making the perl interface through a tied hash is just...
weird.
So Text::MeCab gives you a more natural, Perl-ish way to access
libmecab!
It is suitable for the scraping of a popular bbs of Japan.
other BBS and the news sites and other sites are also possible by the
addition of the plugin for scraping.
Please take care with the flood control to an excessive access.