Table of Contents
This chapter will get you started using the tools provided by the xml-keyval package. We will be assuming that you are somewhat familiar with the following concepts (hopefully at least two of the following will ring some bells):
The keyval2sh command can be a handy
way to start a new bash script. This scripts treats a
<keyval>
XML document as a representation of command
line arguments.
The following sections will take you through the basic steps of creating a new script whose purpose is to find all of the unique words in a text file that may be misspelled and report them to the user.
We're going to be lazy and let the keyval2sh command stub in the necessary template files for us. We'll do this in a new directory with the following set of commands:
[pkb@salsa pkb]$
mkdir $HOME/bogus
[pkb@salsa pkb]$
cd $HOME/bogus
[pkb@salsa bogus]$
keyval2sh --name spellcheck
creating template file: spellcheck_config.xml
Generating: spellcheck_config.sh
Generating: spellcheck_config.docbook_synopsis
Generating: spellcheck_config.docbook_options
Creating Template: spellcheck.sh
Creating Template: spellcheck.mk
Creating Template: spellcheck.docbook
Finished - finally!
STEP 1: Try invoking:
make -f spellcheck.mk; man ./spellcheck.l; ./spellcheck --help
STEP 2: Then edit files:
spellcheck.sh, spellcheck_config.xml and spellcheck.docbook
STEP 3: Repeat steps 1 and 2 until life is good.
[pkb@salsa bogus]$
ls
spellcheck_config.docbook_options spellcheck_config.xml spellcheck.sh
spellcheck_config.docbook_synopsis spellcheck.docbook
spellcheck_config.sh spellcheck.mk
[[pkb@salsa bogus]$
If you looked at the above very long, you're probably just about ready to hunt me down and shoot me. All you wanted to do was create a simple spell checker, and all of a sudden, you're faced with SEVEN new files for crying out loud.
Before you get too upset, let me explain that THREE of the files never need to be edited (they will often be replaced during the build process). Here's the rough break down of the purpose the files serve:
spellcheck_config.xml
This file contains the definition of the
command line arguments. We will edit this file to add,
modify and/or remove command line arguments. After
modifying this file, we will use the
keyval2sh command to generate new versions
of spellcheck_config.sh
,
spellcheck_config.docbook_synopsis
,
and
spellcheck_config.docbook_options
. As
we may edit this file, the keyval2sh script
will leave it alone on future
invocations.
spellcheck.sh
We will be completely replacing this
file. This is where the guts of our script will be
placed. It should be noted that the
spellcheck_config.sh
file will be
"glued" above this portion of code. Hence, by the time our
code is reached, the command line arguments will already
have been validated and placed in environment variables
for us. As we may edit this file, the
keyval2sh script will leave it alone on
future invocations.
spellcheck.docbook
This file provides a template man page in
DocBook.org form. If we had specified the
--doc-type m4
, the
keyval2sh script would have provided us
with m4 style man page templates. As we
may edit this file, the keyval2sh script
will leave it alone on future
invocations.
spellcheck.mk
This file provides a set of rough make
rules such that we should be able to build our script and
its corresponding man page via a make -k -f
spellcheck.mk
invocation. As we may edit this file, the
keyval2sh script will leave it alone on
future invocations.
spellcheck_config.sh
This file provides the script code to
parse the command line arguments specified by the user (it
performs a very simple form of validation). This is
combined with the spellcheck.sh
file
to produce the final script by the makefile. This file is
automatically generated each time the
spellcheck_config.xml
is
modified. DON'T EDIT - unless you
have some time to waste.
spellcheck_config.docbook_synopsis
This file provides the synopsis portion
of the man page and is included by the
spellcheck.docbook
file. This file is
automatically generated each time the
spellcheck_config.xml
is
modified. DON'T EDIT - unless you
have some time to waste.
spellcheck_config.docbook_options
This file provides the description portion
of the man page and is included by the
spellcheck.docbook
file. This file is
automatically generated each time the
spellcheck_config.xml
is
modified. DON'T EDIT - unless you
have some time to waste.
Before we do any work, let's try out what we started
with. We will attempt to build the utility (as stubbed in),
review the man page, invoke the command with the
--help
option, and then run the command
without any command line arguments. Please note, if you don't
have the docbook-xsl package installed locally (it
typically ends up under
/usr/share/sgml/docbook
) on your system,
we will try to access the public docbook-xsl files on
the Internet. Since we may access the Internet, you will want
to verify your setting of XML_COMMON_JAVA
includes the proper proxy server information for your
LAN. This is described in the Environment Variables
section. On a side note, the
Solaris version of
man doesn't allow one to use it to format a
specific file. Because of this, the example session below has
a funky groff invocation on both
Solaris and
Linux.
[pkb@salsa bogus]$
make -k -f spellcheck.mk
[pkb@salsa bogus]$
groff -Tascii -man spellcheck.l | less
SPELLCHECK(l) SPELLCHECK(l)
NAME
spellcheck - Used to wash dishes.
SYNOPSIS
spellcheck [-i FILENAME | --in FILENAME] [-o FILENAME | --out FILENAME]
[-v true|false | --verbose true|false] [-l FILENAME |
--log-file FILENAME]
DESCRIPTION
The spellcheck is pretty darn handy when it comes to washing dishes. It
can:
o Scrub them clean.
o Dry them dry.
:q
[pkb@salsa bogus]$
./spellcheck --help
Usage:
spellcheck [-h|--help] [-i|--in FILENAME] [-o|--out FILENAME]
[-v|--verbose true|false] [-l|--log-file FILENAME]
Where:
-i|--in FILENAME Currently:[-]
The name of the input file to process.
-o|--out FILENAME Currently:[-]
The name of the output file to produce.
-v|--verbose true|false Currently:[false]
Produce verbose output.
-l|--log-file FILENAME Currently:[/dev/null]
Used to log diagnositic output to a file.
-h|--help
Displays this usage information.
[pkb@salsa bogus]$
./spellcheck
PLEASE INSERT YOUR OWN CODE!
Available settings for the spellcheck script:
IN="-"
OUT="-"
VERBOSE="false"
LOG_FILE="/dev/null"
[pkb@salsa bogus]$
For as little work as we've done so far, we're off to a
pretty good start. The next step is to replace
spellcheck.sh
with the code for our
script.
We will want to replace the block of code in
spellcheck.sh
which the
keyval2sh stubbed in for us with the
code that actually does the spell checking. We will plan on
making use of the --in IN
and
--out OUT
command line options that have
already been stubbed in. The following block of code should do
what we want:
Figure 3.1. Meat of spellcheck.sh
# # Look for spell checker on system and initialize command to run # SPELL="$(which aspell 2>/dev/null) pipe" if [ -x "${SPELL:-bogusXXX}" ]; then SPELL="$SPELL pipe" else SPELL="$(which ispell 2>/dev/null)" if [ -x "${SPELL:-bogusXXX}" ]; then SPELL="$SPELL -a" else error "unable to locate aspell or ispell on system" exit 2 fi fi CMD="tr '\t\ ' '\n\n' | sort | tee sort.txt | uniq | $SPELL |\ grep -v '^*$' | grep -v '^$'" # # Check input. If '-' use stdin, otherwise verify file exists # makes use of error() function provided in spellcheck_config.sh # if [ "$IN" = "-" ]; then CMD="$CMD" elif [ ! -r "$IN" ]; then error "unable to read from file: $IN" exit 1 else CMD="cat '$IN' | $CMD" fi # # Check output file. If '-' use stdout, # otherwise verify we can write to it # if [ "$OUT" = "-" ]; then CMD="$CMD" elif touch "$OUT" && rm -f "$OUT"; then CMD="$CMD >'$OUT' 2>&1" else error "unable to create output file: $OUT" exit 1 fi # # Invoke the command to perform the spell check # eval $CMD
Let's build and run our new spell checking utility:
[pkb@salsa bogus]$
make -k -f spellcheck.mk
cat "spellcheck_config.sh" "spellcheck.sh" > "spellcheck"
chmod +x "spellcheck"
[pkb@salsa bogus]$
echo "Don't mispell misspell!" | ./spellcheck
@(#) International Ispell Version 3.1.20 (but really Aspell .33.7.1 alpha)
& mispell 8 0: misspell, mi spell, mi-spell, Ispell, ispell, misspells, misplay, spell
[pkb@salsa bogus]$
echo "Is it convienent for you?" > file.txt
[pkb@salsa bogus]$
./spellcheck --in file.txt
@(#) International Ispell Version 3.1.20 (but really Aspell .33.7.1 alpha)
& convienent 9 0: convenient, Continent, continent, confinement, convent, contingent, convergent, convened, covenant
[pkb@salsa bogus]$
./spellcheck --in file.txt -o err.txt
[pkb@salsa bogus]$
cat err.txt
@(#) International Ispell Version 3.1.20 (but really Aspell .33.7.1 alpha)
& convienent 9 0: convenient, Continent, continent, confinement, convent, contin
gent, convergent, convened, covenant
[pkb@salsa bogus]$
./spellcheck --help
Usage:
spellcheck [-h|--help] [-i|--in FILENAME] [-o|--out FILENAME]
[-v|--verbose true|false] [-l|--log-file FILENAME]
Where:
-i|--in FILENAME Currently:[-]
The name of the input file to process.
-o|--out FILENAME Currently:[-]
The name of the output file to produce.
-v|--verbose true|false Currently:[false]
Produce verbose output.
-l|--log-file FILENAME Currently:[/dev/null]
Used to log diagnositic output to a file.
-h|--help
Displays this usage information.
[pkb@salsa bogus]$
Well, it looks like our script is working pretty much as
expected. The only thing left to do is clean up the man page a
bit and remove the --verbose
and
--log-file
options from
spellcheck_config.xml
as we don't plan on
supporting them.
We let the keyval2sh script generate a stub DocBook.org file to represent the man page for this script. Let's adjust the following:
Change this:
<!ENTITY author.name.first "Joe"> <!ENTITY author.name.last "Blow"> <!ENTITY author.email "joe.blow@vaisala.com">
To something like this (use your own name and email please - I really only want people contacting me about the crud that I create):
<!ENTITY author.name.first "Paul"> <!ENTITY author.name.last "Blankenbaker"> <!ENTITY author.email "paul@mekwin.com">
Next, change this:
<para>The <command>&name;</command> is pretty darn handy when it comes to washing dishes. It can:</para> <itemizedlist> <listitem><para>Scrub them clean.</para></listitem> <listitem><para>Dry them dry.</para></listitem> </itemizedlist>
To this:
<para>The <command>&name;</command> is prints a sorted list of misspelled words in the file you feed it.</para>
OK, it isn't the greatest man page ever written, but at
least its a start. Before we recompile, lets remove the
definition of the --verbose
and
--log-file
arguments from our master
definition file. This is done simply by removing the following
lines from spellcheck_config.xml
:
Figure 3.2. Modified spellcheck_config.xml
<boolean default="false" varname="VERBOSE"> <key short="v">verbose</key> <summary>Produce verbose output.</summary> <description>When you set this option to true, <command>spellcheck</command> will produce additional output. This is typically used for diagnostic purposes to help track down when things go wrong.</description> </boolean> <file default="/dev/null" varname="LOG_FILE"> <key short="l">log-file</key> <summary>Used to log diagnositic output to a file.</summary> <description>If you are encountering errors when running <command>spellcheck</command>, try using this option and examine the log file after running <command>spellcheck</command>. It may provide useful clues as to what went wrong.</description> </file>
Now, let's rebuild the script and verify that both the man page and short help output match the updates we just made:
[pkb@salsa bogus]$
make -k -f spellcheck.mk
java org.apache.xalan.xslt.Process -IN "spellcheck_config.xml" -out "spellcheck_config.sh" -xsl "/home/pkb/usr/share/vaisala/xml/keyval/xsl/keyval2sh.xsl"
cat "spellcheck_config.sh" "spellcheck.sh" > "spellcheck"
chmod +x "spellcheck"
java org.apache.xalan.xslt.Process -IN "spellcheck_config.xml" -out "spellcheck_config.docbook_options" -xsl "/home/pkb/usr/share/vaisala/xml/keyval/xsl/keyval2docbook_options.xsl"
java org.apache.xalan.xslt.Process -IN "spellcheck_config.xml" -out "spellcheck_config.docbook_synopsis" -xsl "/home/pkb/usr/share/vaisala/xml/keyval/xsl/keyval2docbook_synopsis.xsl"
java org.apache.xalan.xslt.Process -IN "spellcheck.docbook" -XSL "/usr/share/sgml/docbook/xsl-stylesheets/manpages/docbook.xsl" -OUT "spellcheck.l"
Using original entity definition for "ı".
Using original entity definition for "<".
Using original entity definition for ">".
Using original entity definition for "&".
Using original entity definition for """.
Using original entity definition for "'".
file:///usr/share/sgml/docbook/xsl-stylesheets/html/chunker.xsl; Line #94; Column #-1; Writing spellcheck.l for refentry(spellcheck)
java org.apache.xalan.xslt.Process -IN "spellcheck.docbook" -XSL "/usr/share/sgml/docbook/xsl-stylesheets/html/docbook.xsl" -OUT "spellcheck.man.html"
Using original entity definition for "ı".
Using original entity definition for "<".
Using original entity definition for ">".
Using original entity definition for "&".
Using original entity definition for """.
Using original entity definition for "'".
[pkb@salsa bogus]$
./spellcheck --help
Usage:
spellcheck [-h|--help] [-i|--in FILENAME] [-o|--out FILENAME]
Where:
-i|--in FILENAME Currently:[-]
The name of the input file to process.
-o|--out FILENAME Currently:[-]
The name of the output file to produce.
-h|--help
Displays this usage information.
[pkb@salsa bogus]$
man ./spellcheck.l
SPELLCHECK(l) SPELLCHECK(l)
NAME
spellcheck - Used to wash dishes.
SYNOPSIS
spellcheck [-i FILENAME | --in FILENAME] [-o FILENAME | --out FILENAME]
DESCRIPTION
The spellcheck is prints a sorted list of misspelled words in the file
you feed it.
OPTIONS
The following command line options are available:
This argument is used to specify the name of the input file to be pro-
cessed by spellcheck. If omitted, it defaults to - indicating that
spellcheck should read its input from the console.
This argument is used to specify the name of the output file to be pro-
duced by . If omitted, it defaults to - indicating that spellcheck
should write its input to the console.
:q
[pkb@salsa bogus]$
OK, we didn't clean up everything in the man page, but
we've made some progress. The important thing to note is that
our changes to spellcheck_config.xml
found their way into both the man page and the script.
This concludes the tutorial on using the keyval2sh command. Hopefully it was enough to get you started and on your way.