Building Unix Tools with Ruby
Pages: 1, 2, 3
Read Command-Line Options and Arguments
The specification presented in an earlier section lists several
options, which csvt should understand. Your script can
access the list of options and arguments in two ways, reading them
directly from the ARGV array (passed to your script
automatically by the operating system) or using the
GetoptLong module to parse ARGV for you. The
latter method is preferred: it's easier and saves time.
GetoptLong is an external module, so it must be explicitly
imported before you can use it:
require 'getoptlong'
After your script imports getoptlong, you will also need
to create a new instance of GetoptLong:
opts = GetoptLong.new(
[ "--extract", "-e", GetoptLong::REQUIRED_ARGUMENT ],
[ "--remove", "-r", GetoptLong::REQUIRED_ARGUMENT ],
[ "--help", "-h", GetoptLong::NO_ARGUMENT ],
[ "--usage", "-u", GetoptLong::NO_ARGUMENT ],
[ "--version", "-v", GetoptLong::NO_ARGUMENT ]
)
The arguments passed to GetoptLong.new are the names of
the long and the short options, and the argument flags that finetune the
behavior of the option parser implemented in GetoptLong. The
example above shows how the csvt option specification is
turned into code. It is a good habit to define both long and short
options, but if for some reason it isn't possible or desired, you can omit
them and put "" in place of either the long or the
short option that you wish to leave undefined. The argument flags can be
set to REQUIRED_ARGUMENT, NO_ARGUMENT, or
OPTIONAL_ARGUMENT. The GetoptLong option and
argument parser uses these settings to decide how it should interpret the
contents of ARGV.
Once you have a properly initiated instance of the option parser, you
can add code to checks which options have been selected and what mistakes
have been made. GetoptLong provides a lot of help here; your
job is limited to defining a few global variables and handling any errors
that may occur at this stage.
First, let's define a few global variables:
version = "0.0.1" # used by the --version or -v option handler
extract_f = false # set to true when --extract or -e are used
extract_args = [] # stores the list of arguments of --extract or -e
remove_f = false # set to true when --remove or -r are used
remove_args = [] # stores the list of arguments of --remove or -r
ex_options_n = 0 # used to store the number of mutually exclusive
# options, when > 1, the script will terminate
have_options_f = false # set to true when at least one option is used
Next, you need to check which options have been used. The general
layout of the block of code responsible for testing this and setting
appropriate parameters that will be used to change the behavior of
csvt follows the pattern show below:
begin
opts.each do |opt, arg|
case opt
when option
... option handler ...
when option
... option handler ...
end
end
rescue
... handle exceptions ...
end
The begin-rescue-end construct that wraps the
opts.each do loop is required to add the exception handler,
rescue-end, that provides a way to gracefully handle
unexpected situations. We need that handler, because we do not want the
user to see the trace messages printed by the Ruby interpreter when
GetoptLong raises an exception. A short error message and a
help screen are much more user friendly.
Let's get down to the details. The opts.each do |opt,
arg| loop reads options and their arguments, if any are
expected:
begin
opts.each do |opt, arg|
Should the value of opt be some undefined option (e.g.,
-w), GetoptLong will display a error message
about unsupported option, throw an exception, and stop the execution of
the script. This sounds a bit drastic, but as you will see in a moment,
you can handle that situation easily.
If the value of opt is one of the known options (e.g.,
--extract), it will be examined by the following
case control structure, which sets the extract_f
flag and checks which columns from the source file the user wants to
print.
Notice that it does not matter if the user uses the long or the short
version of the --extract option. GetoptLong
treats them both as the same option, which means that you only need to
write one handler.
case opt
when "--extract"
extract_f = true
extract_args = arg.split(",")
tmp = 0
extract_args.each do |column|
begin
extract_args[tmp] = Integer(column)
tmp += 1
rescue
$stderr.print "csvt: non-integer column index\n"
printusage(1)
end
end
ex_options_n += 1
have_options_f = true
The --extract option handler sets the
extract_f flag, splits the arguments that follow it
(remember, these are numbers separated with commas), and checks if all
arguments of --extract are numerical, integer indexes. When
all goes well, the ex_options_n exclusive options counter is
incremented and the have_options_f flag is set to indicate
that at least one option was selected by the user. This is used to avoid
ambiguity when the user selects mutually exclusive options.
Because the --extract and --remove options
are quite similar in the way they work, their handlers are also almost
identical (see below).
when "--remove"
remove_f = true
remove_args = arg.split(",")
tmp = 0
remove_args.each do |column|
begin
extract_args[tmp] = Integer(column)
tmp += 1
rescue
$stderr.print "csvt: non-integer column index\n"
printusage(1)
end
end
ex_options_n += 1
have_options_f = true
Requests for csvt version information are handled by the
code shown below. Notice that it doesn't matter if other options were
used. Once --version or -v are found,
csvt prints version information and exits with 0 (no
errors).
when "--version"
print $0, ", version ", version, "\n"
exit(0)
Should the user need some help on csvt usage, our script
displays the help screen and exits with 0.
when "--help"
printusage(0)
when "--usage"
printusage(0)
end
end
Once the loop ends, it's time to check for possible errors like mutually exclusive and missing options. Both are considered errors and result in displaying an error message followed by the help screen.
#################################################################
# test for mutually exclusive options: --extract and --remove
if ex_options_n > 1
$stderr.print $0, ": cannot use --extract (-e) and --remove (-r) together\n"
printusage(1)
end
#################################################################
# test for missing options
if have_options_f == false
printusage(1)
end
The last piece of the option-processing block of code is the exception
handler, which prints the help screen, exits csvt, and
returns error code 1.
rescue
# all other errors
printusage(1)
end
Your code should look like this now:
require 'getoptlong'
version = "0.0.1" # used by the --version or -v option handler
extract_f = false # set to true when --extract or -e are used
extract_args = [] # stores the list of arguments of --extract or -e
remove_f = false # set to true when --remove or -r are used
remove_args = [] # stores the list of arguments of --remove or -r
ex_options_n = 0 # used to store the number of mutually exclusive
# options, when > 1, the script will terminate
have_options_f = false # set to true when at least one option is used
def printusage(error_code)
print "csvt -- extract columns of data from a CSV (Comma-Separate Values) file\n"
print "Usage: csvt [POSIX or GNU style options] file ...\n\n"
print "POSIX options GNU long options\n"
print " -e col[,col][,col]... --extract col[,col][,col]...\n"
print " -r col[,col][,col]... --remove col[,col][,col]...\n"
print " -h --help\n"
print " -u --usage\n"
print " -v --version\n\n"
print "Examples: \n"
print "csvt -e 1,5,6 file print column 1,5 and 6 from file\n"
print "csvt --extract 4,1 file print column 4 and 1 from file\n"
print "csvt -r 2,7,1 file print all columns except 2,7 and 1 from file\n"
print "csvt --remove 6,0 file print all columns except 6 and 0 from file\n"
print "cat file | csvt --remove 6,0 print all columns except 6 and 0 from file\n\n"
print "Send bugs reports to bugs@foo.bar\n"
print "For licensing terms, see source code\n"
exit(error_code)
end
opts = GetoptLong.new(
[ "--extract", "-e", GetoptLong::REQUIRED_ARGUMENT ],
[ "--remove", "-r", GetoptLong::REQUIRED_ARGUMENT ],
[ "--help", "-h", GetoptLong::NO_ARGUMENT ],
[ "--usage", "-u", GetoptLong::NO_ARGUMENT ],
[ "--version", "-v", GetoptLong::NO_ARGUMENT ]
)
begin
opts.each do |opt, arg|
case opt
when "--extract"
extract_f = true
extract_args = arg.split(",")
tmp = 0
extract_args.each do |column|
begin
extract_args[tmp] = Integer(column)
tmp += 1
rescue
$stderr.print "csvt: non-integer column index\n"
printusage(1)
end
end
ex_options_n += 1
have_options_f = true
when "--remove"
remove_f = true
remove_args = arg.split(",")
tmp = 0
remove_args.each do |column|
begin
remove_args[tmp] = Integer(column)
tmp += 1
rescue
$stderr.print "csvt: non-integer column index\n"
printusage(1)
end
end
ex_options_n += 1
have_options_f = true
when "--help"
printusage(0)
when "--usage"
printusage(0)
when "--version"
print "csvt, version ", version, "\n"
exit(0)
end
end
#################################################################
# test for mutually exclusive options: --extract and --remove
if ex_options_n > 1
$stderr.print "csvt: cannot use --extract (-e) and --remove (-r) together\n"
printusage(1)
end
#################################################################
# test for missing options
if have_options_f == false
printusage(1)
end
rescue
printusage(1)
end