CSc 250: Lecture Notes: Argparse

Introduction

We previously learned how to access command-line arguments using the sys.argv list. Arguments that are expected to show up at an exact location on the command-line (a particular index of sys.argv) are know as positional arguments. How an argument should be treated is determined by it’s exact position on the command line. A non-positional argument is one that isn’t expected to show up at a specific position. Rather, it is expected to show up after a particular identifier that begins with a dash (-). Alternately, a non-positional argument can be the identifier itself.

Earlier in the semester, we used both positional and non-positional arguments. For example, let’s look at a cut bash command that we’ve seen before:

cut -d " " -f 1 grades.txt

-d " " is a non-positional argument. -d is an identifier, which indicates that the cut delimiter is the argument following this one (" "). -f 1 is also a non-positional argument. -f is an identifier, which indicates that the column number to cut is the argument following this one (1). grades.txt is a positional argument. cut expects the last argument to be the file to cut from.

We can re-arrange the non-positional arguments, because the cut does not depend on their exact position in the argument list. Both of the following work the same

cut -d " " -f 1 grades.txt
cut -f 1 -d " " grades.txt

But we cannot change the position of the positional argument. Neither of the following work:

cut grades.txt -f 1 -d " "
cut -f 1 grades.txt -d " "

It’s pretty clear to see how we can support positional arguments in python, given what we know about the sys.argv list. But how can we support non-positional arguments?

One option is to write our own logic to handle this. To accomplish this, we would have to search through sys.argv for the identifiers we are interested in, and then check the following argument to make sure it is what we expect. Doing this manually is a pain in the butt, and there are a lot of edge cases we would need to worry about.

Fortunately, python has a built-in module for handling non-positional arguments called argparse.

argparse

The first step in using the argparse library is to import it, like we have done before with sys.

import argparse

next, we must initialize an argument parser object by running:

parser = argparse.ArgumentParser()

Now parser is a variable which we can tell what kinds of arguments we expect to see on the command line.

To tell parser that we expect a particular argument, we can call the add_argument function. For example, we can tell it that we want to handle -x, -y, and -z by doing:

parser.add_argument('-x')
parser.add_argument('-y')
parser.add_argument('-z')

Once we’ve specified all of the arguments we want the program to handle, we must tell the parser to process the command-line arguments. To do this:

args = parser.parse_args()

The args variable stores the processed arguments. To access one f the processed arguments, you type args followed by a dot, followed by the name of the identifier. To print out what the user inputted for -x, you should write:

print( args.x )

A complete (but very simple) example (named ap-args.py:

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-x')
parser.add_argument('-y')
parser.add_argument('-z')

args = parser.parse_args()

print('-x is: ' + str(args.x))
print('-y is: ' + str(args.y))
print('-z is: ' + str(args.z))

If we run this and do not specify any arguments on the command line, we will get:

$ python3 ap-args.py
-x is: None
-y is: None
-z is: None

If we run and specify arguments, but not the ones that argparse expects to see (the ones that we told it about), then it will generate and print a helpful warning message, and then exit the program.

$ python3 ap-args.py testing one two three
usage: ap-args.py [-h] [-x X] [-y Y] [-z Z]
ap-args.py: error: unrecognized arguments: testing one two three
$ python3 ap-args.py -f FEE -g GEE
usage: ap-args.py [-h] [-x X] [-y Y] [-z Z]
ap-args.py: error: unrecognized arguments: -f FEE -g GEE

This is good, because argparse automatically makes sure that the user of the program is following the expected argument requirements. But, if we specify only expected arguments, argparse will process them for us and allow us to access them as expected.

$ python3 ap-args.py -x Superman
-x is: Superman
-y is: None
-z is: None
$ python3 ap-args.py -z Batman -x Superman
-x is: Superman
-y is: None
-z is: Batman

Help!

argparse has several nifty, built-in features. One such feature is the functionality to automatically generate a help message.

Imagine a scenario where we have a python program, and we know it is supposed to take some command line arguments, but we don’t know which arguments it expects. By default, argparse understands that -h means “print out a help message”.

$ python3 ap-args.py -h
usage: ap-args.py [-h] [-x X] [-y Y] [-z Z]

optional arguments:
  -h, --help  show this help message and exit
  -x X
  -y Y
  -z Z

This tells us everything argparse knows about the arguments it supports. Notice that the information provided about -x, -y and -z is not all that useful. It tells us that the program supports these, but it doesn’t tell us what these things represent! Fortunately, argparse provides a way for the programmer to give extra information and specify extra requirements for each of these argument identifiers. Before taking about this, we need to detour into named arguments.

Optional and Named Arguments

(If you already know about optional and named arguments, feel free to skip this section)

We have already learned about how we can pass arguments to functions in python.

For example, take this simple function:

def print_info(name, year, email):
    print('Person: ' + name)
    print('  * born: ' + str(year))
    print('  * email-address: ' + str(email))

If we import this function (in functions.py) into a python shell, we can call it like so:

>>> functions.print_info('Ben', 1920, 'bdd@gmail.com')
Person: Ben
  * born: 1920
  * email-address: bdd@gmail.com

As we know, we must specify all of the arguments, otherwise python will complain:

>>> functions.print_info()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: print_info() missing 3 required positional arguments: 'name', 'year', and 'email'
>>> functions.print_info('Ben', 1920)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  TypeError: print_info() missing 1 required positional argument: 'email'

We also need to give them to the function in the exact order that they were specified in the function definition. Otherwise we will get output that we don’t want/expect.

>>> functions.print_info('Ben', 'bdd@gmail.com', 1920)
Person: Ben
  * born: bdd@gmail.com
  * email-address: 1920

Python actually gives us the flexibility to make some arguments optional. In reality, the arguments are not truly optional, but we can give an argument a default value so that if we don’t specify the argument, python knows what it’s value is. This is accomplished by assigning the argument to a default value in the function definition. In the below example, we give each argument of print_info a default value.

def print_info(name='John', year=2000, email='john@gmail.com'):
    print('Person: ' + name)
    print('  * born: ' + str(year))
    print('  * email-address:' + str(email))

Now, each of the arguments is optional. If the function is called with no arguments, we will no longer get an error. Instead, the function will run where each of the arguments are their default values:

>>> functions.print_info()
Person: John
  * born: 2000
  * email-address:john@gmail.com

With default values, we can specify only some of the arguments, and it will still work:

>>> functions.print_info('Ben', 1920)
Person: Ben
  * born: 1920
  * email-address:john@gmail.com

The default value was only used for the argument that we did not give the function.

What do we do if we only want to specify the email argument? Since it’s the last argument, just doing the following will not work, because ti interprets it as the name.

>>> functions.print_info('bdd@gmail.com')
Person: bdd@gmail.com
  * born: 2000
  * email-address:john@gmail.com

This is where named arguments come into play. Any argument that a function takes can be specified out-of-order by explicitly assigning it in the call to the function. This happens like so:

>>> functions.print_info( email = 'bdd@gmail.com' )
Person: John
  * born: 2000
  * email-address:bdd@gmail.com

We were able to specify the last argument (email) by assigning it a value directly within the call to this function. Named arguments allow us to pass values to a function in any order and combination that we want.

>>> functions.print_info( email='bdd@gmail.com', name='Benito', year=1777 )
Person: Benito
  * born: 1777
  * email-address:bdd@gmail.com
>>> functions.print_info( email='bdd@gmail.com', year=2013 )
Person: John
  * born: 2013
  * email-address:bdd@gmail.com

(More) Help!

Now that we know (a little bit) about optional and named arguments, let’s get back to the -h conversation. I mentioned before that we can give argparse more information about each argument identifier. This is accompished by giving values to some arguments of add_argument that have default values.

One of the more useful named/optional arguments that can be specified is help, which should be assignment a string that describes the purpose of the argument. Below is a modified version of the ap-args.py program from earlier named personinfo.py. This prints out the same information that the print_info function did, but it takes the name, email, and year born as command-line arguments.

import argparse

parser = argparse.ArgumentParser()

parser.add_argument('-n', help='Your name')
parser.add_argument('-y', help='The year you were born in')
parser.add_argument('-e', help='Your email address')

args = parser.parse_args()

print('Person: ' + str(args.n))
print('  * born: ' + str(args.y))
print('  * email-address:' + str(args.e))

Notice that each call to the function add_argument now has a help=... specified. help is an optional argument to add_argument. We are not required to specify it, but if we do then the string will show up in the help message on the command-line when we call -h.

$ python3 personinfo.py -h
usage: personinfo.py [-h] [-n N] [-y Y] [-e E]

optional arguments:
  -h, --help  show this help message and exit
  -n N        Your name
  -y Y        The year you were born in
  -e E        Your email address

Now it is clear what each argument is meant to be used for.

$ python3 personinfo.py -n Ben -e ben@email.com -y 2000
Person: Ben
  * born: 2000
  * email-address:ben@email.com