Python Documentation Tips and Tricks
by Cameron Laird05/17/2001
Traditional thinking judges the merit of a computing language on objective, rather mathematical dimensions: how object-oriented is it? Are its constructs orthogonal? Does the syntax admit ambiguities? While that's appropriate for abstract analysis, when we work or play with a computing language in our daily life, we live an intimacy best judged along a more human dimension.
What's important and best about Python is not its use of white space, or capacity for metaprogramming, or any other single feature in isolation. Python's use is exploding because it fits well. It's a tool engineered with good balance, the right combination of rigor and flexibility to match our human needs. Yet one of the best things about Python is its documentation -- especially the kinds of documentation reference books and tutorials rarely explain.
How to look at the documentation
Python programmers like its libraries. The answer to most of the common, "How do you ...?" questions is, "Use the ... module." Part of the reason for this rich tradition of reuse is that Python modules are generally well-documented, and it's easy to make new modules with good documentation.
Consider os as an example. The os module
provides "a more portable way of using operating system (OS) dependent
functionality than importing an OS dependent built-in module." That's
what the online version of the Global Module
Index tells us. Hyperlinks on the Index page lead to a variety of
formats suitable for online browsing and printing. Nearby on the same
site is an explanation by Fred L. Drake of how to document Python,
informing document authors of the LaTeX markup practices typically
used with Python modules.
The culture of the Python community reinforces high standards in documentation. Since the existing documentation is good, code authors expect that they will need to do as well for new contributions. Programmers' efforts to document their code carefully boosts Python's strength in documentation even further and attracts more practitioners for whom documentation is important. The cycle continues.
You can download standard documentation and use it from local storage. A standard installation of Python on Windows includes includes on the desktop a "Python Documentation" shortcut to local forms of the same browsable documents just mentioned.
__doc__ magic
More distinctive is the help available from within the Python
interpreter. Many Python programmers work within a live Python
processing session, composing code "on the fly". In this context, you
can interrogate Python objects (including modules, methods, and
functions) by asking them about their __doc__ value. In
the middle of such a session, you might, for example, wonder about
file permissions:
>>> import os
>>> os.access.__doc__
'access(path, mode) -> 1 if granted, 0 otherwise
Test for access to a file.'
>>>
When you first meet __doc__, you may wonder what the
point is. It apparently only abbreviates the standard manual entry in
a slightly clumsier form. Experienced Pythoneers use
__doc__ all the time, though. A simple
__doc__ printout often gives just the right information
-- the order of parameters, for example -- that otherwise would take
many annoying seconds to find in the full documentation.
__doc__ operates in its own virtuous circle. Because
the best Python programmers use it so often, library authors know
high-quality __doc__ descriptions will be
appreciated. There's also a simple technical trick behind
__doc__ maintenance encouraging its proliferation. Most
attributes are assigned or, more properly, bound with
assignment statements:
thing.attribute = 'This is a description.'
For the special case of __doc__, though, the docstring
convention allows abbreviation: when the first statement of a
definition is a string, that string is taken as the
__doc__ (or docstring) of the defined object. An example
might look like
def phase_of_the_moon():
"This function returns a slightly randomized
integer that shuffles data around in a way
convenient for the XYZ class."
# Working code here.
return value
The "This function ..." explanation is
phase_of_the_moon.__doc__, the function's docstring.
Building on the __doc__ base
Because most Python programmers write at least minimal docstrings, several useful tools leverage this information. Pytext transforms docstrings into DOM trees -- an XML-related data structure, handy for preparation of indexes, among other uses. Docutils collects more general-purpose machinery to transform a variety of documentation formats. Last month, this site focused attention on one particular module, PyDoc, the purpose of which is to render documentation, and especially docstrings, in various convenient forms.
Among all these tools, the most beautiful in its simplicity and power is doctest. Doctest is a convention and process for embedding simple self-testing examples in docstrings. As doctest's own documentation explains, "The doctest module searches a module's docstrings for text that looks like an interactive Python session, then executes all such sessions to verify they still work exactly as shown." The documentation goes on to present an example definition of a factorial function, whose docstring embeds such interactive exercises as
>>> [factorial(n) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> factorial(30)
265252859812191058636308480000000L
The definition of factorial is designed for re-use by other Python code, of course. At the same time, by conforming to the doctest format, the definition can be executed in isolation, as a validation of its content. This kind of introspective play makes good language interpreters enormously productive.
Doctest builds on another social convention of the Python community. One of the continuing challenges with languages such as C is how to design and implement unit tests; this is generally regarded as an thorny problem, best left to more senior programmers. Python's proud tradition is that each definition should embed its own unit tests. Python source files often end in the idiom
if __name__ == '__main__':
# Self-testing code goes here.
else:
# This will be executed in case the
# source has been imported as a
# module.
This idiom means that the source is available either for re-use as a module by other applications or for immediate self-testing execution. In the case of the doctest example, the self-testing code imports doctest and applies it to the source file itself. This makes for the ultimate in succinctness:
def _test():
import doctest, example
return doctest.testmod(example)
if __name__ == "__main__":
_test()
By building on strong traditions of Python use, these few lines achieve a powerful result in self-validation.
More jewels in documentation management
These themes of automation and strong social expectations about documentation pervade Python programming. Other languages have comparable and even superior technical answers for questions about documentation. Python is essentially unique, though, in the extent to which its programmers fulfill the promise of the tools. Pythondoc is a tool like Javadoc: it generates documentation from source-code comments. This, in turn, depends on observance of the Python styleguide.
Python documentation keeps getting better. All the tools and references mentioned here remain in active development. One area now blossoming is "structured text" (ST), a cover term for conventions of formatting docstrings and other textual material. Advances in processing ST aim to make it even easier for novice programmers or nonprogrammers to supply content that benefits from Python's power. Why ST? Briefly, "civilians" can read and write ST. It's less rigorous than XML, for example, and it's far more readable for most humans. For example, ST recognizes paragraphs implicitly:
def something():
"This is a first paragraph.
This is a second paragraph. The intervening
blank line means this must be a new paragraph."
# ...
Contrast this with HTML, which has an explicit tag, <P>, for paragraphs.
Summary
There's more to Python than its syntax and reference manuals. To get the most from Python, learn to document "the Python way". Look for the docstrings that other programmers have written; they're likely to hold a lot of meat. Write your own source comments in good style and, especially, include good docstrings in your work. However well you code "raw" source, good supplementary documentation, in traditional Python style, helps others to quickly understand your work.
Cameron Laird is the vice president of Phaseit, Inc. and frequently writes for the O'Reilly Network and other publications.
Return to the Python DevCenter.