Python Packages and You
Jean-Paul Calderone wrote an excellent blog post on the right way to structure a python project. This post will build on that post by covering concrete examples of how to write imports, how to distribute your package, and what not to do.
As an example, we’ll be looking at my own project, passacre. There is a
python package called
passacre immediately in the root of the project
(i.e. not in a
src directory). Here’s what’s inside it:
1 2 3 4 5 6 7 8 9 10 11 12 13
There is exactly one way code in your package should be importing other code inside your package, and that’s with absolute imports.
So, in this code,
passacre.config needs to import some things from
passacre.multibase. The import looks like this:
Note that the import uses the fully-qualified name of the module,
passacre.multibase. This is important for two reasons:
- It makes your code very explicit in what you’re importing.
- Your code will continue to work in the future. There’s a very, very old
method of doing imports (“implicit relative imports”) wherein one would
from multibase import Multibaseinstead, but this is gone in python 3. It should’ve been removed from python 2.7, but unfortunately, they forgot to do it.
There is another style of imports also covered in PEP 328, wherein the code would look instead look like this:
I’ll only mention explicit relative imports once more in this post; I very much prefer the look of absolute imports, but this is primarily an issue of style.
Test module imports
passacre also comes with a test suite in the
passacre.test submodule. These
tests necessarily have to import the passacre code to test it. Here’s what the
passacre.test.test_multibase module does to import
“But wait! That’s the same as before!” you cry. Since the fully-qualified module name is always the same, the imports are also always the same. This is another advantage of absolute imports—you always know exactly what you’re getting.
Sibling subpackage imports
For demonstration purposes, we’ll pretend that there’s another subpackage
passacre.test2 that contains even more tests. Here’s the imaginary addition
to the package structure:
test_multibase module needs to use some utility function from
passacre.test.util, so it imports the function:
Using absolute imports means that it doesn’t matter where in the project a module is when you try to import another module. Python will always look up the module to import from the root of your package.
Shadowed standard library module imports
Something I’m sure everyone has done at some point in their python career is
name a module the same as a module in the standard library. If I wanted a
passacre.socket module that used the standard library
import socket should give me the standard library module instead of
passacre.socket. PEP 328 covers this too by adding a
absolute_import, which is on by default in python 3.0 and higher.
absolute_import has exactly one effect: disabling implicit
relative imports. Since
__future__ features only affect the module enabling
them, I can simply add
from __future__ import absolute_import to
passacre/socket.py and any
import socket calls in it will import the
Once your package has some code in it that can run, you’ll probably want to let other people use your package as well.
setup.py is your package’s entry point into distutils or
setuptools. Every package should have one, as that’s how your code gets
put into the right place for your users. I won’t go too much into detail in how
to use them; the links above are for tutorials which will explain how to get
My own rule of thumb is to use distutils unless I absolutely require a feature from setuptools. Some people will say to always use setuptools, and that’s okay too.
Jean-Paul’s post recommends putting python scripts in the
of your project root. This is fine advice, but there are newer methods which
mean you don’t need a
bin directory at all.
__main__.py. This is not quite the same as an executable
script, but it allows a package to be executed via
python -m. For example,
passacre/__main__.py means that one can execute
python -m passacre and the
__main__.py file will be executed. PEP 338 has more details on exactly
how this works.
passacre/__main__.py is fairly small:
__main__.py is executed directly, the actual main function should live
elsewhere if you actually want to be able to test it. This way, test code can
from passacre.application import main and be able to call that
python -m is not limited to executing packages. For times when you want to
quickly run some module that has an
if __name__ == '__main__': block in it,
you can use
python -m passacre.thatmodule. Importantly, this will ensure that
python recognizes your whole package, which means that your imports won’t fail.
Sometimes a project wants explicit binaries instead of requiring their users to
python -m. For this case, setuptools has automatic script creation.
Here’s an excerpt from passacre’s
1 2 3 4 5 6 7 8 9 10
setup.py uses extras, but that’s not important for this blog
console_scripts definition makes a
passacre executable that does the
same thing as the
__main__.py script did: imports the
main function from
passacre.application and calls it.
Using distutils or setuptools to generate and/or install your python scripts is
important—as a part of the install process, the shebang line (e.g.
#!/usr/bin/env python) will be rewritten to use the python that ran
setup.py. So, if someone does
/usr/local/bin/python2.6 setup.py install,
passacre will start with
python -m and automatic script creation can be used beautifully
together—a project can have a
scripts subpackage containing one module for
every executable script. For example,
passacre_generate could be
passacre_generate = passacre.scripts.generate:main and also contain:
For projects with a lot of scripts, this is a good organization method to keep the scripts separate from the more library-ish code.
Please have some, especially if you’re distributing your package. Docstrings
are a good start, but a tutorial written with sphinx and/or a
with usage examples goes a long way.
Don’t directly run modules inside packages
Seriously. Never ever ever do this. What I mean specifically is doing
python passacre/test/test_application.py, or
anything else that starts with
python passacre/. This will prevent python
from knowing where the root of your package is, and so absolute imports and
explicit relative imports will fail, or not import the code you think it should
If you think you have to do this, you’re wrong. Instead, you probably want
python -m, or generate an executable script to run, or
use a test runner. If you don’t think any of those cover your use case,
leave a comment and I’ll show you a better method.
PYTHONPATH to try to make it go
If you think you have to set
PYTHONPATH, you’ve probably fallen victim to the
first pitfall and are trying to execute a module directly. With proper package
layout and proper imports, you won’t need to set
PYTHONPATH to run your code.
sys.path from code in your package
PYTHONPATH, but worse because it’s easier to affect other
people using your code. Never do this, as it will break and make people
trying to use your code very annoyed.
Don’t make your project root a package
Your project root should contain a python package, not be a package itself.
If you do this, your
setup.py will be very confusing (or not work at all) and
it will be very difficult to run your code.
Really, writing a python package isn’t hard. The rules above are very simple and will lead to simple, easy-to-read-and-understand python code.