Modules and Packages – Overview

Concepts: File, Directory, Namespace, Import

Modules and Files

As programs get large, it is important to be able to “modularize” them into independent pieces that can be worked on separately by different programmers.  A module in Python is a file that contains Python definitions and statements.  When a module is imported into a program, the statements in the file are executed, symbols defined, and the resulting definitions are made available to the importing program through a Python namespace or dictionary.

What definitions are created in a Python module?  Any symbols defined at the “top level” of the file become part of the dictionary for that module.  These symbols include function and class names, and the names of global variables (either declared with the “global” keyword or used outside any function or class definition in the module).  See the examples later in this unit.

How are symbols accessed in a Python module?  Once imported, a symbol is accessed using the dot notation, e.g., module.symbol.  The module name is the name of the file without the “.py” extension.  The import statement in Python provides a few different ways to import symbols from a module into a program; these are illustrated in the examples section.

An important feature of modules is that they allow programmers to work independently on a project without having to worry about clashes of symbols representing global variable, function, or class names.  Each symbol  is “qualified by” the name of the module in which it appears.  So, if two programmers define a function named “doit()”, as long as the functions appear in different modules (e.g., alice.py and bob.py), then those functions are accessed using different qualifiers (e.g., alice.doit() and bob.doit()).

Module Search Path

When you import a module into a program, how does the Python interpreter find it?  Does the file need to be in the “current directory”?

Python uses a “search path” to locate the file to be imported.  By default, the Python interpreter first searches for a built-in module with the given name.  If it is not found, Python then searches the directories contained in the variable “sys.path”.  This variable generally starts with the name of the directory containing the program being run, or an empty string, indicating the current directory.  Thus, it is straightforward for a program to import other modules that are located in the same directory.

Packages

As we’ve seen, modules can be used to avoid name clashes for global variables, functions, and classes.  But, how can name clashes of modules be prevented?  What if two programmers working independently want to use the same module name?

Packages are another level of namespace for Python programs.  A package can be represented using a file system directory containing the special file “__init__.py”.  The presence of this file in a directory on the search path indicates to the Python interpreter that the directory is a package whose name is given by the name of the directory.  Any modules (files with “.py” extension) in that directory are part of the package.  Modules in a package can be accessed using the standard dot notation (e.g., matplotlib.pyplot).  See the examples later in this unit.

So, two programmers can create modules with the same name, if the modules are located in packages (directories) with different names.