Using the pathlib module

One of our principle forms of interaction with the OS is working with files. The pathlib module makes this particularly flexible. While the OS can represent a path to a file as a string, there is considerable syntactic subtlety to the strings that are used. Rather than try to parse the strings directly, it's much more pleasant to create Path objects. These can both compose and decompose paths from their constituent parts.

Path composition uses the / operator to assemble a Path from starting Path and str objects. This operator works for Windows as well as POSIX-compliant operating systems, such as Linux and macOS. Because a single operator will build appropriate paths, it's best to use Path objects for all filesystem access.

Here are some examples of building a Path object:

  • Path.home() / "some_file.dat": This names a given file in the user's home directory.
  • Path.cwd() / "data" / "simulation.csv": This names a file relative to the current working directory.
  • Path("/etc") / "profile": This names a file starting from the root of the filesystem.

There are a number of interesting inquiries that we can make to find details about a given Path object. In some cases, we might want to know a path's parent directory or the extension on a filename. Here are some examples:

p = Path.cwd() / "data" / "simulation.csv"
>>> p.parent
PosixPath('/Users/slott/mastering-oo-python-2e/data')
>>> p.name
'simulation.csv'
>>> p.suffix
'.csv'
>>> p.exists()
False

Note that these queries about a Path object do not depend on the path representing an actual object in the filesystem. In this example, the various properties of parent, name, and suffix are all reported correctly for a file that does not actually exist. This is very useful for creating output filenames from input files.

For example, we might do the following:

>>> results = p.with_suffix('.json')
>>> results
PosixPath('/Users/slott/mastering-oo-python-2e/data/simulation.json')

We've taken an input Path object, p, and created an output Path object, results. The resulting object has the same name, but a different suffix. The new name was built by the with_suffix() method of a Path. This lets us create related files without having to parse the (relatively) complex path names.

As we noted in Chapter 6, Using Callables and Contexts, a file should be used as a context manager to ensure that it's closed properly. A Path object can open a file directly, leading to programs that work like the following example:

output = Path("directory") / "file.dat"
with output.open('w') as output_file:
output_file.write("sample data ")

This example creates a Path object. An OS will use a path without a leading / relative to the current working directory. The open() method of a Path will create a file object that can then be used for reading or writing. In this case, we're writing a constant string to a file.

We can use Path objects to manage directories as well as files. We often want to create working directories with code like the following:

>>> target = Path("data")/"ch18_directory"
>>> target.mkdir(exist_ok=True, parents=True)

We've assembled the Path object from a relative reference to the data directory and a specific subdirectory, ch18_directory. The mkdir() method of this Path object will ensure that the required directory structures are present in the filesystem. We've provided two common options. The exists_ok option will suppress the FileExistsError exception that would be raised if the file already exists. The parents option will create all of the required parent directories. This can be handy when creating complex, nested directory trees.

A common use case when working with web logs is to segregate the logs by date. We can create date-specific directories with code similar to the following example:

>>> import datetime
>>> today = datetime.datetime.today()
>>> target = Path("data")/today.strftime("%Y%m%d")
>>> target.mkdir(exists_ok=True)

In this example, we've computed the current date. From this, we can create a directory path using the data subdirectory and the year, month, and day of the current date. We want to be tolerant of the directory that already exists, so we suppress the exception. This will not create parent directories, and if the data directory does not exist, there will be a FileNotFoundError exception raised.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset