Naming styles

The different naming styles used in Python are:

  • CamelCase
  • mixedCase
  • UPPERCASE, and UPPER_CASE_WITH_UNDERSCORES
  • lowercase and lower_case_with_underscores
  • _leading and trailing_ underscores, and sometimes __doubled__ underscores

Lowercase and uppercase elements are often a single word, and sometimes a few words concatenated. With underscores, they are usually abbreviated phrases. Using a single word is better. The leading and trailing underscores are used to mark the privacy and special elements.

These styles are applied to:

  • Variables
  • Functions and methods
  • Properties
  • Classes
  • Modules
  • Packages

Variables

There are two kinds of variables in Python:

  • Constants
  • Public and private variables

Constants

For constant global variables, an uppercase with an underscore is used. It informs the developer that the given variable represents a constant value.

Note

There are no real constants in Python like those in C++, where const can be used. You can change the value of any variable. That's why Python uses a naming convention to mark a variable as a constant.

For example, the doctest module provides a list of option flags and directives (https://docs.python.org/2/library/doctest.html) that are small sentences, clearly defining what each option is intended for:

from doctest import IGNORE_EXCEPTION_DETAIL
from doctest import REPORT_ONLY_FIRST_FAILURE

These variable names seem rather long, but it is important to clearly describe them. Their usage is mostly located in initialization code rather than in the body of the code itself, so this verbosity is not annoying.

Note

Abbreviated names obfuscate the code most of the time. Don't be afraid of using complete words when an abbreviation seems unclear.

Some constants' names are also driven by the underlying technology. For instance, the os module uses some constants that are defined on C side, such as the EX_XXX series, that defines Unix exit code numbers. The same name code can be found, for example, in the system's sysexits.h C headers file:

import os
import sys

sys.exit(os.EX_SOFTWARE)

Another good practice when using constants is to gather them at the top of a module that uses them and combine them under new variables when they are intended for such operations:

import doctest
TEST_OPTIONS = (doctest.ELLIPSIS |
                doctest.NORMALIZE_WHITESPACE | 
                doctest.REPORT_ONLY_FIRST_FAILURE)

Naming and usage

Constants are used to define a set of values the program relies on, such as the default configuration filename.

A good practice is to gather all the constants in a single file in the package. That is how Django works, for instance. A module named settings.py provides all the constants:

# config.py
SQL_USER = 'tarek'
SQL_PASSWORD = 'secret'
SQL_URI = 'postgres://%s:%s@localhost/db' % (
    SQL_USER, SQL_PASSWORD
)
MAX_THREADS = 4

Another approach is to use a configuration file that can be parsed with the ConfigParser module, or an advanced tool such as ZConfig, which is the parser used in Zope to describe its configuration files. But some people argue that it is rather an overkill to use another file format in a language such as Python, where a file can be edited and changed as easily as a text file.

For options that act like flags, a common practice is to combine them with Boolean operations, as the doctest and re modules do. The pattern taken from doctest is quite simple:

OPTIONS = {}


def register_option(name):
    return OPTIONS.setdefault(name, 1 << len(OPTIONS))


def has_option(options, name):
    return bool(options & name)

# now defining options
BLUE = register_option('BLUE')
RED = register_option('RED')
WHITE = register_option('WHITE')

You will get:

>>> # let's try them
>>> SET = BLUE | RED
>>> has_option(SET, BLUE)
True
>>> has_option(SET, WHITE)
False

When such a new set of constants is created, avoid using a common prefix for them, unless the module has several sets. The module name itself is a common prefix. Another solution would be to use the Enum class from the built-in enum module and simply rely on the set collection instead of the binary operators. Unfortunately, the Enum class has limited applications in code that targets old Python releases because the enum module was provided in Python 3.4 version.

Note

Using binary bit-wise operations to combine options is common in Python. The inclusive OR (|) operator will let you combine several options in a single integer, and the AND (&) operator will let you check that the option is present in the integer (refer to the has_option function).

Public and private variables

For global variables that are mutable and freely available through imports, a lowercase letter with an underscore should be used when they need to be protected. But these kinds of variables are not used frequently, since the module usually provides getters and setters to work with them when they need to be protected. A leading underscore, in that case, can mark the variable as a private element of the package:

_observers = []

def add_observer(observer):
    _observers.append(observer)


def get_observers():
    """Makes sure _observers cannot be modified."""
    return tuple(_observers)

Variables that are located in functions and methods follow the same rules, and are never marked as private, since they are local to the context.

For class or instance variables, using the private marker (the leading underscore) has to be done only if making the variable a part of the public signature does not bring any useful information, or is redundant.

In other words, if the variable is used internally in the method to provide a public feature, and is dedicated to this role, it is better to make it private.

For instance, the attributes that are powering a property are good private citizens:

class Citizen(object):
    def __init__(self):
        self._message = 'Rosebud...'

    def _get_message(self):
        return self._message

    kane = property(_get_message)

Another example would be a variable that keeps an internal state. This value is not useful for the rest of the code, but participates in the behavior of the class:

class UnforgivingElephant(object):
    def __init__(self, name):
        self.name = name
        self._people_to_stomp_on = []

    def get_slapped_by(self, name):
        self._people_to_stomp_on.append(name)
        print('Ouch!')

    def revenge(self):
        print('10 years later...')
        for person in self._people_to_stomp_on:
            print('%s stomps on %s' % (self.name, person))

Here is what you will see in interactive session:

>>> joe = UnforgivingElephant('Joe')
>>> joe.get_slapped_by('Tarek')
Ouch!
>>> joe.get_slapped_by('Bill')
Ouch!
>>> joe.revenge()
10 years later...
Joe stomps on Tarek
Joe stomps on Bill

Functions and methods

Functions and methods should be in lowercase with underscores. This rule was not always true in the old standard library modules. Python 3 did a lot of reorganizations to the standard library, so most of its functions and methods have a consistent case. Still, for some modules like threading, you can access the old function names that used mixedCase (for example, currentThread). This was left to allow easier backwards compatibility, but if you don't need to run your code in older versions of Python, then you should avoid using these old names.

This way of writing methods was common before the lowercase norm became the standard, and some frameworks, such as Zope and Twisted, are also using mixedCase for methods. The community of developers working with them is still quite large. So the choice between mixedCase and lowercase with an underscore is definitely driven by the library you are using.

As a Zope developer, it is not easy to stay consistent because building an application that mixes pure Python modules and modules that import Zope code is difficult. In Zope, some classes mix both conventions because the code base is still evolving and Zope developers try to adopt the common conventions accepted by so many.

A decent practice in this kind of library environment is to use mixedCase only for elements that are exposed in the framework, and to keep the rest of the code in PEP 8 style.

It is also worth noting that developers of the Twisted project took a completely different approach to this problem. The Twisted project, same as Zope, predates the PEP 8 document. It was started when there were no official guidelines for code style, so it had its own. Stylistic rules about the indentation, docstrings, line lengths, and so on could be easily adopted. On the other hand, updating all the code to match naming conventions from PEP 8 would result in completely broken backwards compatibility. And doing that for such a large project as Twisted is infeasible. So Twisted adopted as much of PEP 8 as possible and left things like mixedCase for variables, functions, and methods as part of its own coding standard. And this is completely compatible with the PEP 8 suggestion because it specifically says that consistency within a project is more important than consistency with PEP 8 style guide.

The private controversy

For private methods and functions, a leading underscore is conventionally added. This rule was quite controversial because of the name-mangling feature in Python. When a method has two leading underscores, it is renamed on the fly by the interpreter to prevent a name collision with a method from any subclass.

So some people tend to use a double leading underscore for their private attributes to avoid name collision in the subclasses:

class Base(object):
    def __secret(self):
        print("don't tell")

    def public(self):
        self.__secret()


class Derived(Base):
    def __secret(self):
        print("never ever")

You will see:

>>> Base.__secret
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: type object 'Base' has no attribute '__secret'
>>> dir(Base)
['_Base__secret', ..., 'public']
>>> Derived().public()
don't tell

The original motivation for name mangling in Python was not to provide a private gimmick, like in C++, but to make sure that some base classes implicitly avoid collisions in subclasses, especially in multiple inheritance contexts. But using it for every attribute obfuscates the code in private, which is not Pythonic at all.

Therefore, some people opined that the explicit name mangling should always be used:

class Base:
    def _Base_secret(self):  # don't do this !!!
        print("you told it ?")

This duplicates the class name all over the code and so __ should be preferred.

But the best practice, as the BDFL (Guido, the Benevolent Dictator For Life, see http://en.wikipedia.org/wiki/BDFL) said, is to avoid using name mangling by looking at the __mro__ (method resolution order) value of a class before writing a method in a subclass. Changing the base class private methods has to be done carefully.

For more information on this topic, an interesting thread occurred in the Python-Dev mailing list many years ago, where people argued on the utility of name mangling and its fate in the language. It can be found at http://mail.python.org/pipermail/python-dev/2005-December/058555.html.

Special methods

Special methods (https://docs.python.org/3/reference/datamodel.html#special-method-names) start and end with a double underscore, and no normal method should use this convention. Some developers used to call them dunder methods as a portmanteau of double-underscore. They are used for operator overloading, container definitions, and so on. For the sake of readability, they should be gathered at the beginning of class definitions:

class WeirdInt(int):
    def __add__(self, other):
        return int.__add__(self, other) + 1

    def __repr__(self):
        return '<weirdo %d>' % self

    # public API
    def do_this(self):
        print('this')

    def do_that(self):
        print('that')

For a normal method, you should never use these kinds of names. So don't invent a name for a method such as this:

class BadHabits:
    def __my_method__(self):
        print('ok')

Arguments

Arguments are in lowercase, with underscores if needed. They follow the same naming rules as variables.

Properties

The names of properties are in lowercase, or in lowercase with underscores. Most of the time, they represent an object's state, which can be a noun or an adjective, or a small phrase when needed:

class Connection:
    _connected = []

    def connect(self, user):
        self._connected.append(user)

    @property

    def connected_people(self):
        return ', '.join(self._connected)

When run on interactive session:

>>> connection = Connection()
>>> connection.connect('Tarek')
>>> connection.connect('Shannon')
>>> print(connection.connected_people)
Tarek, Shannon

Classes

The names of classes are always in CamelCase, and may have a leading underscore when they are private to a module.

The class and instance variables are often noun phrases, and form a usage logic with the method names that are verb phrases:

class Database:
    def open(self):
        pass


class User:
    pass

Here is an example usage in interactive session:

>>> user = User()
>>> db = Database()
>>> db.open()

Modules and packages

Besides the special module __init__, the module names are in lowercase with no underscores.

The following are some examples from the standard library:

  • os
  • sys
  • shutil

When the module is private to the package, a leading underscore is added. Compiled C or C++ modules are usually named with an underscore and imported in pure Python modules.

Package names follow the same rules, since they act like more structured modules.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset