Chapter 21

Tantrum

image

21.1 Constraints

  • Every single procedure and function checks the sanity of its arguments and refuses to continue when the arguments are unreasonable.
  • All code blocks check for all possible errors, possibly log context-specific messages when errors occur, and pass the errors up the function call chain.

21.2 A Program in this Style

  1 #!/usr/bin/env python
  2
  3 import sys, re, operator, string, traceback
  4
  5 #
  6 # The functions
  7 #
  8 def extract_words(path_to_file):
  9 assert(type(path_to_file) is str), "I need a string!"
 10 assert(path_to_file), "I need a non-empty string!"
 11
 12 try:
 13    with open(path_to_file) as f:
 14        str_data = f.read()
 15 except IOError as e:
 16    print "I/O error({0}) when opening {1}: {2}! I quit!".
      format(e.errno, path_to_file, e.strerror)
 17    raise e
 18
 19 pattern = re.compile('[W_]+')
 20 word_list = pattern.sub(' ', str_data).lower().split()
 21 return word_list
 22
 23 def remove_stop_words(word_list):
 24 assert(type(word_list) is list), "I need a list!"
 25
 26 try:
 27    with open('../stop_words.txt') as f:
 28        stop_words = f.read().split(',')
 29 except IOError as e:
 30    print "I/O error({0}) when opening ../stops_words.txt:
      {1}! I quit!".format(e.errno, e.strerror)
 31    raise e
 32
 33 stop_words.extend(list(string.ascii_lowercase))
 34 return [w for w in word_list if not w in stop_words]
 35
 36 def frequencies(word_list):
 37 assert(type(word_list) is list), "I need a list!"
 38 assert(word_list <> []), "I need a non-empty list!"
 39
 40 word_freqs = {}
 41 for w in word_list:
 42    if w in word_freqs:
 43        word_freqs[w] += 1
 44    else:
 45        word_freqs[w] = 1
 46 return word_freqs
 47
 48 def sort(word_freq):
 49 assert(type(word_freq) is dict), "I need a dictionary!"
 50 assert(word_freq <> {}), "I need a non-empty dictionary!"
 51
 52 try:
 53    return sorted(word_freq.iteritems(), key=operator.
      itemgetter(1), reverse=True)
 54 except Exception as e:
 55    print "Sorted threw {0}: {1}".format(e)
 56    raise e
 57
 58 #
 59 # The main function
 60 #
 61 try:
 62 assert(len(sys.argv)> 1), "You idiot! I need an input file!"
 63 word_freqs = sort(frequencies(remove_stop_words(extract_words(
      sys.argv[1]))))
 64
 65 assert(type(word_freqs) is list), "OMG! This is not a list!"
 66 assert(len(word_freqs)> 25), "SRSLY? Less than 25 words!"
 67 for (w, c) in word_freqs[0:25]:
 68    print w, ' - ', c
 69 except Exception as e:
 70 print "Something wrong: {0}".format(e)
 71 traceback.print_exc()

21.3 Commentary

THIS STYLE is as defensive as the previous one: the same possible errors are being checked. But the way it reacts when abnormalities are detected is quite different: the functions simply refuse to continue.

Let's look at the example program, again starting at the bottom. In line #62, we are not just checking that there is a file name given in the command line, but we are asserting that it must exist, or else it throws an exception – the assert function throws the AssertionError exception when the stated condition is not met.

A similar approach can be seen in other parts of the program. In the function extract_words, lines #9 and #10, we are asserting that the argument meets certain conditions, or else the function throws an exception. In lines #12–17, if the opening or reading of the file throws an exception, we are catching it right there, printing a message about it, and passing the exception up the stack for further catching. Similar code – i.e. assertions, and local exception handling – can be seen in all the other functions.

Stopping the program's execution flow when abnormalities happen is one way to ensure that those abnormalities don't cause damage. In many cases, it may be the only option, as fallback strategies may not always be good or desirable.

This style has one thing in common with the Constructivist style of the previous chapter: it is checking for errors, and handling them, in the local context in which the errors may occur. The difference here is that the fallback strategies of the Constructivist style are interesting parts of the program in themselves, whereas the cleanup and exit code of the Tantrum style is not.

This kind of local error checking is particularly visible in programs written in languages that don't have exceptions. C is one of those languages. When guarding against problems, C programs check locally whether errors have occurred, and, if so, either use reasonable fallback values (Constructivist) or escape the function in the style explained here. In languages without exception handling, like C, the abnormal return from functions is usually flagged using error codes in the form of negative integers, null pointers, or global variables (e.g. errno), which are then checked in the call sites.

Dealing with abnormalities in this way can result in verbose boilerplate code that distracts the reader from the actual goals of the functions. It is quite common to encounter portions of the programs written in this style with one line of functional code followed by a long sequence of conditional blocks that check for the occurrence of various errors, each one returning an error at the end of the block.

In order to avoid some of the verbosity of this style, advanced C programmers sometimes resort to using C's GOTO statement. One of the main advantages of GOTOs is the fact that they allow non-local escapes, avoiding boilerplate, distracting code when dealing with errors, while supporting a single exit point out of functions. GOTOs allow us to express our displeasure of errors in a more contained, succinct form. But GOTOs have long been discouraged, or outright banned, from mainstream programming languages, for all sorts of good reasons.

21.4 This Style in Systems Design

Computers are dumb machines that need to be told exactly and unambiguously what to do. Computer software inherited that trait. Many software systems don't make much effort in trying to guess the intentions behind wrong inputs (from users or other components); it is much easier and risk-free to simply refuse to continue. Therefore this style is seen pervasively in software. Worse, many times the errors are flagged with incomprehensible error messages that don't inform the offending party in any actionable way.

When being pessimistic about adversity, it is important to at least let the other party know what was expected and why the function/component is refusing to continue.

21.5 Further Reading

IBM (1957). The FORTRAN Automatic coding system for the IBM 704 EDPM. Available at: http://www.softwarepreservation.org/projects/FORTRAN/manual/Prelim_Oper_Man-1957_04_07.pdf
Synopsis: The original FORTRAN manual, showing a long list of possible error codes and what to do with them. The list mixes machine (hardware) errors with human (software) errors. Some of the human errors are syntactic while others are a bit more interesting. For example, error 430 is described as "Program too complex. Simplify or do in 2 parts (too many basic blocks)."

21.6 Glossary

Error code: Enumerated messages that denote faults in specific components.

21.7 Exercises

21.1 Another language. Implement the example program in another language, but preserve the style.

21.2 A different task. Write one of the tasks proposed in the Prologue using this style.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset