Chapter 23. The Interpreter Pattern

Some programs benefit from having a language to describe operations they can perform. The Interpreter pattern generally describes defining a grammar for that language and using that grammar to interpret statements in that language.

Motivation

When a program presents a number of different but somewhat similar cases it can deal with, it can be advantageous to use a simple language to describe these cases and then have the program interpret that language. Such cases can be as simple as the sort of Macro language recording facilities a number of office suite programs provide or as complex as Visual Basic for Applications (VBA). VBA is not only included in Microsoft Office products, but it can be embedded in any number of third-party products quite simply.

One of the problems we must deal with is how to recognize when a language can be helpful. The Macro language recorder simply records menu and keystroke operations for later playback and just barely qualifies as a language; it may not actually have a written form or grammar. Languages such as VBA, on the other hand, are quite complex, but they are far beyond the capabilities of the individual application developer. Further, embedding commercial languages usually require substantial licensing fees, which makes them less attractive to all but the largest developers.

Applicability

As the SmallTalk Companion notes, recognizing cases where an Interpreter can be helpful is much of the problem, and programmers without formal language/compiler training frequently overlook this approach. There are not many such cases, but there are three general places where languages are applicable.

  1. When you need a command interpreter to parse user commands. The user can type queries of various kinds and obtain a variety of answers.
  2. When the program must parse an algebraic string. This case is fairly obvious. The program is asked to carry out its operations based on a computation where the user enters an equation of some sort. This frequently occurs in mathematical-graphics programs where the program renders a curve or surface based on any equation it can evaluate. Programs like Mathematica and graph drawing packages such as Origin work in this way.
  3. When the program must produce varying kinds of output. This case is a little less obvious but far more useful. Consider a program that can display columns of data in any order and sort them in various ways. These programs are frequently referred to as Report Generators, and while the underlying data may be stored in a relational database, the user interface to the report program is usually much simpler than the SQL language that the database uses. In fact, in some cases, the simple report language may be interpreted by the report program and translated into SQL.

A Simple Report Example

Let’s consider a simplified report generator that can operate on five columns of data in a table and return various reports on these data. Suppose we have the following results from a swimming competition.

image

The five columns are frname, lname, age, club and time. If we consider the complete race results of 51 swimmers, we realize that it might be convenient to sort these results by club, by last name, or by age. Since there are a number of useful reports we could produce from these data in which the order of the columns changes as well as the sorting, a language is one useful way to handle these reports.

We’ll define a very simple nonrecursive grammar of this sort.


Print lname frname club time Sortby club Thenby time

For the purposes of this example, we define these three verbs.


Print
Sortby
Thenby

And we’ll define the five column names we listed earlier.


Frname
Lname
Age
Club
Time

For convenience, we’ll assume that the language is case insensitive. We’ll also note that the simple grammar of this language is punctuation free and amounts in brief to the following.

Print var[var] [sortby var [thenby var]]

Finally, there is only one main verb, and while each statement is a declaration, there is no assignment statement or computational ability in this grammar.

Interpreting the Language

Interpreting the language is a three-step process.

  1. Parsing the language symbols into tokens.
  2. Reducing the tokens into actions.
  3. Executing the actions.

We parse the language into tokens by simply scanning each statement with a StringTokenizer and then substituting a number for each word. Usually parsers push each parsed token onto a stack, and we will use that technique here. We implement the Stack class using an ArrayList—where we have push, pop, top, and nextTop methods to examine and manipulate the stack contents. After parsing, our stack could look like this.

image

However, we quickly realize that the “verb” Thenby has no real meaning other than clarification, and it is more likely that we’d parse the tokens and skip the Thenby word altogether. Our initial stack then, looks like this.


Time
Club
Sortby
Time
Club
Frname
Lname
Print

Objects Used in Parsing

In this parsing procedure, we do not push just a numeric token onto the stack but a ParseObject that has both a type and a value property.


public class ParseObject  {
      public const int VERB=1000;
      public const int VAR=1010;
      public const int MULTVAR=1020;
      protected int value, type;
      //-----
      public ParseObject(int val, int typ)   {
             value = val;
             type = typ;
      }
      //-----
      public int getValue() {
             return value;
      }
      //-----
      public int getType() {
             return type;
      }
}

These objects can take on the type VERB or VAR. Then we extend this object into ParseVerb and ParseVar objects, whose value fields can take on PRINT or SORT for ParseVerb and FRNAME, LNAME, and so on for ParseVar. For later use in reducing the parse list, we then derive Print and Sort objects from ParseVerb.

This gives us the simple hierarchy shown in Figure 23-1.

Figure 23-1. A simple parsing hierarchy for the Interpreter pattern

image

The parsing process is just the following simple code, using the StringTokenizer and the parse objects. Part of the main Parser class is shown here.


public class Parser       {
      private Stack stk;
      private ArrayList actionList;
      private Data dat;
      private ListBox ptable;
      private Chain chn;
      //-----
      public Parser(string line, KidData kd, ListBox pt){
             stk = new Stack ();
             //list of verbs accumulates here
             actionList = new ArrayList ();
             setData(kd, pt);
             buildStack(line);   //create token stack
             buildChain();       //create chain of responsibility
      }
      //-----
      private void buildChain() {
             chn = new VarVarParse(); //start of chain
             VarMultvarParse vmvp = new VarMultvarParse();
             chn.addToChain(vmvp);
             MultVarVarParse mvvp = new MultVarVarParse();
             vmvp.addToChain(mvvp);
             VerbMultvarParse vrvp = new VerbMultvarParse();
             mvvp.addToChain(vrvp);
             VerbVarParse vvp = new VerbVarParse();
             vrvp.addToChain(vvp);
             VerbAction va = new VerbAction(actionList);
             vvp.addToChain(va);
             Nomatch nom = new Nomatch ();     //error handler
             va.addToChain (nom);
      }
      //-----
      public void setData(KidData kd, ListBox pt) {
             dat = new Data(kd.getData ());
             ptable = pt;
      }
      //-----
      private void buildStack(string s) {
             StringTokenizer tok = new StringTokenizer (s);
             while(tok.hasMoreElements () ) {
                    ParseObject token = tokenize(tok.nextToken ));
                           stk.push (token);
             }
      }
      //-----
      protected ParseObject tokenize(string s) {
             ParseObject obj;
             int type;
             try {
                    obj = getVerb(s);
                    type = obj.getType ();
             }
             catch(NullReferenceException) {
                    obj = getVar(s);
             }
             return obj;
      }
      //-----
      protected ParseVerb getVerb(string s) {
             ParseVerb v = new ParseVerb (s, dat, ptable);
             if(v.isLegal () )
                    return v.getVerb (s);
             else
                    return null;
      }
      //-----
      protected ParseVar getVar(string s) {
             ParseVar v = new ParseVar (s);
             if( v.isLegal())
                    return v;
             else
                    return null;
      }
}

The ParseVerb and ParseVar classes return objects with isLegal set to true if they recognize the word.


public class ParseVerb:ParseObject      {
      protected const int PRINT = 100;
      protected const int SORT = 110;
      protected const int THENBY = 120;
      protected ArrayList args;
      protected Data kid;
      protected ListBox pt;
      protected ParseVerb pv;
      //-----
      public ParseVerb(string s, Data kd, ListBox ls):
                    base(-1, VERB) {
             args = new ArrayList ();
             kid = kd;
             pt = ls;
             if(s.ToLower().Equals ("print")) {
                    value = PRINT;
             }
             if(s.ToLower().Equals ("sortby")) {
                    value = SORT;
             }
      }
      //------
      public ParseVerb getVerb(string s) {
             pv = null;
             if(s.ToLower ().Equals ("print"))
                    pv =new Print(s,kid, pt);
      if(s.ToLower ().Equals ("sortby"))
             pv = new Sort (s, kid, pt);
      return pv;
}
//-----
public void addArgs(MultVar mv) {
      args = mv.getVector ();
}

Reducing the Parsed Stack

The tokens on the stack have this form.


Var
Var
Verb
Var
Var
Var
Var
Verb

We reduce the stack a token at a time, folding successive Vars into a MultVar class until the arguments are folded into the verb objects, as shown in Figure 23-2.

Figure 23-2. How the stack is reduced during parsing

image

When the stack reduces to a verb, this verb and its arguments are placed in an action list; when the stack is empty, the actions are executed.

Creating a Parser class that is a Command object and executing it when the Go button is pressed on the user interface carries out this entire process.


private void btCompute_Click(object sender, EventArgs e) {
      parse();
}
private void parse() {
      Parser par = new Parser (txCommand.Text, kdata, lsResults);
      par.Execute ();
}

The parser itself just reduces the tokens, as the preceding shows. It checks for various pairs of tokens on the stack and reduces each pair to a single one for each of five different cases.

Implementing the Interpreter Pattern

It would certainly be possible to write a parser for this simple grammar as just a series of if statements. For each of the six possible stack configurations, reduce the stack until only a verb remains. Then, since we have made the Print and Sort verb classes Command objects, we can just Execute them one by one as the action list is enumerated.

However, the real advantage of the Interpreter pattern is its flexibility. By making each parsing case an individual object, we can represent the parse tree as a series of connected objects that reduce the stack successively. Using this arrangement, we can easily change the parsing rules without much in the way of program changes: We just create new objects and insert them into the parse tree.

According to the Gang of Four, these are the names for the participating objects in the Interpreter pattern.

AbstractExpression—declares the abstract Interpret operation.

TerminalExpression—interprets expressions containing any of the terminal tokens in the grammar.

NonTerminalExpression—interprets all of the nonterminal expressions in the grammar.

Context—contains the global information that is part of the parser—in this case, the token stack.

Client—builds the syntax tree from the preceding expression types and invokes the Interpret operation.

The Syntax Tree

The syntax tree we construct to carry out the parsing of the stack we just showed can be quite simple. We just need to look for each of the stack configurations we defined and reduce them to an executable form. In fact, the best way to implement this tree is using a Chain of Responsibility, which passes the stack configuration along between classes until one of them recognizes that configuration and acts on it. You can decide whether a successful stack reduction should end that pass or not. It is perfectly possible to have several successive chain members work on the stack in a single pass. The processing ends when the stack is empty. We see a diagram of the individual parse chain elements in Figure 23-3.

Figure 23-3. How the classes that perform the parsing interact

image

In this class structure, we start with the AbstractExpression interpreter class InterpChain.


public abstract class InterpChain:Chain {
      private Chain nextChain;
      protected Stack stk;
      private bool hasChain;
      //-----
      public InterpChain()       {
             stk = new Stack ();
             hasChain = false;
      }
      //-----
      public void addToChain(Chain c) {
             nextChain = c;
             hasChain = true;
      }
      //-----
      public abstract bool interpret();
      //-----
      public void sendToChain(Stack stack) {
             stk = stack;
             if(! interpret()  ) {             //interpret stack
                    nextChain.sendToChain (stk);      //pass along
             }
      }
      //-----
      public bool topStack(int c1, int c2) {
             ParseObject p1, p2;
             p1 = stk.top ();
             p2 = stk.nextTop ();
             try{
             return (p1.getType() == c1 && p2.getType() == c2);
             }
             catch(NullReferenceException) {
                    return false;
             }
      }
      //-----
      public void addArgsToVerb() {
             ParseObject p = (ParseObject) stk.pop();
             ParseVerb v =  (ParseVerb) stk.pop();
             v.addArgs (p);
             stk.push (v);
      }
      //-----
      public Chain getChain() {
             return nextChain;
      }

This class also contains the methods for manipulating objects on the stack. Each of the subclasses implements the interpret operation differently and reduces the stack accordingly. For example, the complete VarVarParse class reduces two variables on the stack in succession to a single MultVar object.


public class VarVarParse : InterpChain  {
      public override bool interpret() {
             if(topStack(ParseVar.VAR , ParseVar.VAR )) {
                    //reduces VAR VAR to MULTVAR
                    ParseVar v1 = (ParseVar) stk.pop();
                    ParseVar v2 = (ParseVar) stk.pop();
                    MultVar mv = new MultVar (v2, v1);
                    stk.push (mv);
                    return true;
             }
             else
                    return false;
      }
}

Thus, in this implementation of the pattern, the stack constitutes the Context participant. Each of the first five subclasses of InterpChain are NonTerminal Expression participants, and the ActionVerb class that moves the completed verb and action objects to the actionList constitutes the TerminalExpression participant.

The client object is the Parser class that builds the stack object list from the typed-in command text and constructs the Chain of Responsibility from the various interpreter classes. We just showed most of the Parser class already. However, it also implements the Command pattern and sends the stack through the chain until it is empty and then executes the verbs that have accumulated in the action list when its Execute method is called.


  //executes parse and interpretation of command line
public void Execute() {
      while(stk.hasMoreElements () ) {
             chn.sendToChain (stk);
      }
      //now execute the verbs
      for(int i=0; i< actionList.Count ; i++ ) {
             Verb v = (Verb)actionList[i];
             v.setData (dat, ptable);
             v.Execute ();
      }
}

The final visual program is shown in Figure 23-4.

Figure 23-4. The Interpreter pattern operating on the simple command in the text field

image

Consequences of the Interpreter Pattern

Whenever you introduce an interpreter into a program, you need to provide a simple way for the program user to enter commands in that language. It can be as simple as the Macro record button we noted earlier, or it can be an editable text field like the one in the preceding program.

However, introducing a language and its accompanying grammar also requires fairly extensive error checking for misspelled terms or misplaced grammatical elements. This can easily consume a great deal of programming effort unless some template code is available for implementing this checking. Further, effective methods for notifying the users of these errors are not easy to design and implement.

In the preceding Interpreter example, the only error handling is that keywords that are not recognized are not converted to ParseObjects and pushed onto the stack. Thus, nothing will happen because the resulting stack sequence probably cannot be parsed successfully, or if it can, the item represented by the misspelled keyword will not be included.

You can also consider generating a language automatically from a user interface of radio and command buttons and list boxes. While it may seem that having such an interface obviates the necessity for a language at all, the same requirements of sequence and computation still apply. When you have to have a way to specify the order of sequential operations, a language is a good way to do so, even if the language is generated from the user interface.

The Interpreter pattern has the advantage that you can extend or revise the grammar fairly easily once you have built the general parsing and reduction tools. You can also add new verbs or variables easily once the foundation is constructed. However, as the syntax of the grammar becomes more complex, you run the risk of creating a hard-to-maintain program.

While interpreters are not all that common in solving general programming problems, the Iterator pattern we take up next is one of the most common ones you’ll be using.

Thought Question

Design a system to compute the results of simple quadratic expressions such as

image

where the user can enter x or a range of x’s and can type in the equation.

Program on the CD-ROM

image

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset