Some programs benefit from having a language to describe operations they can perform. The Interpreter pattern generally describes defining a grammar for that language and using that grammar to interpret statements in that language.
When a program presents a number of different but somewhat similar cases it can deal with, it can be advantageous to use a simple language to describe these cases and then have the program interpret that language. Such cases can be as simple as the sort of Macro language recording facilities a number of office suite programs provide or as complex as Visual Basic for Applications (VBA). VBA is not only included in Microsoft Office products, but it can be embedded in any number of third-party products quite simply.
One of the problems we must deal with is how to recognize when a language can be helpful. The Macro language recorder simply records menu and keystroke operations for later playback and just barely qualifies as a language; it may not actually have a written form or grammar. Languages such as VBA, on the other hand, are quite complex, but they are far beyond the capabilities of the individual application developer. Further, embedding commercial languages usually require substantial licensing fees, which makes them less attractive to all but the largest developers.
As the SmallTalk Companion notes, recognizing cases where an Interpreter can be helpful is much of the problem, and programmers without formal language/compiler training frequently overlook this approach. There are not large numbers of such cases, but there are three general places where languages are applicable.
When you need a command interpreter to parse user commands. The user can type queries of various kinds and obtain a variety of answers.
When the program must parse an algebraic string. This case is fairly obvious. The program is asked to carry out its operations based on a computation where the user enters an equation of some sort. This frequently occurs in mathematical-graphics programs where the program renders a curve or surface based on any equation it can evaluate. Programs like Mathematica and graph drawing packages such as Origin work in this way.
When the program must produce varying kinds of output. This case is a little less obvious but far more useful. Consider a program that can display columns of data in any order and sort them in various ways. These programs are frequently referred to as Report Generators, and while the underlying data may be stored in a relational database, the user interface to the report program is usually much simpler than the SQL language that the database uses. In fact, in some cases, the simple report language may be interpreted by the report program and translated into SQL.
Let's consider a simplified report generator that can operate on five columns of data in a table and return various reports on these data. Suppose we have the following results from a swimming competition.
The five columns are frname, lname, age, club and time. If we consider the complete race results of 51 swimmers, we realize that it might be convenient to sort these results by club, by last name, or by age. Since there are a number of useful reports we could produce from these data in which the order of the columns changes as well as the sorting, a language is one useful way to handle these reports.
We'll define a very simple nonrecursive grammar of this sort.
Print lname frname club time Sortby club Thenby time
Amanda McCarthy
| 12
| WCA
| 29 .28
|
Jamie Falco
| 12
| HNHS
| 29 .80
|
Meaghan O'Donnell
| 12
| EDST
| 30 .00
|
Greer Gibbs
| 12
| CDEV
| 30 .04
|
Rhiannon Jeffrey
| 11
| WYW
| 30 .04
|
Sophie Connolly
| 12
| WAC
| 30 .05
|
Dana Helyer
| 12
| ARAC
| 30 .18
|
For the purposes of this example, we define these three verbs.
Print Sortby Thenby
And we'll define the five column names we listed earlier.
Frname Lname Age Club Time
For convenience, we'll assume that the language is case insensitive. We'll also note that the simple grammar of this language is punctuation free and amounts in brief to the following.
Print var[var] [sortby var [thenby var]]
Finally, there is only one main verb, and while each statement is a declaration, there is no assignment statement or computational ability in this grammar.
Interpreting the language takes place in three steps.
We parse the language into tokens by simply scanning each statement with a StringTokenizer and then substituting a number for each word. Usually parsers push each parsed token onto a stack we will use that technique here. We implement the Stack class using an Arraylist—where we have push, pop, top, and nextTop methods to examine and manipulate the stack contents.
After parsing, our stack could look like this.
type | token | <-top of stack
|
---|---|---|
Var
| Time
| |
Verb
| Thenby
| |
Var
| Club
| |
Verb
| Sortby
| |
Var
| Time
| |
Var
| Club
| |
Var
| Frname
| |
verb
| Lname
|
However, we quickly realize that the “verb” Thenby has no real meaning other than clarification, and it is more likely that we'd parse the tokens and skip the Thenby word altogether. Our initial stack then, looks like this.
Time Club Sortby Time Club Frname Lname Print
Because the Interpreter pattern relies so heavily on having parsing objects that all derived from the same base class, we will illustrate this pattern in VB7 initially. While you can write this pattern using VB6 interfaces, as we will see, it is more difficult to present and explain.
In this parsing procedure, we do not push just a numeric token onto the stack but a ParseObject that has the both a type and a value property.
Public Class ParseObject Public Const VERB As Integer = 1000 Public Const VAR As Integer = 1010 Public Const MULTVAR As Integer = 1020 Protected value As Integer Protected type As Integer Public Sub New(ByVal val As Integer, _ ByVal typ As Integer) MyBase.New() value = val type = typ End Sub '---- Public Function getValue() As Integer Return value End Function '---- Public Function get_Type() As Integer Return type End Function End Class
These objects can take on the type VERB or VAR. Then we extend this object into ParseVerb and ParseVar objects, whose value fields can take on PRINT or SORT for ParseVerb and FRNAME, LNAME, and so on for ParseVar. For later use in reducing the parse list, we then derive Print and Sort objects from ParseVerb.
This gives us a simple hierarchy shown in Figure 24-1.
The parsing process is just the following simple code, using the String Tokenizer and the parse objects. Part of the main Parser class is shown here.
Public Class Parser Implements Command Private stk As Stack Private actionList As ArrayList Private kdata As KidData Private dat As Data Private ptable As Listbox Private chn As Chain Public Sub New(ByVal line As String, _ ByVal k As KidData, ByVal pt As ListBox) stk = New Stack() setData(k, pt) 'actions accumulate here actionList = New ArrayList() buildStack(line) buildChain() 'construct interpreter chain End Sub '---------------------------------------- Private Sub buildStack(ByVal line As String) 'parse input tokens and build stack Dim tok As StringTokenizer = _ New StringTokenizer(line) While (tok.hasMoreElements()) Dim token As ParseObject = _ tokenize(tok.nextToken()) stk.push(token) End While End Sub '-------------------------------------------- Public Sub setData(ByVal k As KidData, _ ByVal pt As listbox) dat = New Data(k.getData()) ptable = pt End Sub '-------------------------------------- Protected Function tokenize(ByVal s As String) As _ ParseObject Dim obj As ParseObject Dim typ As Integer Try obj = getVerb(s) typ = obj.get_Type 'this will throw null exception Catch e As NullReferenceException obj = getVar(s) End Try Return obj End Function '---------------------------------------- Protected Function getVerb(ByVal s As String) _ As ParseVerb Dim v As ParseVerb v = New ParseVerb(s, dat, ptable) If (v.isLegal()) Then Return v.getVerb(s) Else Return Nothing End If End Function '---------------------------------------- Protected Function getVar(ByVal s As String) _ As ParseVar Dim v As ParseVar v = New ParseVar(s) If (v.isLegal()) Then Return v End If End Function End Class
The ParseVerb and ParseVar classes return objects with isLegal set to true if they recognize the word.
Public Class ParseVerb Inherits ParseObject Protected Const PRINT As Integer = 100 Protected Const SORTBY As Integer = 110 Protected Const THENBY As Integer = 120 Protected args As ArrayList Protected kd As Data Protected pt As listbox '------- Public Sub New(ByVal s As String, _ ByVal kd_ As Data, ByVal pt_ As ListBox) args = New Arraylist() s = s.ToLower() value = -1 type = VERB kd = kd_ pt = pt_ Select Case s Case "print" value = PRINT Case "sortby" value = SORTBY End Select End Sub '----- Public Function getVerb(ByVal s As String) _ As ParseVerb Select Case value Case PRINT Return New Print(s, kd, pt) Case SORTBY Return New Sort(s) Case Else Return Nothing End Select End Function
The tokens on the stack have this form.
Var Var Verb Var Var Var Var Verb
We reduce the stack a token at a time, folding successive Vars into a MultVar class until the arguments are folded into the verb objects, as we show in Figure 24-2.
When the stack reduces to a verb, this verb and its arguments are placed in an action list; when the stack is empty, the actions are executed.
Creating a Parser class that is a Command object and executing it when the Go button is pressed on the user interface carries out this entire process.
Private Sub Compute_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles Compute.Click parse() 'call the parser End Sub '----- Private Sub parse() 'parse the command in the text box Dim par As New Parser( _ txCommand.Text, kdata, lsResults) par.Execute() End Sub
The parser itself just reduces the tokens, as the preceding shows. It checks for various pairs of tokens on the stack and reduces each pair to a single one for each of five different cases.
It would certainly be possible to write a parser for this simple grammar as just a series of if statements. For each of the six possible stack configurations, reduce the stack until only a verb remains. Then, since we have made the Print and Sort verb classes Command objects, we can just Execute them one by one as the action list is enumerated.
However, the real advantage of the Interpreter pattern is its flexibility. By making each parsing case an individual object, we can represent the parse tree as a series of connected objects that reduce the stack successively. Using this arrangement, we can easily change the parsing rules without much in the way of program changes: We just create new objects and insert them into the parse tree.
According to the Gang of Four, these are the names for the participating objects in the Interpreter pattern:
AbstractExpression—. declares the abstract Interpret operation.
TerminalExpression—. interprets expressions containing any of the terminal tokens in the grammar.
NonTerminalExpression—. interprets all of the nonterminal expressions in the grammar.
Context—. contains the global information that is part of the parser—in this case, the token stack.
Client—. builds the syntax tree from the preceding expression types and invokes the Interpret operation.
The syntax tree we construct to carry out the parsing of the stack we just showed can be quite simple. We just need to look for each of the stack configurations we defined and reduce them to an executable form. In fact, the best way to implement this tree is using a Chain of Responsibility, which passes the stack configuration along between classes until one of them recognizes that configuration and acts on it. You can decide whether a successful stack reduction should end that pass or not. It is perfectly possible to have several successive chain members work on the stack in a single pass. The processing ends when the stack is empty. We see a diagram of the individual parse chain elements in Figure 24-3.
In this class structure, we start with the AbstractExpression interpreter class InterpChain.
Public MustInherit Class InterpChain Implements Chain Private nextChain As chain Protected stk As Stack '------------------------------------------ Public Sub addtoChain(ByVal c As Chain) _ Implements Chain.addToChain nextChain = c 'next in chain of resp End Sub '----------------------------------------- Public MustOverride Function interpret() As Boolean '------------------------------------------ Public Function getChain() As Interpreter.Chain _ Implements Interpreter.Chain.getChain Return nextChain End Function '----------------------------------------- Public Sub sendToChain(ByVal stk_ As Stack) _ Implements Chain.sendToChain stk = stk_ If (Not interpret) Then 'interpret stack 'Otherwise, pass request along chain nextChain.sendToChain(stk) End If End Sub '----------------------------------------- Protected Sub addArgsToVerb() Dim v As ParseObject = CType(stk.pop(), parseobject) Dim verb As ParseVerb = CType(stk.pop(), parseverb) verb.addArgs(v) stk.push(verb) End Sub '---------------------------------------- Protected Function topStack(ByVal c1 As Integer, _ ByVal c2 As Integer) As Boolean Dim pobj1, pobj2 As ParseObject pobj1 = stk.top pobj2 = stk.nextTop Return (pobj1.get_Type() = c1) And (pobj2.get_Type() = c2) End Function End Class
This class also contains the methods for manipulating objects on the stack. Each of the subclasses implements the interpret operation differently and reduces the stack accordingly. For example, the complete VarVarParse class reduces two variables on the stack in succession to a single MultVar object.
Public Class VarVarParse Inherits InterpChain Public Overrides Function interpret() As Boolean If (topStack(ParseObject.VAR, ParseObject.VAR)) Then 'reduce (Var Var) to Multvar Dim v As ParseVar = CType(stk.pop(), ParseVar) Dim v1 As ParseVar = CType(stk.pop(), ParseVar) Dim mv As MultVar = New MultVar(v1, v) stk.push(mv) Return True Else Return False End If End Function End Class
Thus, in this implementation of the pattern, the stack constitutes the Context participant. Each of the first five subclasses of InterpChain are NonTerminal Expression participants, and the ActionVerb class that moves the completed verb and action objects to the actionList constitutes the TerminalExpression participant.
The client object is the Parser class that builds the stack object list from the typed-in command text and constructs the Chain of Responsibility from the various interpreter classes. We showed most of the Parser class above already. However, it also implements the Command pattern and sends the stack through the chain until it is empty and then executes the verbs that have accumulated in the action list when its Execute method is called.
'executes parse and interpretation of command line Public Sub Execute() Implements Command.Execute Dim i As Integer While (stk.hasMoreElements()) chn.sendToChain(stk) End While 'now execute the verbs For i = 0 To actionList.Count - 1 Dim v As Verb = CType(actionList(i), Verb) v.setData(dat, ptable) v.Execute() Next i End Sub
The final visual program is shown in Figure 24-4.
Writing an interpreter in VB6 is a bit more challenging because we don't have the convenience of inheriting many methods that the parser classes have in common. We will still organize the parser using the same classes and using the Chain of Responsibility pattern to send the stack from one to the next until the stack pattern is matched. However, we have to flatten out our class hierarchy to a much simpler single level one, since VB6 does not conveniently allow classes that implement one interface to then be used as interfaces themselves.
Thus, we might have started with the Chain interface and build from it.
'Interface class Chain '-------- Public Sub addChain(c As Chain) End Sub '-------- Public Sub sendToChain(stk As Stack) End Sub '-------- Public Function getChain() As Chain End Function '-------- Public Function hasChain() As Boolean End Function
Since we derive our Interpreter chain class from Chain by having it implement the Chain interface, we flatten the hierarchy and have the InterpChain class be the interface that all the remaining classes implement. Further, since a number of classes call the addArgsToVerb method, we put it in this class and let the derived classes create an instance of the base class and call that method.
'Class InterpChain 'Implements Chain but does it directly Private stk As Stack Private nextChain As InterpChain Private has_chain As Boolean '----- Public Sub init(stk_ As Stack) Set stk = stk_ End Sub '----- Public Sub addToChain(c As InterpChain) Set nextChain = c has_chain = True End Sub '----- Public Sub sendToChain(stk As Stack) End Sub '----- Public Function interpret() As Boolean End Function '----- Public Sub addArgsToVerb() Dim v As ParseObject Set v = stk.pop() 'get one off stack Dim vo As ParseObject Set vo = stk.pop() vo.addArg v 'add to next once stk.push vo 'and put it back End Sub
Using this interface, we can write each of the six stack reduction parsers. Here is the VarVarParse class that reduces two variables on the stack to a single MultVar variable.
'Class VarVarParse Implements InterpChain Private nextChain As InterpChain Private stk As Stack Private ichain As New InterpChain Private has_chain As Boolean '----- Private Sub InterpChain_addArgsToVerb() ichain.addArgsToVerb End Sub '----- Private Sub InterpChain_addToChain(c As InterpChain) Set nextChain = c 'next in chain has_chain = True End Sub '----- Public Sub init(stk_ As Stack) Set stk = stk_ ichain.init stk 'used to call addArgsToVerb End Sub '----- Private Function InterpChain_getChain() As InterpChain Set InterpChain_getChain = nextChain End Function '----- Private Function InterpChain_hasChain() As Boolean InterpChain_hasChain = has_chain End Function '----- Private Sub InterpChain_init(stk_ As Stack) Set stk = stk_ init stk_ End Sub '----- Private Function InterpChain_interpret() As Boolean Dim v As ParseObject, v1 As ParseObject, mv As Multvar If (stk.topStack(VAR_TYPE, VAR_TYPE)) Then 'reduce (Var Var) to Multvar Set v = stk.pop() Set v1 = stk.pop() Set mv = New Multvar mv.init v1, v stk.push mv InterpChain_interpret = True Else InterpChain_interpret = False End If End Function '----- Private Sub InterpChain_sendToChain(stk_ As Stack) 'try interpreting the top of the stack If Not InterpChain_interpret Then 'if no match, pass it on nextChain.sendToChain stk_ End If End Sub '----- Private Function InterpChain_topStack(ByVal c1 As Integer, ByVal c2 As Integer) As Boolean InterpChain_topStack = stk.topStack(c1, c2) End Function
In our VB7, we created objects representing each of the stack reduction states: ParseObject, ParseVar, ParseVerb, Verb, Sort, and Print. We derived the Parsevar and ParseVerb classes from ParseObject and the Command object Verb from ParseVerb. Then we derived Print and Sort from Verb. In the VB6 version, we have to flatten this hierarchy as well and have all these classes implement the ParseObject interface so we can use them interchangeably.
'Interface ParseObject 'All objects passed around interpreter implement this interface '----- Public Function getValue() As Integer End Function '----- Public Function getType() As Integer End Function '----- Public Sub init() End Sub '----- Public Sub addArg(p As ParseObject) End Sub '----- Public Sub setData(dt As data, ByVal pt As ListBox) End Sub
In VB6, we must rename the Print class to Printit, since VB6 has a single name space and Print is a reserved word. Both our Printit and Sort classes also implement the Command pattern, so they can be executed from the action list stack.
This interpreter works much as the previous one once we have gotten all the classes reduced to being ParseObjects. The user string is tokenized and converted into ParseObjects of type ParseVar or ParseVerb. Then the stack is reduced by combining adjacent ParseVars into MultVars and Var-Verb or MultVar-Verb combined into Verbs that contain the list of variables to operate on. When the top of the stack is a verb, it is removed and placed on the action list. When the stack is empty, the action list is executed, treating each verb as a Command with an Execute method.
Whenever you introduce an interpreter into a program, you need to provide a simple way for the program user to enter commands in that language. It can be as simple as the Macro record button we noted earlier, or it can be an editable text field like the one in the preceding program.
However, introducing a language and its accompanying grammar also requires fairly extensive error checking for misspelled terms or misplaced grammatical elements. This can easily consume a great deal of programming effort unless some template code is available for implementing this checking. Further, effective methods for notifying the users of these errors are not easy to design and implement.
In the preceding Interpreter example, the only error handling is that keywords that are not recognized are not converted to ParseObjects and pushed onto the stack. Thus, nothing will happen because the resulting stack sequence probably cannot be parsed successfully, or if it can, the item represented by the misspelled keyword will not be included.
You can also consider generating a language automatically from a user interface of radio and command buttons and list boxes. While it may seem that having such an interface obviates the necessity for a language at all, the same requirements of sequence and computation still apply. When you have to have a way to specify the order of sequential operations, a language is a good way to do so, even if the language is generated from the user interface.
The Interpreter pattern has the advantage that you can extend or revise the grammar fairly easily once you have built the general parsing and reduction tools. You can also add new verbs or variables easily once the foundation is constructed. However, as the syntax of the grammar becomes more complex, you run the risk of creating a hard-to-maintain program.
While interpreters are not all that common in solving general programming problems, the Iterator pattern we take up next is one of the most common ones you'll be using.
Design a system to compute the results of simple quadratic expressions such as
4x^2 + 3x - 4
where the user can enter x or a range of x's and can type in the equation.