Special

Clearance Sale!

We've been publishing for over five years now and it's time to clear out our inventory of back issues, so we're slashing prices!

RBD Magazines

Check out this amazing clearance sale of all our past issues. Missing some issues? This is a great time to complete your RBD collection. Save up to 40% off the regular price of our printed back issue packages. These prices are only good until the end of the year May 2008 and supplies are limited, so place your order today.

Article Preview


Buy Now

Print:
PDF:

Algorithms

An RPN Interpreter

Installment Two

Issue: 6.1 (November/December 2007)
Author: Charles Yeomans
Article Description: No description available.
Article Length (in bytes): 14,438
Starting Page Number: 40
RBD Number: 6117
Resource File(s): None
Related Web Link(s):

http://www.declareSub.com/

Known Limitations: None

Excerpt of article text...

In the last installment of this column, I presented a naive implementation of an RPN interpreter. Now, we rewrite that code. The goal is to decouple parsing and evaluation of the arithmetic expression, with an eye toward greater things. We begin with the evaluation of an RPN expression. As discussed previously, reverse Polish notation (RPN), also called postfix notation, is a notation for writing mathematical expressions in which the operands for a function or operator precede the function. We can translate this more or less directly into REALbasic. We introduce classes RPNOperator and RPNOperand to correspond to operators and operands, respectively. Since they are both pieces of an RPN expression, we define a superclass RPNToken from which they inherit. To RPNToken, we add the following constructor. Sub Constructor(s as String, startPosition as Integer) me.SourceStart = startPosition me.SourceLength = Len(s) raiseEvent Open s End Sub The variable s is the string data extracted from the input to be converted into an RPNToken. We defer conversion to subclasses via the Open event. We are separating parsing of the input string from interpretation of the RPN expression. It is possible that parsing will succeed in generating an RPN expression that cannot be interpreted; for example, consider the expression 3 5 + + In order to provide some useful feedback as to the location of the error, we want to keep track of the location of an RPNToken in the input. So we add properties SourceStart as integer, SourceLength as integer. The implementation of the Open event in RPNOperand is Sub Open(s as String) me.Data = s End Sub where Data as String is a private property of RPNOperand. We also add a read-only computed property IntegerValue as integer, implemented in the obvious way. The implementation of the Open event in RPNOperator is Sub Open(s as String) me.Operator = s End Sub where Operator as String is a private property of RPNOperator. Now we can think of an RPN expression as an array of RPNTokens. To evaluate such an expression, we start at the beginning and work through the list. Recall the algorithm is this. for i as Integer = 0 to UBound(tokenList) if tokenList(i) isA RPNOperand then // add it to operand stack elseIf tokenList(i) isA RPNOperator then // pop operands off the stack // do the operation // push the result back on the stack else // something is really wrong end if next The result should now be the single element in the stack. We can replace the if-else block with some polymorphic behavior. We add a method Interpret(theStack() as RPNOperand) to RPNToken, and implement it in subclasses. So now the algorithm is as follows. for i as Integer = 0 to UBound(tokenList) tokenList(i).Interpret theStack next the result should now be the single element in the stack. Now it is a small step to a method Interpret(tokenList() as RPNToken) as RPNOperand to our RPNInterpreter class, introduced in the previous column. Function Interpret(tokenList() as RPNToken) As RPNOperand dim theStack(-1) as RPNOperand for i as Integer = 0 to UBound(tokenList) tokenList(i).Interpret theStack next if UBound(theStack) = 0 then return theStack.Pop else raise new RPNInterpreterException end if End Function Next we implement RPNToken.Interpret by calling an event, to be implemented in subclasses. Sub Interpret(theStack() as RPNOperand) raiseEvent Interpret theStack End Sub Here is the implementation of the event handler for RPNOperand. Sub Interpret(theStack() as RPNOperand) theStack.Append me End Sub Implementing the event in RPNOperator is a little more work, since we have more than one operator to support. Sub Interpret(context() as RPNOperand) select case me.Operator case "+" me.InterpretPlus context case "-" me.InterpretMinus context case "*" me.InterpretTimes context else raise new RPNInterpreterException end select End Sub Private Sub InterpretMinus(context() as RPNOperand) dim op2 as RPNOperand = context.Pop dim op1 as RPNOperand = context.Pop context.Append new RPNOperand(op1.IntegerValue - op2.IntegerValue) Exception oob as OutOfBoundsException raise new RPNInterpreterException End Sub Subroutines InterpretPlus, InterpretTimes are implemented similarly. The following describes what we have done so far. Given a language, represent its grammar as a collection of classes. Then we can write a program in the language as a bunch of objects, and execute the code to run the program. This is the Interpreter pattern. Here we are tackling the much simpler case of evaluating an expression. In subsequent columns, though, we will use the technique we are developing to implement a real language. Now we turn to the task of parsing an input string into an array of RPNTokens. As you may recall, here is the RPN interpreter as we left it. Function Interpret(rpnInput as String) As Integer dim character() as String = Split(rpnInput, "") const whitespaceSentinel = " " character.Append whitespaceSentinel dim operandStack() as Integer dim buffer() as String const stateWhitespace = 0 const stateOperand = 1 const stateOperator = 2 dim interpreterState as Integer = stateWhitespace for i as Integer = 0 to UBound(character) dim theCharacter as String = character(i) select case interpreterState case stateWhitespace if me.IsWhitespaceCharacter(theCharacter) then continue elseIf me.IsOperandCharacter(theCharacter) then redim buffer(-1) buffer.Append theCharacter interpreterState = stateOperand elseIf me.IsOperatorCharacter(theCharacter) then redim buffer(-1) buffer.Append theCharacter interpreterState = stateOperator else raise new RPNInterpreterException(i, "Illegal character") end if case stateOperand if me.IsOperandCharacter(theCharacter) then buffer.Append theCharacter elseIf me.IsWhitespaceCharacter(theCharacter) then operandStack.Append Val(Join(buffer, "")) redim buffer(-1) interpreterState = stateWhitespace elseIf me.IsOperatorCharacter(theCharacter) then operandStack.Append Val(Join(buffer, "")) redim buffer(-1) buffer.Append theCharacter interpreterState = stateOperator else raise new RPNInterpreterException(i, "Illegal character") end if case stateOperator if me.IsOperatorCharacter(theCharacter) then buffer.Append theCharacter elseif me.IsOperandCharacter(theCharacter) then try dim op2 as Integer = operandStack.Pop dim op1 as Integer = operandStack.Pop dim theOperator as String = Join(buffer, "") operandStack.Append Evaluate(op1, op2, theOperator) redim buffer(-1) buffer.Append character(i) interpreterState = stateOperand catch rpnError as RPNInterpreterException rpnError.Offset = i raise rpnError catch oob as OutOfBoundsException raise new RPNInterpreterException(i, "Missing operand") end try elseIf me.IsWhitespaceCharacter(theCharacter) then try dim op2 as Integer = operandStack.Pop dim op1 as Integer = operandStack.Pop dim theOperator as String = Join(buffer, "") operandStack.Append Evaluate(op1, op2, theOperator) interpreterState = stateWhitespace catch rpnError as RPNInterpreterException rpnError.Offset = i raise rpnError catch oob as OutOfBoundsException raise new RPNInterpreterException(i, "Missing operand") end try else raise new RPNInterpreterException(i, "Illegal character") end if else raise new RPNInterpreterException(0, "Interpreter is in invalid state.") end select next if UBound(operandStack) = 0 then return operandStack(0) elseIf UBound(operandStack) > 0 then raise new RPNInterpreterException(UBound(character) + 1, "Missing operator") else return 0 end if End Function We will do more than just refactor this code, because I want to fill a gap; this code does not handle negative numbers. As it turns out, adding support for negative numbers adds an interesting complication to the task of parsing the input string. We begin with a function prototype. Function Tokenize(rpnInput as String) as RPNToken() End Function As before, we split the input into an array of characters, and append a space to remove the need to handle the end of the loop as a special case. The value of each character puts the function into one of the states stateWhitespace, stateOperand, or stateOperator. While in a state, we accumulate characters in a buffer. When we switch state, we create an RPNOperand or RPNOperator, or discard whitespace. Function Tokenize(rpnInput as String) as RPNToken() dim characters() as String = Split(rpnInput, "") characters.Append " " const stateWhitespace = 0 const stateOperand = 1 const stateOperator = 2 dim buffer as String dim tokenList() as RPNToken dim state as Integer = stateWhitespace dim stateStartPosition as Integer for i as Integer = 0 to UBound(characters) dim theChar as String = characters(i) select case state case stateWhitespace case stateOperator case stateOperand else raise new RPNInterpreterException end select next End Function In the code above, there is one new local variable, stateStartPosition. As noted above, we are splitting tokenizing and interpretation into separate methods. We use this variable to keep track of the start of a new state. The case stateOperator is the easiest to implement, so we tackle it first. If the next character is whitespace, we add a new RPNOperator to tokenList and switch to stateWhitespace. If the next character is a digit, we add a new RPNOperator to tokenList and switch to stateOperand. Otherwise, we raise an exception. Implicit is the assumption that all operators are a single character. We can easily remove this assumption, but there is no need to now. The next state to consider is stateWhitespace. If the next character is again a whitespace character, we continue. If the next character is "+" or "*", we switch to stateOperator. If the next character is a digit, we switch to stateOperand. Now, what if the next character is "-"? The answer is that it depends on the following character. If that character is a digit, then we presume that "-" the leading minus sign of a negative number. Otherwise, we say that it is the minus operator. In other words, the decision requires a lookahead. Finally, we have stateOperand. If the next character is a digit, we continue. If the next character is whitespace, we add a new RPNOperand to tokenList and switch to stateWhitespace. If the next character is "+" or "*", we switch to stateOperator. If the next character is a "-", add a new RPNOperand to tokenList. Then we look ahead as above and switch to either stateOperator or stateOperand. Here's the code. Function Tokenize(rpnInput as String) As RPNToken() dim characters() as String = Split(rpnInput, "") characters.Append " " const stateWhitespace = 0 const stateOperand = 1 const stateOperator = 2 dim buffer as String dim tokenList() as RPNToken dim state as Integer = stateWhitespace dim startPosition as Integer for i as Integer = 0 to UBound(characters) dim theChar as String = characters(i) select case state case stateWhitespace if me.IsWhitespace(theChar) then continue elseif me.IsDigit(theChar) then parseState = stateOperand buffer = theChar startPosition = i elseIf me.IsPlus(theChar) or me.IsTimes(theChar) then parseState = stateOperator buffer = theChar startPosition = i elseIf me.IsMinus(theChar) then //we need to look at the next character dim nextChar as String = characters(i + 1) if me.IsDigit(nextChar) then parseState = stateOperand buffer = theChar startPosition = i else parseState = stateOperator buffer = theChar startPosition = i end if else raise new RPNInterpreterException end if case stateOperator if me.IsWhitespace(theChar) then tokenList.Append new RPNOperator(buffer, startPosition) parseState = stateWhitespace buffer = "" startPosition = i elseIf me.IsDigit(theChar) then tokenList.Append new RPNOperator(buffer, startPosition) parseState = stateOperand buffer = theChar startPosition = i else raise new RPNInterpreterException end if case stateOperand If me.IsDigit(theChar) then buffer = buffer + theChar elseIf me.IsMinus(theChar) then tokenList.Append new RPNOperand(buffer, startPosition) //we need to look at the next character dim nextChar as String = characters(i + 1) if me.IsDigit(nextChar) then parseState = stateOperand buffer = theChar startPosition = i else parseState = stateOperator buffer = theChar startPosition = i end if elseIf me.IsPlus(theChar) or me.IsTimes(theChar) then tokenList.Append new RPNOperand(buffer, startPosition) parseState = stateOperator buffer = theChar startPosition = i elseIf me.IsWhitespace(theChar) then tokenList.Append new RPNOperand(buffer, startPosition) parseState = stateWhitespace buffer = "" startPosition = i else raise new RPNInterpreterException end if else raise new RPNInterpreterException end select next return tokenList End Function Believe it not, we now have a start on a working interpreter.

...End of Excerpt. Please purchase the magazine to read the full article.

Article copyrighted by REALbasic Developer magazine. All rights reserved.


 


|

 


Weblog Commenting and Trackback by HaloScan.com