Special

Clearance Sale!

We've been publishing for over five years now and it's time to clear out our inventory of back issues, so we're slashing prices!

RBD Magazines

Check out this amazing clearance sale of all our past issues. Missing some issues? This is a great time to complete your RBD collection. Save up to 40% off the regular price of our printed back issue packages. These prices are only good until the end of the year May 2008 and supplies are limited, so place your order today.

Article Preview


Buy Now

Print:
PDF:

Feature

PostMortem: Reality Check

Code Analysis Software

Issue: 6.1 (November/December 2007)
Author: Dr. Scott Steinman
Article Description: No description available.
Article Length (in bytes): 21,283
Starting Page Number: 11
RBD Number: 6108
Resource File(s): None
Related Web Link(s):


Known Limitations: None

Excerpt of article text...

I came to REALbasic from the world of C++ and Java. While I came to love the ease and speed of programming in REALbasic, I missed some of the advanced programming utilities that I could use with Java, such as static analysis, documentation, and refactoring tools. My goal was to bring some of these tools to REALbasic, with the encouragement of Mars Saxman and my wife, Barbara. Reality Check is a quality assurance (QA) tool for REALbasic that I introduced at my presentation at the 2006 Real World conference. It analyzes REALbasic program source code for potential errors in object-oriented and procedural code design and adherence to coding standards. It is modeled after similar tools that are available for other programming languages. The program reads a REALbasic project in either binary or XML format, then breaks the project down into its constituent elements. Classes, modules, windows, and interfaces are further broken down into code, constants, and properties. Code is divided into methods, event handlers, and menu handlers. Data is broken down into constants, variables, properties, and method parameters. List boxes of each construct, data item and code item, along with the properties such as scope and data type, can be viewed in the main program window. Method code can be viewed in a fourth list box. In essence, the main window acts as an OOP code browser (see Figure 1). Each program element is then analyzed for adherence to OOP and procedural code standards as documented in many books on code design. For example, is an if-then statement too complex due to too many branches, or is a property used only in a single method, making it possible to convert it to a local variable, or is a class performing too many actions? If any such potential error is found for a given element, an icon indicates it in the main window's list boxes. The user may select that item and all of its associated errors will be displayed in an error list box, along with suggestions for correcting the error. If desired, the user may select to view all of the program's errors in a view with a single listbox (see Figure 2). An Example of Reality Check's Analyses Now that you've heard about Reality Check, let's see it in action. This is just a code snippet, but even this short code highlights how Reality Check can help REALbasic programmers. The original code is shown below: Class TestClass Const kUnused = 1 Property mPreviousError as integer Function test( var1 as integer, var2 as integer, var3 as integer, _ var4 as integer, var5 as integer, var6 as integer, var7 as integer ) as Boolean dim d as integer = getErrorCode() if d = 1 or d = 2 then return true else return false end if Exception End Function Function getErrorCode() as Integer // Dummy method for test purposes only return 2 End Function End Class Can you find all of the potential mistakes in the code (I mean, besides the fact that it doesn't do anything useful)? Now look at all of the error messages that Reality Check provides: For class TestClass: It is possible to convert class to an interface. Class has no superclass, events or menu handlers, and all methods are public. Class has no constructors. Check that class' data is left in a known state after New operator is called. Lazy class: Class performs few functions. Consider merging it with a related class. Constant is unused. Consider removing constant from class. Property is unused. Consider removing property from class. For method test (in Reality Check, the errors are shown in the actual code listing): Boolean method name is not descriptive. Boolean methods should begin with a question such as 'is', 'will', 'did', 'can' or 'has.' Parameter list is too long. Long parameter lists are prone to typing errors. Consider removing parameters. Parameters are unused. Consider removing unused parameters. Variable name is either too long or too short. Variable name should be long enough to be descriptive, but n so long as to be unwieldy. Standard variable name does not match type, e.g., an integer may be named x, y, z, i, j or k. A floating-point may be named d or f. 'Magic number' used. Numeric literal should be replaced with constant whose name describes its purpose. Method has multiple return points. Consider reducing the number of return statements. Exception handler is empty. This can lead to unhandled errors or to program crashes. First, the class design errors. The first message tells us that the code could be constructed as either a class or an interface; if we wanted to use it as an interface, nothing would prevent us from doing so. The second message states that the class has no constructor. While this is not always a problem, if the class has properties, we'd want them to be set in a constructor so instances of this class are always in a known state before they are used. The third message tells us that the class does not do much (because it has only two methods). A "lazy" class may not need to exist on its own; its task may be folded into a similar existing class. The last two messages tell us that the class' property and constant are not referenced anywhere in the class. They can therefore be safely removed. Now we examine the errors from the method named "test." First, its method name is misleading. It is a verb that connotes an action; that is, it may be changing the state of the class. This is inappropriate for a function, but even more so for a Boolean function. Boolean variables and methods that return them typically hold a yes/no answer. A Boolean method should therefore be named like a question -- "is this true?", "can this be done?", etc. Second, there are a ton of parameters to the method which aren't even accessed within the method. We can remove these parameters. Third, local variable "d" is poorly named. What does "d" mean? In addition, a name like "d", when used, is typically reserved for a temporary floating-point value. We've used it for an integer, which is misleading. Fourth, literal integer values are used in the method. These should be replaced with named constants or enumerations that describe what the number means. Fifth, the method has several points where return values are defined and from which the method can exit. While this can be beneficial at times, it is preferable to have a single return point in a method; multiple return points should be used with caution. Finally, while there is an exception handler defined in the method, it is empty. Even if the exception is caught, nothing occurs! Below is a quick rewrite of the code to correct these errors. Now imagine how useful Reality Check could be on a large-scale project that has been maintained for months or years. M NoError = 0, DiskNotReadyError, NoFileError End Enum End Module Class TestClass Property mPreviousError as integer Sub Constructor() mPreviousError = Error.NoError End Sub Function isFileIntact() as Boolean dim theCode as integer = getFileErrorCode() if theCode = Error.NoError then // Check file contents end if return theCode Exception MsgBox "Exception encountered in TestClass.isFileIntact" // Do something to handle exception, if possible End Function Private Function getFileErrorCode() as Integer // Dummy method for test purposes only return 2 End Function End Class Reality Check Development History Version 0.5 This was the first version that I released to a small private group of testers in 2005. It already contained the core functions of the program -- reading project files in XML format via my own XML file parsing code, analyzing the project file into its component elements, then running an analysis for a few core types of errors. Each program element -- projects, classes, interfaces, modules, windows, methods, events, menu handlers, properties, constants, local variables and constants, method parameters and return values -- is represented by its own Info class. This class contains the name of the element, its data and code contents, its parent (if a subclass), its interfaces (if a class), its implementer (if an interface), its code (if a method, event or menu handler), the errors present in the element, and the XMLNode of the project file that contains this element's definition. Every project element, as stored in an instance of that element's Info class, is held in a dictionary that is specific to that element. For example, a ProjectInfo object contains dictionary properties that store Info class instances for the project's classes, interfaces, windows and modules. A ClassInfo object contains dictionary properties to store a class' constants, properties, methods, menu handlers, and event handlers, as well as individual properties that store the superclass name and any implemented interface names. A MethodInfo class contains dictionary properties to store a method's parameters, local constants, local variables, local references, and parameters, as well as a String array to store the method's source code line by line and individual properties to store the method scope and its return value. The use of REALbasic dictionaries facilitated the analysis of the project file. As each program element was parsed in the project file, it was added to its appropriate dictionary. Separate dictionaries for each type of program element and the contents of each element also made it easy to display information about the elements in the main window's list boxes. For example, a list of project classes is displayed by iterating through the ProjectInfo object's classes dictionary. Data elements of that class are found by iterating though that ClassInfo object's properties, etc., dictionaries. Code elements are found in a similar manner. Source code is displayed by reading a MethodInfo object's sourceCode property. Most importantly, the use of dictionaries populated with Info objects allowed me to make the error analysis code simple and modular. Here is an example of code that iterates through the classes dictionary and analyzes each class contained within it. The number of elements in the dictionary is retrieved with the dictionary's Count property. A loop is entered that iterates through each element in the dictionary, casts it to a ClassInfo instance, then calls an analysis subroutine for each part of the class -- the class construct itself, its data members, and its code members. P dim count as integer = project.classes.Count - 1 dim hasSubclass as boolean for i as integer = 0 to count theClass = ClassInfo( project.classes.Value( project.classes.Key( i ) ).ObjectValue ) currAnalyzedElement = theClass.name hasSubclass = findSubclasses( theClass ) analyzeClassConstructs( theClass ) analyzeConstants( theClass.constants ) analyzeProperties( theClass.properties ) analyzeMethods( theClass.methods, hasSubclass ) analyzeEvents( theClass.events ) analyzeMenuHandlers( theClass.menuHandlers ) next End Sub Each analysis subroutine accepts an element's Info class instance as its parameters, then runs its error checks on that element. This can be seen in this snippet from the class construct analysis method. Each error check itself is only a few lines of code. As an error is encountered, information about the error is appended to the element's Info class' errors array, and the hasErrors flag is set to true, allowing the display of the error icon in the Reality Check GUI. Protected Sub analyzeClassConstructs( theClass as ClassInfo ) // Class name should not be too short or too long to be both descriptive and convenient if theClass.name.Len < cSettings.minNameLength or theClass.name.Len > cSettings.maxNameLength then logError( kClassNameLengthError, theClass.errors, "", kNone, kNone ) theClass.hasErrors = true end if // No methods, events or menu handlers if theClass.methods.Count <= cSettings.minMethods and theClass.events.Count = 0 and theClass.menuHandlers.Count = 0 then logError( kLazyClassError, theClass.errors, "", kNone, kNone ) theClass.hasErrors = true end if End Sub Versions 0.6 - 0.6.3 These versions added several new types of error checks, acceleration of the analysis, and a preferences system for selecting which analyses to do and to suppress unwanted error warnings. Better graphics were added, courtesy of my wife, including a flattering 3D picture of me analyzing a REALbasic cube with a CAT scan-like device for the splash screen. I wish I was as young and thin as that picture. The preferences dialog box was not as easy to create as I anticipated, but surprisingly the coding wasn't the problem. The main problem was aligning controls on different tabs of a TabPanel. Unlike windows, there are no guidelines for aligning the contents of TabPanels or GroupBox relative to its container according to each platform's GUI standards. It's also difficult to select several controls within a TabPanel or GroupBox at once. For totally unselfish reasons (that's my story and I'm sticking to it!), a feature request for improved editing of components in TabPanels and GroupBoxes was submitted to the REALbasic feedback site. Version 0.6.4 - 0.6.8 At this point, I realized that the default REALbasic list boxes would not be capable of rapidly displaying my lists of classes, interfaces, modules, data, and code. It was time to replace them with a more versatile control. I chose the Einhugur DataGrid. It provided a fast, professional-looking list box that could be sorted, adorned with icons indicating the presence of errors or the scope of data and code, enhanced with colored styled text in individual cells, and printed. The Einhugur WindowSplitter plugin allowed users to customize the size if the list boxes displayed in Reality Check. One complaint that my testers had was the inconvenient extra step of saving projects in XML format (of course, some XML file parsing errors in my own code didn't help). Luckily, Thomas Templemann had been working on RBProjectTools, a library that could read and write both XML-based project files and binary project files. With his permission, I replaced my XML code with his. I also added Jonathan Johnson's assertion library to allow more detailed exception reporting in my code. Versions 0.6.9 - 0.7.2 Due to illness, my progress on Reality Check was slowing down. I decided to open source the Reality Check project, hoping that a group effort could complete version 1.0 and make Reality Check available to the REALbasic community, then add new features. I replaced the Einhugur plugins with open source controls such as Alex Restrepo's list box and Harry Hooie's window splitter. This version, the last one released, is available at my web site. What's still needed for Versions 1.0 and 2.0 While working on Reality Check, I realized that certain errors could not be reliably reported with the current analysis engine. For example, to determine if a method or data element had the proper scope, all references to that element within the program had to be found. Likewise, to detect unused code or data, all of the references to that element had to be found, whether the element is passed to a method or "silently" returned by a method and instantly used without being stored in a property or local variable. To detect a faulty Boolean statement that always results in true or always results in false, the element's value had to be traced through a method's code. To detect the types of errors related to element scope and use, I needed to be able to parse and analyze the syntax of REALbasic code, then create a symbol table to store all elements, their scope, and references to them. In addition, if I wanted in version 2.0 to do more than report errors -- to be able to automatically correct these errors by refactoring the project code -- I'd need to build a complete decorated abstract syntax tree, then rearrange the syntax tree to do the refactoring. Sound familiar to any of you? This is essentially the front end of a compiler. I'd have to write a full REALbasic compiler in REALbasic! This was getting scary. I am not a computer scientist by training. I started taking computer science courses at a nearby university, but here too the illness got in the way before I could take a compiler course (there's nothing worse than having a nearly perfect grade in a course, then getting a migraine at the start of the final exam). Despite my having read most of the available compiler texts, jumping in and writing one was a daunting task. I explored available compiler writing tools -- parser generators -- to see if one could be adapted to REALbasic. I'd have to reproduce the REALbasic grammar, then wrap the output of a parser generator into a plugin, not an easy task. In addition, I'd need a way to detect symbols used in a project that are not reflected in the project file -- framework symbols and symbols from installed plugins. I wrote a second program, Language Reference Parser (also on my web site), to parse the LR documentation for symbol information and transform it to an XML format that could be read and added to a symbol table. However, inconsistencies in LR page HTML formats required custom handling of several pages, and with each change to REALbasic, the LR information must be reparsed, requiring changes to the LR parser program, and the symbol information built again. In addition, plugin symbol information is not yet obtainable. Even though I didn't want to give up on my labor of love, it looked like Reality Check would die a slow, painful death. Lessons Learned So what did I learn from this project? First the minuses. At various points in the project, I used code written by others to perform some of the tasks needed for Reality Check. In one case, the other programmer had less time to work on that code due to increasing demands at his job. Of course, I can't fault him for this, but this left me unable to handle later changes to REALbasic that required changes in that code. Using open source code, like using proprietary code, is not completely risk-free. Some of the libraries that I used, such as the project file read/write code, would not be required if REALbasic allowed tools to be plugged into the IDE via a Tools API as does IntelliJ's IDEA, Eclipse, or Visual Studio. A REALbasic compiler front end for Reality Check version 1.0 would not be needed if abstract syntax tree and symbol table information were made available through an API. I believe that the lack of these APIs is holding back REALbasic programming utility development relative to the rich tools market seen with other programming languages. Addition of third-party tools to the REALbasic IDE would also reduce the programming demands on Real Software since they would not need to build these tools themselves. Reality Check was made open source to promote contributions from others to the parser/syntax analyzer/symbol table. However, the open source community for RB is not as well developed as for Java, C/C++, Ruby, or Python. A past Postmortem in RBDJ noted that many REALbasic developers don't seem to want to pay for high-end controls, plugins, and utilities. I was surprised to see that they also did not contribute to the development of free tools. I'm hoping this changes as the REALbasic community grows. Okay, now the plusses. The initial development of Reality Check was very rapid, partly due to my obsession with writing it (I'm not a Type A personality. With extra effort, I can be A+!), but also due to the ease of writing code in REALbasic. REALbasic dictionaries were critical for holding program information and searching this information quickly to analyze or display it. They were also very easy to use. Dictionaries allowed each error check to be reduced to a small code block run on each dictionary member, and adding new error checks is simple. These same dictionaries are rapidly traversed to populate the list boxes in the GUI. There is a wealth of available REALbasic libraries that can be used in projects, both proprietary and open source. Using these libraries certainly increased the quality and power of Reality Check, but also presented some drawbacks, as noted above. My advice is to use code from projects that are still actively being developed with rapid releases of updates. Although Reality Check was developed on Mac OS X, I was confident that the final version of the program would readily work on Windows and Linux too. REALbasic has been the best cross-platform tool I've ever used. Just about all of my software has immediately worked on both the Mac and Windows with just small tweaks to the GUI. Last, but not least, I remembered reading about a computer science student who wrote a code analyzer as a class project that spit out a grade for the code quality. His professor, as sneaky professors do, ran the student project's source code through the code analyzer. The student got a D for his own code quality. Well, as a sneaky professor myself, I couldn't avoid running Reality Check's source code through Reality Check. Reality Check found design errors in its own code that allowed me to improve its design and speed.

...End of Excerpt. Please purchase the magazine to read the full article.

Article copyrighted by REALbasic Developer magazine. All rights reserved.


 


|

 


Weblog Commenting and Trackback by HaloScan.com