1. INTRODUCTION 1.1 Purpose The purpose of the Maintenance Guide is to provide the information needed to maintain the MISRA C Compliance Checker, including the processes of making changes, testing changes, and finally migrating changes to the production environment. This document is useful and necessary for minimizing time spent picking up this project and maximizing the ability to hit the ground running, continuing development as soon as possible. Read this document before reading any other documentation regarding the project. 1.2 Document Organization The remainder of the Maintenance Guide for the MISRA C Compliance Checker is organized as follows. System Overview - General information on the MISRA C Compliance Checker, designed to provide maintainers with the background necessary to accurately maintain the system and continue development. Maintenance Procedures - Detailed instructions on the maintenance of the system. Automated Tools and Utilities - A list and description of the tools and utilities used to maintain the system. 2. SYSTEM OVERVIEW To accurately maintain a system, it is important to have a general understanding of the system and the context in which changes are made. This section provides knowledge of the system as a whole, and should be understood before any changes are undertaken. In an effort to reduce data redundancy: information on the project’s requirements can be found in the SRS. Design information may be found in the Design Document, this includes architectural information about the system. In short the system takes C code input, runs a series of lexers, parsers, and tree parsers on the C code, each one being an instance of a MISRA-C rule, and finds violations of the rule and outputs them to an XML file. The majority of what needs to be maintained lies in the grammars from which these recognizers are generated. Potential defect fixes will require making modifications to existing grammars; adding new rules will require adding new grammar code. At the completion of the original project, it includes a database that stores the rules and corresponding information, including what rules to check for in case the user only wants to search a subset of rules, as well as scripts for turning grammars into generated java files. This manual will help explain some techniques for modifying different types of grammars, generating java code, and including rules into the main application. The tools needed to run and make changes to the system are ANTLR version 2 (release 2.7.7 or higher), hsqldb.jar, build_rule.bat, and Java version 5 or higher. Eclipse is required for future development of the system. A tool to develop and compile C code may be useful for testing sample code before running it through the checker. 3. MAINTENANCE PROCEDURES This section details the procedures used to make modifications to the system. 3.1 How to Modify the System 3.1.1 Creating / Modifying rules Most of the logic specific to a rule is located within a grammar (or grammars) written specifically for that rule. ANTLR is used to auto-generate parser Java classes from grammar files and the compliance checker then loads and invokes the corresponding parser (or parsers) for every rule it checks. Therefore the first step in implementing a new rule is to write a new ANTLR grammar (or multiple stages of grammars if required) for that rule. Similarly for modification of an existing rule, the grammar or grammars for that rule should be examined. The source code files for rule grammars are in the Resources/Grammars/Rule subdirectory of the project. Grammars that correspond to a rule are named using the following convention, Rule_X_Y_Parser_Pass_N X_Y is the rule number (e.g. rule 20.11 would be 20_11) and the _Pass_N suffix is only used when the grammar is the second or later part of a multiple-stage rule (the words “pass” and “stage” mean the same thing), where N is the number of the stage. Thus a single-stage rule or the first stage of a multiple-stage rule would be named Rule_X_Y_Parser. Each rule grammar is derived from one of the project’s C grammars. Every rule grammar is modified with additional actions specific to that rule which detect and report rule violations. The C grammars from which rule grammars derive each perform different stages of language recognition. It depends on the specific rule which stage has the information needed to check that rule; thus different rule grammars derive from different C grammars and some rule grammars require multiple stages. The three C grammars from which all rule grammars derive currently consist of the lexer, parser and tree parser grammars. These grammars are stored in Resources/Grammars/V2 (V3 holds grammars that would have been compatible with ANTLR v3, but work on these was not completed). Usually, the changes made to a rule grammar that distinguish it from the grammar it inherits from are limited to additional actions, class methods and variables and any required import statements. It is a bad idea to change the syntax of the C being recognized by the rule grammar. In general, any modifications made to a rule grammar should be limited to new Java action code; none of the existing functionality of the grammar should be changed or removed. If such a change is required (for example, making a change in the lexer that allows C++-style comments), it should be made to the original grammar file, not the rule grammar, and all the other rules would have to be updated to reflect the syntax change. Using the comments example, if only one rule allows C++-style comments and the system tries to check a file with such comments, all the other rules would throw exceptions except that one rule that recognizes that syntax. Inheritance is not always possible in ANTLR through the simple use of an extends keyword, so sometimes copy/pasting is required. The following sections describe how each type of rule is implemented and in what situations each type should be used. 3.1.1.1 Tree Parser Rules Abstract syntax trees (AST) are meant to be simpler to parse than tokens built from the lexer. Therefore, the parser stage builds the abstract syntax tree and sometimes rearranges tokens and/or combines them so that the resulting tree makes more sense and is easier to work with. Therefore, you should try to write Compliance Checker rules with tree parsers unless you need information from the lexer/parser. To write a new tree parser rule, open up GnuCTreeParser.g (in the Grammars directory) and copy everything up to and including the options section (this includes the curly brackets that follow the options keyword). Change the following line class GnuCTreeParser extends TreeParser; to class Rule_{X_Y}_Parser extends GnuCTreeParser; where X_Y is your rule number (omit the brackets). Following the options section you may declare any class variables and methods you need for your rule by placing them inside a curly brackets section (there is no keyword which precedes the curly brackets, see GnuCTreeParser.g for an example of declaring class members). Next, copy into your new grammar any grammar rules from GnuCTreeParser.g that will be modified with actions for your rule. You do not need to copy everything form the tree parser, only what you need (this is in contrast to lexer/parser rules, see section 3.1.1.2). For example, if you only need to check something after the entire AST has been parsed, you only need to copy the translationUnit rule and add an action before the semicolon which ends the rule. When building a rule of this type (see section 3.1.2), use the tree rule option in build_rule.bat. 3.1.1.2 Lexer / Parser Rules The lexer stage takes in a character input stream and outputs a stream of tokens to the next stage, the parser. The lexer stage reads in characters one at a time and creates tokens from them. The parser then takes in the tokens created by the lexer and builds an abstract syntax tree as output. These two stages are combined into one grammar file, and thus rules which utilize either of these stages derive from the C lexer/parser, GnuCParser.g, located in the V2 directory. ANTLR V2 does not allow extensive inheritance of grammars (GnuCParser.g already inherits from StdCParser.g; you cannot extend a lexer/parser grammar that already extends another), so you must copy and paste the contents of GnuCParser.g into the rule grammar to simulate inheritance. Some of the names inside the grammar need to be changed. The line that declares the Parser class class GnuCParser extends StdCParser; must be changed to: class Rule_{X_Y}_Parser extends StdCParser; where {X_Y} (excluding the brackets) is the rule number of your rule grammar (as per the naming scheme described earlier). A similar change must be done for the lexer. The following class GnuCLexer extends StdCLexer; is changed to class Rule_{X_Y}_Lexer extends StdCLexer; Notice that the name of this class ends with Lexer, not Parser. After making these changes, the only other changes to the grammar should be the addition of actions that are required to implement your rule. No grammar rules should be deleted from the new grammar. Symbol table information is easily accessible from within the parser, but the symbol table is still being built during this stage. The completed symbol table is only available at the end of the translationUnit rule. The lexer contains information concerning the characters that were in the original input (for example, useful for examining the contents of comments, which are ignored in later stages). The lexer is the only place currently where #include and other preprocessing directives can be examined. When building a rule of this type (see section 3.1.2), use the lexer rule option in build_rule.bat. See rule 5.2 for an example of a lexer/parser rule. 3.1.1.3 Multiple Stage Rules Some rules need information from or do processing in both the lexer/parser and the tree parser. For example, you may need to check if a library was included by using the lexer and then check the AST in the tree parser. This is accomplished through a multiple-stage rule. A multiple stage rule consists of two or more grammar files where each stage calls the next stage. Currently, the only multiple-stage rules in the system consist of a lexer/parser for the first stage and a tree parser for the second stage, but multiple tree parsers might also be a desirable setup for future rules (there should always only be one lexer/parser stage). To implement a multiple-stage rule where the first stage is a lexer/parser and the second is a tree parser, start by writing a lexer/parser as described in section 3.1.1.2. Name it as any other lexer/parser rule would be named (do not use the _Pass_N suffix) and do any processing required for the lexer/parser stage of the rule by adding actions to the grammar. Then add an action to translationUnit in the parser; in here the next stage will be initiated. The code that initiates the next stage can be placed directly into this action or can be placed in a method which is called from here: translationUnit : ( externalList )? /* Empty source files are allowed. */ { // initiate the next stage here } ; Use the following code to initialize and begin the tree parser stage: // Instantiate and initialize tree parser Rule_X_Y_Parser_Pass_2 pass2parser = new Rule_X_Y_Parser_Pass_2 (); pass2parser.setASTNodeType(TNode.class.getName()); TNode.setTokenVocabulary("GNUCTokenTypes"); AST pass1tree = this.getAST(); // Optional: Get the lexer object from this stage and pass // it to the tree parser. Rule_X_Y_Lexer lexer = (Rule_X_Y_Lexer)this.getInputState().getInput().getInput(); pass2parser.setPreviousPass(lexer, this); try { pass2parser.translationUnit(pass1tree); } catch(RecognitionException e) { System.err.println(e); // do nothing } The first four lines of this code are required to declare and initialize the tree parser. The tree parser is then executed in the try-catch block at the bottom. The optional 2 lines of code in the middle get the current lexer object and pass it to the tree parser. The setPreviousPass method is used here as an example of how to pass lexer information to the next stage and it is a method that was added to the tree parser specifically for this rule. The second stage, the tree parser, is implemented just as any another tree parser as described in section 3.1.1.1. The only differences are that you should name this grammar with the _Pass_N suffix (see 3.1.1), where N is a stage number greater than 1 (e.g. _Pass_2 for the second stage), and you need to have some way of accessing information from previous stages, such as implementing a setPreviousPass method as described earlier (using this method, the first stage passes information to the second stage). The rules are built separately. The lexer/parser is built with the lexer rule option in build_rule.bat and the tree parser is built with the tree rule option. The Compliance Checker only needs to load the first stage since each stage loads the next stage, so when adding the rule to the system database, only the first stage needs to be added. However, all stages must be compiled and be in the src directory. 3.1.2 Compiling rules Once a grammar is written, run build_rule.bat and follow the instructions. This batch file will generate Java classes from grammar files using ANTLR and copy these classes to the src directory. It will first prompt you for the name of the grammar file which is to be built (include the “.g” extension); then it will ask whether it is a lexer or tree rule. Alternatively to avoid the prompt, you can supply these two answers as parameters to build_rule.bat as follows: build_rule.bat Rule_{X_Y}_Parser{_Pass_N}.g { tree|lexer} Replace the bracketed values with the actual values for you rule (do not include the brackets). The script will inform you if building a rule succeeded or failed. On success, ANTLR will generate Java classes from the specified grammar and the batch file will copy the classes to the src directory. If rule building succeeded, the batch file will display a success message when it finishes as in the following screenshot: If rule building failed, ANTLR will not create Java classes (rule building usually fails because there are syntax errors in the grammar) and the batch files will not copy any files to src. The batch file will display a failed message if this happens as shown below: Once the classes have been generated and are in the src directory, they need to be edited in order to compile and prevent the system from throwing errors. For lexer/parser rules, both the lexer and the parser generated classes need to be edited. The generated lexer class, with the name of Rule_{X_Y}_Lexer.java, needs to implement the RuleLexer interface; this interface must be appended to the implements list in the class declaration, as in the example bellow public class Rule_20_8_Lexer extends antlr.CharScanner implements Rule_20_8_LexerTokenTypes, TokenStream, RuleLexer Additionally in this java file, the following import statements must be commented out or deleted (to avoid a known issue with packages) import import import import CToken; java.io.*; LineObject; antlr.*; For the generated parser class of the lexer rule with the name Rule_{X_Y}_Partser.java, the interface RuleParser must be implemented by the parser, so it must be appended to the implements list in the class declaration as shown below public class Rule_20_8_Parser extends antlr.LLkParser implements GNUCTokenTypes, RuleParser For tree parser rules, the parser must implement RuleTreeParser. Add this interface to the implements list as shown in the example tree parser rule below: public class Rule_6_4_Parser extends antlr.TreeParser implements Rule_6_4_ParserTokenTypes, RuleTreeParser Note that for multi-stage rules, the second stages do not need to implement the Rule interfaces (RuleLexer, RuleParser, RuleTreeParser) because they are not loaded by the main program, they are loaded by the preceding stage of the rule. However, the problematic import statements for lexer classes must always be commented out. Once the rule is included into the main application as a rule to be run, it can be tested accordingly. 3.1.3 Including Rules 3.1.3.1 Using the Tool In order to include rules in the system, they must first be added to the rule database and then added to the list of rules to be checked. The rule configuration tool is the easiest way to perform these tasks. To add rules to the database, start the rule configuration tool with the command line argument admin. The rule configuration tool will open with two tabs, one for editing the list of rules that are being checked, and one for editing the rules in the system's database. Go to the tab labeled Edit Rules. Select the option NEW from the dropdown list at the top. Enter the rule number (X.Y} in the rule name, select whether or not the rule is advisory or required, and enter a description of the rule. Then, select either lexer/parser or tree-parser from the Rule Type dropdown box. Rules using a tree-parser should use the tree-parser option. Rules using either a lexer and parser combination or multi-stage rules (using lexer, parser, and a tree-parser) should use the lexer/parser option. Enter in the classname for the appropriate files. Lastly, select all associated consequences. Press Add New Rule to add the rule to the database. To edit an existing rule, select the rule to be edited from the dropdown. Edit the fields you wish to change and press the Update Rule button. Press the Save button once all rule additions have been made. To change the configuration of rules to be checked, go to the Checked Rules tab. To add rules to the checked list from the list of unchecked rules, select the rules from the list of unchecked rules (the list on the left), and press the >> button to move them to the checked list. To remove rules from the list of those to be checked, select them from the list of checked rules (the rules on the right), and press the << button to move them to the list of unchecked rules. Also, you can select one of the preset options from the list at the bottom. Select an option and click the Apply button to change the list of checked rules to the chosen preset. Press the Save button when all changes have been made. 3.1.3.2 Adding New Rules to the Database A lightweight database powered by HSQLDB is used for the persistent storage of rule, consequence, and file extension information and configuration settings. The system figures out which recognizers to instantiate and run against the input files by querying this database. Therefore, a rule entry must be inserted into the database before that particular rule may be checked by the system. Below is a step-by-step guide for the addition of one rule: 1. 2. 3. 4. Add an insert statement to the Resources\SQL Scripts\rules.sql file. Start a command prompt. Navigate to the root directory of the project. Enter this command: "java -cp Resources\hsqldb.jar org.hsqldb.util.DatabaseManagerSwing". 5. Make the prompt look like this: (Change Type and URL) 6. Copy your insert statement into the text field: 7. Execute SQL. If it tells you there was an update, you were successful. 8. Clear the text field and enter "SHUTDOWN". 9. Execute SQL one more time. This ensures proper shutdown of the DB. 10. Make sure you commit any changes to the database (miserdb.script) and SQL scripts to revision control. 3.1.4 Testing Rules / Tips It is easier for a project like this to use test driven development. Write tests before writing the rules, so that each part of rule can be tested as its implemented. Carefully consider all equivalence classes and all of quirky things that can be done in C so that every case can be addressed. APPENDIX A - DELIVERABLES INDEX Maintenance Guide Deliverables Index Template: This template is used to list the formal deliverables relevant to the maintenance of a system. DELIVERABLES INDEX Name Used for Task SRS Requirements Management Design View System Design Document User Manual Using the System Use Cases Risk Management Understand System Flow Understand Development Risks Description System Requirements Location Docs/SRS.doc System Design Docs/Miser-C Design.doc Instructions for System Use List of Use Cases of the System Risks and Mitigation Strategies Docs/MISRA C User Manual.doc Docs/Use Cases.doc Docs/RiskManagement.xls
© Copyright 2025 Paperzz