Document

1.
INTRODUCTION
1.1 Purpose
The purpose of the Maintenance Guide is to provide the information needed to maintain
the MISRA C Compliance Checker, including the processes of making changes, testing
changes, and finally migrating changes to the production environment.
This document is useful and necessary for minimizing time spent picking up this project
and maximizing the ability to hit the ground running, continuing development as soon as
possible. Read this document before reading any other documentation regarding the
project.
1.2 Document Organization
The remainder of the Maintenance Guide for the MISRA C Compliance Checker is
organized as follows.

System Overview - General information on the MISRA C Compliance Checker,
designed to provide maintainers with the background necessary to accurately
maintain the system and continue development.

Maintenance Procedures - Detailed instructions on the maintenance of the system.

Automated Tools and Utilities - A list and description of the tools and utilities used
to maintain the system.
2.
SYSTEM OVERVIEW
To accurately maintain a system, it is important to have a general understanding of the
system and the context in which changes are made. This section provides knowledge of
the system as a whole, and should be understood before any changes are undertaken.
In an effort to reduce data redundancy: information on the project’s requirements can be
found in the SRS. Design information may be found in the Design Document, this
includes architectural information about the system.
In short the system takes C code input, runs a series of lexers, parsers, and tree parsers on
the C code, each one being an instance of a MISRA-C rule, and finds violations of the
rule and outputs them to an XML file. The majority of what needs to be maintained lies
in the grammars from which these recognizers are generated. Potential defect fixes will
require making modifications to existing grammars; adding new rules will require adding
new grammar code.
At the completion of the original project, it includes a database that stores the rules and
corresponding information, including what rules to check for in case the user only wants
to search a subset of rules, as well as scripts for turning grammars into generated java
files. This manual will help explain some techniques for modifying different types of
grammars, generating java code, and including rules into the main application.
The tools needed to run and make changes to the system are ANTLR version 2 (release
2.7.7 or higher), hsqldb.jar, build_rule.bat, and Java version 5 or higher. Eclipse is
required for future development of the system. A tool to develop and compile C code
may be useful for testing sample code before running it through the checker.
3.
MAINTENANCE PROCEDURES
This section details the procedures used to make modifications to the system.
3.1 How to Modify the System
3.1.1 Creating / Modifying rules
Most of the logic specific to a rule is located within a grammar (or grammars) written
specifically for that rule. ANTLR is used to auto-generate parser Java classes from
grammar files and the compliance checker then loads and invokes the corresponding
parser (or parsers) for every rule it checks.
Therefore the first step in implementing a new rule is to write a new ANTLR grammar
(or multiple stages of grammars if required) for that rule. Similarly for modification of an
existing rule, the grammar or grammars for that rule should be examined. The source
code files for rule grammars are in the Resources/Grammars/Rule subdirectory of the
project. Grammars that correspond to a rule are named using the following convention,
Rule_X_Y_Parser_Pass_N
X_Y is the rule number (e.g. rule 20.11 would be 20_11) and the _Pass_N suffix is only
used when the grammar is the second or later part of a multiple-stage rule (the words
“pass” and “stage” mean the same thing), where N is the number of the stage. Thus a
single-stage rule or the first stage of a multiple-stage rule would be named
Rule_X_Y_Parser.
Each rule grammar is derived from one of the project’s C grammars. Every rule grammar
is modified with additional actions specific to that rule which detect and report rule
violations. The C grammars from which rule grammars derive each perform different
stages of language recognition. It depends on the specific rule which stage has the
information needed to check that rule; thus different rule grammars derive from different
C grammars and some rule grammars require multiple stages. The three C grammars
from which all rule grammars derive currently consist of the lexer, parser and tree parser
grammars. These grammars are stored in Resources/Grammars/V2 (V3 holds grammars
that would have been compatible with ANTLR v3, but work on these was not
completed).
Usually, the changes made to a rule grammar that distinguish it from the grammar it
inherits from are limited to additional actions, class methods and variables and any
required import statements. It is a bad idea to change the syntax of the C being
recognized by the rule grammar. In general, any modifications made to a rule grammar
should be limited to new Java action code; none of the existing functionality of the
grammar should be changed or removed. If such a change is required (for example,
making a change in the lexer that allows C++-style comments), it should be made to the
original grammar file, not the rule grammar, and all the other rules would have to be
updated to reflect the syntax change. Using the comments example, if only one rule
allows C++-style comments and the system tries to check a file with such comments, all
the other rules would throw exceptions except that one rule that recognizes that syntax.
Inheritance is not always possible in ANTLR through the simple use of an extends
keyword, so sometimes copy/pasting is required. The following sections describe how
each type of rule is implemented and in what situations each type should be used.
3.1.1.1 Tree Parser Rules
Abstract syntax trees (AST) are meant to be simpler to parse than tokens built from the
lexer. Therefore, the parser stage builds the abstract syntax tree and sometimes rearranges
tokens and/or combines them so that the resulting tree makes more sense and is easier to
work with. Therefore, you should try to write Compliance Checker rules with tree parsers
unless you need information from the lexer/parser.
To write a new tree parser rule, open up GnuCTreeParser.g (in the Grammars directory)
and copy everything up to and including the options section (this includes the curly
brackets that follow the options keyword). Change the following line
class GnuCTreeParser extends TreeParser;
to
class Rule_{X_Y}_Parser extends GnuCTreeParser;
where X_Y is your rule number (omit the brackets). Following the options section you
may declare any class variables and methods you need for your rule by placing them
inside a curly brackets section (there is no keyword which precedes the curly brackets,
see GnuCTreeParser.g for an example of declaring class members). Next, copy into your
new grammar any grammar rules from GnuCTreeParser.g that will be modified with
actions for your rule. You do not need to copy everything form the tree parser, only what
you need (this is in contrast to lexer/parser rules, see section 3.1.1.2). For example, if you
only need to check something after the entire AST has been parsed, you only need to
copy the translationUnit rule and add an action before the semicolon which ends the
rule.
When building a rule of this type (see section 3.1.2), use the tree rule option in
build_rule.bat.
3.1.1.2 Lexer / Parser Rules
The lexer stage takes in a character input stream and outputs a stream of tokens to the
next stage, the parser. The lexer stage reads in characters one at a time and creates tokens
from them. The parser then takes in the tokens created by the lexer and builds an abstract
syntax tree as output. These two stages are combined into one grammar file, and thus
rules which utilize either of these stages derive from the C lexer/parser, GnuCParser.g,
located in the V2 directory. ANTLR V2 does not allow extensive inheritance of
grammars (GnuCParser.g already inherits from StdCParser.g; you cannot extend a
lexer/parser grammar that already extends another), so you must copy and paste the
contents of GnuCParser.g into the rule grammar to simulate inheritance. Some of the
names inside the grammar need to be changed. The line that declares the Parser class
class GnuCParser extends StdCParser;
must be changed to:
class Rule_{X_Y}_Parser extends StdCParser;
where {X_Y} (excluding the brackets) is the rule number of your rule grammar (as per the
naming scheme described earlier). A similar change must be done for the lexer. The
following
class GnuCLexer extends StdCLexer;
is changed to
class Rule_{X_Y}_Lexer extends StdCLexer;
Notice that the name of this class ends with Lexer, not Parser. After making these
changes, the only other changes to the grammar should be the addition of actions that are
required to implement your rule. No grammar rules should be deleted from the new
grammar.
Symbol table information is easily accessible from within the parser, but the symbol table
is still being built during this stage. The completed symbol table is only available at the
end of the translationUnit rule. The lexer contains information concerning the
characters that were in the original input (for example, useful for examining the contents
of comments, which are ignored in later stages). The lexer is the only place currently
where #include and other preprocessing directives can be examined.
When building a rule of this type (see section 3.1.2), use the lexer rule option in
build_rule.bat. See rule 5.2 for an example of a lexer/parser rule.
3.1.1.3 Multiple Stage Rules
Some rules need information from or do processing in both the lexer/parser and the tree
parser. For example, you may need to check if a library was included by using the lexer
and then check the AST in the tree parser. This is accomplished through a multiple-stage
rule. A multiple stage rule consists of two or more grammar files where each stage calls
the next stage. Currently, the only multiple-stage rules in the system consist of a
lexer/parser for the first stage and a tree parser for the second stage, but multiple tree
parsers might also be a desirable setup for future rules (there should always only be one
lexer/parser stage).
To implement a multiple-stage rule where the first stage is a lexer/parser and the second
is a tree parser, start by writing a lexer/parser as described in section 3.1.1.2. Name it as
any other lexer/parser rule would be named (do not use the _Pass_N suffix) and do any
processing required for the lexer/parser stage of the rule by adding actions to the
grammar. Then add an action to translationUnit in the parser; in here the next stage
will be initiated. The code that initiates the next stage can be placed directly into this
action or can be placed in a method which is called from here:
translationUnit
: ( externalList )? /* Empty source files are allowed. */
{
// initiate the next stage here
}
;
Use the following code to initialize and begin the tree parser stage:
// Instantiate and initialize tree parser
Rule_X_Y_Parser_Pass_2 pass2parser = new Rule_X_Y_Parser_Pass_2
();
pass2parser.setASTNodeType(TNode.class.getName());
TNode.setTokenVocabulary("GNUCTokenTypes");
AST pass1tree = this.getAST();
// Optional: Get the lexer object from this stage and pass
// it to the tree parser.
Rule_X_Y_Lexer lexer =
(Rule_X_Y_Lexer)this.getInputState().getInput().getInput();
pass2parser.setPreviousPass(lexer, this);
try
{
pass2parser.translationUnit(pass1tree);
}
catch(RecognitionException e)
{
System.err.println(e); // do nothing
}
The first four lines of this code are required to declare and initialize the tree parser. The
tree parser is then executed in the try-catch block at the bottom. The optional 2 lines of
code in the middle get the current lexer object and pass it to the tree parser. The
setPreviousPass method is used here as an example of how to pass lexer information to
the next stage and it is a method that was added to the tree parser specifically for this
rule.
The second stage, the tree parser, is implemented just as any another tree parser as
described in section 3.1.1.1. The only differences are that you should name this grammar
with the _Pass_N suffix (see 3.1.1), where N is a stage number greater than 1 (e.g.
_Pass_2 for the second stage), and you need to have some way of accessing information
from previous stages, such as implementing a setPreviousPass method as described
earlier (using this method, the first stage passes information to the second stage).
The rules are built separately. The lexer/parser is built with the lexer rule option in
build_rule.bat and the tree parser is built with the tree rule option. The Compliance
Checker only needs to load the first stage since each stage loads the next stage, so when
adding the rule to the system database, only the first stage needs to be added. However,
all stages must be compiled and be in the src directory.
3.1.2 Compiling rules
Once a grammar is written, run build_rule.bat and follow the instructions. This batch file
will generate Java classes from grammar files using ANTLR and copy these classes to the
src directory. It will first prompt you for the name of the grammar file which is to be built
(include the “.g” extension); then it will ask whether it is a lexer or tree rule.
Alternatively to avoid the prompt, you can supply these two answers as parameters to
build_rule.bat as follows:
build_rule.bat Rule_{X_Y}_Parser{_Pass_N}.g { tree|lexer}
Replace the bracketed values with the actual values for you rule (do not include the
brackets).
The script will inform you if building a rule succeeded or failed. On success, ANTLR
will generate Java classes from the specified grammar and the batch file will copy the
classes to the src directory. If rule building succeeded, the batch file will display a
success message when it finishes as in the following screenshot:
If rule building failed, ANTLR will not create Java classes (rule building usually fails
because there are syntax errors in the grammar) and the batch files will not copy any files
to src. The batch file will display a failed message if this happens as shown below:
Once the classes have been generated and are in the src directory, they need to be edited
in order to compile and prevent the system from throwing errors. For lexer/parser rules,
both the lexer and the parser generated classes need to be edited. The generated lexer
class, with the name of Rule_{X_Y}_Lexer.java, needs to implement the RuleLexer
interface; this interface must be appended to the implements list in the class declaration,
as in the example bellow
public class Rule_20_8_Lexer extends antlr.CharScanner implements
Rule_20_8_LexerTokenTypes, TokenStream, RuleLexer
Additionally in this java file, the following import statements must be commented out or
deleted (to avoid a known issue with packages)
import
import
import
import
CToken;
java.io.*;
LineObject;
antlr.*;
For the generated parser class of the lexer rule with the name Rule_{X_Y}_Partser.java,
the interface RuleParser must be implemented by the parser, so it must be appended to
the implements list in the class declaration as shown below
public class Rule_20_8_Parser extends antlr.LLkParser
implements GNUCTokenTypes, RuleParser
For tree parser rules, the parser must implement RuleTreeParser. Add this interface to
the implements list as shown in the example tree parser rule below:
public class Rule_6_4_Parser extends antlr.TreeParser
implements Rule_6_4_ParserTokenTypes, RuleTreeParser
Note that for multi-stage rules, the second stages do not need to implement the Rule
interfaces (RuleLexer, RuleParser, RuleTreeParser) because they are not loaded by
the main program, they are loaded by the preceding stage of the rule. However, the
problematic import statements for lexer classes must always be commented out.
Once the rule is included into the main application as a rule to be run, it can be tested
accordingly.
3.1.3 Including Rules
3.1.3.1 Using the Tool
In order to include rules in the system, they must first be added to the rule database and
then added to the list of rules to be checked. The rule configuration tool is the easiest
way to perform these tasks. To add rules to the database, start the rule configuration tool
with the command line argument admin. The rule configuration tool will open with two
tabs, one for editing the list of rules that are being checked, and one for editing the rules
in the system's database. Go to the tab labeled Edit Rules. Select the option NEW from
the dropdown list at the top. Enter the rule number (X.Y} in the rule name, select
whether or not the rule is advisory or required, and enter a description of the rule. Then,
select either lexer/parser or tree-parser from the Rule Type dropdown box. Rules using a
tree-parser should use the tree-parser option. Rules using either a lexer and parser
combination or multi-stage rules (using lexer, parser, and a tree-parser) should use the
lexer/parser option. Enter in the classname for the appropriate files. Lastly, select all
associated consequences. Press Add New Rule to add the rule to the database. To edit an
existing rule, select the rule to be edited from the dropdown. Edit the fields you wish to
change and press the Update Rule button. Press the Save button once all rule additions
have been made.
To change the configuration of rules to be checked, go to the Checked Rules tab. To add
rules to the checked list from the list of unchecked rules, select the rules from the list of
unchecked rules (the list on the left), and press the >> button to move them to the
checked list. To remove rules from the list of those to be checked, select them from the
list of checked rules (the rules on the right), and press the << button to move them to the
list of unchecked rules. Also, you can select one of the preset options from the list at the
bottom. Select an option and click the Apply button to change the list of checked rules to
the chosen preset. Press the Save button when all changes have been made.
3.1.3.2 Adding New Rules to the Database
A lightweight database powered by HSQLDB is used for the persistent storage of rule,
consequence, and file extension information and configuration settings. The system
figures out which recognizers to instantiate and run against the input files by querying
this database. Therefore, a rule entry must be inserted into the database before that
particular rule may be checked by the system. Below is a step-by-step guide for the
addition of one rule:
1.
2.
3.
4.
Add an insert statement to the Resources\SQL Scripts\rules.sql file.
Start a command prompt.
Navigate to the root directory of the project.
Enter this command: "java -cp Resources\hsqldb.jar
org.hsqldb.util.DatabaseManagerSwing".
5. Make the prompt look like this: (Change Type and URL)
6. Copy your insert statement into the text field:
7. Execute SQL. If it tells you there was an update, you were successful.
8. Clear the text field and enter "SHUTDOWN".
9. Execute SQL one more time. This ensures proper shutdown of the DB.
10. Make sure you commit any changes to the database (miserdb.script) and SQL
scripts to revision control.
3.1.4 Testing Rules / Tips
It is easier for a project like this to use test driven development. Write tests before
writing the rules, so that each part of rule can be tested as its implemented. Carefully
consider all equivalence classes and all of quirky things that can be done in C so that
every case can be addressed.
APPENDIX A - DELIVERABLES INDEX
Maintenance Guide Deliverables Index Template:
This template is used to list the formal deliverables relevant to the maintenance of a
system.
DELIVERABLES INDEX
Name
Used for Task
SRS
Requirements
Management
Design
View System Design
Document
User Manual
Using the System
Use Cases
Risk
Management
Understand System
Flow
Understand
Development Risks
Description
System Requirements
Location
Docs/SRS.doc
System Design
Docs/Miser-C Design.doc
Instructions for
System Use
List of Use Cases of
the System
Risks and Mitigation
Strategies
Docs/MISRA C User
Manual.doc
Docs/Use Cases.doc
Docs/RiskManagement.xls