Automatic Conversion of SQL Statement into Clamshell Diagram

Automatic Conversion
of SQL Statement into
Clamshell Diagram
Takehiko MURAKAWA, Atsushi TSUJIMOTO,
Kazunori MATSUO and Masaru NAKAGAWA
Wakayama University, Japan
JCKBSE'10 - August 25, 2010.
1
Purpose of My Talk
SELECT title FROM (SELECT MAX(salary) AS max_salary
FROM joblist) AS j, joblist WHERE salary = max_salary
Here
2
SQL
 SQL is a practical programming language used in
the query to relational database.
 We want to know if the SQL statement is OK.
 Is
the statement correct in grammar? ... Execute it!
 How long for run time? ... Execute it!
 Are the programmer and his/her supervisor convinced
that the statement is appropriate? ... ???
3
Clamshell Diagram
 Symmetrical double tree for representing two
hierarchies and a problem-solution relationship.
 Each node has a piece of information.
 I spoke about the diagram in the last JCKBSE first,
and have applied it to program understanding for C
and SQL.
4
Clamshell Diagram
for SQL Statement
 Left-hand tree: SQL statement.
Right-hand tree: description in English.
 Traversing the left-hand tree in preorder restores
the SQL statement, while the right-hand tree
produces the description.
 The statement with subquery is transformed into
the clamshell diagram including another clamshell
diagram.
5
Diagram drawn in 2009
SELECT
SELECT
title
title
FROM
joblist
joblist
from
WHERE
salary=x
salary is
equal to x
where
MAX(x)
salary
salary
maximum
of x
FROM
joblist
joblist
from
get
get
6
Next step
 Until now, the clamshell diagrams were drawn by
human works. That tends to make a mistake and is
ineffective to make diagrams for two or more lookalike statements.
 We attempt to produce the diagram by means of a
program.
7
From SQL Statement
to Clamshell Diagram
SQL statement
1. Parse through lexical & syntactic analyses
Syntax tree
2. Reconfigure tree
Left-hand tree
3. Construct right-half
Right-hand tree
Clamshell diagram
8
1. Parsing
 We adopted Racc, a library of Ruby, for lexical and
syntactic analyses.
 The syntax rules are from a specification of SQLite,
limited to SELECT statement.
 Including
recursive rules and the rules about expression.
 Introduced 53 nonterminal symbols and defined
186 grammatical rules.
9
Syntax Tree Is Not Exactly
Left-hand Tree
 While the pieces of the SQL statement appear on
the leaf nodes, all the internal nodes are binding
on nonterminal symbols.
Left: Syntax tree
[expr]
Right: description
salary
salary
>
is
larger
than
50000
50000
?
10
2. Tree reconfiguration
 The nodes around binary operators, function
calls, and parentheses without function calls
are changed by the specific rules so that
 The
node associated with a nonterminal symbol may be
removed.
 Traversing the tree can still restore the SQL statement.
11
Examples (1)
x+1
x
+
1
x+1-1
x
+
1
-
1
1
*
x+1*0
x
+
0
12
Examples (2)
 f(x, y)=z
f ()
x
,
=
z
y
13
3. Constructing Right-hand Tree
 The right-hand tree is the isomorphic tree of the
left-hand tree, and the label on each node is much
the same.
 Special words are replaced.
SQL Token
Substitute
word
SELECT
get
* (asterisk)
records
, (comma)
and
. (period)
's
f ()
f of
Note: "*" is replaced unless it is used as an operator.
14
Generated Diagram
 Note in reading: The pair of parentheses puts the
subtree of the first child in it, and has the rest of
child nodes just behind.
15
Applying to other
SQL Statements
 Three statements using subqueries to retrieve the
titles of maximum salary bring the valid diagrams.
 The query for calculating the median leads to the
diagram with 212 nodes in total.
 The
handwritten diagram consists of 88 nodes.
The syntactic analysis of today’s talk atomizes the
components of SQL statement.
Proposed
salary
is
larger
than
50000
Conventional
salary is
larger than
50000
16
Future works
 Evaluation of readability.
 Comparison with indented, colored SQL statements
 Expressing the description in Japanese (but some
might be unpleasant when we submit the results to
an international conference).
17