FINRA PPT Style Guidelines

“An automated tool designed to ease the
pain of test creation and maintenance.”
Nil Weerasinghe
Bryan Robbins
Mohamed Ibrahim
About FINRA
■ Financial Industry Regulatory
Authority
• Largest independent regulator for all
securities firms doing business in the U.S.
• ~4,500 brokerage firms
• ~163,500 branch offices
• ~634,400 registered securities
representatives
Our Mission:
Investor Protection.
Market Integrity.
Providing independent,
vigorous
regulation
Educating
& informing investors
Computerized certification and
continued education.
Inviting active
industry involvement
& input
Actively
supporting
Arial
Body
firms’ compliance
efforts
Copy
Series 7, 63 …etc.
American University Presentation Copyright 2011 FINRA
1
FINRA Open Source Projects
■ Increase Community Involvement
■ FINRA Open Source Projects
• http://finraos.github.io/
■ DataGenerator
• http://finraos.github.io/DataGenerator/
■ JTAF-ExtWebDriver
• http://finraos.github.io/JTAFExtWebDriver/
American University Presentation Copyright 2011 FINRA
2
How to get involved.
■ Use it
■ Extend it
• Fork it
• Discuss idea
– Open ticket
– Google group discussion
– [email protected]
• Commit
– DCO and ApacheV2
■ Report bugs
■ Help document
http://finraos.github.io/DataGenerator/
https://github.com/FINRAOS/DataGenerator
American University Presentation Copyright 2011 FINRA
3
Agenda
• What is the DataGenerator?
• Demo.
– Dependency Modeling
– Pairwise Data Generation.
• Current Limitations.
• Re-architecture plan.
• Questions
American University Presentation Copyright 2011 FINRA
4
Video
http://finraos.github.io/DataGenerator/
http://www.youtube.com/watch?v=Wxa1T0gp56k
American University Presentation Copyright 2011 FINRA
5
Current Approach
DataSpec
Model
Datasets
Outputs
■ Two ways to describe and generate datasets
• Equivalence Classes + Combinations
• Dependency Model + Graph Coverage
■ Both use Apache Velocity to generate output from templates
American University Presentation Copyright 2011 FINRA
6
Demo
■ Pairwise Combinations
• Uses equivalence classes from
DataSpec to populate datasets
DataSpec
■ All Paths
• Uses annotations from graphical
model to populate datasets
Model
American University Presentation Copyright 2011 FINRA
7
Limitations of Current Approach
■ Limited set of graph annotations
• Can only set variable values within model
• No support for logic, pos/neg equivalence classes in current version
• We need more powerful annotation
■ Logic often split across spec, model, and templates
• Anything dynamic must be injected into Velocity template, as model and
spec are both static
• We need more dynamic evaluation
■ Performance considerations
• Breadth-first enumeration doesn’t scale well as domain becomes more
complex
• We need more performant implementation
American University Presentation Copyright 2011 FINRA
8
Re-architecting Data Generator
■ Replacing Visio with SCXML, an open standard to represent the
state machine.
<scxml xmlns="http://www.w3.org/2005/07/scxml"
xmlns:cs="http://commons.apache.org/scxml"
version="1.0"
initial="start">
<state id="start">
<transition event="RECORD_TYPE" target="RECORD_TYPE"/>
</state>
<state id="RECORD_TYPE">
<!-- Mandatory -->
<onentry>
<assign name="var_out_RECORD_TYPE" expr="set:{a,b,c}"/>
</onentry>
<transition event="REQUEST_IDENTIFIER"
target="REQUEST_IDENTIFIER"/>
</state>
.
.
.
American University Presentation Copyright 2011 FINRA
9
Re-architecting Data Generator
■ SCXML Allows for complex modelling using embedded EL
<state id="PRODUCT_TYPE_CODE">
<!-- Mandatory -->
<onentry>
<assign name="var_out_PRODUCT_TYPE_CODE" expr="#ProductTypeCode_Cycle"/>
</onentry>
<transition event="OPTIONS_SYMBOLOGY_IDENTIFIER" target="OPTIONS_SYMBOLOGY_IDENTIFIER"
cond="${var_out_PRODUCT_TYPE_CODE=='Derivatives-Options'}"
/>
<transition event="OPTIONAL_SECURITY_SYMBOL" target="OPTIONAL_SECURITY_SYMBOL"
cond="${var_out_PRODUCT_TYPE_CODE!='Derivatives-Options'}"
/>
</state>
.
.
.
American University Presentation Copyright 2011 FINRA
10
Re-architecting Data Generator
■ SCXML Allows for complex modelling: A state
can be written as a state machine itself
■ We’re using apache commons-scxml in out POC
American University Presentation Copyright 2011 FINRA
11
Re-architecting Data Generator
■ Overcoming memory issues by enhancing the all-paths
algorithm, use DFS with minimal memory overhead
American University Presentation Copyright 2011 FINRA
12
Re-architecting Data Generator
■ Short demo:
<scxml xmlns=http://www.w3.org/2005/07/scxml xmlns:cs=http://commons.apache.org/scxml version="1.0"
initial="start">
<state id="start">
<transition event="RECORD_TYPE" target="RECORD_TYPE"/>
</state>
<state id="RECORD_TYPE">
<onentry>
<assign name="var_out_RECORD_TYPE" expr="set:{a,b,c}"/>
</onentry>
<transition event="REQUEST_IDENTIFIER" target="REQUEST_IDENTIFIER"/>
</state>
<state id="REQUEST_IDENTIFIER">
<onentry>
<assign name="var_out_REQUEST_IDENTIFIER" expr="set:{1,2,3}"/>
</onentry>
<transition event="MANIFEST_GENERATION_DATETIME" target="MANIFEST_GENERATION_DATETIME"/>
</state>
<state id="MANIFEST_GENERATION_DATETIME">
<onentry>
<assign name="var_out_MANIFEST_GENERATION_DATETIME" expr="#{nextint}"/>
</onentry>
<transition target="end"/>
</state>
<state id="end">
</state>
</scxml>
American University Presentation Copyright 2011 FINRA
13
Re-architecting Data Generator
■ Restructure the code to allow Hadoop Map Reduce and Giraph to
operate on it.
■ Data Generator won’t itself directly depend on Hadoop or Girpah,
but will abstract the following:
• Input: Allow input from files
• Execution: Allow the execution from a middle state provided input variables
• Output: Allow outputs to different formats text files, several files, gz. The
user will be able to extend the output to support: sequence files, redshift,
hbase
American University Presentation Copyright 2011 FINRA
14
Re-architecting Data Generator
American University Presentation Copyright 2011 FINRA
15