LAE Java Node Getting Started Guide

LAE Java Node
Getting Started Guide
Date:
Issue:
November 12, 2010
1.0
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 1
Copyright
© THE CONTENTS OF THIS DOCUMENT ARE THE COPYRIGHT OF LAVASTORM TECHNOLOGIES,
INC., dba MARTIN DAWES ANALYTICS (MDA). ALL RIGHTS RESERVED. THIS DOCUMENT OR
PARTS THEREOF MAY NOT BE REPRODUCED IN ANY FORM WITHOUT THE WRITTEN PERMISSION
OF MDA.
Confidentiality
This document contains confidential information that is proprietary to Martin Dawes Analytics. The original
recipient of this document may duplicate this document in whole or in part for internal business purposes
only, provided that this entire notice appears in all copies. The recipient agrees to make every effort to
prevent the unauthorized use, distribution or disclosure of the proprietary information contained in this
document.
Disclaimer
No representation, warranty or understanding is made or given by this document or the information
contained within it and no representation is made that the information contained in this document is
complete, up to date or accurate. In no event shall Martin Dawes Analytics be liable for incidental or
consequential damages in connection with, or arising from its use, whether MDA was made aware of the
probability of such loss arising or not.
Trademarks
Microsoft and Windows are registered trademarks of Microsoft Corporation. Oracle and Teradata are
registered trademarks of Oracle Corporation and Teradata Corporation, respectively. All other trademarks
or registered trademarks are the sole property of their respective owners.
Contact Details
For product demonstrations, enhancement requests or technical questions regarding the use of any Martin
Dawes Analytics product, contact us as follows:
HQ Address:
Telephone:
Fax:
Email:
Internet:
th
Martin Dawes Analytics, 321 Summer Street, 5 Floor, Boston, MA 02210 USA
+1 617 345 5422 ext. 244
+1 617 345 5475
[email protected]
www.mda-data.com
Comments
We welcome your feedback on this documentation or any other Martin Dawes Analytics product or
document. We are always interested in your suggestions for additional topics. Please contact us at:
[email protected].
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 2
Table of Contents
1.
Overview ................................................................................................................................. 5
1.1.
Purpose ................................................................................................................................. 5
1.2.
Where to find it..................................................................................................................... 5
1.3.
Who should use it ................................................................................................................. 5
1.4.
When to use it ....................................................................................................................... 5
2.
Simple Example Node ............................................................................................................. 6
3.
Getting Node Parameters......................................................................................................... 7
3.1.
Using the Run Time Property Names................................................................................... 7
3.2.
Textual Substitution of Parameters ...................................................................................... 9
4.
Handling Node Input ............................................................................................................. 10
4.1.
Opening Inputs ................................................................................................................... 10
4.2.
Finding Input Fields ........................................................................................................... 10
4.3.
Reading Records................................................................................................................. 11
4.4.
Closing Inputs..................................................................................................................... 12
5.
Handling Node Output .......................................................................................................... 14
5.1.
Setting Output Metadata ..................................................................................................... 14
5.1.1.
Constructing Metadata from Scratch .............................................................................. 14
5.1.2.
Reusing Metadata ........................................................................................................... 15
5.2.
Opening Outputs................................................................................................................. 16
5.3.
Writing Records ................................................................................................................. 16
5.4.
Closing Outputs .................................................................................................................. 17
6.
Node Process Flow ................................................................................................................ 18
6.1.
Create.................................................................................................................................. 18
6.2.
Setup ................................................................................................................................... 19
6.3.
Process All .......................................................................................................................... 20
6.4.
Cleanup ............................................................................................................................... 20
6.5.
Destroy ............................................................................................................................... 20
7.
Logging & Error Handling Guidelines .................................................................................. 21
8.
Recommendations ................................................................................................................. 25
8.1.
Parameter Visibility............................................................................................................ 25
8.2.
Parameter Validation .......................................................................................................... 25
8.3.
Property Base & Run Time Property Names ..................................................................... 25
8.4.
Package, Class & Node Names .......................................................................................... 26
8.5.
Code Documentation & Maintenance ................................................................................ 26
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 3
9.
Advanced Topics ................................................................................................................... 27
9.1.
Classpath Modifications ..................................................................................................... 27
9.2.
Controlling Downstream Processing .................................................................................. 27
9.3.
Logger Usage ..................................................................................................................... 29
9.3.1.
Simple String-Based Logging......................................................................................... 29
9.3.2.
Using the Built-In ErrorCodes ........................................................................................ 30
9.3.3.
In-Node ErrorCodes & Error Messages ......................................................................... 31
9.3.3.1.
9.4.
9.4.1.
Example ...................................................................................................................... 32
Adding Extra Code Blocks ................................................................................................. 35
Example .......................................................................................................................... 35
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 4
1.
Overview
With the release of LAE 4.5, comes the java node. As the node is one of the most complicated and
advanced nodes to use (along with the python node), this document introduces the node. The
example in this document should be used as a guide for anyone writing their first java node.
1.1.
Purpose
The java node is introduced in order to solve the same problems that the python node currently
solves. The java node has better performance than the python node for large data sets. In future
LAE releases, the intention is to implement more nodes in java and achieve further
performance improvements by reducing the communication overhead between the nodes.
1.2.
Where to find it
The java node is found in the Lavastorm library, in the Interfaces and Adapters category.
1.3.
Who should use it
The java node can be used to construct new nodes by any LAE user. However, the user will
need to have some java knowledge in order to use the node. The amount of java knowledge
required is comparable to the amount of python knowledge required to write a python node.
Similar to the python node, more complicated business logic will require more knowledge of
the language.
1.4.
When to use it
The java node can be used in any case where a python node was previously being used. While
the python node is still supported, it is recommended that the java node be used in future in
order to obtain the performance benefits - from both the current implementation, and the
expected benefits in future releases.
While there are no restrictions from doing so, as with the python node, it is still considered best
practice to only use a java node when the same functionality cannot be achieved with existing
nodes. This helps ensure that, wherever possible, the business logic in LAE graphs is still
easily understandable to the casual LAE user, or the LAE user with no programming
background.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 5
2. Simple Example Node
Accompanying this getting started guide is the LAE graph “ExampleJavaNodes”. Within this
graph, there are two composite nodes. For now, consider only the “SumExample” java node
contained within the “Simple” composite.
This example node is very trivial in its operation. It simple takes any number of input pins, and
sums together all of the fields with the name as specified in the parameter InputFieldName. In the
example, this is populated with the field name “id”. If any of the inputs do not contain this input
field, then the node will fail.
Otherwise, the node will continue reading records & adding the values for each record until it has
consumed all records from all inputs.
The node is required to have one and only one output. For each summed row, the node will output
the sum, either as an int or float, depending on the value of the parameter OutputAsFloat. These
values will be written to the output field specified by the parameter OutputFieldName.
The node is able to sum over any input type that can be converted to a numeric value. If the input
type cannot be converted to a numeric value, then an error is thrown.
Bear in mind that since this node example is particularly simple, it would not be a good candidate
to implement as a java node, as standard BRAINScript alternatives could be used.
The following sections will describe how the code within the JavaCode parameter should be
structured, and how to write JavaCode sections, using this node as an example.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 6
3. Getting Node Parameters

In general, the run time property name approach should be used where possible.
The java node would be fairly useless if there was no way to access the parameters defined on the
node within the node code. It is only through defining node parameters and using them in the node
code that the java nodes are able to be at all generic and reusable. As with the python node, there
are two mechanisms for accessing parameters within a java node – using BRE’s textual
substitution, and using the run time property names.
3.1.


Using the Run Time Property Names
When properties are retrieved using their Run Time Property Name, all of the required
parameters should be obtained & verified within the setup method (described in section
0).
Whenever obtaining or setting properties using this method, PropertyExceptions can be
thrown. Therefore it is important to handle this exception as described in section 7.
LAE users who are used to writing python nodes will be familiar with declaring run time
property names for parameters and accessing the parameters using the run time property name
within the python code. The same approach is used to obtain parameters within the java code.
Examine the SumExample node within the Simple composite in the example graph provided.
Notice that the JavaCode within the node defines a method propertyBase, as shown below.
public String propertyBase() {
return "ls.brain.node.sumExample";
}
Then examine the Parameter Declarations on the node, shown below.
Figure 1 Parameters defined on the example node.
Each of these parameters has a Run Time Property Name format of:
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 7
Where <paramName> is simple some specific name for the parameter (e.g. inputFieldName).
Therefore, with this in mind, variables are defined within the JavaCode class in order to store
these properties, as shown below
/**
* Specifies whether the output should be written as a floating point, or
* integer
*/
private boolean
m_outputAsFloat;
/**
* The name of the field which is to be summed
*/
private String
m_inputFieldName;
/**
*
The name of the field to be output.
*/
private String
m_outputFieldName;
Then, within the setup method of the node these properties are loaded into the variables as
shown below:
try {
//get the required properties
m_outputAsFloat = properties().getBoolean( propertyBase() +
".outputFloat");
m_inputFieldName = properties().getString( propertyBase() +
".inputFieldName");
m_outputFieldName = properties().getString( propertyBase() +
".outputFieldName");
}
catch (PropertyException ex)
{
logger().error(ex, Logger.CHAIN_END, "Error reading node properties.");
throw fail();
}
Consider the first property that is read in the above code block. This code simply obtains the
Properties object via the properties() accessor defined on the Node interface – as can be
seen in the javadoc API provided. The code then attempts to access the boolean property
called:
propertyBase() + ".outputFloat"
From the propertyBase() method, this corresponds to the run time property name:
ls.brain.node.sumExample.outputFloat
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 8
Furthermore, from Figure 1 this run time property name is used by the parameter
OutputAsFloat.
So, with all of this put together, the line:
m_outputAsFloat = properties().getBoolean( propertyBase() + ".outputFloat");
Simply states:
“Obtain the Boolean parameter ÓutputAsFloat and store it in the variable m_outputAsFloat”
There are a variety of different methods for accessing the different property types that can be
defined on a node. Therefore the Properties javadoc API should be consulted when
determining how to read the particular property you are interested in.
3.2.
Textual Substitution of Parameters
Textual substitution is probably the easiest method of accessing parameters within the java
code. Existing LAE users should be familiar with how textual substitution can be used, via the
{{^parameterName^}} syntax. As this is a general LAE concept it will not be discussed here.
Within the JavaCode itself, textual substitution should be used in places where it is not
possible to use the runtime property name to obtain the property value. Wherever the value
needs to be directly substituted into the java code, then the textual substitution should be used.
If however, the code can be written to obtain the runtime property value as shown in the
previous section, and this can be then stored in some variable, then the runtime property name
approach is preferable as it allows greater control of errors that can occur during property
evaluation.
An example of the use of textual substitution of parameters occurs when substituting the name
of the class, and the name of the package within which it lies, as shown below:
package {{^JavaPackage^}};
…
public class {{^JavaClass^}} extends AbstractJavaNode
In these cases, it would not be possible to read in the value of the JavaPackage or JavaClass
parameter into a variable for use in the java code, therefore the textual substitution approach is
used.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 9
4. Handling Node Input
This section details how the java node code can be written to handle the node inputs.
4.1.
Opening Inputs
The default JavaCode stub in the Java Node is an implementation which extends
1
SimplifiedNode. When an implementing class that extends SimplifiedNode is used , then the
inputs are already opened and no work needs to be done in the Java Node code.
4.2.
Finding Input Fields
Often it is necessary to locate a specific field within an input. In these cases it is necessary to
find the index of the field within the input metadata, using the name of the field to search. In
the case of the SumExample node, we need to find the field with the name specified in the
InputFieldName parameter.
As seen in section 3.1, this was read into the variable m_inputFieldName. Therefore, the
following code within the SumExample node is used to locate this field within the ith input, and
error if the field does not exist on this input:
int idx = input(i).metadata().find(m_inputColumnName);
if ( idx == -1) {
logger().error(Logger.CHAIN_END, "Unable to find required field
("+m_inputFieldName+") on input ("+i+"): \""+input(i).name()+"\"");
throw fail();
}
Ignoring the logging & exception handling for the moment (this is described later in sections 7
and Error! Reference source not found.), this code simply obtains the metadata for the ith
nput, then searches this metadata for the field with the name m_inputFieldName.
If the field cannot be found in the metadata, then a result of -1 is returned. In this case, the node
defines that it will error.
Similar to the process of finding input fields using a field name, the index of an input or output
can be found using the findInput and findOutput methods respectively as defined on the
Node interface.
In general, if the field will need to be read from multiple records from the same input, then it is
good practise to store the field index (idx in the above code), such that it can be used without
1
It is expected that all customer-written java nodes will extend SimplifiedNode. Currently all nodes that extend the
java node provided by MDA also extend the SimplifiedNode. The only reason this would not be used was if there were
specific requirements requiring non-blocking I/O.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 10
re-searching the record metadata each time. If this is required, then it should be performed in
the setup method of the node.
In the SumExample node, the indices for the InputFieldName field in each of the inputs are
stored into an array, which is subsequently used when reading the records.
4.3.

Reading Records
Record processing should always be performed in the processAll method (described in
section 0).
Processing records is relatively straightforward, as shown from the code within the
processAll method in the SumExample node. In this example, we want to read records from
each input that still has records remaining, until all inputs have been completely read.
Therefore, the following code is used:
Record record = read(i);
if (record != RecordInput.EOF) {
…
}
This simply reads the next available record from the ith input. If no more records are available
from this input, then RecordInput.EOF will be returned.
Therefore, if we only had one input and simply wanted to continue processing records until this
input had no more records to read, the following could be used instead:
Record record = null;
while ((record = read(0)) != RecordInput.EOF) {
//process record
}
Once the Record has been obtained, and we have verified that this Record isn’t simply an end
of input indicator, then it is straightforward to get a field from this record.
In order to obtain the first field defined on a record, then the following can be used:
Object field = record.field(0);
However, since field ordering is not guaranteed, it is better to use the name of the field and
obtain the index of that field in the metadata. To get the field named “Foo” from the first input,
the following can be used:
int index = input(0).metadata().find("Foo");
Object field = record.field(index);
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 11
As mentioned in the previous section, if this is to be done repeatedly, the index should be
stored in a variable so that the metadata does not need to be searched every time. This is
exactly what has been done in the SumExample node. There, in the setup method, the index
for the {{ÎnputFieldName^}} field for the ith input has been stored in the m_indices array,
at position i,
The code snippet shown below is used within the SumExample node to obtain the
th
{{ÎnputFieldName^}} field form the i input.
record.field(m_indices.get(i)
Note that the record.field methods will always return a java.lang.Object. Therefore, in
order to perform useful processing with the returned field, it will most likely be necessary to
cast the field to a different type.
When performing these operations, it is advisable to ensure that the metadata on the input is
correct for the type you are casting to. For instance, if we want to ensure that the field “Foo” in
input 0 contains integer data, then the following code could be used to ensure that this is the
case:
int index = input(0).metadata().find("Foo");
FieldMetadata fieldMd = input(0).metadata().field(index);
Class<?> clazz = fieldMd.type();
if (!java.lang.Integer.isAssignableFrom(clazz)) {
//error
}
Also note that the above code will only handle Integer types, and will still error for other types
such as Byte, Long, etc.
4.4.



Closing Inputs
In general, inputs should be closed within the cleanup method (described in section 6.4).
Similarly, the setup method should ensure that if it fails, all inputs have subsequently been
closed, as the cleanup method will not be called.
Note that closing an input can cause an IOException to be thrown. Therefore it is
important to handle this exception as described in section 7.
A helper method cleanupIo can be used to close all of the open inputs & outputs and
handle any IOExceptions that might be thrown – logging correctly,
Any inputs that have been opened during the setup method should in general be closed in the
cleanup method. Closing an input is a very simple operation, and the following code (as seen
in the SumExample node) will close the ith input (0-indexed):
input(i).close(false);
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 12
The boolean parameter provided to the close method specifies whether or not the input is
being closed due to an error.
The close method shown above can throw IOExceptions. In order to simplify the cleanup of
all I/O, it is recommended that the cleanup method simply calls cleanupIo. This is the default
implementation of cleanup provided in the JavaCode stub in the Java Node.
This will close any open inputs & outputs, and correctly log any IOExceptions that get thrown
& error the node.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 13
5. Handling Node Output
This section details how the java node can be written to handle the node outputs.
5.1.

Setting Output Metadata
Wherever possible, outputs should have their metadata set in the setup method (described
in section 0). Where the metadata is dependant on data read from input records, then the
metadata should be set in the processAll method.
While the first operation performed on node inputs is normally to open them, the metadata first
needs to be defined on an output before the output is opened.
5.1.1. Constructing Metadata from Scratch
In the “SumExample” node, the output record metadata is constructed & set as part of the
setup method. Here, the OutputAsFloat and OutputFieldName parameters are used to
determine the output metadata.
When OutputAsFloat is set to true, the OutputFieldName field is set to be of a floating
point type. Otherwise, the output metadata is setup with an integer type, as shown in the
following code:
//Setup the output metadata according to the properties.
Class<?> outputType = null;
if (m_outputAsFloat)
outputType = java.lang.Float.class;
else
outputType = java.lang.Integer.class;
RecordMetadata metadata = output(0).newMetadata();
metadata.add(new SimpleFieldMetadata(m_outputColumnName, outputType));
output(0).metadata(metadata);
The newMetadata call on the RecordOutput constructs a new RecordMetadata object.
This is then populated with new FieldMetadata objects (in this case,
SimpleFieldMetadata is used). Once all of the required field metadata has been added to
the RecordMetadata, the metadata can be set on the RecordOutput.
Following this, no additional field metadata can be added to the RecordMetadata object.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 14
5.1.2. Reusing Metadata
While the previous approach allows for full control of all of the fields in the output
metadata, it may be the case that the output metadata should simply be the same as the
metadata on an input. In this case, the RecordMetadata.copyFrom method can be used.
The code below shows how this can be done for setting the metadata for the first output to
the same as the metadata for the first input:
//Setup the output metadata
RecordMetadata metadata = output(0).newMetadata();
metadata.copyFrom(input(0).metadata());
output(0).metadata(metadata);
Similarly, if the metadata is to be used on multiple outputs, this can be done using the
following:
RecordMetadata metadata0 = output(0).newMetadata();
//construct the metadata for output 0 here
…
output(0).metadata(metadata0);
RecordMetadata metadata1 = output(1).newMetadata();
metadata1.copyFrom(metadata0);
output(1).metadata(metadata1);
It is important to note that the same RecordMetadata object cannot be used on multiple
outputs. This means that the following code is incorrect:
//Setup the output metadata
RecordMetadata metadata = output(0).newMetadata();
//Assigning this once is fine
output(0).metadata(metadata);
// The follwing code is incorrect.
// RecordMetadata instances should not be shared.
output(1).metadata(metadata);
Rather, a different RecordMetadata object needs to be constructed for each output, as
shown in the previous examples.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 15
5.2.

Opening Outputs
Wherever possible, outputs should be opened in the setup method (described in section 0).
Where the metadata is dependant on data read from input records, then the outputs will
need to be opened in the processAll method (described in section 0).
Once the metadata has been set on an output, the output can be opened. The output must be
opened prior to attempting to write to it. Opening an output is a very simple operation, and the
following code (as seen in the SumExample node) will open the first output:
openOutput(0);
If multiple outputs are being used, and all need to be opened at the same time (after the
metadata has been set on each output), then the following code can be used to simply open all
of the outputs:
openOutputs();
5.3.

Writing Records
Record writing should generally be performed in the processAll method (described in
section 0).
Records can only be written to an output after the output has been opened. It is a relatively
straightforward process to write records within a java node. First, a new record can be obtained
from the output metadata. Then, on the returned record, each of the fields can be populated
prior to writing the record to the output.
The following code shows how this is performed within the SumExample node to write a
simple record to the first output with one field set (where the variable sum is defined to be a
double).
Record record = output(0).metadata().newRecord();
if (m_outputAsFloat)
record.field(0, sum);
else
record.field(0, (int)sum);
writeRecord(0, record);
Each field which is not set on a record prior to the record being written will appear as NULL in
the brdViewer. For instance, if in the above example, the first output was defined with a
metadata containing 2 fields, the second field would be left as NULL.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 16
5.4.


Closing Outputs
In general, outputs should be closed within the cleanup method (described in section 6.4).
Similarly, the setup method should ensure that if it fails, any outputs it has opened have
subsequently been closed, as the cleanup method will not be called.
Note that closing an output can cause an IOException to be thrown. Therefore it is
important to handle this exception as described in section 7.
Any outputs that have been opened during the setup method should in general be closed in the
cleanup method.
Closing an output is a very simple operation, and the following code (as seen in the
SumExample node) will close the first output:
output(0).close(false);
The boolean parameter provided to the close method specifies whether or not the output is
being closed due to an error.
The close method shown above can throw IOExceptions. In order to simplify the cleanup of
all I/O, it is recommended that the cleanup method simply calls cleanupIo. This is the default
implementation of cleanup provided in the JavaCode stub in the Java Node.
This will close any open inputs & outputs, and correctly log any IOExceptions that get thrown
& error the node.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 17
6. Node Process Flow
While there are some exceptions (see section 0), the code within the JavaCode parameter should
generally be a class which extends the com.lavastorm.brain.node.SimplifiedNode class.
This class should have no constructor. If there is some pressing reason to have a constructor
(rather than simply performing initialisation in setup as recommended), then there must be a noargument constructor.
The code skeleton provided with the base java node is already defined to extend this
SimplifiedNode class.
The LAE server then knows that it has a Node to execute, and will call the methods defined on the
Node interface (implemented by the SimplifiedNode) in order to execute the node.
Looking at the JavaCode in the simple example node provided the following line ensures that this
class inherits from the AbstractJavaNode:
public class {{^JavaClass^}} extends SimplifiedNode
The Node interface is provided in the java node API. The process flow of the node is outlined in
the following section, and follows the path
create -> setup -> processAll -> cleanup -> destroy
This path is always followed, assuming that no exceptions are thrown from the code within these
methods, and assuming that the node status is never set to failed in any of these methods.
Additional node sates for controlling downstream processing (outside of failure & success) are an
advanced topic and described in section 9.2.
6.1.
Create
The create method defined on the Node interface is implemented in the SimplifiedNode and
does not need to be implemented in the user-defined class in the JavaCode parameter.
The create method in SimplifiedNode guarantees that by the time setup is called, all inputs
have been opened.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 18
6.2.
Setup
The setup method is where all of the required setup for the node should be performed. This
generally involves:
 Reading any required properties
 Setting up output metadata (if that can be performed without needing to read input
records)
 Open outputs (if the output metadata can be setup)
 Performing any other initialisation required.
As stated in the API, if any errors occur during setup, the user defined code should catch these
errors, log them appropriately, indicate that the node has failed via calling the fail() method,
then throw a NodeFailedException, or return from the method normally.
o If this method does not throw an exception
o If the node fail() method was not called, then :
 processAll, cleanup & destroy will all be called
 The cleanup method should be valid to be called to cleanup any resources
allocated during the setup method, regardless of whether or not processAll
fails.
o Else
 Execution of the node will be terminated, processAll and cleanup will not be
called, destroy will be called.
o Otherwise,
 If the method throws a RuntimeException
 The node controller will mark the node as failed - regardless of whether
or not fail() has been called within the node code.
 Node execution will be terminated – processAll and cleanup will not
be called, destroy will be called.
 Else (if the method throws a NodeFailedException)
 The node status will not be changed by the node controller. If this has
not been set to fail, then the node will not be marked as failed.
 Execution of the node will be terminated, proceessAll and cleanup
will not be called, destroy will be called.
Importantly, this means that the NodeFailedException does not signal to the node controller
that the node status should be set to failed. Rather, the node controller assumes that the node
has correctly set its own status. The NodeFailedException does signal to the node controller
that further processing should be aborted.
In general, when an error occurs within the node, and is caught by the node code, the easiest
mechanism for signalling an error and aborting processing is to call:
throw fail();
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 19
6.3.
Process All
The processAll method is to contain the main node execution operation. This will involve the
reading and writing of records, and any business logic that needs to be applied to these records.
As stated in the API, if any errors occur during processAll, the user defined code should
catch these errors, log them appropriately, indicate that the node has failed via calling the
fail() method, then throw a NodeFailedException, or return from the method normally.
o If this method returns normally, and the node fail() method was called, then:
 The node will be marked as failed
o Otherwise,
 If the method throws a RuntimeException
 The node will be marked as failed
o In all cases, if processAll is called, cleanup & destroy will be called.
This means that if a NodeFailedException is thrown, but fail() is not called, then the node
will not be marked as failed.
6.4.
Cleanup
The cleanup method finishes execution of the node and cleans up internal resources allocated
during the setup and processAll methods.
This method will always be called if setup completed successfully.
After this method is called calls to input, output, properties, and log accessors should still be
valid.
The destroy method will always be called after this method.
While there may be some additional code required in the cleanup method to handle the closing
of external files, database connections, sockets etc, the following code should almost always be
placed into the cleanup method to ensure that the node’s inputs and outputs are cleaned up
correctly:
cleanupIo();
6.5.
Destroy
The destroy method defined on the Node interface is implemented in the AbstractJavaNode
(parent class of the SimplifiedNode) and does not need to be implemented in the user-defined
class in the JavaCode parameter.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 20
7.
Logging & Error Handling Guidelines
The Node interface also defines an accessor to a Logger object. The Logger provides utility
methods for writing to the log with varying LogLevels – which are, in order of severity:




DEBUG
INFO
WARN
ERROR
The different log levels can be used by calling the logger().log(LogLevel level, …) method.
Alternatively, the logger().debug, logger().info, logger().warn & logger().error
methods can be used.
These log levels are outlined in the table on the following page.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 21
LogLevel
LogLevel.DEBUG
LogLevel.INFO
When to Use
Used when logging information unrelated to
any errors, just debugging information which
may be useful to identify what is going on in
the processing logic. The contents of these
debug statements may only make sense to the
node developer for troubleshooting purposes.
In general, these should not be
internationalized.
Generally useful if an error occurs, but is
handled, or an exception is being thrown that is
part of the method contract (except for
NodeFailedExceptions, which should generally
only be thrown after a LogLevel.ERROR
message has been logged).
Usage Example
logger().debug(“Beginning to process records”);
Record record = null
while ((record = read(0)) != RecordInput.EOF) {
//process record
}
logger().debug(“Finished processing records”);
public void myMethod() throws IOException {
…
try {
readFile();
}
catch (IOException ioe) {
logger().info(ioe, “IOException occurred reading file”);
throw ioe;
}
}
LogLevel.WARN
Generally used when an error occurs, but the
code is able to make some assumptions &
continue. Is possible/likely to lead to an error
later.
File dir =new File(dirName);
boolean madePath = dir.mkdirs();
if (!madePath) {
String msg =
“First choice path (“+madePath.getName()+”) failed. ”;
msg += “proceeding to backup location. ”;
msg += “This may affect downstream processing.”;
logger().warn(msg);
dir = new File(backupDirPath);
…
}
LogLevel.ERROR
Used when the node is about to fail.
int inputIdx = findInput(“Data”);
if (inputIdx == -1) {
logger().error(“Expected node input \”Data\” does not exist”);
throw fail();
}
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 22
The methods inherited by the Node that the node developer is expected to write (setup,
processAll and cleanup) only declare to throw NodeFailedExceptions. When a
NodeFailedException is thrown from the body of one of these methods, then the LAE will
assume that the node has written all of the required information to the log, so will not
automatically log any error information.
Therefore, it is up to the node developer to ensure that when any exception2 is thrown from
within one of these methods, one of the following occurs:



The exception is caught, handled, and processing can continue
The exception is caught, logged, the node status is set to fail (via calling the fail()
method) and
o A NodeFailedException is thrown or
o The node returns normally.
The exception is caught, logged, the node status set to some an appropriate non-failure
state based on the exception (see section 9.2 for information on different node signalling
states) and
o A NodeFailedException is thrown or
o The node returns normally.
The SumExample node contains a number of examples of using the Logger object provided on
the Node interface.
In order to make life easier for the node developer, the fail() method returns a
NodeFailedException. An example of using where this is used in the SumExample is shown
below. This code – within the setup method – attempts to obtain a series of properties values.
try
{
//get the required properties
m_outputAsFloat = properties().getBoolean( propertyBase() +
".outputFloat");
m_inputColumnName = properties().getString( propertyBase() +
".inputColumnName");
m_outputColumnName = properties().getString( propertyBase() +
".outputColumnName");
}
catch (PropertyException ex)
{
logger().error(ex, Logger.CHAIN_END, "Error reading node properties.");
throw fail();
}
2
Technically, only checked exceptions (i.e. Exceptions that aren’t RuntimeExceptions) need to be handled.
However, it is generally good practice to also handle any RuntimeException where you are able to provide more
information as to the reason behind the error than would be available if the error simply bubbled up for the LAE to
handle without context.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 23
If a PropertyException occurs during this attempt, the code first obtains the Logger using the
logger() accessor and writes an error to the log. The Logger.CHAIN_END argument simply tells
the log that it shouldn’t expect any more error messages related to this error, and to end the error
chain. This can be left off the call to error without ill-effects.
Following this, the code will set the state of the node to failed, using the fail() method defined
on the Node interface, and throw the returned NodeFailedException.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 24
8.
Recommendations
8.1.
Parameter Visibility
There are several reasons for writing java nodes. Sometimes this is to simply implement
some complex functionality, and allow this functionality to be configured by some node
parameters. If this is the case, then generally the java node should be made into a library
node, then all of the java-specific parameters should be hidden to the node user. So, after
appropriately documenting the node (not just code comments), the following parameters
should be hidden:






NodeClass
Classpaths
JavaPackage
JavaClass
CompileJavaClass
JavaCode
These can be hidden by opening the node, then navigating to Declare Parameters ->
Inherited Parameter Group Overrides, then setting Group to Hide.
There are times however – as discussed in section 0 – where the node user will be expected
to provide their own java code. Such cases are relatively advanced topics and in general a
little bit more thought needs to go into exactly which parameters should be exposed or
hidden from the node user.
8.2.
Parameter Validation
If a parameter is required, then set it to be Not Blank in the parameter declarations. Event
though you can check the runtime property within the java code, and then log appropriate
exceptions, the Not Blank setting is caught a lot earlier, rather than only at runtime. Also,
there is always the possibility that future BRE versions may use the Not Blank setting to
provide better visual clues to the user as to required fields.
8.3.
Property Base & Run Time Property Names
It is a good idea to set the property base to reflect the node itself. For example, the property
base on the SumExample node is “ls.brain.node.sumExample”.
Experienced LAE users will be aware that the structure of the node parameter hierarchy is
important. This is because the property resolution rules use the dot-separated property name
as a search hierarchy.

If a search is performed for the property “ls.brain.node.sumExample.foo”, then
o If this exists, it is returned,
o Else, a search is performed for “ls.brain.node.foo”, then
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 25


If this property exists, it is returned,
Else, a search is performed for “ls.brain.foo”, then
 If this property exists, it is returned,
 Else, a search is performed for “ls.foo”
o If this property exists, it is returned
o Else a search is performed for “foo”
 If this property exists, it is returned
 Otherwise a PropertyException is thrown.
Therefore, it is good practise to ensure that the node hierarchy is mirrored in the run time
property name hierarchy & the String returned from the propertyBase method, such that a
parameter “propertyName” that is declared on a parent node, with the correct run time
property name hierarchy can be obtained using the line :
properties().getProperty( propertyBase() + “.propertyName”);
8.4.
Package, Class & Node Names
It is common sense to give your node a sensible name – this adds to the self-documenting
nature of the graphical BRE display. Similarly, the class name should be changed to reflect
the name or purpose of the node. While this won’t necessarily help with documenting the
graph, it will help for understanding error logs.
For MDA developed nodes, the JavaPackage parameter should also generally be under the
com.lavastorm.brain.node.<libraryName> package.
For external customers, if the node is simply a one-off node to be used within a graph, then
the default value for the JavaPackage parameter should be left (pgk{{^handle^}}).
However, if the node is to be re-used, and turned into a library node, then a general rule of
thumb would be that it is good practice to change the node package name to something
similar to the com.<companyName>.node.<libraryName> pattern.
The name of the class should be changed in the JavaClass parameter to reflect the node
name. In order for these parameter changes to be effective, they still need to be referenced
via the textual substitution ({{^^}}) mechanism in the JavaCode parameter, and also in the
NodeClass parameter. Clearly, as these are to be inserted as the java package name and java
class name respectively, they need to be valid package and class names.
8.5.
Code Documentation & Maintenance
Comment any complex logic in the code such that the next person who has to read it stands a
chance (and just remember, this could be you in a couple of year’s time). Provide sufficient
information to the logs such that errors have sensible & understandable error messages that
allow for node errors to be easily resolved.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 26
9.
Advanced Topics
9.1.
Classpath Modifications
In certain cases it may be desirable to modify the classpath on the java node. This can be
done for inheritance & interface purposes, or to simply reference external classes and jars
that aren’t part of the LAE installation.
If external classes are required for the java node to run, these can simply be added to the
Classpaths parameter in Parameters 2 from the Node Editor window.
Say for example that you had written some utility java code which you wanted to reference in
your java node, and let’s say that this was packaged into a jar myUtilityClasses.jar.
In order for the java node to be able to reference these classes within the JavaNode code, this
jar needs to be placed in a location where the LAE server can access it. Assume this is in the
location:
/usr/home/myUtilityClasses.jar
Then this can be setup in the java node classpath by simple adding this location to the
Classpaths parameter as shown below.
Figure 2 Adding an external jar to the java node classpath
9.2.
Controlling Downstream Processing
Examine the node “ErrorAndSignallingExample” in the “Advanced” composite in the
ExampleJavaNodes.brg file provided with this getting started guide.
This node has defined 3 parameters (in addition to those inherited from the base Java node).
These parameters are:
Parameter Name
Allowable Options
ErrorPosition
NONE
DURING_SETUP
DURING_PROCESS_ALL
DURING_CLEANUP
NONE
DIRECT_DEPENDENTS
NONE
OutputNodeSignal
ClockedNodeSignal
LAE Java Node Getting Started Guide
Description
Tells the node code where to fail the node.
None implies that the node will not fail.
Specifies the value to set on
AbstractJavaNode.outputSignalMode
Specifies the value to set on
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 27
DIRECT_DEPENDENTS
ALL_DEPENDENTS
AbstractJavaNode.clockSignalMode
Essentially, this is an example node to show what occurs when the node fails at different
stages of processing. Alternatively, if the node succeeds, this shows how the different return
states can be used to control downstream processing.
It is obvious that if the node status is set to failed (i.e. ErrorPosition != NONE), that none
of the downstream nodes will be able to be correctly processed.
However, more interesting is the case where the ErrorPosition is set to NONE, and the
OutputNodeSignal parameter and ClockedNodeSignal parameter are modified.
The piece of code within the node which uses these parameters to modify the downstream
processing are the two simple lines:
outputSignalMode(m_outputNodeSignal);
clockSignalMode(m_clockNodeSignal);
The OutputNodeSignal and ClockedNodeSignal parameters have already been read into
the variables m_outputNodeSignal and m_clockNodeSignal respectively by this stage of
processing.
Modify these node parameters in the example graph and investigate the effects on
downstream processing.
The following table details what will occur with each combination of OutputNodeSignal &
ClockedNodeSignal. For those familiar with the BRAINScript setSuccessReturnCode
function, the corresponding returnCode is shown in the left column.
returnCode
200
clockSignalMode
NONE
ouptutSignalMode
NONE
202
NONE
DIRECT_DEPENDENT
S
0
DIRECT_DEPENDENTS
201
DIRECT_DEPENDENTS
DIRECT_DEPENDENT
S
NONE
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Behaviour
Nothing connected (clock
or output) to this node runs
Only things connected to
the output of this node run,
clocked items don’t
Normal
Only things connected to
the outclock of this node
run, things connected to
outputs don’t
Page 28
203
ALL_DEPENDENTS
NONE
N/A
ALL_DEPDENDENTS
N/A
*
DIRECT_DEPENDENT
S
ALL_DEPENDENTS
Only things connected to
the outclock of this node,
and connected to the
outclock of nodes
connected to the output of
this node run.3
Error, not allowed.
Error, not allowed.
Consider the second last row in this table. It may not be immediately obvious why this
combination would not be allowed.
However, consider that you have three nodes, A, B & C, connected as shown below:
Where
Is a output relationship &
is a clock relationship
In the case where clockSignalMode ==ALL_DEPENDENTS && outputSignalMode ==
DIRECT_DEPENDENTS then if A passes, B will be allowed to execute. Once B is allowed to
execute, then whether or not C runs is entirely dependent on whether B allows it to run, and
not dependent on what A specifies. Therefore this combination does not make sense.
9.3.
Logger Usage
The Logger object returned from the logger accessor on the Node interface provides for more
complex logging than simply writing some String message to an error log with an associated
error level.
The Logger object also caters for node developers writing java nodes that require localization
features, including internationalised error message.
There are three main logging usage patterns within the java node. These are outlined in the
following sections.
9.3.1.
Simple String-Based Logging
In general, it is expected that most users of the java node will not care about
internationalising error messages. Therefore, it is expected that the use of hard or soft coded
Strings in log messages will be the most common use case.
Examples of the String use case were shown in the table in section 7.
3
Note that due to an existing Gnats issue (2664), this will not work when using the BRE controller.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 29
9.3.2.
Using the Built-In ErrorCodes
Provided with the LAE is a set of internationalized (although not yet localized) error
messages and associated ErrorCodes that can be used with the logger. These relate to errors
that will need to be commonly handled on Java Nodes within the LAE.
In future releases this will be further populated as other common errors are identified.
The ErrorCodes are in com.lavastorm.brain.node.ErrorCodes, and are documented in
the API provided. In order to use these ErrorCodes within your node, you simply need to
reference them. These are imported by default into the Java Node via the line:
import static com.lavastorm.brain.node.ErrorCodes.*;
In order to use an ErrorCode, code similar to the following -used in the “Internatioalized
SumExample” node - can be used:
try {
//get the required properties
m_outputAsFloat = properties().getBoolean( propertyBase() +
".outputFloat");
m_inputFieldName = properties().getString( propertyBase() +
".inputFieldName");
m_outputFieldName = properties().getString( propertyBase() +
".outputFieldName");
}
catch (PropertyException ex)
{
logger().error(ex, Logger.CHAIN_END, ERROR_RETRIEVING_PROPERTIES);
throw fail();
}
Note that many of the error messages require arguments. An example of an ErrorCode that
takes arguments is also found in the “Internationalized SumExample” node, as shown below:
private void setupOutputs() throws NodeFailedException {
//Check that the output is correct.
if (numOutputs() != 1) {
logger().error(UNEXPECTED_NUMBER_OF_OUTPUTS, 1, numOutputs());
throw fail();
}
}
Examining the ErrorCodes API shows that the UNEXPECTED_NUMBER_OF_OUTPUTS
ErrorCode takes two arguments, the number of outputs that were expected (1 in this case),
and the number of outputs on the node (numOutputs()).
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 30
9.3.3.
In-Node ErrorCodes & Error Messages
There may be cases where a Java Node is going to be used in different locales in which case
internationalized error messages may be desirable. Where possible, the built-in error codes
described in the previous section should be used.
However, these only cover generic errors that could occur on any java node. Node specific
error messages can be internationalised within the java node itself.
In order to achieve this, the node developer will need to use message keys to lookup error
messages in resource bundles.
These resource bundles can be constructed as node parameters with a specific run time
property name format.
For each locale, the error messages need to go into a property which has the format of a java
ResourceBundle – simply containing key, value pairs separated by an “=” sign. In any place
where arguments are to be supplied to augment the message, placeholders can be written into
the resource bundle using {0}, {1}, … etc markers.
For each locale & error message bundle combination, there needs to be one of these resource
bundle parameters, and a corresponding bundle name parameter within the node. These
parameters need to have the respective run time property name formats:
&
The format of the
already been briefly described in the previous paragraph.
parameter has
The value of the “ls.brain.node.java.resources.<something>.name” parameter
specifies the name of the resource bundle. This needs to adhere to the localization naming
guidelines as described within the ResourceBundle documentation in the Java API:
http://download.oracle.com/javase/6/docs/api/java/util/ResourceBundle.html
This essentially states that the name of the resource bundle needs to consist of alphanumeric
sections separated by periods. If the set of error messages (or “content” parameter) to which
the name applies is the default bundle, then this is all that should be supplied in the resource
name. Otherwise, it should have “_<language>[_<region>]” suffix. For example, if the set
of error messages to which the name applies is to be used as the default bundle in a French
speaking locale, then this would end with “_fr”. If it is the bundle to use in a French
speaking part of Canada, then this should end with the “_fr_ca” suffix.
It is a good idea therefore to separate out the base part of the resource bundle from the
localized part. If this was placed into a parameter “MessageBundleBase” this would mean
that the ls.brain.node.java.resources.<something>.name parameter for the resource
bundle for the default French locale would be: {{^MessageBundleBase^}}_fr
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 31
The bundle names are still a little more complicated. The value of the
ls.brain.node.java.resources.<something>.name parameter is not actually the full
bundle name. Rather, if this runtime property is bound to the parameter “BundleName”, then
the full bundle name (including locale) is actually: {{^JavaPackage^}}.{{^BundleName^}}
The actual resource bundle contents are taken from the contents parameter, and placed into
the correct location on the node’s classpath (based on the package-like hierarchy), and placed
into a resource bundle file with a “.properties” extension. This makes the resource bundles
available for use within the Node java code.
In order to use these resources, the easiest method is to construct a SimpleErrorCode that
references the messageKey within the bundle. Then, this ErrorCode can be provided to the
Logger when logging messages. The following example shows how all of this can be done in
practice.
9.3.3.1. Example
Consider the case of the simple SumExample node introduced previously. Now, consider that
this node is to be distributed to and used by multiple users across multiple different locales
(although I’m not sure who would want to use it). It would be nice if the error messages
could also be localized.
An example of how this can be achieved is shown in the node “Internationalized
SumExample” – which has been added to the local library. First look at the definition of the
library node.
Examine the Parameter Declarations window in the node editor (shown below):
Figure 3 Parameter declarations for internationalised logging
Here it can be seen that two resource bundles are being defined, as there are two sets of run
time property names matching the format:
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 32
&
These are the “germanErrors” and “nodeErrors” parameter sets. Note that within the
parameter declarations there is nothing to say that these two parameter sets are in any way
related.
Notice also that MessageBundleBase and MessagePrefix parameters have been declared.
Now, look at the Parameters 2 tab of the node editor as shown below:
Figure 4 Parameters for internationalised resource bundles
These two message bundles are actually two different localizations of the same message
bundle base. The MessageBundleBaseName parameter is declared as
“test.ErrorMessageBundle”. Then, the two message bundles are declared as:
GermanMessageBundleName: {{^MessageBundleBase^}}_de
DefaultMessageBundleName: {{^MessageBundleBase^}}
As mentioned previously, these bundle names are only a part of the full message bundle
name, which also uses the JavaPackage parameter. When combining the JavaPackage into
the bundle name, these become:
German Bundle Name: {{^JavaPackage^}}.{{^MessageBundleBase^}}_de
Default Bundle Name: {{^JavaPackage^}}.{{^MessageBundleBase^}}
This in turn evaluates to:
German Bundle Name: com.lavastorm.brain.node.test.ErrorMessageBundle_de
Default Bundle Name: com.lavastorm.brain.node.test.ErrorMessageBundle
Therefore the resource bundle properties will be constructed as the resource bundle files
“ErrorMessageBundle_de.properties” & “ErrorMessageBundle.properties” in the
com/lavastorm/brain/node/test/ directory within the node’s classpath.
Note also that the MessagePrefix parameter has been used to qualify the messageKeys on
each of the error messages described in each of the resource bundles.
So, now that all of the required error message bundles have been defined, examine the
JavaCode parameter to see how these are actually used.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 33
Within the JavaCode, the ErrorCodes for each of these error messages are defined as
constants on the node, as shown below:
private final static String MESSAGE_BUNDLE =
"{{^JavaPackage^}}.{{^MessageBundleBaseName^}}";
private final static String MESSAGE_PREFIX = "{{^MessagePrefix^}}";
private final static ErrorCode ERROR_PARSING_NUMERIC_VALUE =
new SimpleErrorCode(MESSAGE_PREFIX + "errorParsingNumericValue",
MESSAGE_BUNDLE);
It may be worthwhile to examine the ErrorCode and SimpleErrorCode API provided to
ensure that you understand what these definitions are doing. Once these ErrorCodes have
been declared, they can be supplied to the various methods on the Logger that take
ErrorCodes.
In the simplest case, this is simply providing the ErrorCode itself.
In more complicated cases, arguments need to be provided in addition to the ErrorCode
which are then substituted into the error message.
Consider the case below, from the “Internationalized SumExample” node:
logger().error(nfe, Logger.CHAIN_END, ERROR_PARSING_NUMERIC_VALUE,
obj, inputIdx, input(inputIdx).name(), recNum);
In this case, the ERROR_PARSING_NUMERIC_VALUE ErrorCode points to the
“ls.brain.node.sumExample.errorParsingNumericValue” message key. Within both the
German and default message bundles, this message key corresponds to an error message that
takes 4 arguments. The default version is shown below:
{{^MessagePrefix^}}errorParsingNumericValue=Error attempting to parse {0}
as a numeric value. Error occurred on input ({1}): "{2}" on record {3}.
Therefore, from the logger().error call, will end up with the following substitutions
performed:
Argument
Value
{0}
{1}
{2}
{3}
Obj
inputIdx
input(idx).name()
recNum
The node “Internationalized SumExample – Failure” under the path “Advanced ->
Internationalized Logging” shows a case where this error message gets used.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 34
9.4.
Adding Extra Code Blocks
In some circumstances, it may be impractical to write all of the java code required into the
one code block provided in the Java Node. This can occur because a large amount of code is
required, and it makes more sense to have some form of sensible object model, and not
simply one node class.
Alternatively, it can occur when a Java Node is being developed that is to be placed into a
library and extended. For instance, there may be some common base code that needs to be
developed & maintained within a library, however each instance of this node may need to
add or modify a small section of code. If this code was all simply placed into the java code
block, then changes in the base node would not get propagated down to the implementing
nodes, as the implementing nodes would have changed the Java Code parameter which is
being modified in the base.
For these reasons, multiple java code blocks are sometimes required. This section describes
how this can be achieved using the Java Node, and talks through the example library nodes
“AbstractMetadataPassThrough” and “MultiFilter”, and the instance nodes “Pass Through”
and “StringsToUpperCase IntsTimes2”.
We have already talked through the JavaCode and JavaClass parameters in previous
sections and described how these are required to compile the class. However it is important
to note that these parameters are no different than other node parameters. JavaCode is simply
a “text” parameter, and JavaClass is simply a “string” parameter. The LAE only knows
which parameter defines the class name and which parameter defines the java code to
compile based on the run time property names.
The same run time property name format used by the Java Node is also required for extra
user-defined code blocks. For each additional code block that you are to implement, two new
parameters are required.
These parameters must have corresponding run time property names of the form:
&
Where the parameter with the “.class” run time property name extension must be the name of
the class to compile, and the parameter with the “.code” run time property name extension
must contain the java code to compile. The class defining the Node to run must be referenced
in the NodeClass parameter.
9.4.1. Example
Consider the “AbstractMetadataPassThrough” node example. This is an abstract base node
that simply provides the code required to open all of the input streams, setup the output
metadata on the corresponding input metadata, and cleans up after itself. In order for this to
work, the node must have the same number of inputs as outputs.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 35
This node itself, however, can not be run. The code is incomplete, and the node itself only
erves as an abstract base which is to be inherited by other nodes.
The JavaCode defines the abstract base class, which defines the setup & cleanup methods
required. The node introduces an additional code block, with the parameter name ImplCode.
This doesn’t contain much detail, however provides the skeleton for implementing nodes to
define the required processAll and propertyBase methods.
In the ImplCode parameter, examine the line:
public class {{ÎmplClass^}} extends {{^JavaClass^}}
This states that value of the ImplClass parameter is to be taken as the name of the class
defined within the code block. Further, this class is to extend the JavaClass class defined in
the JavaCode code block. The ImplClass parameter has deliberately been left blank for
implementing nodes to fill in.
Now examine the NodeClass parameter, as shown below:
{{^JavaPackage^}}.{{ÎmplClass^}}
As the JavaClass parameter refers to an abstract class, this cannot be run as the node.
Therefore, the node class which is to be run is declared in the ImplCode block, as referenced
by the ImplClass parameter.
By clicking on Declare Parameters, it is easy to see that the code block that has been
added has the required parameters, with run time property names matching the format
described previously in this section.
Figure 5 Defining the Run Time Property Names for extra code blocks.
As good development practise, the JavaCode and JavaClass parameters have been hidden to
the implementing node.
Now look at the other library node “MultiFilter”. This node fills in the ImplCode parameter s
provided by the “AbstractMetadataPassThrough” node. The code within the processAll
method shown below, shows that the node simply continues reading records from each input
until the inputs have no more records to read. On each record, a method processRecord is
called on a class {{^ProcessRecordClass^}}.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 36
@Override
public void processAll() throws NodeFailedException
{
boolean allDone = false;
int recNum=1;
while (!allDone) {
allDone = true;
for (int i=0; i<numInputs();i++) {
Record record = read(i);
if (record != RecordInput.EOF) {
allDone = false;
try {
{{^ProcessRecordClass^}}.processRecord(i,
record, recNum, input(i), output(i));
}
catch (IOException ioe) {
logger().error(ioe, "Error Processing record on
input: "+i);
throw fail();
}
}
}
}
recNum++;
}
This node also introduces an additional java code block to be compiled,
ProcessRecordCode”. This has the class name “ProcessRecord”, as defined in the parameter
“ProcessRecordClass”. The node is simply a utility class and does not implement the Node
interface. Therefore, this time the NodeClass parameter does not need to be modified, as the
actual Node to be run is still defined in the ImplCode.
The entire contents of the ProcessRecordCode parameter are shown below:
package {{^JavaPackage^}};
{{^RequiredImports^}}
{{ÛserDefinedImports^}}
public class {{^ProcessRecordClass^}} {
{{ÊxtraClassCode^}}
{{^ProcessRecordMethod^}}
}
The code defines the class structure, then simply references other parameters that are exposed
to the node user.
Examining the Parameter Declarations window again shows that the run time property names
for the ProcessRecordCode and ProcessRecordClass indicate that this
ProcessRecordCode needs to be compiled, and informs that node of the class name.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 37
Figure 6 Defining the Run Time Property Names for extra code blocks in MultiFile node.
In addition, note that this code is not exposed to the node user. Rather, only the code snippet
parameters which are used within the java class are exposed. This then leaves the node
developer to modify the node structure and ensure that all implementing nodes will get
updated.
Now go to the graph view, and examine the nodes “Pass Through” and “StringsToUpperCase
Ints2Times”. Both of these nodes simply extend the “MultiFilter” node and provide extra
code snippets that are used within the ProcessRecordCode block defined in the library node.
“Pass Through” is the simplest node, and simply implements a straight pass through on all of
the input records. The code in this node is extremely simple, and shown below:
public static void processRecord(int ioChannel, Record record, int recNum,
RecordInput input, RecordOutput output) throws IOException {
output.write(record);
}
Since all of the record I/O has been taken care of in the base nodes, the amount of effort
required from the node user is significantly reduced.
Next examine the “StringsToUpperCase Ints2Times” node. This node actually performs
some operations – basically doing what its name implies. Again the code is shown in its
entirety below:
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 38
public static void processRecord(int ioChannel, Record record, int recNum,
RecordInput input, RecordOutput output) throws IOException {
for (int i=0;i<record.numFields();i++) {
if (record.metadata().field(i).type() == String.class) {
if (record.field(i) != com.lavastorm.lang.Null.NULL)
record.field(i,
((String)record.field(i)).toUpperCase());
}
else if (record.metadata().field(i).type() == Integer.class) {
if (record.field(i) != com.lavastorm.lang.Null.NULL)
record.field(i, ((Integer)record.field(i)).intValue()
* 2);
}
}
output.write(record);
}
While these two examples are practically useless, it is easy to see that scenarios requiring this
type of node & code extension can arise and shows that by using multiple code blocks,
common base code can be maintained while allowing for node implementations to define
minimal extra code.
LAE Java Node Getting Started Guide
Martin Dawes Analytics© 2011 | www.mda-data.com
Page 39

Download Report

LAE Java Node Getting Started Guide

Paperzz.com

Your Paperzz