LAE Java Node Getting Started Guide Date: Issue: November 12, 2010 1.0 LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 1 Copyright © THE CONTENTS OF THIS DOCUMENT ARE THE COPYRIGHT OF LAVASTORM TECHNOLOGIES, INC., dba MARTIN DAWES ANALYTICS (MDA). ALL RIGHTS RESERVED. THIS DOCUMENT OR PARTS THEREOF MAY NOT BE REPRODUCED IN ANY FORM WITHOUT THE WRITTEN PERMISSION OF MDA. Confidentiality This document contains confidential information that is proprietary to Martin Dawes Analytics. The original recipient of this document may duplicate this document in whole or in part for internal business purposes only, provided that this entire notice appears in all copies. The recipient agrees to make every effort to prevent the unauthorized use, distribution or disclosure of the proprietary information contained in this document. Disclaimer No representation, warranty or understanding is made or given by this document or the information contained within it and no representation is made that the information contained in this document is complete, up to date or accurate. In no event shall Martin Dawes Analytics be liable for incidental or consequential damages in connection with, or arising from its use, whether MDA was made aware of the probability of such loss arising or not. Trademarks Microsoft and Windows are registered trademarks of Microsoft Corporation. Oracle and Teradata are registered trademarks of Oracle Corporation and Teradata Corporation, respectively. All other trademarks or registered trademarks are the sole property of their respective owners. Contact Details For product demonstrations, enhancement requests or technical questions regarding the use of any Martin Dawes Analytics product, contact us as follows: HQ Address: Telephone: Fax: Email: Internet: th Martin Dawes Analytics, 321 Summer Street, 5 Floor, Boston, MA 02210 USA +1 617 345 5422 ext. 244 +1 617 345 5475 [email protected] www.mda-data.com Comments We welcome your feedback on this documentation or any other Martin Dawes Analytics product or document. We are always interested in your suggestions for additional topics. Please contact us at: [email protected]. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 2 Table of Contents 1. Overview ................................................................................................................................. 5 1.1. Purpose ................................................................................................................................. 5 1.2. Where to find it..................................................................................................................... 5 1.3. Who should use it ................................................................................................................. 5 1.4. When to use it ....................................................................................................................... 5 2. Simple Example Node ............................................................................................................. 6 3. Getting Node Parameters......................................................................................................... 7 3.1. Using the Run Time Property Names................................................................................... 7 3.2. Textual Substitution of Parameters ...................................................................................... 9 4. Handling Node Input ............................................................................................................. 10 4.1. Opening Inputs ................................................................................................................... 10 4.2. Finding Input Fields ........................................................................................................... 10 4.3. Reading Records................................................................................................................. 11 4.4. Closing Inputs..................................................................................................................... 12 5. Handling Node Output .......................................................................................................... 14 5.1. Setting Output Metadata ..................................................................................................... 14 5.1.1. Constructing Metadata from Scratch .............................................................................. 14 5.1.2. Reusing Metadata ........................................................................................................... 15 5.2. Opening Outputs................................................................................................................. 16 5.3. Writing Records ................................................................................................................. 16 5.4. Closing Outputs .................................................................................................................. 17 6. Node Process Flow ................................................................................................................ 18 6.1. Create.................................................................................................................................. 18 6.2. Setup ................................................................................................................................... 19 6.3. Process All .......................................................................................................................... 20 6.4. Cleanup ............................................................................................................................... 20 6.5. Destroy ............................................................................................................................... 20 7. Logging & Error Handling Guidelines .................................................................................. 21 8. Recommendations ................................................................................................................. 25 8.1. Parameter Visibility............................................................................................................ 25 8.2. Parameter Validation .......................................................................................................... 25 8.3. Property Base & Run Time Property Names ..................................................................... 25 8.4. Package, Class & Node Names .......................................................................................... 26 8.5. Code Documentation & Maintenance ................................................................................ 26 LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 3 9. Advanced Topics ................................................................................................................... 27 9.1. Classpath Modifications ..................................................................................................... 27 9.2. Controlling Downstream Processing .................................................................................. 27 9.3. Logger Usage ..................................................................................................................... 29 9.3.1. Simple String-Based Logging......................................................................................... 29 9.3.2. Using the Built-In ErrorCodes ........................................................................................ 30 9.3.3. In-Node ErrorCodes & Error Messages ......................................................................... 31 9.3.3.1. 9.4. 9.4.1. Example ...................................................................................................................... 32 Adding Extra Code Blocks ................................................................................................. 35 Example .......................................................................................................................... 35 LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 4 1. Overview With the release of LAE 4.5, comes the java node. As the node is one of the most complicated and advanced nodes to use (along with the python node), this document introduces the node. The example in this document should be used as a guide for anyone writing their first java node. 1.1. Purpose The java node is introduced in order to solve the same problems that the python node currently solves. The java node has better performance than the python node for large data sets. In future LAE releases, the intention is to implement more nodes in java and achieve further performance improvements by reducing the communication overhead between the nodes. 1.2. Where to find it The java node is found in the Lavastorm library, in the Interfaces and Adapters category. 1.3. Who should use it The java node can be used to construct new nodes by any LAE user. However, the user will need to have some java knowledge in order to use the node. The amount of java knowledge required is comparable to the amount of python knowledge required to write a python node. Similar to the python node, more complicated business logic will require more knowledge of the language. 1.4. When to use it The java node can be used in any case where a python node was previously being used. While the python node is still supported, it is recommended that the java node be used in future in order to obtain the performance benefits - from both the current implementation, and the expected benefits in future releases. While there are no restrictions from doing so, as with the python node, it is still considered best practice to only use a java node when the same functionality cannot be achieved with existing nodes. This helps ensure that, wherever possible, the business logic in LAE graphs is still easily understandable to the casual LAE user, or the LAE user with no programming background. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 5 2. Simple Example Node Accompanying this getting started guide is the LAE graph “ExampleJavaNodes”. Within this graph, there are two composite nodes. For now, consider only the “SumExample” java node contained within the “Simple” composite. This example node is very trivial in its operation. It simple takes any number of input pins, and sums together all of the fields with the name as specified in the parameter InputFieldName. In the example, this is populated with the field name “id”. If any of the inputs do not contain this input field, then the node will fail. Otherwise, the node will continue reading records & adding the values for each record until it has consumed all records from all inputs. The node is required to have one and only one output. For each summed row, the node will output the sum, either as an int or float, depending on the value of the parameter OutputAsFloat. These values will be written to the output field specified by the parameter OutputFieldName. The node is able to sum over any input type that can be converted to a numeric value. If the input type cannot be converted to a numeric value, then an error is thrown. Bear in mind that since this node example is particularly simple, it would not be a good candidate to implement as a java node, as standard BRAINScript alternatives could be used. The following sections will describe how the code within the JavaCode parameter should be structured, and how to write JavaCode sections, using this node as an example. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 6 3. Getting Node Parameters In general, the run time property name approach should be used where possible. The java node would be fairly useless if there was no way to access the parameters defined on the node within the node code. It is only through defining node parameters and using them in the node code that the java nodes are able to be at all generic and reusable. As with the python node, there are two mechanisms for accessing parameters within a java node – using BRE’s textual substitution, and using the run time property names. 3.1. Using the Run Time Property Names When properties are retrieved using their Run Time Property Name, all of the required parameters should be obtained & verified within the setup method (described in section 0). Whenever obtaining or setting properties using this method, PropertyExceptions can be thrown. Therefore it is important to handle this exception as described in section 7. LAE users who are used to writing python nodes will be familiar with declaring run time property names for parameters and accessing the parameters using the run time property name within the python code. The same approach is used to obtain parameters within the java code. Examine the SumExample node within the Simple composite in the example graph provided. Notice that the JavaCode within the node defines a method propertyBase, as shown below. public String propertyBase() { return "ls.brain.node.sumExample"; } Then examine the Parameter Declarations on the node, shown below. Figure 1 Parameters defined on the example node. Each of these parameters has a Run Time Property Name format of: LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 7 Where <paramName> is simple some specific name for the parameter (e.g. inputFieldName). Therefore, with this in mind, variables are defined within the JavaCode class in order to store these properties, as shown below /** * Specifies whether the output should be written as a floating point, or * integer */ private boolean m_outputAsFloat; /** * The name of the field which is to be summed */ private String m_inputFieldName; /** * The name of the field to be output. */ private String m_outputFieldName; Then, within the setup method of the node these properties are loaded into the variables as shown below: try { //get the required properties m_outputAsFloat = properties().getBoolean( propertyBase() + ".outputFloat"); m_inputFieldName = properties().getString( propertyBase() + ".inputFieldName"); m_outputFieldName = properties().getString( propertyBase() + ".outputFieldName"); } catch (PropertyException ex) { logger().error(ex, Logger.CHAIN_END, "Error reading node properties."); throw fail(); } Consider the first property that is read in the above code block. This code simply obtains the Properties object via the properties() accessor defined on the Node interface – as can be seen in the javadoc API provided. The code then attempts to access the boolean property called: propertyBase() + ".outputFloat" From the propertyBase() method, this corresponds to the run time property name: ls.brain.node.sumExample.outputFloat LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 8 Furthermore, from Figure 1 this run time property name is used by the parameter OutputAsFloat. So, with all of this put together, the line: m_outputAsFloat = properties().getBoolean( propertyBase() + ".outputFloat"); Simply states: “Obtain the Boolean parameter ÓutputAsFloat and store it in the variable m_outputAsFloat” There are a variety of different methods for accessing the different property types that can be defined on a node. Therefore the Properties javadoc API should be consulted when determining how to read the particular property you are interested in. 3.2. Textual Substitution of Parameters Textual substitution is probably the easiest method of accessing parameters within the java code. Existing LAE users should be familiar with how textual substitution can be used, via the {{^parameterName^}} syntax. As this is a general LAE concept it will not be discussed here. Within the JavaCode itself, textual substitution should be used in places where it is not possible to use the runtime property name to obtain the property value. Wherever the value needs to be directly substituted into the java code, then the textual substitution should be used. If however, the code can be written to obtain the runtime property value as shown in the previous section, and this can be then stored in some variable, then the runtime property name approach is preferable as it allows greater control of errors that can occur during property evaluation. An example of the use of textual substitution of parameters occurs when substituting the name of the class, and the name of the package within which it lies, as shown below: package {{^JavaPackage^}}; … public class {{^JavaClass^}} extends AbstractJavaNode In these cases, it would not be possible to read in the value of the JavaPackage or JavaClass parameter into a variable for use in the java code, therefore the textual substitution approach is used. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 9 4. Handling Node Input This section details how the java node code can be written to handle the node inputs. 4.1. Opening Inputs The default JavaCode stub in the Java Node is an implementation which extends 1 SimplifiedNode. When an implementing class that extends SimplifiedNode is used , then the inputs are already opened and no work needs to be done in the Java Node code. 4.2. Finding Input Fields Often it is necessary to locate a specific field within an input. In these cases it is necessary to find the index of the field within the input metadata, using the name of the field to search. In the case of the SumExample node, we need to find the field with the name specified in the InputFieldName parameter. As seen in section 3.1, this was read into the variable m_inputFieldName. Therefore, the following code within the SumExample node is used to locate this field within the ith input, and error if the field does not exist on this input: int idx = input(i).metadata().find(m_inputColumnName); if ( idx == -1) { logger().error(Logger.CHAIN_END, "Unable to find required field ("+m_inputFieldName+") on input ("+i+"): \""+input(i).name()+"\""); throw fail(); } Ignoring the logging & exception handling for the moment (this is described later in sections 7 and Error! Reference source not found.), this code simply obtains the metadata for the ith nput, then searches this metadata for the field with the name m_inputFieldName. If the field cannot be found in the metadata, then a result of -1 is returned. In this case, the node defines that it will error. Similar to the process of finding input fields using a field name, the index of an input or output can be found using the findInput and findOutput methods respectively as defined on the Node interface. In general, if the field will need to be read from multiple records from the same input, then it is good practise to store the field index (idx in the above code), such that it can be used without 1 It is expected that all customer-written java nodes will extend SimplifiedNode. Currently all nodes that extend the java node provided by MDA also extend the SimplifiedNode. The only reason this would not be used was if there were specific requirements requiring non-blocking I/O. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 10 re-searching the record metadata each time. If this is required, then it should be performed in the setup method of the node. In the SumExample node, the indices for the InputFieldName field in each of the inputs are stored into an array, which is subsequently used when reading the records. 4.3. Reading Records Record processing should always be performed in the processAll method (described in section 0). Processing records is relatively straightforward, as shown from the code within the processAll method in the SumExample node. In this example, we want to read records from each input that still has records remaining, until all inputs have been completely read. Therefore, the following code is used: Record record = read(i); if (record != RecordInput.EOF) { … } This simply reads the next available record from the ith input. If no more records are available from this input, then RecordInput.EOF will be returned. Therefore, if we only had one input and simply wanted to continue processing records until this input had no more records to read, the following could be used instead: Record record = null; while ((record = read(0)) != RecordInput.EOF) { //process record } Once the Record has been obtained, and we have verified that this Record isn’t simply an end of input indicator, then it is straightforward to get a field from this record. In order to obtain the first field defined on a record, then the following can be used: Object field = record.field(0); However, since field ordering is not guaranteed, it is better to use the name of the field and obtain the index of that field in the metadata. To get the field named “Foo” from the first input, the following can be used: int index = input(0).metadata().find("Foo"); Object field = record.field(index); LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 11 As mentioned in the previous section, if this is to be done repeatedly, the index should be stored in a variable so that the metadata does not need to be searched every time. This is exactly what has been done in the SumExample node. There, in the setup method, the index for the {{^InputFieldName^}} field for the ith input has been stored in the m_indices array, at position i, The code snippet shown below is used within the SumExample node to obtain the th {{^InputFieldName^}} field form the i input. record.field(m_indices.get(i) Note that the record.field methods will always return a java.lang.Object. Therefore, in order to perform useful processing with the returned field, it will most likely be necessary to cast the field to a different type. When performing these operations, it is advisable to ensure that the metadata on the input is correct for the type you are casting to. For instance, if we want to ensure that the field “Foo” in input 0 contains integer data, then the following code could be used to ensure that this is the case: int index = input(0).metadata().find("Foo"); FieldMetadata fieldMd = input(0).metadata().field(index); Class<?> clazz = fieldMd.type(); if (!java.lang.Integer.isAssignableFrom(clazz)) { //error } Also note that the above code will only handle Integer types, and will still error for other types such as Byte, Long, etc. 4.4. Closing Inputs In general, inputs should be closed within the cleanup method (described in section 6.4). Similarly, the setup method should ensure that if it fails, all inputs have subsequently been closed, as the cleanup method will not be called. Note that closing an input can cause an IOException to be thrown. Therefore it is important to handle this exception as described in section 7. A helper method cleanupIo can be used to close all of the open inputs & outputs and handle any IOExceptions that might be thrown – logging correctly, Any inputs that have been opened during the setup method should in general be closed in the cleanup method. Closing an input is a very simple operation, and the following code (as seen in the SumExample node) will close the ith input (0-indexed): input(i).close(false); LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 12 The boolean parameter provided to the close method specifies whether or not the input is being closed due to an error. The close method shown above can throw IOExceptions. In order to simplify the cleanup of all I/O, it is recommended that the cleanup method simply calls cleanupIo. This is the default implementation of cleanup provided in the JavaCode stub in the Java Node. This will close any open inputs & outputs, and correctly log any IOExceptions that get thrown & error the node. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 13 5. Handling Node Output This section details how the java node can be written to handle the node outputs. 5.1. Setting Output Metadata Wherever possible, outputs should have their metadata set in the setup method (described in section 0). Where the metadata is dependant on data read from input records, then the metadata should be set in the processAll method. While the first operation performed on node inputs is normally to open them, the metadata first needs to be defined on an output before the output is opened. 5.1.1. Constructing Metadata from Scratch In the “SumExample” node, the output record metadata is constructed & set as part of the setup method. Here, the OutputAsFloat and OutputFieldName parameters are used to determine the output metadata. When OutputAsFloat is set to true, the OutputFieldName field is set to be of a floating point type. Otherwise, the output metadata is setup with an integer type, as shown in the following code: //Setup the output metadata according to the properties. Class<?> outputType = null; if (m_outputAsFloat) outputType = java.lang.Float.class; else outputType = java.lang.Integer.class; RecordMetadata metadata = output(0).newMetadata(); metadata.add(new SimpleFieldMetadata(m_outputColumnName, outputType)); output(0).metadata(metadata); The newMetadata call on the RecordOutput constructs a new RecordMetadata object. This is then populated with new FieldMetadata objects (in this case, SimpleFieldMetadata is used). Once all of the required field metadata has been added to the RecordMetadata, the metadata can be set on the RecordOutput. Following this, no additional field metadata can be added to the RecordMetadata object. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 14 5.1.2. Reusing Metadata While the previous approach allows for full control of all of the fields in the output metadata, it may be the case that the output metadata should simply be the same as the metadata on an input. In this case, the RecordMetadata.copyFrom method can be used. The code below shows how this can be done for setting the metadata for the first output to the same as the metadata for the first input: //Setup the output metadata RecordMetadata metadata = output(0).newMetadata(); metadata.copyFrom(input(0).metadata()); output(0).metadata(metadata); Similarly, if the metadata is to be used on multiple outputs, this can be done using the following: RecordMetadata metadata0 = output(0).newMetadata(); //construct the metadata for output 0 here … output(0).metadata(metadata0); RecordMetadata metadata1 = output(1).newMetadata(); metadata1.copyFrom(metadata0); output(1).metadata(metadata1); It is important to note that the same RecordMetadata object cannot be used on multiple outputs. This means that the following code is incorrect: //Setup the output metadata RecordMetadata metadata = output(0).newMetadata(); //Assigning this once is fine output(0).metadata(metadata); // The follwing code is incorrect. // RecordMetadata instances should not be shared. output(1).metadata(metadata); Rather, a different RecordMetadata object needs to be constructed for each output, as shown in the previous examples. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 15 5.2. Opening Outputs Wherever possible, outputs should be opened in the setup method (described in section 0). Where the metadata is dependant on data read from input records, then the outputs will need to be opened in the processAll method (described in section 0). Once the metadata has been set on an output, the output can be opened. The output must be opened prior to attempting to write to it. Opening an output is a very simple operation, and the following code (as seen in the SumExample node) will open the first output: openOutput(0); If multiple outputs are being used, and all need to be opened at the same time (after the metadata has been set on each output), then the following code can be used to simply open all of the outputs: openOutputs(); 5.3. Writing Records Record writing should generally be performed in the processAll method (described in section 0). Records can only be written to an output after the output has been opened. It is a relatively straightforward process to write records within a java node. First, a new record can be obtained from the output metadata. Then, on the returned record, each of the fields can be populated prior to writing the record to the output. The following code shows how this is performed within the SumExample node to write a simple record to the first output with one field set (where the variable sum is defined to be a double). Record record = output(0).metadata().newRecord(); if (m_outputAsFloat) record.field(0, sum); else record.field(0, (int)sum); writeRecord(0, record); Each field which is not set on a record prior to the record being written will appear as NULL in the brdViewer. For instance, if in the above example, the first output was defined with a metadata containing 2 fields, the second field would be left as NULL. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 16 5.4. Closing Outputs In general, outputs should be closed within the cleanup method (described in section 6.4). Similarly, the setup method should ensure that if it fails, any outputs it has opened have subsequently been closed, as the cleanup method will not be called. Note that closing an output can cause an IOException to be thrown. Therefore it is important to handle this exception as described in section 7. Any outputs that have been opened during the setup method should in general be closed in the cleanup method. Closing an output is a very simple operation, and the following code (as seen in the SumExample node) will close the first output: output(0).close(false); The boolean parameter provided to the close method specifies whether or not the output is being closed due to an error. The close method shown above can throw IOExceptions. In order to simplify the cleanup of all I/O, it is recommended that the cleanup method simply calls cleanupIo. This is the default implementation of cleanup provided in the JavaCode stub in the Java Node. This will close any open inputs & outputs, and correctly log any IOExceptions that get thrown & error the node. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 17 6. Node Process Flow While there are some exceptions (see section 0), the code within the JavaCode parameter should generally be a class which extends the com.lavastorm.brain.node.SimplifiedNode class. This class should have no constructor. If there is some pressing reason to have a constructor (rather than simply performing initialisation in setup as recommended), then there must be a noargument constructor. The code skeleton provided with the base java node is already defined to extend this SimplifiedNode class. The LAE server then knows that it has a Node to execute, and will call the methods defined on the Node interface (implemented by the SimplifiedNode) in order to execute the node. Looking at the JavaCode in the simple example node provided the following line ensures that this class inherits from the AbstractJavaNode: public class {{^JavaClass^}} extends SimplifiedNode The Node interface is provided in the java node API. The process flow of the node is outlined in the following section, and follows the path create -> setup -> processAll -> cleanup -> destroy This path is always followed, assuming that no exceptions are thrown from the code within these methods, and assuming that the node status is never set to failed in any of these methods. Additional node sates for controlling downstream processing (outside of failure & success) are an advanced topic and described in section 9.2. 6.1. Create The create method defined on the Node interface is implemented in the SimplifiedNode and does not need to be implemented in the user-defined class in the JavaCode parameter. The create method in SimplifiedNode guarantees that by the time setup is called, all inputs have been opened. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 18 6.2. Setup The setup method is where all of the required setup for the node should be performed. This generally involves: Reading any required properties Setting up output metadata (if that can be performed without needing to read input records) Open outputs (if the output metadata can be setup) Performing any other initialisation required. As stated in the API, if any errors occur during setup, the user defined code should catch these errors, log them appropriately, indicate that the node has failed via calling the fail() method, then throw a NodeFailedException, or return from the method normally. o If this method does not throw an exception o If the node fail() method was not called, then : processAll, cleanup & destroy will all be called The cleanup method should be valid to be called to cleanup any resources allocated during the setup method, regardless of whether or not processAll fails. o Else Execution of the node will be terminated, processAll and cleanup will not be called, destroy will be called. o Otherwise, If the method throws a RuntimeException The node controller will mark the node as failed - regardless of whether or not fail() has been called within the node code. Node execution will be terminated – processAll and cleanup will not be called, destroy will be called. Else (if the method throws a NodeFailedException) The node status will not be changed by the node controller. If this has not been set to fail, then the node will not be marked as failed. Execution of the node will be terminated, proceessAll and cleanup will not be called, destroy will be called. Importantly, this means that the NodeFailedException does not signal to the node controller that the node status should be set to failed. Rather, the node controller assumes that the node has correctly set its own status. The NodeFailedException does signal to the node controller that further processing should be aborted. In general, when an error occurs within the node, and is caught by the node code, the easiest mechanism for signalling an error and aborting processing is to call: throw fail(); LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 19 6.3. Process All The processAll method is to contain the main node execution operation. This will involve the reading and writing of records, and any business logic that needs to be applied to these records. As stated in the API, if any errors occur during processAll, the user defined code should catch these errors, log them appropriately, indicate that the node has failed via calling the fail() method, then throw a NodeFailedException, or return from the method normally. o If this method returns normally, and the node fail() method was called, then: The node will be marked as failed o Otherwise, If the method throws a RuntimeException The node will be marked as failed o In all cases, if processAll is called, cleanup & destroy will be called. This means that if a NodeFailedException is thrown, but fail() is not called, then the node will not be marked as failed. 6.4. Cleanup The cleanup method finishes execution of the node and cleans up internal resources allocated during the setup and processAll methods. This method will always be called if setup completed successfully. After this method is called calls to input, output, properties, and log accessors should still be valid. The destroy method will always be called after this method. While there may be some additional code required in the cleanup method to handle the closing of external files, database connections, sockets etc, the following code should almost always be placed into the cleanup method to ensure that the node’s inputs and outputs are cleaned up correctly: cleanupIo(); 6.5. Destroy The destroy method defined on the Node interface is implemented in the AbstractJavaNode (parent class of the SimplifiedNode) and does not need to be implemented in the user-defined class in the JavaCode parameter. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 20 7. Logging & Error Handling Guidelines The Node interface also defines an accessor to a Logger object. The Logger provides utility methods for writing to the log with varying LogLevels – which are, in order of severity: DEBUG INFO WARN ERROR The different log levels can be used by calling the logger().log(LogLevel level, …) method. Alternatively, the logger().debug, logger().info, logger().warn & logger().error methods can be used. These log levels are outlined in the table on the following page. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 21 LogLevel LogLevel.DEBUG LogLevel.INFO When to Use Used when logging information unrelated to any errors, just debugging information which may be useful to identify what is going on in the processing logic. The contents of these debug statements may only make sense to the node developer for troubleshooting purposes. In general, these should not be internationalized. Generally useful if an error occurs, but is handled, or an exception is being thrown that is part of the method contract (except for NodeFailedExceptions, which should generally only be thrown after a LogLevel.ERROR message has been logged). Usage Example logger().debug(“Beginning to process records”); Record record = null while ((record = read(0)) != RecordInput.EOF) { //process record } logger().debug(“Finished processing records”); public void myMethod() throws IOException { … try { readFile(); } catch (IOException ioe) { logger().info(ioe, “IOException occurred reading file”); throw ioe; } } LogLevel.WARN Generally used when an error occurs, but the code is able to make some assumptions & continue. Is possible/likely to lead to an error later. File dir =new File(dirName); boolean madePath = dir.mkdirs(); if (!madePath) { String msg = “First choice path (“+madePath.getName()+”) failed. ”; msg += “proceeding to backup location. ”; msg += “This may affect downstream processing.”; logger().warn(msg); dir = new File(backupDirPath); … } LogLevel.ERROR Used when the node is about to fail. int inputIdx = findInput(“Data”); if (inputIdx == -1) { logger().error(“Expected node input \”Data\” does not exist”); throw fail(); } LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 22 The methods inherited by the Node that the node developer is expected to write (setup, processAll and cleanup) only declare to throw NodeFailedExceptions. When a NodeFailedException is thrown from the body of one of these methods, then the LAE will assume that the node has written all of the required information to the log, so will not automatically log any error information. Therefore, it is up to the node developer to ensure that when any exception2 is thrown from within one of these methods, one of the following occurs: The exception is caught, handled, and processing can continue The exception is caught, logged, the node status is set to fail (via calling the fail() method) and o A NodeFailedException is thrown or o The node returns normally. The exception is caught, logged, the node status set to some an appropriate non-failure state based on the exception (see section 9.2 for information on different node signalling states) and o A NodeFailedException is thrown or o The node returns normally. The SumExample node contains a number of examples of using the Logger object provided on the Node interface. In order to make life easier for the node developer, the fail() method returns a NodeFailedException. An example of using where this is used in the SumExample is shown below. This code – within the setup method – attempts to obtain a series of properties values. try { //get the required properties m_outputAsFloat = properties().getBoolean( propertyBase() + ".outputFloat"); m_inputColumnName = properties().getString( propertyBase() + ".inputColumnName"); m_outputColumnName = properties().getString( propertyBase() + ".outputColumnName"); } catch (PropertyException ex) { logger().error(ex, Logger.CHAIN_END, "Error reading node properties."); throw fail(); } 2 Technically, only checked exceptions (i.e. Exceptions that aren’t RuntimeExceptions) need to be handled. However, it is generally good practice to also handle any RuntimeException where you are able to provide more information as to the reason behind the error than would be available if the error simply bubbled up for the LAE to handle without context. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 23 If a PropertyException occurs during this attempt, the code first obtains the Logger using the logger() accessor and writes an error to the log. The Logger.CHAIN_END argument simply tells the log that it shouldn’t expect any more error messages related to this error, and to end the error chain. This can be left off the call to error without ill-effects. Following this, the code will set the state of the node to failed, using the fail() method defined on the Node interface, and throw the returned NodeFailedException. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 24 8. Recommendations 8.1. Parameter Visibility There are several reasons for writing java nodes. Sometimes this is to simply implement some complex functionality, and allow this functionality to be configured by some node parameters. If this is the case, then generally the java node should be made into a library node, then all of the java-specific parameters should be hidden to the node user. So, after appropriately documenting the node (not just code comments), the following parameters should be hidden: NodeClass Classpaths JavaPackage JavaClass CompileJavaClass JavaCode These can be hidden by opening the node, then navigating to Declare Parameters -> Inherited Parameter Group Overrides, then setting Group to Hide. There are times however – as discussed in section 0 – where the node user will be expected to provide their own java code. Such cases are relatively advanced topics and in general a little bit more thought needs to go into exactly which parameters should be exposed or hidden from the node user. 8.2. Parameter Validation If a parameter is required, then set it to be Not Blank in the parameter declarations. Event though you can check the runtime property within the java code, and then log appropriate exceptions, the Not Blank setting is caught a lot earlier, rather than only at runtime. Also, there is always the possibility that future BRE versions may use the Not Blank setting to provide better visual clues to the user as to required fields. 8.3. Property Base & Run Time Property Names It is a good idea to set the property base to reflect the node itself. For example, the property base on the SumExample node is “ls.brain.node.sumExample”. Experienced LAE users will be aware that the structure of the node parameter hierarchy is important. This is because the property resolution rules use the dot-separated property name as a search hierarchy. If a search is performed for the property “ls.brain.node.sumExample.foo”, then o If this exists, it is returned, o Else, a search is performed for “ls.brain.node.foo”, then LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 25 If this property exists, it is returned, Else, a search is performed for “ls.brain.foo”, then If this property exists, it is returned, Else, a search is performed for “ls.foo” o If this property exists, it is returned o Else a search is performed for “foo” If this property exists, it is returned Otherwise a PropertyException is thrown. Therefore, it is good practise to ensure that the node hierarchy is mirrored in the run time property name hierarchy & the String returned from the propertyBase method, such that a parameter “propertyName” that is declared on a parent node, with the correct run time property name hierarchy can be obtained using the line : properties().getProperty( propertyBase() + “.propertyName”); 8.4. Package, Class & Node Names It is common sense to give your node a sensible name – this adds to the self-documenting nature of the graphical BRE display. Similarly, the class name should be changed to reflect the name or purpose of the node. While this won’t necessarily help with documenting the graph, it will help for understanding error logs. For MDA developed nodes, the JavaPackage parameter should also generally be under the com.lavastorm.brain.node.<libraryName> package. For external customers, if the node is simply a one-off node to be used within a graph, then the default value for the JavaPackage parameter should be left (pgk{{^handle^}}). However, if the node is to be re-used, and turned into a library node, then a general rule of thumb would be that it is good practice to change the node package name to something similar to the com.<companyName>.node.<libraryName> pattern. The name of the class should be changed in the JavaClass parameter to reflect the node name. In order for these parameter changes to be effective, they still need to be referenced via the textual substitution ({{^^}}) mechanism in the JavaCode parameter, and also in the NodeClass parameter. Clearly, as these are to be inserted as the java package name and java class name respectively, they need to be valid package and class names. 8.5. Code Documentation & Maintenance Comment any complex logic in the code such that the next person who has to read it stands a chance (and just remember, this could be you in a couple of year’s time). Provide sufficient information to the logs such that errors have sensible & understandable error messages that allow for node errors to be easily resolved. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 26 9. Advanced Topics 9.1. Classpath Modifications In certain cases it may be desirable to modify the classpath on the java node. This can be done for inheritance & interface purposes, or to simply reference external classes and jars that aren’t part of the LAE installation. If external classes are required for the java node to run, these can simply be added to the Classpaths parameter in Parameters 2 from the Node Editor window. Say for example that you had written some utility java code which you wanted to reference in your java node, and let’s say that this was packaged into a jar myUtilityClasses.jar. In order for the java node to be able to reference these classes within the JavaNode code, this jar needs to be placed in a location where the LAE server can access it. Assume this is in the location: /usr/home/myUtilityClasses.jar Then this can be setup in the java node classpath by simple adding this location to the Classpaths parameter as shown below. Figure 2 Adding an external jar to the java node classpath 9.2. Controlling Downstream Processing Examine the node “ErrorAndSignallingExample” in the “Advanced” composite in the ExampleJavaNodes.brg file provided with this getting started guide. This node has defined 3 parameters (in addition to those inherited from the base Java node). These parameters are: Parameter Name Allowable Options ErrorPosition NONE DURING_SETUP DURING_PROCESS_ALL DURING_CLEANUP NONE DIRECT_DEPENDENTS NONE OutputNodeSignal ClockedNodeSignal LAE Java Node Getting Started Guide Description Tells the node code where to fail the node. None implies that the node will not fail. Specifies the value to set on AbstractJavaNode.outputSignalMode Specifies the value to set on Martin Dawes Analytics© 2011 | www.mda-data.com Page 27 DIRECT_DEPENDENTS ALL_DEPENDENTS AbstractJavaNode.clockSignalMode Essentially, this is an example node to show what occurs when the node fails at different stages of processing. Alternatively, if the node succeeds, this shows how the different return states can be used to control downstream processing. It is obvious that if the node status is set to failed (i.e. ErrorPosition != NONE), that none of the downstream nodes will be able to be correctly processed. However, more interesting is the case where the ErrorPosition is set to NONE, and the OutputNodeSignal parameter and ClockedNodeSignal parameter are modified. The piece of code within the node which uses these parameters to modify the downstream processing are the two simple lines: outputSignalMode(m_outputNodeSignal); clockSignalMode(m_clockNodeSignal); The OutputNodeSignal and ClockedNodeSignal parameters have already been read into the variables m_outputNodeSignal and m_clockNodeSignal respectively by this stage of processing. Modify these node parameters in the example graph and investigate the effects on downstream processing. The following table details what will occur with each combination of OutputNodeSignal & ClockedNodeSignal. For those familiar with the BRAINScript setSuccessReturnCode function, the corresponding returnCode is shown in the left column. returnCode 200 clockSignalMode NONE ouptutSignalMode NONE 202 NONE DIRECT_DEPENDENT S 0 DIRECT_DEPENDENTS 201 DIRECT_DEPENDENTS DIRECT_DEPENDENT S NONE LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Behaviour Nothing connected (clock or output) to this node runs Only things connected to the output of this node run, clocked items don’t Normal Only things connected to the outclock of this node run, things connected to outputs don’t Page 28 203 ALL_DEPENDENTS NONE N/A ALL_DEPDENDENTS N/A * DIRECT_DEPENDENT S ALL_DEPENDENTS Only things connected to the outclock of this node, and connected to the outclock of nodes connected to the output of this node run.3 Error, not allowed. Error, not allowed. Consider the second last row in this table. It may not be immediately obvious why this combination would not be allowed. However, consider that you have three nodes, A, B & C, connected as shown below: Where Is a output relationship & is a clock relationship In the case where clockSignalMode ==ALL_DEPENDENTS && outputSignalMode == DIRECT_DEPENDENTS then if A passes, B will be allowed to execute. Once B is allowed to execute, then whether or not C runs is entirely dependent on whether B allows it to run, and not dependent on what A specifies. Therefore this combination does not make sense. 9.3. Logger Usage The Logger object returned from the logger accessor on the Node interface provides for more complex logging than simply writing some String message to an error log with an associated error level. The Logger object also caters for node developers writing java nodes that require localization features, including internationalised error message. There are three main logging usage patterns within the java node. These are outlined in the following sections. 9.3.1. Simple String-Based Logging In general, it is expected that most users of the java node will not care about internationalising error messages. Therefore, it is expected that the use of hard or soft coded Strings in log messages will be the most common use case. Examples of the String use case were shown in the table in section 7. 3 Note that due to an existing Gnats issue (2664), this will not work when using the BRE controller. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 29 9.3.2. Using the Built-In ErrorCodes Provided with the LAE is a set of internationalized (although not yet localized) error messages and associated ErrorCodes that can be used with the logger. These relate to errors that will need to be commonly handled on Java Nodes within the LAE. In future releases this will be further populated as other common errors are identified. The ErrorCodes are in com.lavastorm.brain.node.ErrorCodes, and are documented in the API provided. In order to use these ErrorCodes within your node, you simply need to reference them. These are imported by default into the Java Node via the line: import static com.lavastorm.brain.node.ErrorCodes.*; In order to use an ErrorCode, code similar to the following -used in the “Internatioalized SumExample” node - can be used: try { //get the required properties m_outputAsFloat = properties().getBoolean( propertyBase() + ".outputFloat"); m_inputFieldName = properties().getString( propertyBase() + ".inputFieldName"); m_outputFieldName = properties().getString( propertyBase() + ".outputFieldName"); } catch (PropertyException ex) { logger().error(ex, Logger.CHAIN_END, ERROR_RETRIEVING_PROPERTIES); throw fail(); } Note that many of the error messages require arguments. An example of an ErrorCode that takes arguments is also found in the “Internationalized SumExample” node, as shown below: private void setupOutputs() throws NodeFailedException { //Check that the output is correct. if (numOutputs() != 1) { logger().error(UNEXPECTED_NUMBER_OF_OUTPUTS, 1, numOutputs()); throw fail(); } } Examining the ErrorCodes API shows that the UNEXPECTED_NUMBER_OF_OUTPUTS ErrorCode takes two arguments, the number of outputs that were expected (1 in this case), and the number of outputs on the node (numOutputs()). LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 30 9.3.3. In-Node ErrorCodes & Error Messages There may be cases where a Java Node is going to be used in different locales in which case internationalized error messages may be desirable. Where possible, the built-in error codes described in the previous section should be used. However, these only cover generic errors that could occur on any java node. Node specific error messages can be internationalised within the java node itself. In order to achieve this, the node developer will need to use message keys to lookup error messages in resource bundles. These resource bundles can be constructed as node parameters with a specific run time property name format. For each locale, the error messages need to go into a property which has the format of a java ResourceBundle – simply containing key, value pairs separated by an “=” sign. In any place where arguments are to be supplied to augment the message, placeholders can be written into the resource bundle using {0}, {1}, … etc markers. For each locale & error message bundle combination, there needs to be one of these resource bundle parameters, and a corresponding bundle name parameter within the node. These parameters need to have the respective run time property name formats: & The format of the already been briefly described in the previous paragraph. parameter has The value of the “ls.brain.node.java.resources.<something>.name” parameter specifies the name of the resource bundle. This needs to adhere to the localization naming guidelines as described within the ResourceBundle documentation in the Java API: http://download.oracle.com/javase/6/docs/api/java/util/ResourceBundle.html This essentially states that the name of the resource bundle needs to consist of alphanumeric sections separated by periods. If the set of error messages (or “content” parameter) to which the name applies is the default bundle, then this is all that should be supplied in the resource name. Otherwise, it should have “_<language>[_<region>]” suffix. For example, if the set of error messages to which the name applies is to be used as the default bundle in a French speaking locale, then this would end with “_fr”. If it is the bundle to use in a French speaking part of Canada, then this should end with the “_fr_ca” suffix. It is a good idea therefore to separate out the base part of the resource bundle from the localized part. If this was placed into a parameter “MessageBundleBase” this would mean that the ls.brain.node.java.resources.<something>.name parameter for the resource bundle for the default French locale would be: {{^MessageBundleBase^}}_fr LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 31 The bundle names are still a little more complicated. The value of the ls.brain.node.java.resources.<something>.name parameter is not actually the full bundle name. Rather, if this runtime property is bound to the parameter “BundleName”, then the full bundle name (including locale) is actually: {{^JavaPackage^}}.{{^BundleName^}} The actual resource bundle contents are taken from the contents parameter, and placed into the correct location on the node’s classpath (based on the package-like hierarchy), and placed into a resource bundle file with a “.properties” extension. This makes the resource bundles available for use within the Node java code. In order to use these resources, the easiest method is to construct a SimpleErrorCode that references the messageKey within the bundle. Then, this ErrorCode can be provided to the Logger when logging messages. The following example shows how all of this can be done in practice. 9.3.3.1. Example Consider the case of the simple SumExample node introduced previously. Now, consider that this node is to be distributed to and used by multiple users across multiple different locales (although I’m not sure who would want to use it). It would be nice if the error messages could also be localized. An example of how this can be achieved is shown in the node “Internationalized SumExample” – which has been added to the local library. First look at the definition of the library node. Examine the Parameter Declarations window in the node editor (shown below): Figure 3 Parameter declarations for internationalised logging Here it can be seen that two resource bundles are being defined, as there are two sets of run time property names matching the format: LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 32 & These are the “germanErrors” and “nodeErrors” parameter sets. Note that within the parameter declarations there is nothing to say that these two parameter sets are in any way related. Notice also that MessageBundleBase and MessagePrefix parameters have been declared. Now, look at the Parameters 2 tab of the node editor as shown below: Figure 4 Parameters for internationalised resource bundles These two message bundles are actually two different localizations of the same message bundle base. The MessageBundleBaseName parameter is declared as “test.ErrorMessageBundle”. Then, the two message bundles are declared as: GermanMessageBundleName: {{^MessageBundleBase^}}_de DefaultMessageBundleName: {{^MessageBundleBase^}} As mentioned previously, these bundle names are only a part of the full message bundle name, which also uses the JavaPackage parameter. When combining the JavaPackage into the bundle name, these become: German Bundle Name: {{^JavaPackage^}}.{{^MessageBundleBase^}}_de Default Bundle Name: {{^JavaPackage^}}.{{^MessageBundleBase^}} This in turn evaluates to: German Bundle Name: com.lavastorm.brain.node.test.ErrorMessageBundle_de Default Bundle Name: com.lavastorm.brain.node.test.ErrorMessageBundle Therefore the resource bundle properties will be constructed as the resource bundle files “ErrorMessageBundle_de.properties” & “ErrorMessageBundle.properties” in the com/lavastorm/brain/node/test/ directory within the node’s classpath. Note also that the MessagePrefix parameter has been used to qualify the messageKeys on each of the error messages described in each of the resource bundles. So, now that all of the required error message bundles have been defined, examine the JavaCode parameter to see how these are actually used. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 33 Within the JavaCode, the ErrorCodes for each of these error messages are defined as constants on the node, as shown below: private final static String MESSAGE_BUNDLE = "{{^JavaPackage^}}.{{^MessageBundleBaseName^}}"; private final static String MESSAGE_PREFIX = "{{^MessagePrefix^}}"; private final static ErrorCode ERROR_PARSING_NUMERIC_VALUE = new SimpleErrorCode(MESSAGE_PREFIX + "errorParsingNumericValue", MESSAGE_BUNDLE); It may be worthwhile to examine the ErrorCode and SimpleErrorCode API provided to ensure that you understand what these definitions are doing. Once these ErrorCodes have been declared, they can be supplied to the various methods on the Logger that take ErrorCodes. In the simplest case, this is simply providing the ErrorCode itself. In more complicated cases, arguments need to be provided in addition to the ErrorCode which are then substituted into the error message. Consider the case below, from the “Internationalized SumExample” node: logger().error(nfe, Logger.CHAIN_END, ERROR_PARSING_NUMERIC_VALUE, obj, inputIdx, input(inputIdx).name(), recNum); In this case, the ERROR_PARSING_NUMERIC_VALUE ErrorCode points to the “ls.brain.node.sumExample.errorParsingNumericValue” message key. Within both the German and default message bundles, this message key corresponds to an error message that takes 4 arguments. The default version is shown below: {{^MessagePrefix^}}errorParsingNumericValue=Error attempting to parse {0} as a numeric value. Error occurred on input ({1}): "{2}" on record {3}. Therefore, from the logger().error call, will end up with the following substitutions performed: Argument Value {0} {1} {2} {3} Obj inputIdx input(idx).name() recNum The node “Internationalized SumExample – Failure” under the path “Advanced -> Internationalized Logging” shows a case where this error message gets used. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 34 9.4. Adding Extra Code Blocks In some circumstances, it may be impractical to write all of the java code required into the one code block provided in the Java Node. This can occur because a large amount of code is required, and it makes more sense to have some form of sensible object model, and not simply one node class. Alternatively, it can occur when a Java Node is being developed that is to be placed into a library and extended. For instance, there may be some common base code that needs to be developed & maintained within a library, however each instance of this node may need to add or modify a small section of code. If this code was all simply placed into the java code block, then changes in the base node would not get propagated down to the implementing nodes, as the implementing nodes would have changed the Java Code parameter which is being modified in the base. For these reasons, multiple java code blocks are sometimes required. This section describes how this can be achieved using the Java Node, and talks through the example library nodes “AbstractMetadataPassThrough” and “MultiFilter”, and the instance nodes “Pass Through” and “StringsToUpperCase IntsTimes2”. We have already talked through the JavaCode and JavaClass parameters in previous sections and described how these are required to compile the class. However it is important to note that these parameters are no different than other node parameters. JavaCode is simply a “text” parameter, and JavaClass is simply a “string” parameter. The LAE only knows which parameter defines the class name and which parameter defines the java code to compile based on the run time property names. The same run time property name format used by the Java Node is also required for extra user-defined code blocks. For each additional code block that you are to implement, two new parameters are required. These parameters must have corresponding run time property names of the form: & Where the parameter with the “.class” run time property name extension must be the name of the class to compile, and the parameter with the “.code” run time property name extension must contain the java code to compile. The class defining the Node to run must be referenced in the NodeClass parameter. 9.4.1. Example Consider the “AbstractMetadataPassThrough” node example. This is an abstract base node that simply provides the code required to open all of the input streams, setup the output metadata on the corresponding input metadata, and cleans up after itself. In order for this to work, the node must have the same number of inputs as outputs. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 35 This node itself, however, can not be run. The code is incomplete, and the node itself only erves as an abstract base which is to be inherited by other nodes. The JavaCode defines the abstract base class, which defines the setup & cleanup methods required. The node introduces an additional code block, with the parameter name ImplCode. This doesn’t contain much detail, however provides the skeleton for implementing nodes to define the required processAll and propertyBase methods. In the ImplCode parameter, examine the line: public class {{^ImplClass^}} extends {{^JavaClass^}} This states that value of the ImplClass parameter is to be taken as the name of the class defined within the code block. Further, this class is to extend the JavaClass class defined in the JavaCode code block. The ImplClass parameter has deliberately been left blank for implementing nodes to fill in. Now examine the NodeClass parameter, as shown below: {{^JavaPackage^}}.{{^ImplClass^}} As the JavaClass parameter refers to an abstract class, this cannot be run as the node. Therefore, the node class which is to be run is declared in the ImplCode block, as referenced by the ImplClass parameter. By clicking on Declare Parameters, it is easy to see that the code block that has been added has the required parameters, with run time property names matching the format described previously in this section. Figure 5 Defining the Run Time Property Names for extra code blocks. As good development practise, the JavaCode and JavaClass parameters have been hidden to the implementing node. Now look at the other library node “MultiFilter”. This node fills in the ImplCode parameter s provided by the “AbstractMetadataPassThrough” node. The code within the processAll method shown below, shows that the node simply continues reading records from each input until the inputs have no more records to read. On each record, a method processRecord is called on a class {{^ProcessRecordClass^}}. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 36 @Override public void processAll() throws NodeFailedException { boolean allDone = false; int recNum=1; while (!allDone) { allDone = true; for (int i=0; i<numInputs();i++) { Record record = read(i); if (record != RecordInput.EOF) { allDone = false; try { {{^ProcessRecordClass^}}.processRecord(i, record, recNum, input(i), output(i)); } catch (IOException ioe) { logger().error(ioe, "Error Processing record on input: "+i); throw fail(); } } } } recNum++; } This node also introduces an additional java code block to be compiled, ProcessRecordCode”. This has the class name “ProcessRecord”, as defined in the parameter “ProcessRecordClass”. The node is simply a utility class and does not implement the Node interface. Therefore, this time the NodeClass parameter does not need to be modified, as the actual Node to be run is still defined in the ImplCode. The entire contents of the ProcessRecordCode parameter are shown below: package {{^JavaPackage^}}; {{^RequiredImports^}} {{^UserDefinedImports^}} public class {{^ProcessRecordClass^}} { {{^ExtraClassCode^}} {{^ProcessRecordMethod^}} } The code defines the class structure, then simply references other parameters that are exposed to the node user. Examining the Parameter Declarations window again shows that the run time property names for the ProcessRecordCode and ProcessRecordClass indicate that this ProcessRecordCode needs to be compiled, and informs that node of the class name. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 37 Figure 6 Defining the Run Time Property Names for extra code blocks in MultiFile node. In addition, note that this code is not exposed to the node user. Rather, only the code snippet parameters which are used within the java class are exposed. This then leaves the node developer to modify the node structure and ensure that all implementing nodes will get updated. Now go to the graph view, and examine the nodes “Pass Through” and “StringsToUpperCase Ints2Times”. Both of these nodes simply extend the “MultiFilter” node and provide extra code snippets that are used within the ProcessRecordCode block defined in the library node. “Pass Through” is the simplest node, and simply implements a straight pass through on all of the input records. The code in this node is extremely simple, and shown below: public static void processRecord(int ioChannel, Record record, int recNum, RecordInput input, RecordOutput output) throws IOException { output.write(record); } Since all of the record I/O has been taken care of in the base nodes, the amount of effort required from the node user is significantly reduced. Next examine the “StringsToUpperCase Ints2Times” node. This node actually performs some operations – basically doing what its name implies. Again the code is shown in its entirety below: LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 38 public static void processRecord(int ioChannel, Record record, int recNum, RecordInput input, RecordOutput output) throws IOException { for (int i=0;i<record.numFields();i++) { if (record.metadata().field(i).type() == String.class) { if (record.field(i) != com.lavastorm.lang.Null.NULL) record.field(i, ((String)record.field(i)).toUpperCase()); } else if (record.metadata().field(i).type() == Integer.class) { if (record.field(i) != com.lavastorm.lang.Null.NULL) record.field(i, ((Integer)record.field(i)).intValue() * 2); } } output.write(record); } While these two examples are practically useless, it is easy to see that scenarios requiring this type of node & code extension can arise and shows that by using multiple code blocks, common base code can be maintained while allowing for node implementations to define minimal extra code. LAE Java Node Getting Started Guide Martin Dawes Analytics© 2011 | www.mda-data.com Page 39
© Copyright 2026 Paperzz