Filtering Data Often when performing an analysis, data must be manipulated and filtered. The Filter node, found in the Aggregation and Transformation library, provides the functionality to change and customize the data output using BRAINscript, Lavastorm’s scripting language. The Split node in the same library is a specialized Filter node which splits the data according to a specified criterion. This Tech Note covers both of these nodes. Filter node A Filter node is essentially a blank node in which any level of data manipulation and output customization can be performed. The functionality of the Filter node is customized by entering BRAINscript in the Script section of the node. By default, the Script section contains the following line of script: emit * which outputs all inputted fields and records. The “emit” keyword is used to specify the fields to output and the “*” (wildcard) indicates all data fields. The default script can be added to or replaced depending on the desired functionality. The most common BRAINscript functions used in the Filter node fall into two broad categories: 1. Those used to transform the data. These are data-related functions and fall into three groups depending on the type of data the function is applied to: numeric, string and data and time. 2. Functions used to control the output of the node. These functions allow for specifying which data fields should be outputted, for renaming fields, filtering the field according to a field value and other output related tasks. Using BRAINscript Functions BRAINscript functions can be added to the Script section by right-clicking within the script area (Figure 1). BRAINscript keywords and functions appear in blue font. The functions are grouped together based on their functionality. 1 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 1 Figure 1 – Accessing BRAINscript functions. The syntax used for functions is generally: FieldName.Function() where FieldName is the data field the function will be applied to and Function is the function being used. Arguments to the function are entered within the parenthesis and are comma separated. If multiple functions are to be applied to a data field, they can be strung together within the same line as follows: FieldName.Function1().Function2().Function3() 2 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 2 An alternative syntax is to use the field name as the first argument of the function: Function(FieldName) The results of a function call can be assigned to a variable name: VariableName = FieldName.Function() If the assignment is the first time the variable is used, it will be automatically defined and then assigned. Variables do not need to be defined before assignment. Variable names and functions names are case-sensitive whereas input field names are not. Help for a function can be easily accessed by placing the cursor in the function name and pressing F1. Comments can be added to the script by placing # in front of the comment. Comments appear in a green font. Data Related BRAINscript Functions Data-related functions are used to transform data before output. They are broken up into three categories depending on the field’s data type – numeric, string and date and time. A list of the most commonly used functions for each type and its definition follows: Numeric BRAINscript Functions abs() returns the absolute value ceil() returns the smallest integer greater than or equal to the value double() returns the value converted to a double floor() returns the largest integer less than or equal to the value int() returns the value converted to an integer isNumber() returns true if the value is a number or can be cast to a number long() returns the value converted to a long integer round() rounds the value pow(exponent) returns the value raised to the power of the specified exponent sqrt() returns the (positive) square root of the value square() returns the value squared 3 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 3 String BRAINscript Functions left(num) returns the first num characters ltrim() removes leading spaces isSpace() returns true if every character is whitespace (horizontal tab, linefeed, carriage return, space) pad(length, [character], [direction]) pads the string in the specified direction (left or right) with the specified length number of characters replace(find,replace) replaces all occurrences (case-sensitive) of the find string with the replace string right(num) returns the last num characters rtrim() removes trailing whitespace split(separator) splits the string by the specified separator creating a list of individual strings strcat() concatenates the value and the arguments to the function strFind(substring) returns the index of the start of the specified substring (case-sensitive). A -1 is returned if the substring does not exist within the string strlen() returns the string length substr(offset,[num]) returns the substring starting at the specified offset including num characters toLower() converts all uppercase letters to lowercase toUpper() converts all lowercase letters to uppercase trim() removes leading and trailing whitespace 4 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 4 Date/Time BRAINscript Functions Date/time functions fall into three types depending on the data type of the variable calling the function. The three date/time data types are date, time and datetime. Date Functions: date() constructs a date object dateAdjust(delta,[units]) adds the specified delta units to the date dateSubtract(date2) subtracts date2 from the date day() returns the day value of the date month() returns the month value of the date year() returns the year value of the date Time Functions time() constructs a time object hours() returns the hour value of the time minutes() returns the minute value of the time seconds() returns the second value of the time timeSubtract(time2) subtracts time2 from the time and returns the results in number of seconds. Datetime Functions: timestamp() constructs a datetime object dateTime(time) returns the epoch-time (number of seconds since midnight 1/1/70) of the specified time dateTimeAdjust(delta, [units]) adds the specified delta units to the datetime 5 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 5 Output Related BRAINscript Functions Output related functions control the output of the Filter node. The main output related keyword is emit which specifies what to output. Multiple fields can be outputted by placing the field names separated by commas after emit: emit Field1, Field2, Field3 or by using multiple emits: emit Field1 emit Field2 emit Field3 The second syntax in useful when customizing the output using the emit qualifiers which are as follows: override overrides old data with new data exclude suppressed the output of a field rename changes the output field name where controls the output based on a specified criteria 6 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 6 Examples 1. In this example, a Filter node is used to output the month an item was purchased and the price of the item. The output would then be used in an aggregate node to find the total monthly revenue. Both data related and output related BRAINscript functions are used. A sample of the input data for this example is shown in Figure 2 and the Filter node scripting is shown in Figure 3. Figure 2 – Example 1 sample input. 7 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 7 The first two lines of the script use string functions to first find the price string length and then to use that length and the right function to remove the $ from the price. The third script line converts the result to a double. As stated in the comments in lines 5 and 6, the three script lines could have been written as one single line by stringing the functions together. Line 9 uses the month function to extract the month portion of the date of purchase and then assigns the result to a variable called MonthofPurchase. The final line of scripting contains output related BRAINscript to emit all input fields, emit the MonthofPurchase field and to replace the Price field with the newly created PriceValue field. A sample of the output is shown in Figure 4. Figure 3 – Example 1 node configuration. 8 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 8 Figure 4 – Example 1 sample output. 9 Lavastorm Desktop Professional Tech Note Lavastorm Analytics© 2012 | www.lavastorm.com Page 9 2. The script in Figure 5 is an example of output related BRAINscript used to control the node’s output. Line 1 specifies that all inputted rows in which field7 is value should be outputted. Line 2 adds two fields, newField1 and newField2, to the output. Line 3 removes all dashes in the data in field3 and replaces the original field3 with the result. Line 4 excludes field4 and field5 from the output and line 5 outputs field6 with the new name of myField6Name. Figure 5 – Example 2 Filter node configuration. 10 Professional Tech Note Lavastorm Desktop Lavastorm Analytics© 2012 | www.lavastorm.com Page 10 Split Node The Split node splits the node’s input based on a specified condition. The Split node configuration is shown in Figure 6. The criterion to split the data is entered in the PredicateExpr section. The expression must evaluate to a Boolean result. Input data that match the condition are outputted in the first output pin and those that do not are outputted to the second output pin. In the example in Figure 6, rows of data in which the TotalPurchase is greater than 300 is outputted to the first pin and those that are less than or equal to 300 are outputted to the second pin. By default, the script section contains emit * which outputs all fields. BRAINscript can be used in the script to customize the output similar to the Filter node. Figure 6 – Split node configuration. 11 Professional Tech Note Lavastorm Desktop Lavastorm Analytics© 2012 | www.lavastorm.com Page 11
© Copyright 2026 Paperzz