Set up and use federation in InfoSphere BigInsights Big SQL

Set up and use federation in InfoSphere BigInsights
Big SQL V3.0
Mara Elisa de Paiva Fernandes Matias
February 04, 2015
(First published July 08, 2014)
Big SQL V3.0 supports federation to many data sources, including IBM® DB2 for Linux, UNIX,
and Windows™, IBM PureData™ System for Analytics, IBM PureData System for Operational
Analytics, Teradata, and Oracle. Federation enables users to send distributed requests
to multiple data sources within a single SQL statement. Learn how to use the federation
capabilities in Big SQL V3.0.
04 February 2015 - Author-requested changes appear in: What you need for this article;
Supported data sources Replaced Table 1 with link; Table 1 (new Informix information);
Environment setup; Set up the connection to Oracle; Set up the connection to Netezza; Set
up the connection to Informix (new section); Listing 3, Listing 4, Listing 5, and Listing 8 under
Complete system setup; Examples of wrappers for different data sources (new Informix
information); Netezza and Informix (new section) under Server definition and use; Usage
examples (new Informix example); Mapping, User mapping; and Function mapping (new
section).
This article introduces Big SQL V3.0 federation capabilities by using the data sources Teradata,
Oracle, Netezza, and IBM® DB2 for Linux, UNIX, and Windows. Some knowledge of database
systems is assumed. The focus is on the basic configuration that is required to use Big SQL V3.0
federation.
Big SQL V3.0 federation, a feature of Big SQL V3.0, is highly configurable. The command syntax
that is used here is a simplified version of the essential settings for a working system.
What you need for this article
• Knowledge of Big SQL in InfoSphere BigInsights, particularly about the characteristics of the
BIGSQL database.
• Familiarity with the DBMS being used as data source
• Knowledge of how to set up the remote server's connection and its respective client.
• Big SQL V3.0 installed on a system that runs Linux AMD64 or Linux PPC (not all data sources
are supported on PPC).
• Clients for each data source, which are installed and configured on the Big SQL V3.0
management node.
© Copyright IBM Corporation 2014, 2015
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Trademarks
Page 1 of 19
developerWorks®
ibm.com/developerWorks/
• Use of Netezza® through that DataDirect ODBC that is compatible with the Netezza client.
• Netezza-branded DataDirect Driver and the Netezza client, which can be downloaded from
IBM Fix Central.
Supported data sources
The Big SQL V3.0 federation server supports several data sources. You can consult the full list of
supported data sources at www.ibm.com/support/docview.wss?uid=swg27044495.
Explore HadoopDev, your direct channel to the
InfoSphere BigInsights development team
Find all the resources that you need to develop with InfoSphere BigInsights, brought to
you by the extended BigInsights development team. Doc, product downloads, labs, code
examples, help, events, expert blogs — it's all there. Plus a direct line to the developers.
Engage with the team now.
Because data analytics is crucial to the success of a business, data warehousing that supports
consolidation and federation of data is a requirement.
Big SQL, part of IBM InfoSphere® BigInsights™, is the SQL interface to Hadoop file systems.
It enables the storage and handling of massive amounts of data. Big SQL federation enables a
mixed environment with consolidation of big data and federation of live data from different data
sources to be created.
Overview of Big SQL V3.0 federation
The architecture of a federated server is fairly simple. It includes the Big SQL V3.0 engine, the
wrapper, and the remote database management systems (DBMS) client.
One of the core elements of the engine is the optimizer (shown in Figure 1), which is responsible
for choosing the execution plan for a federated query that is based on a cost analysis. The
optimizer can rewrite the query, if necessary, to make it more efficient. The optimizer distributes
query work between the federated server and data sources to make the process as cost efficient
as possible. The cost of a query is automatically calculated per nickname by the runstats utility,
which is turned on by default.
Figure 1. Architecture of Big SQL V3.0
To enable Big SQL V3.0, you need a wrapper, a server, a nickname, and a user mapping.
• Wrapper: A library that handles the communication between the Big SQL V3.0 federation
server and the DBMS client. Each type of data source must have its own wrapper. The
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 2 of 19
ibm.com/developerWorks/
developerWorks®
wrapper translates the queries that come to and from the Big SQL V3.0 federation server to
calls on the exposed API of the client. The client communicates with the data source.
• Server: A remote database. Although a server object is a remote database, you need to know
the system that this database resides in because that information determines what client is
used to connect to it.
• Nickname: A single remote object in the data source (for example, a table, a view, or a
procedure) in a server. As with any federated server, the use of data sources is managed
transparently. After the nicknames for the remote objects are defined from a user perspective,
they behave as if they are local objects. If the data comes from the data source unchanged,
the result of a query is determined by the local rules. This transparency is important in this
case because the Big SQL database uses only binary collation, by default. This type of
collation affects the options that need to be set when you create the server object.
• User mapping: An association between a local authorization ID and a remote authorization
ID. Any operation that is run on the federated server by the local ID is run on the data source
by using the ID that it was mapped to. For a user to query a nickname, the user must be
authorized to perform SELECT operation on the original table. For security reasons, on most
systems the same authorization ID with the same password does not exist on all machines;
therefore, user mapping is required.
After the setup is complete, Big SQL V3.0 is used through JSqsh, similar to how Big SQL versions
are accessed. To create federated objects, an authorization ID with DBADM authority is required.
Installation
The installer for InfoSphere BigInsights automatically installs the wrappers and applies the
federation licence. The wrappers that are shown in Table 1 are included by default in the directory
$HOME/sqllib/lib/.
Table 1. Wrappers
Data source type
Wrapper library
Auxiliary files
DB2
libdb2drda.so
libdb2drdaF.so
libdb2drdaU.so
Teradata
libdb2teradata.so
libdb2STteradataF.a
libdb2teradataU.so
Oracle
libdb2net8.so
libdb2net8F.so
libdb2net8U.so
libdb2STnet8F.a
Netezza
libdb2rcodbc.so
libdb2odbct.so
libdb2rcodbcF.so
libdb2rcodbcU.so
Informix
libdb2informix.so
libdb2informixF.so
libdb2informixU.so
Environment setup
Login as user bigsql. By default, Big SQL V3.0 federation is not enabled for immediate use. Enable
it by using the following command.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 3 of 19
developerWorks®
ibm.com/developerWorks/
DB2 UPDATE DBM CFG USING FEDERATED YES
You must restart the database for the change to take effect.
For many data sources, you must set certain variables in the db2dj.ini file. Variables set to file
names or directories must use the fully qualified name, and values cannot use meta characters
or environment variables like ~, or $HOME. The db2dj.ini is in $HOME/sqllib/cfg if it exists.
Otherwise, you can use any editor to create it. Also, to alter db2dj.ini default path from the
command line, run:
export DB2DJ=<CUSTOM_PATH>/db2dj.ini
<CUSTOM_PATH> is any path in the system (for example $HOME/sqllib/cfg )
Some data sources need their client's libraries to be part of the PATH. To do this, alter the profile
file $HOME/sqllib/userprofile. After you make any changes, reload the profile file and restart the
instance.
To reload the profile, issue the following command:
.source $HOME/sqllib/userprofile
To restart the instance of the Big SQL V3.0 federation server, run the following code:
$BIGSQL_HOME/bin/bigsql stop
$BIGSQL_HOME/bin/bigsql start
Set up the connection to DB2 for Linux, UNIX, and Windows
To identify which database the federated server connects to, the remote DB2 database must be
cataloged in the federated server system database directory.
Listing 1. Catalog the remote server
DB2 CATALOG TCPIP NODE RMNODE REMOTE MYHOST SERVER 12345
RMNODE
is the node name.
MYHOST
is the host name for the remote machine.
is the service the remote DB2 server is running on, defined by setting the svcename variable
at the remote data source (db2 often runs on port 50000).
12345
Catalog the database as shown:
Listing 2. Catalog the database
DB2 CATALOG DATABASE RDB2DATA AS DB2DATA AT NODE RMNODE
RDB2DATA
is the name of the database at the remote node.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 4 of 19
ibm.com/developerWorks/
DB2DATA
RMNODE
developerWorks®
is the name for the database in the local catalog.
is the node name.
Refresh the local catalog by running the command db2terminate.
Set up the connection to Teradata
1. Download and install the Teradata client.
2. Add the following command to the startup file of the shell that is used (for example, .kshrc if
you use Korn shell), to export the environmental variable TERADATA_LIB_DIR.
export
TERADATA_LIB_DIR=/opt/teradata/client/lib64
is the absolute path where the Teradata client libraries are
located.
3. As shown in the following command, run the djxlinkTeradata command in $HOME/sqllib/
bin/ as root. If the operation is successful, the file libdb2STteradataF.a that is required for the
Teradata wrapper is created in the $HOME/sqllib/lib64 directory.
/opt/teradata/client/lib64
su root
<HOME>/sqllib/bin/ djxlinkTeradata
<HOME> is the home directory for Big SQL.
4. To avoid errors when you use nicknames, set the variable TERADATA_CHARSET in the db2dj.ini
file to the character set used at the data source, as shown:
TERADATA_CHARSET=ASCII
ASCII
is the character set used by the Teradata database.
Set up the connection to Oracle
1. Set the ORACLE_HOME and TNS_ADMIN variables in the db2dj.ini file. Set ORACLE_HOME to the
absolute path where the Oracle client is installed, and set TNS_ADMIN as the absolute path to
your tnsnames.ora file.
ORACLE_HOME=/opt/oracleclient
TNS_ADMIN=/home/bigsql/resources
/opt/oracleclient is the absolute path where the Oracle client is installed. /home/bigsql/
resources is the absolute path to the location of the tnsnames.ora file.
2. Add client libraries to the path. Do this by adding the following code extract to user profile
above the line "export LD_LIBRARY_PATH DB2LIBPATH":
P="/opt/oracleclient/lib"
LD_LIBRARY_PATH=$(add_to_path "${P}" "${LD_LIBRARY_PATH}")
DB2LIBPATH=$(add_to_path "${P}" "${DB2LIBPATH}")
export LD_LIBRARY_PATH DB2LIBPATH
is the absolute path to the Oracle client libraries.
3. Run the profile and restart the instance.
/opt/oracleclient/lib
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 5 of 19
developerWorks®
ibm.com/developerWorks/
Set up the connection to Netezza
General steps are given here. For more details, see the complete guide for installing and
configuring Netezza access through the ODBC driver.
1. Download and install the Netezza client and the compatible ODBC driver. Consider these two
compatible drivers:
• DataDirect Technologies Connect for ODBC driver
• IBM DataDirect ODBC driver
You can download a complete package that includes the DataDirect ODBC Driver and the
Netezza client. The package includes a sample odbc.ini file, which describes the Netezza
servers that are used on the create server command.
2. Create a symbolic link named .odbc.ini in the home directory to the location where you
created your odbc.ini file:
ln -sf $HOME/resources/odbc.ini $HOME/.odbc.ini
is the odbc.ini file being used (a sample is in the netezza
package).
3. Add the required variables to the db2dj.ini file, as shown:
$HOME/resources/odbc.ini
ODBCINI=/home/bigsql/resources/odbc.ini
where /home/bigsql/resources is the absolute path to the location of the odbc.ini file.
4. Add ODBC driver libraries to the path. Do this by adding the following code extract to user
profile above the line "export LD_LIBRARY_PATH DB2LIBPATH":
P="/opt/odbc64v51/lib"
LD_LIBRARY_PATH=$(add_to_path "${P}" "${LD_LIBRARY_PATH}")
DB2LIBPATH=$(add_to_path "${P}" "${DB2LIBPATH}")
export LD_LIBRARY_PATH DB2LIBPATH
/opt/odbc64v51/lib
is the absolute path to the ODCB driver libraries.
Set up the connection to Informix
1. Add the following command to the startup file of the shell that is used (for example, .kshrc if
you use Korn shell), to export the environmental variable INFORMIXDIR:
export INFORMIXDIR=/opt/IBM/informix
2. Run the djxlinkInformix command in $HOME/sqllib/bin/ as root. If the operation is
successful, the file libdb2STinformixF.a that is required for the Informix wrapper is created in
the $HOME/sqllib/lib64 directory:
su root
<HOME>/sqllib/bin/ djxlinkInformix
<HOME> is the home directory for Big SQL
3. Add the required variables to the db2dj.ini file:
INFORMIXDIR=/opt/IBM/informix
INFORMIXSQLHOSTS=/home/bigsql/resources/sqlhosts
CLIENT_LOCALE=en_US.819
INFORMIXSERVER=infdefault
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 6 of 19
ibm.com/developerWorks/
developerWorks®
is the absolute path to the Informix client. /homedir/bigsql/bin/sqlhosts
is the path to the file containing the server definitions. en_US.819 indicates which locale is
used for the client. infdefault is the default Informix server to use.
/opt/IBM/informix
Complete system setup
The following code listings show a sample of the configurations a user would have, if he used all of
the data sources at the same time.
Listing 3. Content of .kshrc
export TERADATA_LIB_DIR=/opt/teradata/client/lib64
export INFORMIXDIR=/opt/IBM/informix
Listing 4. Content of db2dj.ini
ORACLE_HOME=/opt/oracleclient
TNS_ADMIN=/home/bigsql/resources
INFORMIXDIR=/opt/IBM/informix
INFORMIXSQLHOSTS=/home/bigsql/resources/sqlhosts
CLIENT_LOCALE=en_US.819
INFORMIXSERVER=infdefault
TERADATA_LIB_DIR=/opt/teradata/client/13.10/lib64/
ODBCINI=/home/bigsql/resources/odbc.ini
Listing 5. Section added to userprofile
# for Federation support
P="/opt/oracleclient/lib"
LD_LIBRARY_PATH=$(add_to_path "${P}" "${LD_LIBRARY_PATH}")
DB2LIBPATH=$(add_to_path "${P}" "${DB2LIBPATH}")
P="/opt/odbc64v51/lib"
LD_LIBRARY_PATH=$(add_to_path "${P}" "${LD_LIBRARY_PATH}")
DB2LIBPATH=$(add_to_path "${P}" "${DB2LIBPATH}")
export LD_LIBRARY_PATH DB2LIBPATH
Listing 6. Content of tnsnames.ora
ora11gr2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = ora11qa.svl.ibm.com)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = ora11qa)
)
)
Listing 7. Content of odbc.ini
NZSQL = NetezzaSQL
[NZSQL]
Driver = /opt/netezza/lib64/libnzodbc.so
Description = NetezzaSQL ODBC
Servername = netz.ibm.co.uk
Port = 5480
Database = DWDB
Username = nzuser
Password = nzuser
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 7 of 19
developerWorks®
ibm.com/developerWorks/
Listing 8. Content of sqlhosts
inffake
olsoctcp
127.0.0.1
1652
Wrapper creation and use
A wrapper is required for each different data source type because each wrapper is associated with
a single library file. Wrappers can be written as C++ or Java™ applications. They can be seen as a
composition of two subcomponents: the query compiler and the execution engine.
For wrappers written in C++, it is possible to increase performance by running them as trusted
applications in the database engine, a practice that enables the compiler and the execution engine
to run concurrently.
Java wrappers are always run in fenced mode. In this mode, the compiler must finish before the
query is handed over to an external process to be run. However, this mode has the advantage
of allowing queries with only reads across different data sources to be run in parallel. By default,
the wrapper is run in trusted mode. To set it to fenced mode, set the variable DJ_FENCED when the
wrapper is created.
Create the wrapper syntax
CREATE WRAPPER <WRAPPER_NAME> LIBRARY <WRAPPER_FILE> OPTIONS ( <OPTIONS> )
<WRAPPER_NAME>
is a unique wrapper name.
<WRAPPER_FILE>
is the wrapper library file.
For most data sources, the options (<OPTIONS>) portion of the command is optional.
Note: Because the Netezza wrapper is a generic ODBC wrapper, the MODULE options must be
defined. The two most relevant options are:
• MODULE: Required option when you create a wrapper for Netezza. Set it to the absolute path of
the ODBC driver library.
• DB2_FENCED: Controls what mode the wrapper is to be run in: fenced or trusted.
Examples of wrappers for different data sources
The following examples show how to create wrappers for each data source type.
Listing 9. DB2 data source
CREATE WRAPPER DRDA LIBRARY 'libdb2drda.so'
Listing 10. Teradata data source
CREATE WRAPPER TERA LIBRARY 'libdb2teradata.so'
Listing 11. Oracle data source
CREATE WRAPPER ORA LIBRARY 'libdb2net8.so'
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 8 of 19
ibm.com/developerWorks/
developerWorks®
Listing 12. Netezza data source
CREATE WRAPPER NETZ LIBRARY 'libdb2rcodbc.so' OPTIONS(MODULE '/opt/odbc64v51/lib/libodbc.so')
Listing 13. Informix data source
CREATE WRAPPER INFORMIX LIBRARY 'libdb2informix.so'
Note: In the previous code listing, /opt/odbc64v51/lib/libodbc.so is the absolute path to the
ODBC driver.
Server definition and use
A server refers to a remote database. The server definition must include the type of server and the
database version or release level that it is in, the wrapper that should be used to communicate with
it, and a user and password to be used for authentication.
The following section describes the minimum options that are required for each data source and
the value that the collating_sequence option must be set to. The Big SQL V3.0 federation server
uses only binary collation. Binary collation is blank sensitive and differentiates empty from null
strings. Determine whether the data source's collating sequence is compatible with binary collation
by checking whether the data sources collation behavior is the same for blank sensitivity and
treatment of null strings.
Create server syntax
CREATE SERVER <SERVER_NAME> TYPE <SERVER_TYPE>
VERSION <VERSION> WRAPPER <WRAPPER_NAME>
AUTHORIZATION <REMOTE_USER>
PASSWORD <USER_PASSWORD> OPTIONS (<SERVER OPTIONS>)
<SERVER_NAME>
is a unique server name.
<SERVER_TYPE>
is the data source type.
<VERSION>
is the data source.
<WRAPPER_NAME>
<REMOTE_USER>
is a unique wrapper name defined by the create wrapper command.
is a valid remote user.
<USER_PASSWORD>
is the password for <REMOTE_USER>.
<SERVER OPTIONS>
definition of options for the server being created.
The values for <SERVER TYPE> and <SERVER VERSION> can be found on the respective data source
type in the following section.
Options for DB2 data source
The <TYPE> of a DB2 server is specified as db2/<SUBTYPE> where SUBTYPE is your platform. DB2 for
Linux, UNIX, and Windows is the only supported platform; therefore, the type is db2/udb.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 9 of 19
developerWorks®
ibm.com/developerWorks/
The fully supported <VERSION> is 10.5.
The DB2 identity collation has the same behavior as binary collation in terms of empty and null
strings. It is blank insensitive; therefore, it is incompatible with binary collation.
DB2 in Oracle compatibility mode is compatible in terms of blank padded comparisons. But in
this mode, empty and null strings are treated the same; therefore, it is incompatible with binary
collation.
To enable pushdowns in either of these cases, the COLLATING_SEQUENCE must be set to N.
Table 2. DB2 for Linux, UNIX, and Windows options reference
Option
Description
DBNAME
(Always required.) Specifies the specific database to use for the
initial remote DB2 database connection. This specific database is
the database alias for the remote DB2 database that is cataloged on
the federated server that is mentioned in the environmental setup
instructions.
PUSHDOWN
Specifies whether the federated server allows the data source to
evaluate operations. Valid values are Y and N. The default is Y; the data
source evaluates operations. N specifies that the federated server send
SQL statements that include only SELECT with column names.
COLLATING_SEQUENCE
(Required if PUSHDOWN is set to Y) Specifies whether the data source
uses the same default collating sequence as the federated database.
Must be set to N because there are currently no collations in DB2 that
are perfectly compatible with the Big SQL V3.0 binary collation.
Teradata
The <TYPE> of a Teradata server is teradata.
The fully supported <VERSION> is 12.
Teradata collations are ANSI-compliant. This means empty strings are considered different from
null strings; therefore, it is compatible with binary collation, but it is blank insensitive.
To enable pushdowns, COLLATING_SEQUENCE must be set to N.
Table 3. Teradata options reference
Option
Description
NODE
(Always required.) Specifies the Teradata server. Can be set to the
server alias, the IP address, or the fully qualified domain name.
PUSHDOWN
Specifies whether the federated server allows the data source to
evaluate operations. Valid values are Y and N. The default is Y; the data
source evaluates operations. N specifies that the federated server send
SQL statements that include only SELECT with column names.
COLLATING_SEQUENCE
(Required if PUSHDOWN is set to Y.) Specifies whether the data source
uses the same default collating sequence as the federated database.
Must be set to N.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 10 of 19
ibm.com/developerWorks/
developerWorks®
Oracle
The <TYPE> in Oracle is oracle.
The fully supported <VERSION> is 11.
Oracle has blank-sensitive comparison. It even has a binary collation of its own that is compatible
with binary collation, but in Oracle regardless of collation, empty strings are treated the same as
null strings; therefore they cannot be considered compatible.
To enable pushdowns, COLLATING_SEQUENCE must be set to N.
Table 4. Oracle options reference
Option
Description
NODE
(Always required.) Specifies an entry in the tnsnames.ora file
PUSHDOWN
Specifies whether the federated server allows the data source to
evaluate operations. Valid values are Y and N. The default is Y; the data
source evaluates operations. N specifies that the federated server send
SQL statements that include only SELECT with column names.
COLLATING_SEQUENCE
(Required if PUSHDOWN is set to Y.) Specifies whether the data source
uses the same default collating sequence as the federated database.
Must be set to N because there are currently no collations in Oracle that
are perfectly compatible with Big SQL V3.0 binary collation.
Netezza
The <TYPE> is odbc. Although the connection is to a Netezza server, it is done through the odbc
driver.
The fully supported <VERSION> is 6 for the examples given. ODBC driver version must be 3.0 or
later.
Netezza has blank-sensitive comparisons and because null strings are not supported, it is
compatible with binary collation.
Table 5. Netezza options reference
Option
Description
NODE
(Always required.) Specifies the name of the node or the system DSN
name that is assigned to the ODBC data source defined in the odbc.ini
file. The value is case-sensitive.
PUSHDOWN
Specifies whether the federated server allows the data source to
evaluate operations. Valid values are Y and N. The default is Y; the data
source evaluates operations. N specifies that the federated server send
SQL statements that include only SELECT with column names.
Informix
The <TYPE> is informix.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 11 of 19
developerWorks®
ibm.com/developerWorks/
The fully supported <VERSION> is 11.
Informix collations are not compatible with Binary collation. Empty strings are considered different
from null strings, but it is blank insensitive.
To enable pushdowns, COLLATING_SEQUENCE must be set to N.
Table 6. Informix options reference
Option
Description
NODE
(Always required.) Specifies the Informix server. Must be a server alias
defined in the sqlhosts file.
PUSHDOWN
Specifies whether the federated server allows the data source to
evaluate operations. Valid values are Y and N. The default is Y; the data
source evaluates operations. N specifies that the federated server send
SQL statements that include only SELECT with column names.
COLLATING_SEQUENCE
(Required if PUSHDOWN is set to Y.) Specifies whether the data source
uses the same default collating sequence as the federated database.
Must be set to N.
DBNAME
(Always required.) The name of the Informix database that you want to
access
Usage examples
The following code listings show how to create a server for various data sources.
Listing 14. DB2
CREATE SERVER LUWSERV TYPE DB2/UDB VERSION 10.5 WRAPPER DRDA AUTHORIZATION
\”db2user\” PASSWORD \”db2user\” OPTIONS (DBNAME 'DB2DATA', PUSHDOWN 'Y',
COLLATING_SEQUENCE 'N')
Listing 15. Teradata
CREATE SERVER TERASERV TYPE TERADATA VERSION 12 WRAPPER TERA AUTHORIZATION
\”terauser\” PASSWORD \”terauser\” OPTIONS (NODE 'TERANODE', PUSHDOWN 'Y',
COLLATING_SEQUENCE 'N')
Listing 16. Oracle
CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA AUTHORIZATION
\”orauser\” PASSWORD \”orauser\” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y',
COLLATING_SEQUENCE 'N')
Listing 17. Netezza
CREATE SERVER NZSERV TYPE ODBC VERSION 7 WRAPPER NETZ AUTHORIZATION \"nzuser\"
PASSWORD \"nzuser\" OPTIONS ( NODE 'NZSQL')
Listing 18. Informix
CREATE SERVER INFSERV TYPE INFORMIX VERSION 11 WRAPPER INFORMIX
AUTHORIZATION \"infuser\" PASSWORD \"infuser\" OPTIONS (NODE 'infdefault',
DBNAME 'db1', COLLATING_SEQUENCE 'N', PUSHDOWN 'Y')
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 12 of 19
ibm.com/developerWorks/
developerWorks®
Mapping
It is possible and sometimes even required to create mappings for users, data types, and
functions.
The creation of user mappings is required for most data sources. It is, in particular, required for all
the supported data sources listed.
User mapping
A user mapping defines an association between a user ID and password at the federated server
and the corresponding user ID and password at the data source, as shown in the following code
listing.
Listing 19. Define an association between a user ID and password at the
federated server and data source
CREATE USER MAPPING FOR <LOCAL_ID> SERVER <SERVER> OPTIONS
(REMOTE_AUTHID <REMOTE_ID>, REMOTE_PASSWORD <REMOTE_PASSWORD>)
<LOCAL_ID> is a local authorization ID. It can also be one of
PUBLIC, where USER maps the ID currently connected to the
authorization IDs.
<SERVER>
the special registers USER or
database and PUBLIC maps all valid
is a unique server name defined by the create server statement.
<REMOTE_ID>
is a valid authorization id in the remote data source.
<REMOTE_PASSWORD>
is the password for <REMOTE_ID>.
The following examples show how to create user mappings.
Listing 20. Map the local user ID foo to the DB2 server user ID db2user with
password db2user
CREATE USER MAPPING FOR foo SERVER LUWSERV OPTIONS (REMOTE_AUTHID 'db2user',
REMOTE_PASSWORD 'db2user')
Listing 21. Map the local user ID orauser to the Oracle user ID orauser with
password orauser
CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS ( REMOTE_PASSWORD
'orauser')
Listing 22. Map the current authorization ID to the Teradata user terauser
with password terauser
CREATE USER MAPPING FOR USER SERVER TERASERV OPTIONS (REMOTE_AUTHID 'terauser',
REMOTE_PASSWORD 'terauser')
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 13 of 19
developerWorks®
ibm.com/developerWorks/
Listing 23. Map all the valid authorization IDs to the Netezza user nzuser with
password nzuser
CREATE USER MAPPING FOR PUBLIC SERVER NZSERV OPTIONS (REMOTE_AUTHID 'nzuser',
REMOTE_PASSWORD 'nzuser')
Function mapping
Function mappings create an association between functions in the federated server and the data
source. As with data types, functions have default mappings, but a user-defined function, for
example a function that does not exist at the data source, is not mapped by default.
Listing 24. Syntax for mapping a function on a particular server
CREATE FUNCTION MAPPING <MAPPING_NAME> FOR <LOCAL_FUNCTION> SERVER
<SERVER> OPTIONS (REMOTE_NAME '<REMOTE_FUNCTION>')
is the unique name given to this mapping. <>LOCAL_FUNCTION> is the function that
is mapped on the federation. server. <>SERVER> is the unique server name as defined in the create
server statement. <>REMOTE_FUNCTION> is the function that is mapped on the data source.
<>MAPPING_NAME>
Listing 25. Syntax for mapping a function on any server of a given type
CREATE FUNCTION MAPPING <MAPPING_NAME> FOR <LOCAL_FUNCTION> SERVER TYPE
<SERVER_TYPE> OPTIONS (REMOTE_NAME '<REMOTE_FUNCTION>')
is the unique name given to this mapping. <>LOCAL_FUNCTION> is the function that
is mapped on the federation. server. <>SERVER_TYPE> is the type of the server that should do the
mapping, same as <>TYPE> in the create server statements. <>REMOTE_FUNCTION> is the function
that is mapped on the data source.
<>MAPPING_NAME>
Examples of function mappings
The following examples show how to create function mappings:
Listing 26. Map local function count_big to count function in nzserv ( a
netezza server)
CREATE FUNCTION MAPPING count_big_mapping FOR SYSIBM.COUNT_BIG() SERVER
NZSERV OPTIONS (REMOTE_NAME 'COUNT(*)')
Listing 27. Map local function count_big to count function in all oracle
servers
CREATE FUNCTION MAPPING count_big_mapping FOR SYSIBM.COUNT_BIG() SERVER
TYPE ORACLE OPTIONS (REMOTE_NAME 'COUNT(*)')
Listing 28. Map user-defined function bonus on federated server to userdefined function bonus on teraserv (Teradata data source)
CREATE FUNCTION MAPPING my_udf_mapping FOR BONUS() SERVER TERASERV
OPTIONS
(REMOTE_NAME 'BONUS')
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 14 of 19
ibm.com/developerWorks/
developerWorks®
Data type mapping
Data type mappings associate the federated database data types with the data types at the
data source. These mappings can be forward or reverse depending on whether a data type is
transformed when it's being sent or fetched from the data source.
This association might be necessary if the data type does not exist locally or if the data type is
incompatible.
For example, when you use an Oracle data source:
• Data type NUMBER(8,0) is not a valid data type in Big SQL V3.0. You can map it to its
equivalent INT data type.
• The date data type that is represented as a time stamp in Oracle is mapped to the local date
data type.
Data type mapping can be created on the server or nickname level.
Server level mapping
Server level data type mappings are applied to all transactions with the data source that use a
certain data type. When a server of a determined <TYPE> is created, some data type mappings are
created by default. See the full list of default mappings.
Listing 29. Syntax for mapping on a particular server
CREATE TYPE MAPPING <TYPE-MAPPING-NAME> TO/FROM <LOCAL-DATA-TYPE>
FROM/TO SERVER <SERVERNAME> REMOTE TYPE <REMOTE-DATA-TYPE>
Listing 30. Syntax for mapping on a server type
CREATE TYPE MAPPING <TYPE-MAPPING-NAME> TO/FROM <LOCAL-DATA-TYPE>
FROM/TO SERVER TYPE <SERVERTYPE> REMOTE TYPE <REMOTE-DATA-TYPE>
<TYPE-MAPPING-NAME>
<LOCAL-DATA-TYPE>
is a unique name for the mapping.
is a valid data type in Big SQL V3.0.
<SERVERNAME>
is a server name created by using the create server command.
<SERVERTYPE>
is a server type.
<REMOTE-DATA-TYPE>
is a valid data type at the data source.
Nickname mapping level
Nickname level mapping affects only transactions that involve the column being mapped.
Nickname level mappings are always forward and reverse and are done on a specific column.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 15 of 19
developerWorks®
ibm.com/developerWorks/
Listing 31. Mapping by altering an existing nickname
ALTER NICKANAME <NICKNAME> SET COLUMN <COLUMN> LOCAL TYPE
<TYPE>
<NICKNAME>
<COLUMN>
<TYPE>
is the name of an existing nickname.
is a column name of <NICKNAME>.
is a valid local data type.
Examples of data type mappings
The following code shows how to map different entities.
Listing 32. Map Oracle date to local date for all Oracle servers
CREATE TYPE MAPPING MY_ORACLE_DATE FROM LOCAL TYPE SYSIBM.DATE TO SERVER TYPE
ORACLE REMOTE TYPE DATE
Listing 33. Map date to local date for server LUWOMSERV (a DB2 for Linux, UNIX,
and Windows server by using Oracle compatibility mode)
CREATE TYPE MAPPING MY_ORACLE_DATE FROM LOCAL TYPE SYSIBM.DATE TO SERVER
LUWOMSERV REMOTE TYPE DATE
Listing 34. Map decimal(7,2) column dec72 of nickname nick1 to double data
type
ALTER NICKNAME NICK1 SET COLUMN dec72 LOCAL DATA TYPE DOUBLE
Nicknames
In Big SQL V3.0, a nickname is a local designation to a remote object such as a table or a view. To
create a nickname, the user that issues the command must be mapped to a valid user on the data
source. Mapping is explained in the preceding section.
Create nickname syntax
The following examples show how to create nicknames for various entities.
Listing 35. Explicitly create a nickname for an existing table
CREATE NICKNAME <LOCAL_NAME> FOR <SERVER>.<SCHEMA>.<REMOTE_NAME>
<LOCAL_NAME>
<SERVER>
is the name that the object has locally.
is a unique server name.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 16 of 19
ibm.com/developerWorks/
<SCHEMA>
developerWorks®
is the schema that the object was created under in the remote database.
<REMOTE_NAME>
is the name that the object has on the remote server.
Listing 36. Implicitly create a nickname by creating a table at the data source
CREATE TABLE <TABLE_NAME> (<COLUMN_DEFINITION>) OPTIONS (REMOTE SERVER <SERVER>, REMOTE_SCHEMA <SCHEMA>)
<TABLE_NAME>
is the nickname name and the remote table name.
<SERVER>
is a unique server name.
<SCHEMA>
is the remote schema under which the table will be created.
Listing 37. Create a nickname for an existing table nicktbl a teradata data
source
CREATE NICKNAME NICK1 FOR
TERASERV.TERAUSER.NICKTBL
Listing 38. Create a table at the Netezza server and implicitly create the
nickname
CREATE TABLE "NZTBL1" ("C1" INTEGER , "C2" CHAR(20) ) OPTIONS(REMOTE_SERVER
'NZSERV', REMOTE_SCHEMA 'NZUSER')
Common failures
Error
SQL20076N The instance for the database is not enabled for the specified action or operation.
Reason code = "1". SQLSTATE=0A502
Resolution
Enable federation and restart the server, by using the following commands:
db2 update dbm cfg using federated yes
$BIGSQL_HOME/bin/bigsql stop
$BIGSQL_HOME/bin/bigsql start
Error
• SQL5182N A required environment variable, "ORACLE_HOME", has not been set.
• SQL30090N Operation invalid for application execution environment. Reason code =
"31". SQLSTATE=25000
Resolution
Follow the setup steps under "Set up the connection to Oracle".
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 17 of 19
developerWorks®
ibm.com/developerWorks/
Error
When you use Oracle data source, you get the following message:
SQL1097N
The node name was not found in the node directory.
SQLSTATE=42720
Resolution
If tnsnames.ora was created in a location different from the default set, the TNS_ADMIN variable in
db2dj.ini to that location.
TNS_ADMIN=/home/bigsql/cfgfiles/
/home/bigsql/cfgfiles/
is the absolute path to the tnsnames.ora file.
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 18 of 19
ibm.com/developerWorks/
developerWorks®
Related topics
• Download InfoSphere BigInsights Quick Start Edition, available as a native software
installation or as a VMware image.
• Refer to the InfoSphere BigInsights Knowledge Center for product documentation.
• Read "Understanding InfoSphere BigInsights" to learn more about the product's architecture
and underlying technologies.
• Watch BigInsights experts discuss the technology, give demos, and answer common
questions on the IBM big data channel on YouTube.
• Enroll in free online courses at Big Data University.
• Get a technical introduction to Big SQL on Slideshare.
© Copyright IBM Corporation 2014, 2015
(www.ibm.com/legal/copytrade.shtml)
Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)
Set up and use federation in InfoSphere BigInsights Big SQL
V3.0
Page 19 of 19

Download Report

Set up and use federation in InfoSphere BigInsights Big SQL

Paperzz.com

Your Paperzz