Best practices

IBM® Information Management
Best practices
Enterprise Identifier
Nick Kanellos
Architect, MDM Server Development
IBM Canada Lab
Omar Chugtai
Managing Consultant - Software Services
MDM
Issued: <month> <year>
Enterprise Identifier............................................................................................. 1
Executive Summary............................................................................................. 4
Glossary of Terms ................................................................................................ 4
Introduction .......................................................................................................... 6
Why is an EID important? .................................................................................. 7
Defining an EID.................................................................................................... 8
Business key versus random key................................................................. 8
Where can an EID come from?..................................................................... 9
Defining EID in InfoSphere MDM Server .................................................. 9
Generating a New EID from scratch ......................................................... 14
Assign an EID to a Party when the Party is being created .................... 16
Assign an EID to an existing Party............................................................ 18
Maintaining EID................................................................................................. 19
Can EID change for a Party? ...................................................................... 19
Coordinating your EID with your database ............................................ 20
EID syndication .................................................................................................. 21
Syndication via push mechanism .............................................................. 21
Syndication via pull mechanism................................................................ 24
Using the EID in MDM Server ......................................................................... 24
Locating a Party using their EID................................................................ 24
Getting a Party’s EIDs ................................................................................. 24
Getting a Party’s Primary EID ................................................................... 24
Best practices....................................................................................................... 25
Conclusion .......................................................................................................... 26
Further reading................................................................................................... 27
Contributors.................................................................................................. 27
Notices ................................................................................................................. 28
Enterprise Identifier
Page 2 of 29
Trademarks ................................................................................................... 29
Enterprise Identifier
Page 3 of 29
Executive Summary
Master Data Management Server is all about having a single, consolidated view of the
data that’s important to you. That data primarily consists of the people and
organizations that your organization interacts with. In MDM the people and
organizations that your organization interacts with are referred to as Parties. It’s not
often easy to uniquely identify a party based on their name and address. You may need
to know other attributes, and it is only when all the relevant attributes describing a party
are taken together can you be certain that the individual or organization you are dealing
with is the individual or organization you think you are dealing with.
However, using such a process each time you need to uniquely identify a party can be
very unwieldy. Once you’ve uniquely identified a party, in many cases it is useful to
assign a unique identifier to the party. That unique identifier then can act as a shortcut to
the process of examining all the attributes of a party each time you want to ensure that
you are dealing with the correct party. Each of us has been assigned such identifiers.
Examples include driver’s license numbers, health card numbers, social security
numbers, etc.
This document describes the practices that can be employed within MDM to employ
unique Enterprise wide Identifiers (EIDs) for the parties that are managed within MDM.
It discusses the mechanisms for generating EIDs, the data entities in MDM where they
can be stored, and how these EIDs can be used within MDM and among the applications
with which MDM operates.
Furthermore, it discusses the mechanisms you can use to share EIDs with other
applications, and how to keep those applications apprised of the changes regarding
parties and the EIDs.
Glossary of Terms
Term
Description
EID
Enterprise Identifier. A sequence of numeric
or alphanumeric characters that uniquely
identifies a Party in MDM. A party can have
more than one EID, but an EID can never be
shared amongst parties.
Suspect Duplicate Processing (SDP)
The process in MDM whereby several entries
of parties are identified as being duplicates
of a single actual individual or organization.
Collapse
The process of consolidation of two or more
duplicates of a party into a single new party
representing an actual person or
Enterprise Identifier
Page 4 of 29
organization.
Survivorship
The rules that govern what aspects of
duplicate party entries are carried over into
the new consolidated party representing an
actual person or organization.
Notification Framework
The technical framework within MDM
whereby MDM can notify other applications
of changes made to data within MDM. The
framework relies on the Java JMS
technology.
Behavioral Extension Framework
The technical framework within MDM
whereby you can plug into MDM services
additional logic embodied in Java programs.
In other words you can change or extend the
default behavior of MDM’s services.
Party
A general term referring to an individual
person or an organization that is stored in
MDM. It’s the entity with which your
organization has a direct or indirect business
relationship.
Enterprise Identifier
Page 5 of 29
Introduction
There is a growing interest in understanding how IBM InfoSphere MDM addresses the
use case pertaining to lifecycle management of an Enterprise Identifier (EID) for Parties.
This document discusses the following topics in this regard:
•
The importance of EID for an organization
•
How to choose an EID
•
EID Maintenance
•
Proliferation of EIDs
The document concludes with a summary of best practices on the subject.
Enterprise Identifier
Page 6 of 29
Why is an EID important?
Party information is the most crucial piece of data for organizations. For dotcoms, for
example, the number of customers is directly related to the value of the company; for
retailers, supplier information is crucial for streamlined supply chain management
processes, and customer information is important for targeted marketing campaigns and
business growth; for banking and insurance companies, party information is essential for
determining client’s eligibility for a particular offering, and so forth. It is, therefore,
essential for organizations to maintain pristine party information and apply data
governance principles on this data as it proliferates throughout the organization.
The reality of the situation, however, is the more the information proliferates, the less
control an organization has on its data. The below diagram depicts the scenario:
System A
System B
Jill Doe
ID=1234
System C
Jill Doe
ID=765-238
Bruce Buck
ID=6197
Jill Doe
ID=AJ765
Bruce Buck
ID=486-307
Jill Doe
ID=1234
Same
as
Jill Doe
ID=765-238
Bruce Buck
ID=6197
Same
as
Bruce Buck
ID=486-307
Bruce Buck
ID=HL732
Same
as
Same
as
Jill Doe
ID=AJ765
Bruce Buck
ID=HL732
Cross Reference
As illustrated above, party information may be residing in multiple systems without the
notion that it is in fact the same person. One possible way to keep track of all the places a
party is maintained and all the ways that the party is identified in those places, is to
maintain a complex set of cross reference tables to tie the information together – as is
shown above. The problem is that the more ways a party is maintained by an
organization, and the more places the party is maintained, the more complicated the
cross-reference relationships become.
Enterprise Identifier
Page 7 of 29
The concept of EID greatly simplifies the situation and is an integral part of sound master
data management practices. In addition, in the consolidated style of master data
management, where a single, master view of a party is maintained, EIDs play a vital role
in ensuring that you can get a true understanding of what constitutes a party.
An EID is simply a globally unique identifier that is assigned to a party when the party is
introduced in an organization. That identifier remains with the party throughout its life
cycle. All the systems in which a party is kept agree on the EID and that way the
organization can know with confidence that Bruce Buck in System A is the same Bruce
Buck in System B without having to carry out a great deal of auxiliary processing such as
ensuring that addresses are also the same, that the date of birth is the same, etc.
Defining an EID
The decision around what is the appropriate EID is important and the following
considerations must be kept in mind as you decide what you want to use for your EID:
Business key versus random key
Some of the characteristics of a good EID are:
•
Must be globally unique
•
Must be populated for all parties
•
Must be stable and must not be subjected to changes
Business attribute like numbers assigned by external agencies or combinations of the
party’s name, address, date of birth, etc. might be considered a good candidate for an
EID. After all, identifiers assigned by official agencies are guaranteed to be unique. Or
are they? Many people do not have any relationships with any given agency and
consequently may not have an identifier by that agency; examples include newborns,
people from other countries etc. And you can’t use the identifiers from different agencies
because then you won’t be able to assure that they are unique.
Also, these external identifiers can change. For example, a person who has been the
victim of identity theft may be assigned a new identifier. And so the EID that is
maintained in MDM Server must also be changed to reflect the new identifier. Besides,
in many cases organizations are notorious for populating dummy data for such attributes
such as assigning all 9’s or all 0’s to some identifier due to lack of data governance
procedures in the various systems where a party’s data can be entered by clerks.
A better approach for an EID is to have an unintelligent, random and automatically
generated value, which is impervious to user intervention and business change. In other
words, the value needs to serve only one purpose, and that is to ensure, unequivocally,
that no other individual shares the same value – period. It need mean nothing more than
that. It is a stand-in for the process you would normally go through to ensure that you
are dealing with the correct individual.
Enterprise Identifier
Page 8 of 29
Where can an EID come from?
When it comes to sources for EIDs you can divide the world into two halves: EIDs that
are generated within MDM, or EIDs that are generated without MDM and supplied to
MDM. Earlier we said that you shouldn’t use identifiers that are used by external
agencies as an EID, and that advice holds. But you can have another application within
your organization whose job it is to generate EIDs and to share them with MDM. That’s
because, unlike for external agency identifiers, you are in charge of the identifier. For
example you may have a (legacy) application that’s been carrying out Master Data
Management duties for a long while and that has been generating EIDs as part of its job.
You may be transitioning from that that older application to MDM, and for a period of
time the two applications may be working in tandem, sharing their data. The EIDs
generated by that legacy application have likely found their way throughout applications
in your organization and you don’t want to have to re-do them just because you’ve
adopted MDM. However, as you ramp up MDM Server and wind down the legacy
application, you’ll want the EID generation duties to be taken over by MDM. So, there
may also be a period of time when you may need to generate EIDs in MDM for some
occasions, and accept existing EIDs for other occasions. We’ll weave this concern into the
discussion that follows.
Defining EID in InfoSphere MDM Server
Why can’t the CONT_ID be the EID?
InfoSphere MDM Server creates a golden record for a party by consolidating information
from multiple sources, and it is certainly the best point of origination of the EID. This
information is stored in many database tables. Chief among them is the CONTACT
table, and if the party is a person, the PERSON table; or if the party is an organization,
the ORG table.
Enterprise Identifier
Page 9 of 29
MDM Server creates a unique key for each party, CONT_ID. The CONT_ID is a number
that uniquely identifies a party within MDM. If you look at MDM’s database, you’ll see
that the CONT_ID is used virtually everywhere where there is data stored about a party.
And, as a bonus, it’s guaranteed to be unique. So you might think that the CONT_ID is
the EID.
However, the CONT_ID attribute in the MDM Server schema is very tightly coupled
with MDM’s own internal workings. The CONT_ID is nothing more than the primary
key for a bunch of tables in MDM Server, and the foreign key for a bunch more. Most
importantly, the CONT_ID is an internal number. There is nothing stopping MDM
Server from changing it for any given party without giving notice that it’s doing so. As a
matter of fact MDM does precisely that when it consolidates several contacts into one
actual party in a procedure known as Suspect Duplicate Processing (SDP)
So, it’s best not to use the CONT_ID also as the EID.
Where do I store the EID?
Introducing the Party Identifier Table
MDM has a special purpose database table called the PARTYIDENTIFIER table. It allows
you to associate one or more identifiers and identifier types with a party.
Enterprise Identifier
Page 10 of 29
It is used to store means of identifying parties beyond simply using MDM’s own internal
CONT_ID. The purpose of this table is precisely to store things like the EID. In addition,
the values (and their meanings) stored in this table are entirely within your control. They
are not internal to MDM; they are yours to control and manipulate as you require.
But the identifier table alone doesn’t tell the entire story. That’s because there are
different ways of possibly identifying a party, and so different identifiers can be used.
Each entry in the IDENTIFIER table is associated with an identifier type in the Identifier
Type (CDIDTP) table. That way, if there is a number K0411-01957-19765 in the identifier
table, you can look up its type and see that it’s a driver’s license number as opposed to
some other number. Below is an illustration of the different ways an individual can be
identified using the identifier table.
Enterprise Identifier
Page 11 of 29
Contact
(i.e. Party)
Party Identifier
Value
Clark Kent
Type
28950195
EID
Peter Parker
765 876 432
SIN/SSN
Bruce Wayne
913941984
EID
28950195
EID
Superman
28950195-81349
Drivers Lic
Spiderman
436 439 756
SIN/SSN
28950195
EID
Dr. Bruce Banner
Batman
Hulk
1374147-JK112-01 Drivers Lic
Specifying the EID as a type of identifier stored in the Party
Identifier Table
To set up the Party Identifier table to store an EID, you must define that there is such a
thing as an EID and that it is stored in the party identifier table. The types of identifiers
that are stored in the Party Identifier table are listed in the Party Identifier Type
(CDIDTP) table. The following identifier types are provided to you by MDM out of the
box:
Party Identifier Types (CDIDTP)
Social Security Number
Corporate Tax Identification
Driver's Licence Number
Birth Certificate
Mother's Maiden Name
Tax Identification Number
Tax Registration Number
Passport Number
Health Card
Social Insurance Number
ABILITECLINK
DUNS number
Enterprise Identifier
Page 12 of 29
<EID> < -- Add a new Identifier Type
You can add new types of identifiers using MDM Server’s Business Administration User
Interface. Having specified that you have EIDs as types of identifiers, you can now start
using them.
How do I generate the EID? What should its value be?
Recall what the salient features of the EID are:
•
Must be globally unique
•
Must be populated for all parties
•
Must be stable and must not be subjected to changes
It can be any value you like it to be and take on any format you like it to take. It can be a
combination of numerals and alphabetic characters.
Use the CONT_ID number
Earlier we told you not to use the CONT_ID as the EID, and we meant it. That simply
means, you mustn’t use CONT_ID field to mean two things: the MDM internal primary
key for a party, and the globally unique identifier known as the EID. But you can cheat –
a little.
You can borrow the number that MDM generates as the CONT_ID. Because we can’t
emphasize this enough, we’ll say it again: we don’t mean that you can call the CONT_ID
as the EID and be done. We mean you can take the number that is generated by MDM
when a new party is created and store it separately as the EID in the party identifier
table. This approach has several advantages:
•
The number is already generated for you.
•
It is 18 numerals long.
•
It’s guaranteed to be unique within MDM. That’s because you cannot create a
party in MDM with a duplicate CONT_ID.
This approach has a significant disadvantage in that the EID is the same number as the
CONT_ID. Users who are examining a party’s data and happen to notice that the EID is
the same number as the CONT_ID, may confuse the two. This doesn’t matter to many
users of MDM and it might not matter to you. If it doesn’t, then go ahead and use it.
Enterprise Identifier
Page 13 of 29
Generating a New EID from scratch
Options
If you don’t like the idea of re-using the number generated as the CONT_ID, and you
would prefer to use a different number or identifier, you have several options at your
disposal:
•
•
•
you can use MDM’s built-in key generating functions to generate your own, or
you can use a third party unique id generator such as java.util.UUID, that gives
you identifiers that look like this: dc1314ee-07e9-4f08-9be72f8ba1c86342, or
you can write your own from scratch.
We’ll briefly discuss option 1. If you want to use options two or three, you can follow the
discussion as we go on and simply plug in a different generator.
Using MDMs built-in ID generators
MDM comes with two built-in ID factories you can use. These are:
•
•
com.dwl.tcrm.utilities.MDMPartyIdentifierFactory, and
com.dwl.base.util.DWLIDFactory.
Enterprise Identifier
Page 14 of 29
Generating a simple fixed length numeric ID
If all you need is a straightforward number, either factory will do. The DWLIDFactory
generates an 18 character numeric string (a string composed only of numerals 0 to 9).
The MDMPartyIdentifierFactory does the same except that the identifier’s length is 20
characters. Here is a code snippet that illustrates:
DWLIDFactory idFactory = new DWLIDFactory();
java.lang.String aUniqueId = (String) idFactory
.generateID(new Object());
Ignore the new Object() in the generateID(new Object()) method. It doesn’t do
anything. You just need to follow this convention. That’s it. You can now use your
unique id (i.e. aUniqueId)
Generating different types of numeric, alphabetic or alphanumeric IDs
Generally, you won’t need anything more sophisticated than a simple numeric identifier.
However, if you do, these two factories can also generate identifiers that can be
alphanumeric, numeric, or alphabetic of various lengths – lengths that you can specify.
To do this there are two more classes in MDM Server that you need to know about:
IDGenerationConstants is a class that enumerates the types of IDs you can generate.
Their names (i.e. ALPHA, NUMERIC, ALPHA_NUMERIC, NUMERIC_STRING) are selfexplanatory. The second class, IDParamObj, allows you to pass on the instructions you
Enterprise Identifier
Page 15 of 29
want the ID generators to follow. For example if you want to create an ID that’s 12
numeric characters long, you set the IDParamObj.type field to
IDGenerationConstants.NUMERIC, and the IDParamObj.idlength field to 12.
The IDParamObj.size field allows you to specify how many IDs you want the factory
to generate. Here’s a snippet that shows you how to get 10 alphanumeric IDs, each 16
characters long, generated for you:
IDParamObj paramObj = new IDParamObj();
paramObj.setType(IDGenerationConstants.ALPHA_NUMERIC);
paramObj.setSize(10); //Generate 10 Ids
paramObj.setIdLength(16); //Sets ID length to 16
DWLIDFactory idFactory = new DWLIDFactory();
java.lang.Object[] uniqueIDs = idFactory
.generateMultipleIDs(paramObj);
That’s all there is to it, regarding the generation of the EIDs.
Assign an EID to a Party when the Party is being created
The correct time to assign an EID to a party is when the party is being created. This can
occur as a result of many transactions, but typically occurs during an AddParty,
AddPerson, or AddOrganization transaction. It might also occur during more coarse
grained transactions, such AddContract, AddContractPartyRole, AddClaim,
AddClaimPartyRole, CollapseMultipleParties, CollapseParties, etc. However due to
MDM’s modularity, these coarse grained transactions will call the finer grained ones, and
so we’ll focus on these when it comes to generating a new EID.
If a new party is being created, and you wish to preserve an existing EID for the party,
you can do so by including the EID as a component of the data comprising that party. In
that way, if the EID is not part of the new party’s inbound data, you can generate one
and incorporate it into the data before MDM proceeds to save the party’s data to the
database.
Use a behavior extension to assign an EID to a Party
The best way to assign an EID to a party is to use a behavior extension. It’s beyond the
scope of this text to go into the details of behavior extensions. There is plenty of
documentation that describes how to do that. Here, we’ll simply describe what the
behavior extension should do and where it should be invoked.
When a new party is being created, all the aspects of the party that are known at the time
of creation are included in the PartyBObj/PersonBObj/OrganizationBObj. This includes
the party’s address, the party’s name, whatever contact methods are in effect for the
party, etc. These are all additional business objects that are part party business object
that is the basis of the transaction. You might picture it something like this:
Enterprise Identifier
Page 16 of 29
One of the business objects that the party business object contains is the
TCRMPartyIdentificationBObj. That’s the business object that represents the Party
Identifier. It’s in one of those objects that the EID belongs as well. Let’s take a closer look
at the TCRMPartyIdentificationBObj:
Note that this is only a partial list of the getter and setter methods belonging to this
business object. The methods that are of primary interest to us are the ones related to
Identification Value and Identification Type. The first is the value of the EID. The
second denotes that this particular party identification object is related to the EID. Recall
that in order to have and EID as a valid type of party identifier, you must do so in the
CDIDTP (which is database-ese for Code, Identifier Type).
Recall, also, our discussion around where EIDs can come from. If MDM Server is
working in tandem with other (possibly legacy) applications that may be carrying out
some (subset) of MDM functions, there may already be an EID for a party that is being
created in MDM. If there is, then it is in one of these TCRMPartyIdentificationBObj
objects that it will be found. So before you go about generating one, you can check to
make sure there isn’t already one supplied in the inbound data comprising the party.
Here is a brief algorithm for what a behavior extension that adds an EID must do:
1.
Execute this behavior extension in the pre-execute phase of the controller-level
transaction.
Enterprise Identifier
Page 17 of 29
2.
3.
4.
5.
6.
7.
8.
In the behavior extension, extract the TCRMPartyBObj.
From the TCRMPartyBObj, extract the list of TCRMPartyIdentifcationBObjs
using the TCRMPartyBObj.getItemsTCRMPartyIdentificationBObj
method.
Examine each TCRMPartyIdenficationBObj to ensure it does not represent an
EID. If you find one, exit.
Create a new TCRMPartyIdentificationBObj. Generate a new EID number using
one of the generation methods we discussed earlier. Specify the Type as being an
EID using the type code for EIDs from the CDIDTP table.
Add the TCRMPartyIdentificationBObj you just created back to the list of
TCRMPartyIdenfificationBObjs in the TCRMPartyBObj.
Return the TCRMPartyBObj to MDM and exit the behavior extension.
MDM will incorporate the TCRMPartyIdentification you created into its
processing and will preserve it to the MDM database.
Note that this approach only works if you are generating your own EID number and not
copying the CONT_ID that MDM generates automatically. That’s because in the preexecute method of the controller level transactions for adding parties, the CONT_ID
hasn’t been created yet. If you want to use the CONT_ID, you must write your behavior
extension differently. Instead, you must write it to run in the post-execute of the
component level transaction for AddPerson or AddOrganization. At that point, the
CONT_ID will be available to you. However, you’ll have to explicitly add the EID using
a TCRMPartyIdentificationComponent. addPartyIdentification() method.
Assign an EID to an existing Party
For reasons that are difficult to predict, you may end up with parties in MDM that have
no EID. You can quite easily create an EID for an existing party. There’s an MDM
service specifically for that. It’s called the AddPartyIdentification service and it takes as in
input the TCRMPartyIdentificationBObj business object we described above.
If you need to add an EID to an existing party, it’s a better idea to do so explicitly by
calling MDM and invoking the service AddPartyIdentification. If you attempt to embed
this kind of capability into a behavior extension you may be needlessly invoking it for the
majority of parties that do have an EID. For example, you could write a behavior
extension that, for a given party, checks to see if there is an EID, and if there isn’t it
generates one and persists it to the database. You could embed such a behavior
extension in the pre-execute phase of all updateParty transactions, and you would be
causing its invocation, likely needlessly in most cases, for all updateParty transactions.
Furthermore, you’d have no guarantee of actually identifying those parties have no EID
and supplying one for them.
A sounder approach is to explicitly identify those parties not having an EID, and to
generate one for them.
Enterprise Identifier
Page 18 of 29
Maintaining EID
Can EID change for a Party?
An EID is defined once for the party and is never changed. Certainly, that is true from a
manual intervention perspective.
In consolidation-style implementations, where the data within MDM Server is being
aggregated from multiple systems, it’s likely that many systems that supply MDM with
their data all keep track of the same individual. As they feed MDM, it’s highly likely that
they will try to feed MDM with the same party. It’s beyond the scope of this text to
discuss MDM’s methods for handling duplicated parties. There’s another best practices
document entitled Infosphere Master Data Management Server Suspect Duplicate Processing
that thoroughly describes it. Here, we’ll limit our discussion to what happens when
multiple instances of the same party end up in MDM – each with their own EID.
When MDM discovers that it has two instances of the same party, its first job is to
consolidate the multiple instances into one new consolidated party. But what happens to
the EID? Like we’ve said before, a party can – must – only have one EID. There are two
rules you follow:
If the EID is never shared outside MDM, then the rule is simple: keep one EID from the
parties being consolidated and get rid of the rest. When a new consolidated party is
created from the two or more duplicate parties, pick one of the EIDs from the existing
parties being consolidated and use it for the new party.
However it’s far more likely that an EID is shared amongst many systems throughout an
organization (why else would you go through the trouble of keeping one?). The EID may
be the link between other systems' views of a party and MDM’s. The EID may be the key
through which multiple systems synchronize their views of a Party amongst themselves
and with MDM. In this case you can’t just delete the EIDs you no longer need when you
consolidate parties in MDM. Those EIDs may be the only means some applications have
to identify a party when they interact with MDM.
A final possibility is that, in addition to sharing an EID within an organization, a party’s
EID may be shared with entities outside the organization, including the organization or
the individual herself. It’s far less likely the IT infrastructures of these other types of
entities will be integrated with MDM as tightly as is possible when the EID is shared
exclusively within your own organization, and so during suspect duplicate
processing,when duplicate parties are consolidated, there may be no way to inform all
the users of the EID that it may change.
In these cases, you must preserve ALL the EIDs from the parties you’ve consolidated.
But doesn’t that break the rule about any party having only one EID? It would if all the EIDs
were kept active. Instead, all the EIDs should still be carried over to the new
consolidated party; however they all should be “end dated” (i.e. inactivated) as of the
moment of consolidation, except one EID. The one that remains active can be called the
Primary EID or simply the Active EID. If, on the other hand, all the applications that
Enterprise Identifier
Page 19 of 29
interact with MDM are tightly integrated, and maintain flexible, two-way
communication, it may be feasible to inform each of them of the consolidation of
numerous parties into one, and the preservation of a single EID. If you can do that, then
you need only preserve the one EID that is assigned to a party.
Given the manifold ways in which an EID can be shared outside of MDM, and the
various degrees of sophistication amongst the multitude ways that MDM is built into an
organization’s information technology infrastructure, or the way MDM can interact with
entities outside your organization there is no single “correct” way of maintaining an EID.
We merely present the manifold possible scenarios in which an EID can be used and
shared.
After consolidation, it, and only it should be shared by systems that MDM itself feeds.
There are several transactions that support collapsing party information, such as:
collapseMultipleParties, collapseParties, collapsePartiesWithRules,
comparativeCollapseMultipleParties, comparativeCollapseParties,
previewCollapseMultipleParties, previewCollapseParties. Detailed descriptions of these
transactions are provided in the IBM® InfoSphere Master Data Management Server
Transaction Reference Guide.
One of two rules plays a role in the Collapse process, depending on whether two or more
than two parties are involved in the collapse action:
•
•
Rule 38, com.dwl.tcrm.externalrule.CollapsePartiesWithRules
Rule 119, com.dwl.tcrm.externalrule.CollapseMultiplePartiesWithRules
The basic function of these rules is to inactivate the duplicate parties and to create a new
consolidated party containing the characteristics you want to preserve for the party
henceforth. Part of such a rule’s behavior must be to implement your desired EID
strategy. You can replace these rules with your own, or you can modify them to
implement your own EID survivorship strategy.
Coordinating your EID with your database
In addition to storing and sharing it, and EID is useful because it enables applications to
retrieve the data for a party using the EID, instead of the internal CONT_ID. If you are
employing a partitioning strategy for your database, it would be wise to coordinate your
partitioning strategy with your EID.
Normally, partitioning is done using the CONT_ID as the partitioning key. By using the
CONT_ID as the partitioning key, you ensure that all (or at least, most) of the data
associated with an individual party can be segregated into a single partition. This
markedly reduces the resources necessary to gather up the data for a party each time it is
retrieved by MDM.
If you are employing database partitioning, and if you are using the CONT_ID as the
key, a good strategy is to also use the CONT_ID number as the EID. That way the EID
Enterprise Identifier
Page 20 of 29
and the remaining party data will most likely be on the same partition, and your
retrievals will be faster if you are retrieving party data using the EID.
Note that this strategy is not guaranteed to work all the time. If you’ve had to
consolidate duplicate parties and you’ve had to preserve their EIDs, you may be
preserving EIDs from different partitions and the benefits of this strategy are reduced.
However, it’s still a sound policy to follow, because the number of consolidated parties
will generally be a small fraction of the overall number of parties in the MDM Database.
EID syndication
Till now we have talked about the maintenance of EIDs within MDM Server - the
modeling choices and lifecycle management. It is a fair question to ask how an EID can
be propagated to the downstream channels, and MDM Server offers a variety of options
to address this:
Syndication via push mechanism
A MDM Server transaction goes through various layers of execution and there are
customization points (behavior extensions) available where you can add your own logic
to extend default functionality. The purpose of the behavior extension might be to kick
off some implementation-specific business rule or to create an EID, or to generate a
notification message and push it on to a queue.
MDM Server, out of the box, comes with a Notification Framework that allows for a
custom message to be defined by implementing a Java interface, and then posting it to a
queue of choice via a centralized notification manager.
As illustrated above, the notification framework can be invoked through a behavior
extension of a transaction by first instantiating the custom notification class and then
calling the framework’s sendNotification (custom class instance) method to send the
message off to a queue.
Enterprise Identifier
Page 21 of 29
The question now arises, where to plug in the behavior extension that sends out the
notification, and the answer is easy once we understand how the MDM Server
transactions work. Transactions in MDM Server are atomic, which implies that either all
the operations of the transaction go through successfully or none of them does. MDM
Server does not do partial commit of data. Keeping this in perspective, the best place to
define the behavior extension is when the transaction has successfully completed, that is,
“post” transaction. If, for instance, the behavior extension is defined at an earlier stage of
transaction execution, there is a possibility that the notification has already been sent out
and the transaction fails afterwards, causing out-of-sync data issues. The below diagram
depicts the correct usage of behavior extension as it pertains to notifications:
Enterprise Identifier
Page 22 of 29
As shown above, the notification is generated at the time when the “post” behavior
extension of the transaction is being executed, which ensures data within MDM Server
and that being sent out via notification is in sync.
“Thin” versus “fat” notifications
This is an architectural consideration that defines how the notification out of MDM
Server ought to look like. This discussion stems from the fact that, more frequently than
not, each destination system has its own interface requirements pertaining to assimilating
changes. One can argue that in order to circumvent this issue, why not just send
everything about the changed party in the notification – a “fat” notification - so that the
data may be consumed in whichever way the destination systems require. This is a good
approach provided the Service Level Agreements (SLAs) are being met and there is
adequate queue infrastructure available to handle big messages and large volumes.
Another architectural pattern that may be used is that MDM Server generates a “thin”
notification - more like a pointer to the party that has changed. The systems monitoring
the queue would get the notification and would reach out to MDM at their own
convenience to get the rest of the information.
The above diagram depicts a more decentralized approach where various clients can
invoke a web service to get further information about the party. A more centralized
architecture (not shown) may be achieved by having a Message Broker or an Enterprise
Enterprise Identifier
Page 23 of 29
Service Bus (ESB) that fetches the notifications from the queue, calls the MDM Server for
additional information, and then formats and routes the message to various channels.
Syndication via pull mechanism
MDM Server stores party data in highly normalized tables. Each row in the database is
time stamped with a last update date. If the EID proliferation requirements are not nearreal time, then a periodic external batch process may be scheduled to pull changed party
information based off of the last update date and use that to synchronize the downstream
systems.
Using the EID in MDM Server
In this section we’ll introduce the MDM transactions that you can use to work with EIDs.
We’ll not go into detail since they are adequately described in the MDM Server
Transaction Reference Guide.
Locating a Party using their EID
To get a party using its EID, use the SearchParty transaction in MDM Server. One of
the inputs to the SearchParty transaction in the TCRMPartySearchBObj is the party
identification number (i.e. the EID) and the party identification type (i.e. the type of party
identifier that denotes an EID).
Getting a Party’s EIDs
When you issue a GetParty transaction, the party’s identifications are included in the
result. If you only want a party’s identification, you can issue a
GetAllPartyIdentifications request. If there is a specific identification you want,
you can issue a GetPartyIdentification request.
Note that these transactions are not specific to EIDs. They apply to all types of party
identifications including drivers licenses, health cards, social security numbers, etc. You
must sift through the responses you’ve got to get the EIDs. You’ll know when you’ve got
an EID by examining the party identification type code.
Getting a Party’s Primary EID
Like we discussed earlier, a party can have only one active EID at a time. That’s the
primary EID.
Enterprise Identifier
Page 24 of 29
Best practices
•
Use the Party Identifier Table to store Enterprise IDs.
•
You can use the CONT_ID number as the EID, but don’t use the
CONT_ID field as the EID.
•
You can generate your own EID using one of several mechanisms
including MDM Server's own built-in ID generators.
•
When several parties are consolidated into one, decide whether
it’s best to keep all the EIDs or whether you only need to keep
one. The decision depends on how the systems that interact with
MDM employ the EID.
•
It’s best to use behavior extensions coupled with the MDM
Notification framework to share or ‘syndicate’ EIDs amongst the
applications that work with MDM.
Enterprise Identifier
Page 25 of 29
Conclusion
Enterprise IDs (EIDs) are an important way to ensure that each person or organization
(i.e. Party) you think you are dealing with in MDM is actually the one you want to be
dealing with. EIDs can be used in lieu of the complex, expensive and unwieldy process
of ensuring that you are dealing with the correct party in MDM by examining all their
relevant attributes (such as their address, their date of birth, their gender, etc). It is a best
practice to maintain an EID especially if MDM shares data with other applications. The
EID may be the only way these other applications have to correctly locate the party in
MDM.
There are several ways to create an EID. EIDs can be generated within MDM using
several available mechanisms; the number used for the CONT_ID can also be used as the
EID (but not the CONT_ID itself); or EIDs can be pre-existing and can be added to MDM
when the parties themselves are added to MDM.
During the inevitable process when duplicate parties are to be consolidated in MDM,
each having their own EID, the strategy you take to preserve the EIDs will depend on
how the EID is shared amongst other applications. If the other applications can be
apprised readily of the changes, it may be possible to consolidate the EIDs along with the
consolidation of the parties’ data. If you can’t be sure of that, then you will likely need to
keep all the EIDs active alongside the consolidated party. This will enable other
applications to continue to access the party using previous EIDs.
You should use the MDM behavior extension and notification frameworks to keep other
applications that rely on MDM data synchronized with MDM, including their EIDs.
Enterprise Identifier
Page 26 of 29
Further reading
•
Information Management best practices:
http://www.ibm.com/developerworks/data/bestpractices/
Contributors
Lena Woolf – Senior Product Architect - InfoSphere MDM
Stephanie Hazlewood - Senior Product Architect - InfoSphere MDM
Karen Chouinard - Senior Manager, MDM Portfolio - Customer Focus
Enterprise Identifier
Page 27 of 29
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other
countries. Consult your local IBM representative for information on the products and services
currently available in your area. Any reference to an IBM product, program, or service is not
intended to state or imply that only that IBM product, program, or service may be used. Any
functionally equivalent product, program, or service that does not infringe any IBM
intellectual property right may be used instead. However, it is the user's responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in
this document. The furnishing of this document does not grant you any license to these
patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where
such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES
CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do
not allow disclaimer of express or implied warranties in certain transactions, therefore, this
statement may not apply to you.
Without limiting the above disclaimers, IBM provides no representations or warranties
regarding the accuracy, reliability or serviceability of any information or recommendations
provided in this publication, or with respect to any results that may be obtained by the use of
the information or observance of any recommendations provided herein. The information
contained in this document has not been submitted to any formal IBM test and is distributed
AS IS. The use of this information or the implementation of any recommendations or
techniques herein is a customer responsibility and depends on the customer’s ability to
evaluate and integrate them into the customer’s operational environment. While each item
may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee
that the same or similar results will be obtained elsewhere. Anyone attempting to adapt
these techniques to their own environment does so at their own risk.
This document and the information contained herein may be used solely in connection with
the IBM products discussed in this document.
This information could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein; these changes will be incorporated in new
editions of the publication. IBM may make improvements and/or changes in the product(s)
and/or the program(s) described in this publication at any time without notice.
Any references in this information to non-IBM websites are provided for convenience only
and do not in any manner serve as an endorsement of those websites. The materials at those
websites are not part of the materials for this IBM product and use of those websites is at your
own risk.
IBM may use or distribute any of the information you supply in any way it believes
appropriate without incurring any obligation to you.
Any performance data contained herein was determined in a controlled environment.
Therefore, the results obtained in other operating environments may vary significantly. Some
measurements may have been made on development-level systems and there is no
guarantee that these measurements will be the same on generally available systems.
Furthermore, some measurements may have been estimated through extrapolation. Actual
results may vary. Users of this document should verify the applicable data for their specific
environment.
Enterprise Identifier
Page 28 of 29
Information concerning non-IBM products was obtained from the suppliers of those products,
their published announcements or other publicly available sources. IBM has not tested those
products and cannot confirm the accuracy of performance, compatibility or any other
claims related to non-IBM products. Questions on the capabilities of non-IBM products should
be addressed to the suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or withdrawal
without notice, and represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To
illustrate them as completely as possible, the examples include the names of individuals,
companies, brands, and products. All of these names are fictitious and any similarity to the
names and addresses used by an actual business enterprise is entirely coincidental.
COPYRIGHT LICENSE: © Copyright IBM Corporation 2012. All Rights Reserved.
This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and
distribute these sample programs in any form without payment to IBM, for the purposes of
developing, using, marketing or distributing application programs conforming to the
application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions.
IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these
programs.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both. If these and
other IBM trademarked terms are marked on their first occurrence in this information with a
trademark symbol (® or ™), these symbols indicate U.S. registered or common law
trademarks owned by IBM at the time this information was published. Such trademarks may
also be registered or common law trademarks in other countries. A current list of IBM
trademarks is available on the Web at “Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml
Windows is a trademark of Microsoft Corporation in the United States, other countries, or
both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
Enterprise Identifier
Page 29 of 29