ORACLE PRODUCT DATA QUALITY

ORACLE DATA SHEET
ORACLE PRODUCT DATA QUALITY
Even the best Product Information Management (PIM), Master Data
Management (MDM) or data integration initiative can be derailed by
poor product data quality. Oracle Product Data Quality is built on the
latest semantic technology to handle the extreme variability of product
data and overcome the limitations of traditional customer data quality
solutions. Oracle Product Data Quality can be integrated with any
application or process, and is pre-integrated at a semantic level with the
SPECIALIZED SEMANTIC
Oracle Product Hub to reduce the time and cost to deploy and operate
TECHNOLOGY FOR PRODUCT
your MDM system for product data, or any other system of record –
DATA QUALITY
while also extending its capability and benefits.
FEATURES
• Semantic recognition of
structured or unstructured
product information
• Auto-learning improves
recognition as a natural part of
ongoing operation
• Integrated governance
capabilities for data stewards
and product specialists
• Designed for rules
maintenance by business users
• Advanced transformation
capabilities including
classification and translation
• Advanced matching capabilities
including semantic matching
• Best-in-class patented
DataLens™ technology
• Semantic-level integration to
Oracle Product Hub
Product Data Quality – Specific Problems, Specialized Solution
Data quality has been recognized for some time as a critical component of any Master
Data Management or data integration initiative, but it is all too often under-planned and
under-funded and ends up as the greatest single risk factor in the project. When dealing
with product data this problem is compounded as traditional data quality tools that were
designed for customer (name & address) data struggle with the extreme variability of
typical product data. Consequently enterprises are forced to use manual effort or custom
code to ‘clean-up’ their data – neither of which is likely to be a good solution.
Oracle Product Data Quality was built from the ground up to tackle these problems and
to integrate easily with any existing application or process – from PLM to procurement
and from online retail to asset management and many more. It can be called in real-time
or batch mode and can provide PDQ services from simple standardization to match,
merge, transformation, translation and more.
Oracle Product Data Quality is pre-integrated with Oracle Product Hub making it the
first and only MDM for product data solution to offer a dedicated product data quality
capability that is both highly capable of dealing with the specific nuances of product data
BENEFITS
• Improve revenue – greater
agility, searchability, customer
satisfaction
• Increase PIM ROI, reduce
implementations cost and risk
• Reduce manual data cleansing
effort
• Platform for governance,
compliance and continuous
improvement
and is fully integrated to the core PIM system.
Oracle Product Data Quality – the Right Tool for the Job
Product data covers tens of thousands of product categories, each of which has a different
semantic definition or context (vocabulary, inferences, validations etc.). To make
matters worse, within a single category the format of the data can be infinitely variable
(word order, punctuation, synonyms, national language, abbreviation etc.). Such
variation is beyond the scope of traditional data quality solutions and a different
technology approach is required. Oracle Product Data Quality incorporates patented
DataLens™ technology specifically to deal with these issues:
•
Semantic-based – Semantic recognition technology handles pattern and category
1
ORACLE DATA SHEET
• THE
Translation
costs reduced
to a
ADVANTAGE
OF
variations without additional programming.
DATALENS
TECHNOLOGY
fraction of manual
effort
•
Auto-learning – Semantic inference is applied to understand unrecognized data. If
confirmed, inferences create new rules and the system continues to learn as a natural
consequence of ongoing operation.
•
Scalability – Category-based semantic models and in-memory processing allow
easy scaling to handle of many millions of items across thousands of categories.
•
Integrated governance – Dedicated interface for data stewards and product
WHAT IS A DATA LENS?
specialists to review process effectiveness (quality metrics and KPIs) as well as
Like an optical lens, a ‘data
lens’ does not store the data
is ‘sees, but simply provides a
real-time mechanism to view
your data in a different way –
transformed or ‘re-focused’ in
whatever way you specify.
interact directly with exception data for rapid remediation.
A data lens contains a lot of
contextual knowledge about a
product category – the
vocabulary, abbreviations,
rules, attributes, valid values
and so on. This allows the
data lens to recognize
important information in
context – which is important to
avoid the ambiguities and
false positives that plague
non-semantic approaches.
•
Built for business users – Allowing the business users who really understand the
data to create and maintain rules allows for faster and more precise operation than if
IT is forced to be the middle-man in rules creation.
Oracle Product Data Quality – Capabilities
The purpose of Oracle Product Data Quality is to provide an integrated capability to
recognize, cleanse, match, govern, validate, correct and repurpose product data from any
source to any target requirement.
In addition, data lenses
contain category-specific
information on matching,
standards for transformation
and publication, translation
and more.
Anyone who has used a
traditional DQ toolset knows
that dealing with ambiguous
data and false positives is a
never-ending problem as their
pattern-based systems do not
take into account context –
forcing the programmer to
compensate with neverending ‘exceptions’ usually
created in procedural if-thenelse code. Not only is this
expensive to create and
maintain, but it does not even
produce a high quality result.
Anyone who has tried to do
this with other approaches
can easily appreciate the
value of semantic-based
DataLens technology.
Figure - Product Data Quality Server ‘footprint’
The broad functional capabilities of the system are as follows:
•
Semantic Recognition – Each product record is compared to a semantic model that
can recognize and extract required information through reference to the semantic
context. This is done irrespective of variations in input format, punctuation,
terminology etc. and can be applied to both structured and unstructured data. This is
the core requirement for handling product data and without it the variability of
typical product data will cause subsequent processes to fail. In the case where the
Oracle Product Hub is used, semantic metadata from the PIM system will be used –
ensuring that all data loaded to the PIM already conforms to its defined standards.
•
Standardization – Enforce any required PIM standards on product data
2
ORACLE DATA SHEET
o
PRODUCT DATA VS.
CUSTOMER DATA
number of alternate catalogs which can be based on industry or custom
schemas. Supports multiple classifications per item.
It is sometimes claimed that
traditional data quality tools
designed for customer
(name & address) data can
also handle product data,
but this is true only in the
most simplistic of cases. In
reality, product data has a
number of characteristics
that put it beyond the scope
of traditional toolsets.
o
terminology and abbreviations.
o
target character length without truncation an AutoAbbreviate function can
progressively abbreviate the line until it meets the length requirement while
maintaining maximum readability.
o
right and right-to-left.
•
information – product data
reference sources do
exist, but they are highly
fragmented, change
frequently and rarely
represent the same
standards that a particular
company needs to enforce
Matching – Support for many individual and hybrid matching methods from simple
key matching (with or without cleanup) to semantic matching with full control over
mandatory, optional and weighted attributes. A semantic key is created to ensure
high performance when de-duplicating an input file, or when matching against PIM
production tables. Matched records can optionally be merged and cross references
and associations maintained.
• Ambiguity – the meaning
• Lack of reference
Translation – Attributes and descriptions can be translated into multiple
languages with full support for double-byte character sets and parsing left-to-
are tens of thousands of
product categories, each
with its own terminology,
standards, inferences,
brand names, rules and
validations
of some data can only be
determined in context –
either the context of the
category or of the
individual record
Descriptions – Enforce PIM standards on descriptions including word order,
punctuation, terminology etc. For descriptive information that has to fit a
• Variability – Infinite
• Many categories – there
Attributes – Enforce PIM standards on attribute data including unit of
measure conversion, range standardization, including color ranges,
Product data characteristics:
variability even within a
category due to word
order, punctuation,
abbreviations, units of
measure, language etc.
Classification – Classify an item to the PIM master item class and any
•
•
Enrichment – Extract or append information to enrich the item record
o
Internal – extract information embedded in descriptive fields
o
External – append information from legacy systems, subscription sources etc.
Repurposing – Transform product data into any form for storage in the PIM or for
external publication. Capabilities include all those listed under standardization
above and can be called as required to avoid redundant storage in the PIM
production tables.
•
Governance – Data Stewards can access a graphical dashboard of quality metrics
and KPIs to give insight on where and how to drive process improvement. The
dashboard is fully configurable to monitor transactional data within the system as
For these reasons, effective
automation of product data
quality requires a
specialized, semantic-based
approach that is quite
different than that available
in traditional customer data
quality tools.
well as any other source that can be accessed through a SQL query.
3
ORACLE DATA SHEET
RELATED PRODUCTS AND
SERVICES
The Oracle Product Data
Quality Server uses nextgeneration semantic
technology to manage
product data quality from
recognition of data in any
format to cleansing,
standardization, match,
validation, governance and
repurposing. It is available
as an optional module to the
Oracle Product Hub.
ORACLE PRODUCT HUB
The Oracle Product Hub is a
best in class MDM solution
for product data with out of
the box support for verticalspecific business processes
across a variety of industries
Figure - Product Data Quality Dashboard shows KPIs to drive process improvement
•
Exception management – Single console for exception management by data
stewards or product specialists. Exceptions can be routed to different work queues
for manual or automated remediation and validation.
Figure - Product Data Quality Console allows product specialists to verify exception data
Integrated to Oracle Product Hub
Oracle Product Data Quality is designed to integrate easily with any other system or
process in real-time or batch mode and is not dependent on any specific system of record,
however, since many companies are adopting PIM, or MDM for product data systems as
a core component of their ability to manage product data, Oracle Product Data Quality is
pre-integrated into the Oracle Product Hub at a semantic level, that is, semantic metadata
from the Product Hub is used to seamlessly create the required semantic recognition
models. For operational processes such as loading and matching, the Product Data
Quality Server has access to both interface and production data tables for maximum
operational simplicity and ease of use. A semantic key is created to enable high
performance semantic matching against PIM production tables. This level of integration
represents unprecedented co-ordination between a PIM system and a dedicated product
data quality system.
4
ORACLE DATA SHEET
Contact Us
For more information about Oracle Product Data Quality Server please visit oracle.com
or call +1.800.ORACLE1 to speak to an Oracle representative.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved.
This document is provided for information purposes only and the contents hereof are subject to change without notice. This
document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in
law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any
liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This
document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our
prior written permission.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective
owners. 0109
5