download

White Paper:
Establishing a Robust Data Migration
Methodology
Prepared by James Chi
© GROM Associates, Inc.
Page 1 of 6
Data Migration Components
Summary
The planning, execution, verification, and documentation
of the migration of application data from legacy or source
systems to SAP are critical components to any successful
SAP project implementation. SAP requires and expects
master and transactional data of high quality for the
intended process integration benefits to be realized.
Data Migration is, however, one of the most overlooked
aspects of an implementation project. This is partly
because so much emphasis is placed on re-engineering
the business processes that the quality and accuracy of
data often takes a lesser priority. However, based on our
implementation experience, we would suggest that many
SAP implementation projects simply lack the tools and
methodologies to systematically identify and perform data
migration activities and resolve data quality issues.
Our Recommended Solution
The data migration strategy and methodology described
below is the result of an evolutionary process developed
over many SAP implementations with multiple clients in
various industry verticals. This methodology is intended to
not only deliver repeatable, predictable, and demonstrate
results; but also bring visibility to data quality issues early
enough in the project to mitigate them.
Let us first introduce the distinct components that make up
a data migration landscape. As illustrated in Figure 1, our
recommended methodology follows the traditional Extract,
Transform, and Load (ETL) data migration component
model.
Data Input Sources
Data for the project implementation come from sources as
identified in the functional specifications. The data for
loading into SAP either already exists in an electronic
format or are manually captured in an approved electronic
format. Import data can come from the following sources:
•
Source Application Data – Data from source
systems are either exported into a comma delimited
text file or copied tables when ODBC database
connections are available. Data are extracted out of
source applications following the “all rows, all
columns” principle, without any data filtering,
translation, or formatting.
© GROM Associates, Inc.
Page 2 of 6
•
•
•
Manual Data Collection – Data may be manually
collected in situations where source data does not
exist.
Based on the complexity and referential
dependency of the collected data, a “Data
Construction Server” database application can be
developed to help facilitate the manual data collection
and validation process. These data are subsequently
provided to the central Data Staging & Transformation
tool.
Manual Data Collection in MS-Excel – In some
cases the need to collect data manually that does not
exist in the source system(s) is served by MS-Excel
Spreadsheets. Based on the complexity of the data
that is needed, the project team develops and
distributes an MS-Excel spreadsheet application to
help facilitate the manual data collection process. The
data is subsequently uploaded to the central Data
Staging & Transformation tool.
Manual Data Collection in SAP – In certain
functional areas, the project can manually collect data
for SAP where data do not exist in source systems
directly in the SAP system.
It is sometimes
advantageous to build SAP data directly in the SAP
environment and take advantage of existing predefined data value tables and validation logic. The
data is subsequently extracted from SAP and
provided to the central Data Staging & Transformation
tool.
Data Staging
All master and transactional data loaded into the SAP
system should be staged in a central Data Staging &
Transformation tool. This repository receives source data
and outputs transformed target data. It contains source
data in its originally supplied form, all the rules to convert,
translate and format this data into the destination format,
and intermediate tables required for data migration
processing. The output from the central Data Staging &
Transformation tool is used as the source of data loads
into SAP.
Applications such as the BackOffice’s Data Staging
Warehouse, IBM’s WebSphere Information Integration
(formerly Ascential Software), Informatica’s PowerCenter,
and other commercial ETL tools are designed for the
purpose of extracting, transforming, and loading data.
These tools should be leveraged on projects where
available. On projects where a commercial ETL tool is not
available, native database tools such as Microsoft’s DTS
or Oracle’s Warehouse Builder can be used as well.
Once staged in their original or approved collection format,
all data is filtered, translated, and formatted in a traceable
and reportable fashion via execution of individual data
rules in the central Data Staging & Transformation Tool.
Exceptions to this rule should only be permitted for
manually entered data objects.
© GROM Associates, Inc.
Page 3 of 6
gathered and compared to the Source Data Reconciliation
Report. Results are provided to data owners for review
and approval.
Data Export Destination Programs
Data is exported from the central Data Staging &
Transformation tool into SAP via standard SAP data
conversion methods and tools.
These conversion
methods and tools are:
•
•
•
•
LSMW – Legacy System Migration Workbench
BDC Programs – Binary Direct Connection
CATT – Computer Aided Test Tool
Manual Input
Comprehensive Data Migration Process
Let us now describe the steps involved in a robust and
comprehensive data migration process.
The overall
process is illustrated in Figure 2 (Preceding page).
In order to ensure ongoing execution, troubleshooting, and
problem resolution throughout the data conversion test
cycles described in the next section “Data Conversion
Approach and Methodology”, the Systematic Data
Migration Process is followed for each data test run.
Following is a high-level overview of the process.
Step 1: Extraction of Source Data
The conversion starts with the extraction of source data.
This extraction, depending upon its source may be a direct
ODBC connection, a spreadsheet or file created
programmatically, or a manually loaded spreadsheet. In
all cases, the extract of source data must be accompanied
by a report that details the contents. A Source Data
Reconciliation Report should be produced for each extract
and must indicate the total number of records contained in
the source table. Other metrics should be supplied for key
data fields such as sums, totals, or hash totals of data
columns contained in the source table. This information
will be very important in demonstrating that the source
data has been completely and accurately imported into the
central Data Staging & Transformation tool.
Step 2 – 3: Upload of Extracted Data into the
Data Staging and Transformation Tool
The next step in the migration process begins the upload
of data from source applications and manual collection
repositories in their native format into the central Data
Staging & Transformation tool. It is critical for all data to
be imported into the staging tool in an “as-is” format. All
source application tables and/or spreadsheet rows and
columns are imported into the staging tool. This ensures
that all data record filtering, translation, and formatting
operations are performed in the staging tool in an
approved, auditable, traceable, and reportable fashion via
execution of individual conversion rules.
Data
reconciliation activities are performed. All results are
Step 4: Data Quality Checkpoint One
Once the data has been successfully migrated to the
central Data Staging & Transformation tool, it can now be
subject to a variety of quality and integrity checks to
identify source data issues that can either be resolved in
the staging tool as a transformation rule or be resolved
back in the source system. Data records that do not pass
key quality or integrity checks should be flagged as such
and omitted from subsequent transformation and loading
steps, and directed to Data Owners for correction.
Step 5 – 6: Transformation of Staged Data
Once staged in the central Data Staging & Transformation
tool, the source data is modified according to data filtering
rules. Data filtering refers to reducing the dataset based
upon rules documented in the functional specifications.
This filtering is performed in order to ensure that only
active and relevant data are loaded into SAP. Once the
source data has been filtered, data translation and
formatting rules are performed. Data translation refers to
replacing source system coding, groupings, and other
source system application data characteristics to
corresponding SAP coding, groupings, and data
characteristics as established per Design Specifications.
Data formatting refers to converting the source data from
its original record format to a format that can be read by
the SAP data upload programs for loading into SAP.
These data staging rules, define the main transformation
of the filtered source data into data that is coded and
formatted for SAP upload purposes. All data formatting,
filtering, and translation rules are based on criteria
documented in the functional specifications.
Data
reconciliation activities are performed to verify that all
required business rules defined in the functional
specifications have been completely and accurately
applied.
Step 7: Data Quality Checkpoint Two
Once the data has been successfully filtered, translated,
and formatted, the resulting dataset can be subject to
another set of quality and integrity checks aimed at
identifying target data integrity and completeness issues
that can either be resolved in the staging tool as a
transformation rule, resolved in SAP, in case of
configuration issues, or resolved back in the source
system. Data records that do not pass key quality or
integrity checks should be flagged as such and omitted
from subsequent loading steps, and directed to Data
Owners for correction.
© GROM Associates, Inc.
Page 4 of 6
Step 8 – 10: Loading of Target Data into SAP
& Final Verification
Subsequent to the successful completion of data quality
checks, translated and formatted data will be loaded into
SAP via any of the mechanism described under the “Data
Export Destination Programs” section of this document
and verified for accuracy and completeness.
This
verification will involve a combination of visual inspection
and technical checks including record counts, sums, and
or hash totals of data columns contained in the export files
and SAP tables.
Data Migration Approach and Methodology
Now that we have introduced both data migration
landscape components and process, we can finally
position how this all fits in the lifecycle of an SAP
implementation project.
What follows is a description of the various data migration
activities as they are executed throughout the ASAP
methodology.
Project Preparation – The purpose of this phase is to
provide initial preparation and planning for the SAP
implementation project, the important data migration
issues addressed during the project preparation phase
are:
•
Finalization of
methodology
data
migration
approach
and
•
•
Selection / specification of ETL tool
Validation requirements
Business Blueprint – Define the business processes to
be supported by the SAP system and the functional
requirements, data migration activities begins with the
identification of Business Objects that require conversion
from the source application to the SAP system.
Depending on the complexity of the legacy data, it may
also be advantageous to assess the quality and integrity
of the legacy source data during this period.
Realization – Design, build, and test the system based
upon the requirements described in the functional
specifications, included in this phase are several data
migration process development and testing cycles.
During the early part of realization, functional
specifications are developed for the data conversion
objects identified during requirements gathering. These
design specifications serve as the basis for determining
which conversion mechanisms are used and provide
additional functional conversion program development and
testing details for a given data object.
The project team develops all required data conversion
rules and programs.
These conversion rules and
programs are tested in the Q/A Test client as illustrated
below:
© GROM Associates, Inc.
Page 5 of 6
Final Preparation – Complete end user training, system
management, and cut over activities, data migration
activities performed during the cut over include:
•
•
•
•
•
•
•
SAP System configuration and development object
migration via the SAP Correction and Transport
System
Reconciliation and Quality Checkpoint Reporting
Manual SAP system configuration steps
Automated data conversion activities in SAP
Manual data conversion activities in SAP
Cut over relevant business activities
Cut over relevant source system activities
Go Live and Support – As the purpose of this phase is
the transition from the pre-production environment to live
production operation, this phase is used to closely monitor
system transactions, and to optimize system performance.
From a data migration perspective, any post go-live issues
related to data should be investigated, resolved, and
closed.
About the Author
James Chi is the Director of the GROM’s Business
Consulting Group Industry Solutions Practice and has
overall delivery responsibilities for all GROM-led projects.
James joined GROM after spending the last seventeen
years delivering SAP solutions in the pharmaceutical,
medical device, and consumer products industries
including Johnson & Johnson. James’ strong functional
background in Supply Chain Planning and Manufacturing
Execution has blended to create a well-rounded business
expert with more than fifteen years of Project
Management experience. James has a BE in Electrical
Engineering from Stevens Institute of Technology.
© GROM Associates, Inc.
Page 6 of 6