idcanalystconnection

I D C
A N A L Y S T
C O N N E C T I O N
Carl Olofson
Research Vice President
Driving Digital Transformation with
Anal ytic-Transactional Processing in
In-Memory Databases
February 2016
With the ever-increasing volume, variety, and velocity of data coming at enterprises today, there is
increasing pressure to exploit that data not only for offline strategic decisions or tactical adjustments
but also for decisions "in the moment." The ability to drive or, in some cases, to automate operational
decisions represents an advanced stage of digital transformation for the enterprise. Achieving
in-the-moment decision making requires that both analytics and transactions be performed on a single
data platform. That platform needs to handle a variety of data types and execute complex queries and
transactions very quickly. IDC calls this technology an "analytic-transactional data platform."
The following questions were posed by SAP to Carl Olofson, research vice president for IDC's
Database Management Software service, on behalf of SAP's customers.
Q.
How has memory-optimized database management system (DBMS) technology
transformed IT?
A.
Until relatively recently, database systems were based on disk optimization, which means
that the data resided on disk and most of the technology was meant to optimize the accessed
data on disk. You had to reduce as far as possible the number of input/outputs (I/Os) you
needed to perform in order to make the database run as fast as possible while preserving
data integrity. This approach was necessary because memory was tight and expensive, so
the economics demanded a disk-based approach. All that started to change as a result of the
addressable memory space going from 32-bit addressability to 64-bit addressability. We also
have much faster multicore processors, which have gotten cheaper of course, delivering
more power for less money.
Memory has gotten cheaper, too, and this has made it possible to use large amounts of it in
an affordable way. Data can now be stored in memory, or a combination of disk and memory,
so that its organization is optimized for being managed in memory as opposed to being
managed on a disk. The impact of this is huge. It means that the DBMS itself is operationally
much simpler because you don't have to fret over disk volume allocation and defining
secondary indexes. Greater operational simplicity results in significant cost savings. There's
much less third-shift work, for example. Administrators can spend a lot less time tuning the
database and spend relatively more time adding value to the database.
US40995416
Memory-optimized databases exhibit much greater performance than disk-optimized
databases. You aren't so sensitive about I/O performance. Because of the speed of access to
in-memory data, you get much greater analytical power to execute more complex analytic
queries and have them completed in a very short period of time.
Thus, organizations can now build applications to execute transactions with analytics built in,
which we could never do before, because the analytics would take too long. You can't have a
transactional app taking 20 minutes per transaction because 19 minutes and 59 seconds of
that is an analytical query. But when the analytical query can execute in one-tenth of a
second, then it becomes reasonable to embed that query in transaction processing and make
our transactional apps much smarter as a result.
Q.
What are some different kinds of memory-optimized DBMS technology?
A.
There have been memory-optimized databases for a long time, but they were typically very
small and special-purpose systems. On Wall Street, for instance, there would be memoryoptimized databases that held only certain critical information for trading systems that needed
to retrieve data very quickly. But we're talking about kilobyte- or maybe megabyte-sized
databases, nothing huge. The earlier fast, general-purpose databases that were memory
optimized for transactions were linked directly to an application, so they weren't multiuser.
These memory-optimized databases delivered speed, and that's about all.
Now we're seeing more memory-optimized databases that are multiuser and can leverage
secondary storage to manage more data than will fit in memory. That's becoming pretty
common, but memory-optimized databases in the transactional category usually still keep
data in row format. They can also scale across clusters, something that wasn't possible
10 years ago, thanks to high-speed networking through Fibre Channel, InfiniBand, and
10 Gigabit Ethernet technology.
Where analytics are involved, however, we tend to use a separate database specifically
designed for high-speed complex query processing. Analytic memory-optimized databases
usually manage the data column-wise, rather than row-wise, using special compression
technology and a processor optimization called SIMD (single instruction, multiple data) to
deliver very fast query results. Typically, such optimizations are not useful in a transactional
context. So, organizations end up with two databases: one that does the transactions, and
one that does the analytics. You must move the data back and forth, which creates a certain
amount of operational complexity in the data movement as well as in duplicating data.
The ideal — especially if you want to blend transactions and applications together — is to
have a single database capable of doing both transactions and analytics without duplicating
data. That's what we see emerging now.
Q.
Why is it important to blend analytics and transactions in the same database?
A.
For practically the entire history of relational databases, we've had databases dedicated to
transaction processing and databases dedicated to analytical query processing. The problem
with that of course is you have to have duplicate data. There's also latency with the amount
of time it takes to move data from the live transaction database to the analytic database.
In the past, that used to mean that you could do analytic queries only on data that was old,
maybe by a day or a week. There is value in that if all you want to do is future planning but
not if you want or need to make decisions in the moment, on the spot, while business is being
transacted. Then you really need current data. Today, for instance, there are technologies
that allow you to move data from the transactional environment to the analytical environment
very quickly so that maybe the data is only a few minutes old.
2
©2016 IDC
With that capability, you can run analytical queries in parallel with transactional systems,
which certainly delivers some value. But you can't actually use the analytical queries as part
of the transaction processing — and that's the issue. The only way to effectively do that is to
perform transactions and analytics using single data copy in the same database. Blending
analytic and transactional data in a single data platform, anchored by a memory-optimized
database, makes a new kind of application possible.
Q.
What is this new kind of application?
A.
This new application is what you might call a smart or agile application. It's an application that
can adjust its behavior to current business conditions. It can make choices — for instance, a
sales application that can dynamically adjust pricing based on demand or consumption
patterns even as it's making sales offers to customers.
It could be a supply chain application that's managing the volume and price quotes to
suppliers for supplies based on the current flow of business through the system. In financial
services, it might be an application that can respond to trading going on right now and remain
current as of the last committed transaction. These agile applications can make smart
decisions faster than any human could because when the human has to make the decision,
that means the application must stop and wait for the human to analyze the information.
Now, there always will be times when a human judgment call is required. But you want those
to be the exception in most cases, not the rule.
Q.
What does this smart application require?
A.
For one thing, it needs to be structured in a way that is fundamentally different from that of a
classic transactional application. This is because classic transactional apps assume a particular
fixed order of doing things; there's no flexibility based on how particular conditions might vary. So
a smart app needs to be designed to be able to respond to the data that it gets back from an
analytic query. It also needs to see both the current transactional data and the current contextual
data, which is data that might be brought in from a data warehouse or some other business
intelligence data collection. You may also include data gathered in Hadoop or another large data
store, which could include machine-generated data and/or social media data.
However, the application will likely need to consult data from a variety of sources, so you
need the app to work with a data platform that has those various data sources integrated into
it. The data platform optimizes the retrieval and delivery of the data to the application, instead
of the application having to do it. This ensures that the application itself can be much simpler.
Smart applications must be designed to be flexible, making decisions about how transactions
are to be conducted as they operate. Now, what that requires is an integrated data platform
that uses memory optimization to ensure very high performance for complex queries that
may include not only live transactional data but also contextual data from data warehouses
and other data sources that may come in through, for instance, a document-oriented
database. In addition, if the data platform can process various types of data such as text,
spatial, streaming, and time series and offer advanced processing capabilities such as
search, prediction, and machine learning, then you can build applications that were
previously unimaginable.
The foundation of these new agile, smart applications in any case is an analytic-transactional
data platform.
©2016 IDC
3
A B O U T
T H I S
A N A L Y S T
Carl Olofson has performed research and analysis for IDC since 1997 and manages IDC's Database Management
Software service, as well as advising and guiding the Data Integration Software service. Mr. Olofson's research involves
following sales and technical developments in the structured data management (SDM) markets, including database
management systems (DBMSs), database development and management software, and data integration and access
software, including the vendors of related tools and software systems.
A B O U T
T H I S
P U B L I C A T I O N
This publication was produced by IDC Custom Solutions. The opinion, analysis, and research results presented herein
are drawn from more detailed research and analysis independently conducted and published by IDC, unless specific vendor
sponsorship is noted. IDC Custom Solutions makes IDC content available in a wide range of formats for distribution by
various companies. A license to distribute IDC content does not imply endorsement of or opinion about the licensee.
C O P Y R I G H T
A N D
R E S T R I C T I O N S
Any IDC information or reference to IDC that is to be used in advertising, press releases, or promotional materials requires
prior written approval from IDC. For permission requests, contact the IDC Custom Solutions information line at 508-988-7610
or [email protected]. Translation and/or localization of this document require an additional license from IDC.
For more information on IDC, visit www.idc.com. For more information on IDC Custom Solutions, visit
http://www.idc.com/prodserv/custom_solutions/index.jsp.
Global Headquarters: 5 Speen Street Framingham, MA 01701 USA P.508.872.8200 F.508.935.4015 www.idc.com
4
©2016 IDC