Infinispan, transactional key-value DataGrid and NoSQL

•
Sr. Consultant at Inmeta Consulting
•
Current project: Skattetaten Grid POC
•
Previous projects involving grid technologies:
•
•
•
Mattilsynet food authority system.
FrameSolution BPM framework used in Lovisa National Court
Authority(Norway), Mattilsynet Food Authority
Other noteworthy projects
•
•
Coca Cola Basis ERP system – Coca Cola Bottler factories
mPower Mobilitec 300 million subscribers worldwide, and delivers
over 500,000 pieces of content every day.
•
•
•
•
•
•
•
Big data, Databases are slow. Memory is FAST!
Provides huge computing power.
Tax calculation 
Financial organizations
Government organizations use it for communication and
data sharing between the different departments.
Scientific computations
MMORPG games
•
•
•
General terminology relevant to Distributed Caching
Challenges related to introducing distributed caching to
existing system
Metrics and tuning
•
•
•
•
•
•
•
•
•
•
Cache JSR – 107
Java Data Grid JSR - 347
In memory Data Grid
Cluster
Distribution
Node – a member of a cluster
Transaction awareness
Colocation
Map / Reduce
Consistency
•
•
•
•
Transaction scope
Locking\deadlocking
Flushing policies
Mixing the technology
stack.
• Performance
•
Wow we did it!
•
•
•
•
Our Custom cache is super fast, but its cache hit ratio is
rather low.
Our custom cache has a tendency of getting dirty as the
updates to the shared data can not be propagated. At the
same time the separation of the data regions is not full.
Marshaling is a rather slow and heavy process.
We are facing a technological cocktail and we need to keep
integrity.
•
•
•
Write through
Write Behind
Replication Queue
•
Eviction
•
•
•
•
Least Recently Used
First In First Out
LIRS
Custom
•
Expiration
•
Invalidation
•
•
Ref. Data vs Transactional
Reference data: Good.
Max 30000 reads/sec 1k size
• Transactional data: Good.
Max 25000 writes/sec 1k size
.
•
Reference data: Good.
30000 reads/sec per server.
Grow linearly by adding servers.
•
Transactional data: Not so
good. Max 20000writes/second.
Drops if you add 3rd server to
2500.
•
•
Ref. Data vs Transactional
Reference data: Good.
Max 30000 reads/sec 1k size
• Transactional data: Good.
Max 25000 writes/sec 1k size
•
Reference data(1kb):Good.
30000 reads/sec per server.
Grow linearly by adding servers.
•
Transactional data(1kb):Good.
20000 writes/sec per server.
Grow linearly by adding servers.
•
•
•
•
•
•
•
What is the size of our cluster? Reads vs. Writes
Communication inside our grid
•
UDP,TCP
Synchronous vs. Asynchronous.
What about the transaction isolation?
•
Repeatable Reads vs. Read Committed
What is the nature of our application?
Read intensive data
•
CMS systems
Write Intensive Data
•
Document Management System
•
Level1 cache is
Supported only for
Distribution mode
• Level 1 cache might
have a performance
Impact in certain
systems
•
Passivation
•
Activation
•
Hibernate
•
Long running transactions need to be avoided.
•
What is a long running transaction? How long is actually long.
•
Read Committed vs Repeatable Reads
C is locked
by
TX2
TX1 (Wants update A,B,C)
begin
Update(A)
Update(B)
Update(C)
Update(B)
Update(B)
Release(A)
Lock(A)
TX2 (Wants to update C,B,A)
Begin
Update(C)
A is locked
by
TX1
What is returned??
TX1
begin
get(k)
-
-
TX2
Begin
Get(k)
put(k, v2)
commit
Get(k)
•
Java serialization
•
Java externalization
•
Impact on performance
•
Generic domain.
•
•
•
•
Transaction scope
Locking\deadlocking
Flushing policies
Mixing the technology
stack.
• Performance
•
Wow we did it!
•
Thank you for your attention
http://www.alachisoft.com/ncache/caching-topology.html
http://www.infoq.com/news/2011/10/java-data-grid
https://github.com/datagrids/spec/wiki
http://www.jboss.org/infinispan/documentation
http://code.google.com/p/thrift-protobufcompare/wiki/Benchmarking