Charging Models for Data Centers

Charging Models for
Data Centers
Bhuvan Urgaonkar
The Penn State University
Data Centers
• Clusters of compute and storage servers
connected by high-speed nets
• Resources made available to applications
• Charge the applications for these
resources
• Applications might have clients that they
charge
Charging in a Data Center
• Between data center & application provider
•
•
•
•
Lease out fixed # servers (over-provision)
Fixed monthly rate (e.g., yahoo Web hosting)
Performance-based charging (mostly research prototypes)
Usage-based (e.g., Sun Grid)
• Between application provider & clients
• Fixed monthly rate (possibly with multiple classes)
• Transaction granularity (roughly same as usage-based)
Classification of Charging
Models
•
•
•
•
•
Flat-rate
Usage-based
Flat-rate + Usage-based
Performance-based
Bidding-based
Flat-rate Charging
• Local phone service, cable connection
• (+) Billing can not get easier
• (-) Consumer: Why should I pay even
when I was on vacation?
• (-) Provider: Could I have improved my
revenue by charging based on usage?
Usage-based Charging
• Electricity
• Actually rate fluctuates!
• Service interrupted deliberately sometimes
• Long-distance phone
• Sun Grid: $1/CPU-hour
• There should be a way for the consumer to
verify its usage
• E.g., Electricity meters at our homes
Flat-rate + Usage-based
• Cell phones
• 400 day-time minutes for free
• Usage-based beyond that
Performance-based Charging
• Service providers like AT&T, Sprint
guarantee average delays in the backbone
• For data centers
• Difficult for the data center to translate given
performance into resources
• Workloads vary, applications are complex
• Desirable by application provider
• Caveat: How do I know what response time my
clients are experiencing?
Bidding-based Charging
• eBay
•
•
•
•
•
Clients bid till a pre-decided time
Highest bidder gets to buy
Winning bidder can not back down
Open-bid: You see what others are doing
Closed-bid: You don’t see what others are
doing
Bidding-based Charging
• (+) Provider: This seems to maximize
revenue
• (-) Provider: Has to provide bidding
mechanism
• Scalability may be a problem
• (-) Consumer: I have no guarantees; some
rich guy can always shoot me down!
• (-) Consumer: Outcome known only at the
end of the bid
• Have to wait till then to make any decisions
Possible Factors Governing the
Choice of Charging Model
• Ease of monitoring and accounting
• Abundance of resource
• Competition for the resource
• Ease of verifying/proving resource usage
• Dependencies between various resources
being bought (bidding)
• Different levels of desirability of the
resource among the consumers
Interlude: Differentiated Service
• When does it make sense to have
multiple classes?
• What decides the priority
scheme/scheduling discipline?
Two Aspects of the
Charging Problem
• Charging Model
• Economics Problem
• Accounting and Verification
Mechanism
• Systems Problem
Two Aspects of the
Charging Problem
• Charging Model
• Economics Problem
• Accounting and Verification
Mechanism
• Systems Problem
Charging in a Data Center
• Which model is suitable?
• Apps are interested in performance metrics
• Data center would prefer usage-based charging
• What about bidding for resources?
• What does the choice of model depend on?
• Abundance, competition, peace of mind?
• Got to be revenue maximization, right?
• How to charge for the usage of multiple
resources?
• CPU, disk, network, …
Two Aspects of the
Charging Problem
• Charging Model
• Economics Problem
• Accounting and Verification
Mechanism
• Systems Problem
Two Aspects of the
Charging Problem
• Charging Model
• Economics Problem
• Accounting and Verification
Mechanism
• Systems Problem
Two Systems Requirements for
Enabling Charging
• Accounting
• Resource provider should be able to
monitor and account resource usage
• Verification
• Resource consumer should be able to
verify its own resource usage
• Ability to dispute provider’s claims
Accounting
• Well studied by OS and networks
communities
• Resource containers from Rice University
• Mostly an engineering exercise
• Does the problem become any harder in a
virtualized hosting environment?
Verification
• Remember: App doesn’t trust the data center
• Auditing: Instead of verifying resource usage
at all times, the consumer does it sometimes
• The provider should not be able to predict or
detect an audit
• Audit at random
• Provider and consumer should agree to the
auditing process
• Involve a third party that both trust
• The data center also doesn’t trust the application!
• Failing an audit is a violation of SLA
Auditing in a Data Center:
Exhaustive Profiling
• The auditor uses extensive profiling to
identify resource usage to performance
mapping for all possible workloads
• (+) The data center can not figure out it
is being audited
• (-) Such profiling might be prohibitively
expensive
Auditing in a Data Center:
Selective Profiling
• The auditor sends well-profiled probes and
observes their performance
• (+) No need for extensive/exhaustive profiling
• (-) Data center might identify probes
• Camouflage needed
• (-) Not trivial to construct probes whose
performance is independent of the rest of the
workload
Auditing in a Data Center:
Self-Monitoring Applications
• Assume it is possible to modify the
application
• Can the application monitor its own
resource usage?
• Can not trust the underlying OS/VMM
Self-Monitoring Application
• Idea: We add a special auditing code (AC) to
the application
• … for (i=0; i < 1000000; i++); …
• At a randomly chose time t1, the application
sends a message to the auditor
• The application jumps to AC and starts executing
it
• The auditor ACKs the message
• The application receives the ACK at time t2 and
determines t2-t1, compares it with expected
time to reach the current value of i
Problems with Self-Monitoring Applications
• The execution time of AC depends on
what other apps are doing
• Not a problem: data center expected to guarantee
lower bounds
• Unpredictable delays in the Internet
• Send multiple probes and take average
• The auditor could record probe reception times and
try to adjust for network delays
Problems with Self-Monitoring Applications
• How to ensure data center can not identify a
msg to auditor or the execution of AC?
• Msg to auditor and ACK should look like normal requests
and responses
• Giveaway: Data center observes that the application has
become CPU-intensive suddenly
• Not a problem if the app becomes CPU-intensive when serving
its normal workload
• Need to ensure that the CPU usage during the execution of AC
is indistinguishable from that when serving normal workload
• E.g., Running a while loop that lasts 30 min would be a bad idea
Design Issues: Self-Monitoring Applications
• What is the right observation period? How
many observations should be made?
• What about other resources?
• Network bandwidth perhaps similar to CPU
• Memory and disk bandwidth much harder!
Summary
• Charging in data centers seems like an
important problem to address
• We can break-down the charging
problem into
• Charging model: Economics problem
• Accounting and verification: Systems
problems
• Many interesting open issues!