grid computing

World’s largest virtual computer
INTRODUCTION
 Grid computing is a term referring to the
combination of computer resources from multiple
administrative domains to reach common goal.
INTRODUTION
Definition
 A large-scale geographically distributed hardware and
software infra-structure composed of heterogenous
networked resources owned and shared by multiple
administrative organizations which are coordinated to
provide transparent,dependable,pervasive and consistent
computing support to a wide range of applications. These
applications can perform either distributed computing
support to wide range of applications. These applications
can perform either distributed computing, high
throughput computing,on-demand computing,data
intensive computing, collaborative computing or
multimedia computing.
TYPES OF GRID COMPUTING
 Computational Grid
 Data Grid
Computational Grid
 Why do we need ?
 Computational approaches to problem solving have
proven their worth in almost every field of human
endeavour.
 Computers are used for modelling and simulating
complex scientific and engineering problems,
diagnosing medical conditions, controlling industrial
equipment, forecasting the weather…………
Computational Grid
 Example of Traffic
 Computational Grid Application
Computational Grid
 There are variety of reasons for lack of use of
computational problem-solving methods,including
lack of proper education and tools.
 But one important factor is that the average computing
environment remains inadequate for sophisticated
computational tasks.
Computational Grid
 The computational grid environments that provides a
demand-driven, reliable, powerful, and yet inexpensive
computational power for its customers.
Data Grid
 In Increasing number of scientific disciplines large
data collections are emerging as important community
resources.
 For example
 The weather forecasting data which already in terrabytes
should work with Global Information System at
different locations then working with this large volume
of data requires data to be distributed and accessed by
several users
Data Grid
 This kind of large data set usage geographic
distribution of users and resources and
computationally intensive analysis results in complex
stringent performance demands that are not satisfied
by an existing data management system.
 This led to introduce a Data Grid.
Grid Applications
 Application partitioning that involves breaking the
problem into discrete pieces
 Discovery and scheduling of tasks and workflow
 Data Communications distributing the problem data
where and when it is required
Grid Applications
 Provisioning and distributing application codes to
specific system nodes
 Result management assisting in the decision processes
of the environment
 Autonoimc features such as self-configuration,self-
optimization, self-recovery and self-management.
Grid Benefits
 No need to buy large six figure SMP servers for
applications that can be split up and farmed out to
smaller commodity type servers.
 Results can then be concatenated and analyzed upon
job(s) completion.
Grid Benefits
 Jobs can be executed in parallel speeding performance.
Grid environments are extremely well suited to run
jobs that can be split into smaller chunks and run
concurrently on many nodes.
Grid Benefits
 Grid environments are much more modular and don't
have single points of failure.
 If one of the servers/desktops within the grid fail
there are plenty of other resources able to pick the
load.
 Jobs can automatically restart if a failure occurs.
Grid Benefits
 Upgrading can be done on the fly without scheduling
downtime.
 Since there are so many resources some can be taken
offline while leaving enough for work to continue.
 This way upgrades can be cascaded as to not effect
ongoing projects.
Grid Benefits
 This model scales very well.
 Need more compute resources?
 Just plug them in by installing grid client on additional desktops
or servers.
 They can be removed just as easily on the fly.
 This modular environment really scales well.
Grid Benefits
 Policies can be managed by the grid software.
 The software is really the brains behind the grid.
 A client will reside on each server which send
information back to the master telling it what type of
availability or resources it has to complete incoming
jobs.
Grid Benefits
 Much more efficient use of idle resources.
 Jobs can be farmed out to idle servers or even idle
desktops.
 Many of these resources sit idle especially during off
business hours.
 Policies can be in place that allow jobs to only go to
servers that are lightly loaded or have the appropriate
amount of memory/cpu characteristics for the
particular application.
Drawbacks of Grid computing
 For memory hungry applications that can't take
advantage of MPI you may be forced to run on a large
SMP.
Drawbacks of Grid computing
 Some applications may need to be tweaked to take full
advantage of the new model.
 Licensing across many servers may make it prohibitive
for some apps.
 Vendors are starting to be more flexible with
environment like this.
Drawbacks of Grid computing
 Political challenges associated with sharing resources
(especially across different admin domains).
 Many groups are reluctant with sharing resources even
if it benefits everyone involved.
 The benefits for all groups need to be clearly
articulated and policies developed that keeps everyone
happy.
(easier said than done...)
Drawbacks of Grid computing
 Grid environments include many smaller servers
across various administrative domains.
 Good tools for managing change and keeping
configurations in sync with each other can be
challenging in large environments.
 Tools exist to manage such challenges include
systemimager, , Opsware, Bladelogic, pdsh, cssh,
among others.
Drawbacks of Grid computing
 You may need to have a fast interconnect between
compute resources (gigabit ethernet at a minimum).
Infiband for MPI intense applications
Grid Components
 Grid Portal
 Security
 Broker
 Scheduler
 Data Management
 Job and Resource Management
 Resources
Grid Portal
Security
Broker
Scheduler
Scheduler
Data Management
Job Management
Grid Architecture
Fabric Layer
 The Fabric Layer defines the resources that can be
shared.
 Example: computational resources,data
storage,networks,catalogs and other system resources.
 These resources can be physical or logical
Fabric Layer
 Example of Logical resources: file systems ,software
applications.
 These logical resources are implemented by their own
internal protocol(eg. NFS for distributed file system)
 These resources then comprise their own network of
physical resources.
Fabric Layer
 There are no specific requirements for a particular
physical resources that relates to integrate itself as part
of any grid system.
 It recommends to have basic capabilities associated
with the integration of resources.
 Provide an “inquiry” mechanism which allows to
discover against its own resource capabilities.
 Provide appropriate “resource management” capabilities
to control the QoS the grid solution promises.
Connectivity Layer
 It manages connections.
 It defines the core communication protocols and
authentication protocols required for grid-specific
networking services transactions.
 The communication protocols can work with any of
the networking layer protocols that provide the
transport, routing and naming capabilities in
networking services solutions.
Connectivity Layer
 The Authentication protocol builds on top of the
networking communication services in order to
provide secure authentication and data exchange
between users and respective resources.
Resource Layer
 The Resource Layer utilizes the communication and
security protocols defined by the networking
communications layer, to control the
 Secure negotiation
 Initiation
 Monitoring
 Metering
 Accounting
 Payment involving sharing of operations
Resource Layer
 Information Protocols: These protocols are used to
get information about the structure and the
operational state of a single resource
 Including Configuration
 Usage Policies
 Service-Level agreements
 State of the resource
Resource Layer
 Management Protocols: These provide the following
functionalities
 Negotiating access to a shared resource is paramount.
 Performing operations on resource such as process
creation or data access
 Acting as the service/resource policy enforcement point
for policy validation between a user and resource.
 Providing accounting and payment management
functions
 Monitoring the status of an operation, controlling &
termination
Collective Layer
 Resource Layer Manages individual resource while the
collective layer is responsible for all global resource
management and interaction with a collection of
resources.
 Examples
 Discovery Services
 Coallocation,Scheduling and Brokering Services
 Monitoring and Diagnostic Services
Application Layer
 These are user application, which are constructed by
utilizing the services defined at each lower layer.
 Such an application can directly access the resource or
can access the resource through the Collective Service
Interface APIs
Grid relation to Distributed Technologies
 World wide web
 Distributed Computing Systems
 Application and storage service providers
 Peer-to-peer computing systems
World Wide Web
 A number of open and ubiquitous technologies are
defined for the WWW that makes the web a suitable
candidate for the construction of the virtual
organizations.
 However web is defined as a browser-server message
exchange model, and lacks the more complex
interaction models required for a realistic virtual
organization
World Wide Web
 Examples:
 Single-sign-on
 Delegation of Authority
 Complex Authentication Mechanisms
 Once browser to server interaction matures, the web
will be suitable for the construction of grid portals to
support multiple virtual organizations.
Distributed Computing Systems
 Major distributed technologies including CORBA,J2EE
and DCOM are well suited for distributed applications.
 However these do not provide a suitable platform for
sharing of resources among the members of virtual
organization.
Distributed Computing Systems
 Another major drawback in distributed computing
systems involves the lack of interoperability among the
protocols.
 Some of the distributed technologies have attracted
Grid computing research attention towards the
construction of grid systems, most notable of which is
Java JINI.
Application and storage Service
Providers
 Application and storage service providers normally
outsource their business and scientific applications
and services, as well as very high-speed storage
solutions, to customers outside their organizations.
 Customers negotiate with these highly effective service
providers on QoS requirements.
Application and storage Service
Providers
 These type of advanced services arrangements are
executed over some type of virtual private network or
dedicated line by narrowing the domain of security
and event interactions.
 This in turn reduces the visibility of the service
provider to a lower and fixed scale with the lack of
complex resource sharing among heterogenous
systems and interdomain networking service
interactions.
Application and storage Service
Providers
 This being said, the introduction of the Grid
Computing principles related to resource sharing
across virtual organizations along with the
construction of virtual organizations yielding interdomain participation.
Peer-to-Peer Computing Systems
 Similar to Grid Computing, peer-to-peer computing is
a relatively new computing discipline in the realm of
distributed computing.
 P2P and distributed computing are focussed on
resource sharing and are now widely utilized
throughout the world by home and commerical
markets.
Peer-to-Peer Computing Systems
 The major differences are:
 1. They differ in their target communities. Grid
communities can be small with regard to number of
users, yet will yield a greater applications focus with a
higher level of security requirements and application
integrity.
 On the other hand p2p systems define collaboration
among a larger number of individuals and or
organizations, with a limited set of security
requirements and a less complex resource sharing
topology.
Peer-to-Peer Computing Systems
 The grid systems deal with more complex,more
powerful,more diverse and a highly interconnected set
of resources than that of the P2P.