Potensi cloud technology pada pengolahan data spatial

Pararel Computing for Scientific
Environment
Cluster, Grid & Cloud approach
Mardhani Riasetiawan, MT, Candidate Ph.D
[email protected]
http://mardhani.blog.ugm.ac.id
6283869942863
Department of Computer Science & Electronics
Faculty of Mathematic and Natural Science
Universitas Gadjah Mada
www.dcse.fmipa.ugm.ac.id
A Research and Working Group on
Grid & Cloud Technology
Universitas Gadjah Mada
www.cloud.wg.ugm.ac.id
 Konsep
 Kenapa Pararel, Cloud
AGENDA
Computing?
 Cloud Computing
 Cluster, Grid, & Cloud
 Spatial Cloud Computing
 Best Practise
 GEOSS Clearing House
 Dala Project
 GamaBox
 Implementasi
 Ide & isu
 Arsitektur Teknologi
 Case study
Teknologi
Memaksimalkan sumber daya dan
meminimalisir resiko
Isu Teknologi
The Issues
Petabytes Worldwide
1,000,000
Transient
information
or unfilled
demand for
storage
900,000
800,000
Information
700,000
600,000
500,000
400,000
300,000
Available Storage
200,000
100,000
0
2005
•
•
•
•
•
2006
2007
2008
2009
2010
The digital universe will grow 10-fold in five years, from ~160-170 exabytes in 2006 to
>1,600 exabytes in 2011
Information created surpassed available storage in 2007, will be 2X five years
Unstructured information accounts for >90% of the digital universe
Consumers/individuals account for ~70% of information created, yet enterprises have
“responsibility/liability” for ~85%
Preservation “intense” information will grow 9-fold in 5 years
Source: John Gantz, Chief Research Officer, IDC
4
“Enabler”
Fakta tentang Data
Cluster – Grid – Cloud
Cloud Technology
Grid Computer
Cluster Computer
Un-used & second
hard hardware
Pararel Computing
 Yang ditawarkan
 Integrasi semua data geospatial, pengetahuan/knowledge, dan
memprosesnya dengan waktu yang terukur.
 Menghasilkan dan mengirimkan informasi yang benar secara
real-timekepada pengambil keputusan, penguna utama dan
korban.
 Platform dan infrastruktur komputasi
 Siap dalam beberapa menit
 Dapat mengakomodasi kebutuhan penguna
 Mengeluarkan sesuai dengan “biaya” komputasi yang
digunakan
 Menghindari emergency cost yang muncul dari kegagalan
sistem yang sudah ada
By definition
 “Cloud computing is a model for enabling convenient,
on-demand network access to a shared pool of
configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can
be rapidly provisioned and released with minimal
management effort or service provider interaction. This
cloud model promotes availability and is composed of
five essential characteristics, three service models,
and four deployment models.” (NIST 2010)
Spatial Cloud Computing
• Data Intensity
• Computing Intensity
• Concurrent Access
Intensity, and
• Spatiotemporal
Intensity
•
•
Enables the geospatial science
discoveries, emergency
responses, education, other
societal benefits
Is optimized by spatiotemporal
principles.
Spatial Databases: Representative Projects
Evacutation Route Planning
Parallelize
Range Queries
only in old plan
Only in new plan
In both plans
Shortest Paths
Storing graphs in disk blocks
Why cloud computing for spatial data?
• Geospatial Intelligence [ Dr. M. Pagels, DARPA, 2006]
• Estimated at 140 terabytes per day, 150 peta-bytes annually
• Annual volume is 150x historical content of the entire internet
• Analyze daily data as well as historical data
•
Best Practices
GEOSS Clearinghouse
 Objectives
 Share Global Earth Observation Data Among 140+ Countries to Address
Global Challenges of Natural Hazards and Emergency Responses
 Support Global End Users to Discover, Access, and Utilize EO Data
 Provide Responses to End Users in Seconds
 Advanced Computing Technologies
 Cloud Computing (EC2 & Azure) Responds to Spike
Massive Concurrent End Users
 Cloud DB (SQLAzure) Manages Millions to Billions
of Metadata Records
 WebGIS & 5D Vis Tools to Visualizes EO Data
Concurrent Intensity
CERN
Implementasi
Arsitektur
A Conceptual Framework for CloudGIS
Yang C., Bambacus M., Benedict K., Nebert D., Mochuney D., Hazlett S., Houser P., Raskin R., Xu Y., Fay D.,
Rezgui A., Huang Q., and Xu C., 2011. Using Metadata, Data/Service Quality and Knowledge to Facilitate Better
Data Discovery,
Access, and Utilization for Supporting EarthCube,
http://semanticommunity.info/@api/deki/files/13812/=024_Yang.pdf.
Referensi
1.Yang, C., Goodchild M., Huang Q., Nebert D., Raskin R., Xu Y., Bambacus M., Fay D., 2011a, Spatial Cloud Computing: How
could geospatial sciences use and help to shape cloud computing, International Journal on Digital Earth.
2.Foster, I., Zhao, Y., Raicu, Y., Lu, S., 2008. Cloud Computing and Grid Computing 360-Degree Compared, In: Grid
Computing Environments Workshop, GCE 2008. IEEE, Los Alamitos.
3.Yang, C., Raskin, R., Goodchild, M.F., and Gahegan, M., 2010, Geospatial Cyberinfrastructure: Past, Present and Future,
Computers, Environment, and Urban Systems, 34(4):264-277.
4.M.F. Goodchild, M. Yuan, and T.J. Cova (2007) Towards a general theory of geographic representation in GIS. International
Journal of Geographical Information Science 21(3): 239–260. (Open Access)
5.Rey, S. J., and M. V. Janikas. 2006. STARS: Space-Time Analysis of Regional Systems. Geographical Analysis, 38 (1): 67–
86.
6.Armbrust, M, Fox, A., Griffith R., Joseph A., Katz, R. and etc, 2009. Above the Cloud: A Berkeley View of Cloud Computing,
Technical Report No. UCB/EECS-2009-28. (Open Access)
7.
Wang S. and Armstrong M., 2009. A theoretical approach to the use of cyberinfrastructure in
geographical analysis, International Journal of Geographical Information Science 23(2), 169 – 193.
(Open Access)
8.
Yang C., Wu H., Li Z., Huang Q., Li J., 2011, Spatial Computing: Utilizing Spatial Principles to
Optimize Distributed Computing for Enabling Physical Science Discoveries, Proceedings of National
Academy of Sciences, doi: 10.1073/pnas.0909315108. (Open Access)
http://www.pnas.org/content/early/2011/03/21/0909315108.full.pdf
9.
Wang, S., and Liu, Y. 2009. TeraGrid GIScience Gateway: Bridging Cyberinfrastructure and
GIScience. International Journal of Geographical Information Science, 23 (5): 631-656.
10.
Evangelinos C., Hill C., 2008. Cloud Computing for parallel Scientific HPC Applications: Feasibility of
running Coupled Atmosphere-Ocean Climate Models on Amazon’s EC2, CCA-08 October 22–23,
2008.
11.
Image taken from : http://www.bluecloudspatial.com/
Terima kasih