Unit 4 - Programming Model Prof. B. Chandramouli Syllabus • Open source grid middleware packages • GT4 Architecture, configuration • Usage of Globus • Main components and programming model • Introduction to Hadoop Framework • Mapreduce, Input splitting, map and reduce functions • Specifying I/O parameters, configuring and running a job • Design of Hadoop file system, HDFS concepts • Command line and java interface, data flow of file read & file write Open source grid middleware packages • Grid system includes • • • • Computational resources Storage resources Network resources Scientific instruments • Grid middleware is a software which provides users with • Access to resources • Computing ability Popular Opensource Grid middleware SWs • UNICORE ( open source) • Focus : High level programming models (Java on Unix) • GLOBUS ( open source) • Focus : Low level services (C and Java on Unix) • GRIDBUS ( open source) • Focus : Abstraction and market models (Java on Unix) • LEGION ( Not open source) • Focus : High level programming models (C++ on Unix) Building a Grid service with GT4 OGSI is the basic bldg block of OGSA, OGSI is implemented with GT4,. A grid service , which is an extension of web service , so can be built with GT4 Globus Toolkit 4 • GT4 has become a de facto standard • Using GT4, we can build computational grid and run grid based apps. • The Globus toolkit has four major parts (services): • Security – Components to provide a security envelope and secure access ( GSI – Grid services Infrastructure). • Information – Monitoring and discovery of resources and services (MDS) • Grid Resources and Allocation Management - (GRAM) • Data management – Access and transfer of data ( Grid FTP) User employing Globus services in a Grid Security Shield – Grid Security Infrastructure ( GSI) GT4 Architecture GT4 Architecture aspects 1 of 3 • Service Implements ( i.e infrastructure services): • • • • Resources allocation management (GRAM) Data access and data movement ( Grid FTP Reliable File Transfer - RFT) Replica management ( RLS – replica location services) Credential management – security (MyProxy, Delegation thro Certificate Authorities - CAS) • Discovery and monitoring resources( Index, Trigger) GT4 Architecture aspects 2 of 3 • Containers: • Java • Python •C • These containers are open source environment to provide web services including • WS resource frame work ( WSRF) • WS Notification • WS security GT4 Architecture aspects 3 of 3 • Class Libraries • To invoke client programs Client Server Communication • WS Interoperability transport compliant to communicate using SOAP messaging • X.509 entity and proxy certifications for single sign in. GT4 service components - GRAM • Grid Resources and Allocation Management • After discovery of resources, GRAM initiates, monitors and manages the execution of computations on remote computers • GRAM also responsible for restarting the process in the event of resource failure or service failure GT4 service components - GSI • Grid Security Infrastructure • • • • • Provides authentication to grid users Ensures secure communication Single sign on thro certifications Data Encryption Technologies used are • SSL – Secure sockets Layer • PKI – Public Key Infrastructure • X.509 – Certificate for security Single sign on using Trust Authority GT4 service components – Grid FTP, RFT, RLS • Data management package • Transmits , stores and manage massive data sets • Components of this service are • Grid FTP ( normal FTP + enhanced security) • RFT ( Reliable File Transfer) • RLS ( Replica Location service) GT4 Job workflow GT4 Configuration * • Grid FTP Configure • Installed when GT4 is installed by default. No specific config required. But GridFTP must be started using commands to bring up FTP services • RFT configure • It does third party transfers between GridFTP servers and records transfer status in a database ( Postgre SQLv8.1.4) . This dB must be installed. • GRAM configure • Installed when GT4 is installed by default. No specific config required. GRAM executes and manages jobs thro local scheduler. • *Refer book for commands Usage of GLOBUS GT4 • Defining a job • A job is a single process or multiple processes created as an outcome of a job request • Staging files • Transferring .exe and data files to required destination without user intervention. To transfer we must provide source and destination URLs • Submitting a job (2 steps) • Data transfer ( GridFTP or GASS protocol) • Job submission ( GRAM has tools to submit job) • Monitoring a job ( 3 tasks) • Track status of submitted job • Collect output • Clean files Main components and Programming model of GT4 • Main components • Security component • GSI ( Grid Security Infrastructure) • Data Management component • Grid FTP • RFT ( Reliable File Transfer) • Data Replica component • RLS ( Replica Location Service) • DRS ( Data Replication service) • Execution management • GRAM ( Grid Resources AllocatioManagement) • Monitoring and Discovery Services ( MDS) • Aggregator services ( general framework to build service and aggregate data) • Index • Trigger Possible 2 mark questions 1. 2. 3. 4. Define Grid middleware List 4 popular Grid middleware software What are the 4 major parts (services) of GT4 ? Describe the purpose of GSI, GRAM, MDS and GridFTP 5. List 2 safe file transfer protocols in Globus 6. What 3 components need to be installed during GT4 configuration? 7. What does staging file means? Possible big questions 8marks 1. Write a short note on GT4 configuration 2. Write a short note on GT4 usage 3. Write a short note on GT4 main components 16 marks 1. Explain with a neat diagram the GT4 architecture in detail. 2. Explain with a neat diagram the GT4 Job workflow in detail.
© Copyright 2026 Paperzz