Somu Jayabalan CSS 534: Parallel Programming Grid and Cloud - Programming Tasks Assignment #4: Visualization of Sentinel Agent execution Problem To schedule MPI applications, Condor needs to be configured in such way that machines running MPI jobs are dedicated. It means that once Condor begins MPI execution, it will continue the program until the program ends. The program will not be preempted or suspended in the middle. If the program has larger computation cycle, during the execution of program, the system resources may go low. Under that condition, continuing execution on the same machine will affect the performance of the resource which results poor execution time. Moreover during the execution of program, the user specified resource criteria mayn’t be satisfied by the current executing resources. Recommendation Hence checking resource capacity during the execution of user program is required to decide whether the program continue to run on the same machines or should be transferred to different nods. If it finds the better computing resources then it can stop the execution in the current nodes and resume its execution from different nodes. Implementation AgentTeamwork is a Job management system similar to Condor. This is java based system developed by Prof. Fukuda. It consists of Daemon process (UWPlace) and collection of Mobile agents. Mobile agents will be running inside the daemon process. PFAgent: PFAgent is a mobile agent running in all participating nodes to broadcast resource information (CPU, memory, network bandwidth etc.). Commnader Agent: Commander Agent is the one injected by a user to execute the user program. Commander Agent then spawns Sentinel Agent and exited upon completion of user program. Sentinel Agent: Sentinel Agent contacts the PFAgent to get best computing nodes matching user specified criteria and decided where to execute the user program. It also monitors the execution of user program and if it finds the best computing node then it stops the execution of user program and resumes the execution form best computing node. Sentinel Agent also moves to best computing node along with the user program. Once the user program completes the execution, it notifies the Commander Agent. Somu Jayabalan CSS 534: Parallel Programming Grid and Cloud - Programming Tasks Assignment #4: Visualization of Sentinel Agent execution Comm ander spwans Sumits job Sentin el Migrates Sentin el Migrates Sentin el Sentin el Sentin el In this final project, I’ve visualized where the sentinel agent is moving during the execution of user program. Since this is a MPI program, it also visually represents the nodes (smaller circle) which are part of the mpd.hosts file. When the sentinel agent move around it sends information to Commander Agent then commander Agents writes the information in a file (nodes.txt). Graphics application keeps reading this nodes.txt file and displays it visually. At the end of the execution, Commander Agent writes “end” in the node file. When graphics application reads this “end” then it stops reading the node file. Execution output Green color represents the “execution completed” on the specific nodes and “Red” color represents “Currently executing node”. From the below screenshot, we see the Sentinel Agent initially started on Uw1-320-00 and then migrated to Uw1-320-06, Uw1-320-05, Uw1-320-01, Uw1-320-04 and then Uw1320-07. Smaller circle represents the nodes which were part of mpd.hosts file. Somu Jayabalan CSS 534: Parallel Programming Grid and Cloud - Programming Tasks Assignment #4: Visualization of Sentinel Agent execution Analysis Original version of AgentTeamwork was implemented with the static list of nodes (defined in xml). During my independent study, I’ve enhanced the framework to work with nodes based on its resource capacity (dynamic). With the static list, we may end up executing the program with the nodes which has low capacity. I’ve conducted performance evaluation with the best computing nodes as well worst computing nodes. Below table summarizes the results. Iterations Iteration#1 Iteration#2 Iteration#3 Best Computing Node Executiontime (seconds) 158.345 157.766 158.921 Worst Computing node Executiontime (seconds) 167.682 166.039 163.266 Improvement 5.5% 4.9% 2.7% Based on these iterations, we are always seeing improvement with best computing nodes over worst. Somu Jayabalan CSS 534: Parallel Programming Grid and Cloud - Programming Tasks Assignment #4: Visualization of Sentinel Agent execution Discussions The way best computing node is calculated based on the following formula. I calculate the rank for each computing node and then sorts the nodes based on its rank. Higher rank represents the best node and lower rank represents worst computing node. Cpu_capacity = (#ofCPUs*#ofCores*CPUSpeed) * (1-cpu_Load) Cpu_rank0 = (𝑐𝑝𝑢 𝑐𝑝𝑢𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦0 𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦0 +𝑐𝑝𝑢𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦1 +⋯+𝑐𝑝𝑢𝑐𝑎𝑝𝑎𝑐𝑖𝑡𝑦𝑛 Memory_free_rank0 = (𝑚𝑒𝑚𝑜𝑟𝑦 )% 𝑚𝑒𝑚𝑜𝑟𝑦𝑓𝑟𝑒𝑒0 𝑓𝑟𝑒𝑒0 +𝑚𝑒𝑚𝑜𝑟𝑦𝑓𝑟𝑒𝑒1 +⋯+𝑚𝑒𝑚𝑜𝑟𝑦𝑓𝑟𝑒𝑒𝑛 )% 𝑇𝑜𝑡𝑎𝑙𝑀𝑒𝑚𝑜𝑟𝑦𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒−𝑚𝑒𝑚𝑜𝑟𝑦_𝑓𝑟𝑒𝑒0 ) 𝑎𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒1 …+𝑚𝑒𝑚𝑜𝑟𝑦_𝑎𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒𝑛 Memory_load0 = (𝑚𝑒𝑚𝑜𝑟𝑦𝐴𝑣𝑎𝑖𝑙𝑎𝑏𝑙𝑒0+𝑚𝑒𝑚𝑜𝑟𝑦 100−(𝑚𝑒𝑚𝑜𝑟𝑦𝑙𝑜𝑎𝑑0 ) )% 𝐿𝑜𝑎𝑑1 …+𝑚𝑒𝑚𝑜𝑟𝑦_𝐿𝑜𝑎𝑑𝑛 Memory_pressure_rank0 = (𝑚𝑒𝑚𝑜𝑟𝑦𝐿𝑜𝑎𝑑0+𝑚𝑒𝑚𝑜𝑟𝑦 𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡ℎ0 Bandwdith_rank0 = (𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡ℎ0+𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡ℎ2+⋯+𝑏𝑎𝑛𝑑𝑤𝑖𝑑𝑡ℎ𝑛 ) % overall_rank0 = (cpu_rank0 * 0.5) + (memoryfree_rank0 * 0.2) + (memoryload_rank0 * 0.1) + (bandwidth_rank0 * 0.2) Further research 1) The weight allocated to Cpu_rank , memory and bandwidth are arbitrarily selected. Need to conduct further research to come up with the appropriate weights and find out the correlations between them. 2) Secondly I migrate the sentinel agent if the overall_rank of the best node is greater than current node’s rank (overall_rank > (current_rank + 2)). During this migration need to find out the migration cost (Time taken to save the current program and to resume from the destination node). Somu Jayabalan CSS 534: Parallel Programming Grid and Cloud - Programming Tasks Assignment #4: Visualization of Sentinel Agent execution 3) The performance evaluation needs to be conducted with the simulated condition. Means I need to develop stress scripts for CPU & Memory.
© Copyright 2026 Paperzz