Resource Management in Volunteer Computing Grids An analysis of the different approaches to maximizing throughput on a BOINC grid Presented by Geoffrey Oxholm and Beata Chrulkiewicz CS-575 Position Paper Presentation Fall 2007 Volunteer Grids • A Type of Grid Computer – Decentralized, volunteer nodes • Supercomputing for free – 1.1 PetaFLOPS vs. 360 TeraFLOPS • Unreliable Nodes – Users can disconnect their computers anytime – Amount of donated resources is subject to change – Evil jerks can upload malicious data Image: http://www.di.unipi.it/groups/architetture/images/grid.gif http://holistic.com.mt/h/?Page=Article&Ref=107 Berkeley Open Infrastructure for Network Computing • Duplicate work to ensure validity – R – The “Redundancy Factor” • Validate computation results. If the validation fails, repeat computation. – Validation Methods: • Majority Voting – More than R/2 nodes must agree • M-First Voting – First M nodes must agree Image: http://en.wikipedia.org/wiki/Image:BOINC_logo_July_2007.png Success and Limitations of BOINC • With proper configuration high throughput can be achieved • Still quite difficult to get volunteers • Proper configuration is difficult • Fixed configurations can not account for constantly changing grid characteristics Image: http://www.baseacid.com/imagesRR/workBand.jpg Fix: User Encouragement Feedback and Reward • • • • Each node generates statistics Teams can be formed Sense of pride in commitment Encourages users to donate more time, resources Team OCUK Predictor@home total credit. Go team! Image: http://teamocuk.com/cprojectcred1.php?p=PAH Fix: Maximizing Configuration Through Usage Simulation • • • • Enumerate a set of possible configurations Test configurations in a fraction of the time Avoid disturbing volunteers by simulating Zero in on an effective configuration Image: http://www.cyberroach.com/tron/tron3_circuit.jpg Fix: Dynamic Redundancy Through Reliability Prediction • Wait for a minimum number of nodes before assigning work • Choose nodes which have higher reliability • Higher reliability means less need for redundancy • Successful completion yields higher reliability rating for the node Image: http://image.compusa.com/prodimages/44/8537c95c-8027-4840-b976-67deb0690e13.gif Evaluation • User Encouragement – Encourages cheating – Does nothing to maximize efficient use of resources • Usage Simulation – Still requires researchers to configure system – Static configuration fails to match dynamic grid • Reliability Rating – Subject to further exploitation – Further minimizes the value of slow nodes, working against incentives Image: GPL Licensed Conclusion • Build on existing methods – Continue to encourage users – Create a starting point by using simulation – Update reliability system to avoid conflict with system of incentives • Develop new technologies – Blacklist malicious nodes – Develop a more comprehensive reliability system which uses past schedules to predict future availability Image: http://pixels.dessgeega.com/wp-content/uploads/2006/10/organize_big.gif Questions? Image: http://www.grid.phys.uvic.ca/ Geoff Oxholm Beata Churkiewicz
© Copyright 2026 Paperzz