Managing Scientific Computing Projects Erik Deumens QTP and HPC Center Sep 13, 2006 Scientific Computing 1 Overview What is a scientific computing project? Procedures to manage scientific computing projects Sep 13, 2006 Scientific Computing 2 Commodity computing E-mail Web access Writing: papers, letters, thesis, presentations, web content Drawing: graphs, figures, plots Calculating: spreadsheets, Mathematica, Maple, SAS, Matlab Sep 13, 2006 Scientific Computing 3 Science and Engineering Computing with software Physics: VASP, WIEN Chemistry: Gaussian, Q-Chem Engineering: ANSYS Developing software Programming Prototyping Debugging Performance analysis Sep 13, 2006 Scientific Computing 4 Scientific Computing Project Significant human effort Many steps with dependencies Takes a long time on one computer or many computers to complete Involves a lot of data Input given to be processed Intermediate data for the computation Output produced to be analyzed Sep 13, 2006 Scientific Computing 5 Example SCP Test a set of model parameters Given basic parameters Bn Compute dependent values Dj Compare to test values Tj If the number of dependent and test value sets is large, say 1,000 And each computation takes time, say 1 h Then this is a project Sep 13, 2006 Scientific Computing 6 Recognizing SCP Act from early stages as if it is SCP Then procedures are tested and reliable by the time the science of the project becomes harder and requires all attention Sep 13, 2006 Scientific Computing 7 Reliability of modern computers Computers, networks and software are Very stable Very powerful Leads to wide spread belief that they are Infinitely stable Infinitely powerful Probability of failure Small chance times lots of work = big chance Sep 13, 2006 Scientific Computing 8 Overview What is a scientific computing project? Procedures to manage scientific computing projects Sep 13, 2006 Scientific Computing 9 Manage a SCP Project analysis Data Computation Develop strategy Organize the computation Manage the data Automation Avoid human errors Protect against disasters Sep 13, 2006 Scientific Computing 10 Project analysis Often a project starts small Once you decide the project is worthwhile, perform a project analysis Data: before, during, after Computation: how many, how long Precautions: minimize effect of disasters Sep 13, 2006 Scientific Computing 11 Develop strategy Organize the computation Choose computer system Study scheduling system Match the project computation flow onto the scheduling policies Manage the data Input files generated by hand? By machine? Space for large intermediate files Space for output files Sep 13, 2006 Scientific Computing 12 Automation Extra tools needed to manage the project? Generate input files from a database? Write scripts? Use a tool already developed? Generate scheduler command files? Does a tool exist? Some tools are very complex. Is it easier to write scripts than to learn the tool? Collect data from output files into a database? Write scripts? Write a compiled program? Sep 13, 2006 Scientific Computing 13 Automation Computation and data monitoring Check status of each run Submit the job again if it failed Check correctness and integrity of output data Even if the job finished it may have generated an error message there may be no result or the result may be invalid or incorrect Sep 13, 2006 Scientific Computing 14 Precautions Prepare for some disasters Some or all computed data is lost or corrupted? Make sure all files created manually are on disks that are backed up at least, you can run computations again Some output has been processed Make sure partial results are on disks that are backed up Sep 13, 2006 Scientific Computing 15 Growing projects Often projects start small Procedures are developed and used They work well for 1,000 cases Then the scope is increased After partial success Procedures are used unchanged They do not work for 1,000,000 cases! Must perform new analysis when scope changes Sep 13, 2006 Scientific Computing 16 Tool choices Small operations Scripts are easy to write and change Run fast for small numbers Large operations Running a script 10,000,000 times may be very slow and cause unexpected side effects Investigate better tools Program in compiled language Use database instead of simple files Sep 13, 2006 Scientific Computing 17 Conclusion A little bit of thought, can save you from a lot of trouble and extra work Every scientific computation project that is worth doing is worth a little bit of thought about how to do it. Sep 13, 2006 Scientific Computing 18
© Copyright 2026 Paperzz