Introduction to Parallel Processing Transcript Introduction to Parallel Processing Transcript was developed by Michelle Buchecker. Additional contributions were made by Cheryl Doninger, Glenn Horton, Merry Rabb, and Christine Riddiough. Editing and production support was provided by the Curriculum Development and Support Department. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Introduction to Parallel Processing Transcript Copyright © 2009 SAS Institute Inc. Cary, NC, USA. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, or otherwise, without the prior written permission of the publisher, SAS Institute Inc. Book code E1411, course code RLSPCN06, prepared date 31Mar2009. RLSPCN06_001 ISBN 978-1-59994-969-7 For Your Information Table of Contents Lecture Description ..................................................................................................................... iv Prerequisites ................................................................................................................................. v Introduction to Parallel Processing .............................................................................. 1 1. Basics of Parallel Processing .............................................................................................. 6 2. Threading .......................................................................................................................... 30 iii iv For Your Information Lecture Description This is the first e-lecture of a five-lecture series on parallel processing with SAS/CONNECT software. This lecture teaches you the basics of parallel processing in SAS/CONNECT software to perform parallel processing on a single machine or across multiple machines. To learn more… For information on other courses in the curriculum, contact the SAS Education Division at 1-800-333-7660, or send e-mail to [email protected]. You can also find this information on the Web at support.sas.com/training/ as well as in the Training Course Catalog. For a list of other SAS books that relate to the topics covered in this Course Notes, USA customers can contact our SAS Publishing Department at 1-800-727-3228 or send e-mail to [email protected]. Customers outside the USA, please contact your local SAS office. Also, see the Publications Catalog on the Web at support.sas.com/pubs for a complete list of books and a convenient order form. For Your Information Prerequisites Before listening to this lecture, you should be able to • write DATA and PROC steps • understand error messages in the SAS log and debug your programs. v vi For Your Information Introduction to Parallel Processing 1. Basics of Parallel Processing .......................................................................................... 6 2. Threading ......................................................................................................................... 30 2 Introduction to Parallel Processing 1. Basics of Parallel Processing Introduction to Parallel Processing Welcome to the Introduction to Parallel Processing e-lecture. My name is Michelle and I will be your instructor for this lecture. I have been an instructor for SAS for over 15 years and my specialties include SAS/CONNECT software. 3 4 Introduction to Parallel Processing Lectures Available Introduction to Parallel Processing. Using Parallel Processing on a Single Machine (Scaling Up) Using Parallel Processing Across Multiple Machines (Scaling Out) Pipeline Parallelism Managing Asynchronous Execution 2 This lecture series consists of five separate lectures. The first lecture is Introduction to Parallel Processing. The second lecture is Using Parallel Processing on a Single Machine (Scaling Up). The third lecture is Using Parallel Processing Across Multiple Machines (Scaling Out). The fourth lecture is Pipeline Parallelism. And the fifth lecture is Managing Asynchronous Execution. This is the first lecture in the series. We encourage you to listen to all five lectures to get a full understanding of how to perform parallel processing. 1. Basics of Parallel Processing 5 Introduction to Parallel Processing 1. Basics of Parallel Processing 2. Threading 3 In this lecture you will learn the basics of parallel processing, which is part of SAS/CONNECT software, as well as threading, which is part of Base SAS software. 6 Introduction to Parallel Processing 1. Basics of Parallel Processing Introduction to Parallel Processing 1. Basics of Parallel Processing 2. Threading 4 So let’s start off with the basics. 1. Basics of Parallel Processing Objectives Define parallel processing concepts. Identify the benefits of parallel processing. Define independent parallelism. 5 In this section you will learn about the basics of parallel processing, what the benefits are, and what is meant by independent parallelism. 7 8 Introduction to Parallel Processing Technology Yesterday 6 In years past, life was simple. Each computer had only one CPU and was not connected to another computer. Programs ran faster on the machine that had the fastest processor, assuming no one else was using the processor at that time. When you submitted your program, it ran in a top-down approach. We didn’t have a word for it at the time, but now we call this technique synchronous processing. That is what happened in SAS 6. 1. Basics of Parallel Processing Synchronous Processing DATA Step A PROC SORT A DATA Step B PROC SORT B elapsed time 0 7 Synchronous processing is when the execution of tasks follows one after the other. In this example, I need to run a DATA step to read in data to create data set A, and then sort data set A. Once that completes, then the DATA step to read in different data to create data set B will start, and then sort that. So the total elapsed time is the combination of all of the different steps. 9 10 Introduction to Parallel Processing Parallel Processing Using MP CONNECT Multiprocess (MP) CONNECT enables SAS tasks or sessions to execute in parallel, and coordinates and synchronizes all results into a single parent (or client) session. This simultaneous processing can occur on: MP CONNECT is part of SAS/CONNECT software. 8 The purpose of MP CONNECT is to reduce overall elapsed time by having multiple SAS sessions executing code at once and then synchronizing the results in the parent, also known as the client, SAS session. This parallel processing allows multiple tasks to be executed at the same time. This simultaneous processing can occur on a single machine that has multiple processors, multiple machines that are networked together, or a combination of both. If you are doing simultaneous processing of SAS programs and coordinating the results on just one machine, then SAS/CONNECT must be installed on that machine. 1. Basics of Parallel Processing 11 Benefits of Parallel Processing The purposes of parallel processing (also known as multiprocessing or asynchronous processing) include the following: 9 execute independent tasks in parallel (SAS® 8) execute select dependent tasks in parallel (SAS®9) take advantage of multiple processors on a symmetric multiprocessing (SMP) single machine continued... With parallel processing you can execute independent tasks in parallel starting in SAS 8, or dependent ® tasks in parallel using pipeline parallelism in SAS 9. Both cases require SAS/CONNECT software. This allows you to take advantage of a machine with multiple processors on it, also known as an SMP or symmetric multiprocessing machine. 12 Introduction to Parallel Processing Benefits of Parallel Processing take advantage of each processor on a network of machines complete a job in less total elapsed time than it would take to execute the same job serially increase usage of underutilized CPUs – exploit current investment – prevent further monetary outlay for hardware 10 If you don’t have an SMP machine, that’s OK. Just farm out the processing to multiple machines on your network to use the benefits of parallel processing. Either way, the whole goal of parallel processing is to reduce the overall elapsed time that a job takes to run. This way you can take advantage of underutilized CPUs and don’t have to buy additional hardware. 1. Basics of Parallel Processing 13 Networked Machines Machine 1 Machine 5 Machine 2 Central Parent Machine 4 Machine 3 11 Most of us don’t work 24 hours a day, so your computer is idle at your desk while you are getting some much deserved rest, and the same is true of your coworkers. Parallel processing helps you take advantage of that computer downtime. This is an example of grid computing, which is simply a network of computers that are used to do parallel processing to solve one application problem. 14 Introduction to Parallel Processing Grid Enabling a SAS Application grid node partial result grid node partial result sub-task SAS program sub-task join final result split Input data grid node partial result grid node partial result data sub-task sub-task 12 This diagram shows the idea behind executing a long-running SAS job on a grid. The first step is to split the job into multiple sub-tasks. These sub-tasks are distributed to the nodes of the grid. Executing these sub-tasks results in many partial result sets that are then aggregated or joined together at the end to form the final result. SAS implements this advanced grid environment by using SAS Grid Manager and the Platform Suite for SAS, which are separate components of the SAS Enterprise Business Intelligence Server. This course does not go into more detail on grid computing using the SAS Grid Manager, but if you are interested there is a recorded lecture series devoted to that topic in detail. Note: Some type of software or grid middleware is needed in order to accomplish this parallel distributed execution and aggregation of results. The most basic grid environment consists of a SAS application that utilizes the parallel distributed processing capabilities of SAS/CONNECT to run sub-components of an application in parallel across nodes in a grid. This can be accomplished by utilizing the %DISTRIBUTE macro that serves as a wrapper of the SAS/CONNECT functionality and provides simplistic, yet very powerful load balancing of the workload across the nodes in the grid. This can result in performance gains in the range of 95-99% for those applications that consist of many replicate runs of the same fundamental task (such as applications doing BY-group processing) or tasks made up of multiple independent units of work (such as large-scale parallel scoring). 1. Basics of Parallel Processing 15 A more advanced grid environment consists of using SAS Grid Manager and the Platform Suite for SAS, which are separate components of the SAS Enterprise Business Intelligence Server. 16 Introduction to Parallel Processing Independent Parallelism Today a computer can have multiple processors, or be part of a network. SAS 8 software has the ability through MP CONNECT to take advantage of additional processors for different steps. Data Source A Data Source B PROC SORT A PROC SORT B 13 0 elapsed time Starting in SAS Version 8, SAS introduced parallel processing. Parallel processing was available in SAS 8 as long as the steps you wanted to execute in parallel were independent data. Independent parallelism is when the execution of separate tasks does not have any interdependencies. For instance, I have a DATA step to read in data to create data set A and then sort data set A. While that is happening, I can also run a DATA step to read in different data to create data set B and sort that. Because the processing for data set A has nothing to do with the processing for data set B, this is called independent parallelism. This will reduce the overall elapsed time it takes the code to run. Once the longest task has finished, in our case processing B data, then I can merge the two data sets together. To take advantage of this type of processing, which is known as MP CONNECT, you need to have multiple CPUs, whether they are multiple CPUs on the same physical machine or single CPUs on a number of different machines. Either way, SAS/CONNECT software is still required on those machines. So even if you have a single machine with two CPUs and you want to run SAS steps in parallel, you must have SAS/CONNECT on that one machine. 1. Basics of Parallel Processing Sample Business Scenarios Uses for multiprocessing for independent parallelism tasks include: multiple analyses of a single SAS data set concurrent creation of multiple summary data sets concurrent extraction of data from multiple data sources to be loaded into a data warehouse 14 So what are some uses of doing parallel processing? Well one use is to perform multiple analyses of the same SAS data set. You can also use it for creation of multiple data sets at the same time, as well as to grab data from different data sources to populate a data warehouse. 17 18 Introduction to Parallel Processing Multiple Analyses of a Single SAS Data Set FREQ Procedure TABULATE Procedure UNIVARIATE Procedure 15 0 elapsed time So let’s say I have a single data set and wish to do a PROC FREQ, a PROC TABULATE, and a PROC UNIVARIATE on that data. Using parallel processing, I can have these steps run simultaneously and the overall elapsed time is only as long as it takes the longest PROC to run. 1. Basics of Parallel Processing Concurrent Creation of Multiple Summaries Create Summary1 Create Summary2 Create Summaryn 16 0 elapsed time Similarly, if I had to read the same data set and create multiple summaries of the data, this would be a good use of parallel processing. 19 20 Introduction to Parallel Processing Concurrent Data Extraction Extract Oracle Data Read/Summarize SAS Data Set Merge Data Read/Summarize Raw Data File 17 0 elapsed time And lastly, if I had an Oracle table, a SAS data set, and a raw data file that all needed to be read, manipulated, and loaded into a data warehouse or data mart, I can use the parallel processing techniques to do the reading and manipulating parts at the same time, and when the last one completes I will add an instruction to merge the data together. 1. Basics of Parallel Processing 21 Divide and Conquer Distribute processing for jobs that consist of many replicate runs of a fundamental task: Monte Carlo methods (such as simulating statistical tests) global integer optimization (such as searching for optimal designs) mining massive data sets 18 So with asynchronous execution we can take a “divide and conquer” approach to performing tasks against our data that may consists of replicate runs, for example, Monte Carlo simulation methods, global integer optimization, and mining massive data sets. 22 Introduction to Parallel Processing Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. 1 2 3 1 2 3 19 One approach is to divide the program into equal pieces and submit each piece separately to whatever processor is available. But if I have a fast processor, a medium processor, and a slow processor, the fast processor will get done quickly and then just be idle and your program is waiting then on the slowest processor. 1. Basics of Parallel Processing 23 Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. 20 An alternative approach is to break the program into even smaller chunks and send the first three chunks to the three idle processors. 24 Introduction to Parallel Processing Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. Done! 21 Then when one processor is finished, send the next piece of code to that processor. 1. Basics of Parallel Processing Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. Done! 22 And when the next processor finishes, send the next piece of code in line to that one. 25 26 Introduction to Parallel Processing Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. Done! 23 And continue repeating this process, 1. Basics of Parallel Processing Divide and Conquer You might want to divide a program into many smaller sub-units to take advantage of faster processors and repeatedly send these smaller sub-units to idle processors until the entire program is complete. Done! 24 until all of the subunits of the program have been processed. 27 28 Introduction to Parallel Processing Considerations When working with multiprocessing applications, consider the following: overhead load factor requirements on the I/O subsystem interdependencies of data between steps dividing the data 25 Parallel processing is not a one-size-fits-all solution. There are many factors to take into consideration. Specifically, when doing parallel processing SAS is actually starting up additional SAS sessions on your computer, so there is some amount of overhead as those sessions start up. Also, if your machine is already heavily utilized, it won’t have the extra capacity to help with the parallelism. More I/O is often used for the parallel processing to occur, and you have to carefully think through the dependencies of your data as it goes from step to step. You also could choose to divide a large data set into smaller ones to do the processing on and then bring the data back together. 1. Basics of Parallel Processing 29 Dependent Parallelism DATA Step A PROC SORT A 0 PROC PRINT A elapsed time 26 Regardless if you use sequential processing or parallel processing of independent steps, the data between ® steps still needs to be written out to disk. In SAS 9, SAS introduced pipeline parallelism, which negates the need for writing the data to disk between dependent steps. This process allows you to pipe the data directly from one step to the next through a TCP/IP pipe and allows some steps to work in parallel. Since you are no longer writing the data to disk, you are saving both disk space and I/O time. Pipeline parallelism is part of SAS/CONNECT software. Dependent parallelism is when the execution of different steps has the same data. In this example, the sort for data set A starts shortly after the DATA step has started, but before the DATA step has finished. The two steps have the same data, so this is dependent parallelism. To achieve this, you will use pipeline parallelism techniques. Because the PROC PRINT step has to have all of the data before it can start, it cannot overlap with the PROC SORT. This topic is covered in more detail in the Pipeline Parallelism e-lecture. 30 Introduction to Parallel Processing 2. Threading Introduction to Parallel Processing 1. Basics of Parallel Processing 2. Threading 27 Now that you have an overview of parallel processing, let’s take a look at threading. 2. Threading Objectives Define how threading works. Specify tuning options. Compare benchmarks. 28 In this section you will learn what threading is, what tuning options are available, and see some benchmark statistics comparing threading versus non-threading. 31 32 Introduction to Parallel Processing Threading Starting with SAS®9, some individual steps have the ability to take advantage of additional processors. SAS Version 6 SAS Version 8 Independent Parallelism SAS®9 Threading Parallelism DATA Step A PROC SORT A DATA Step B PROC SORT B 29 DATA Step A DATA Step B PROC SORT A PROC SORT B DATA Step A PROC SORT A PROC SORT A Base SAS Software http://support.sas.com/rnd/scalability/procs/index.html ® Starting with SAS 9, a single step can take advantage of multiple CPUs and run in a threading environment. This is a type of parallel processing. Threading takes advantage of multiple CPUs by dividing processing among the available CPUs. Even if your site does not use an SMP machine, some types of threading can be performed using a single CPU. So if my PROC SORT on data set A is running on a multiple processor machine, SAS can run part of the sort on one processor and part on another processor and coordinate the results. This is part of Base SAS software. Only a select number of PROC steps support this technology. Those steps can be found at the Web address shown on this slide. Note: Multi-threading takes advantage of SMPs and provides performance gains for two types of SAS processes: • threaded I/O • threaded applications processing Threaded I/O aids applications that can process data faster than the data can be delivered to the application. When an application cannot keep the available CPUs busy, the application is said to be I/Obound. Threaded application processing aids applications that receive data faster than they can perform the necessary processing on that data. These applications are referred to as CPU-bound. 2. Threading Threaded technology and multiple CPUs alleviate these problems. Threading is part of Base SAS and is implemented for the following SAS procedures: • MEANS/SUMMARY • REPORT • SORT • SQL • TABULAT • REG • ROBUSTREG • GLM • LOESS • DMREG • DMINE 33 34 Introduction to Parallel Processing Multi-Threaded Processing PROC SORT DATA=A Execute Simultaneously SMP 30 So let’s say that I have an SMP machine that has three processors on it, and I submit a PROC SORT which is one of the threaded procedures. The operating system then can use one CPU to start sorting some of the data, a second CPU can sort more of the data, and the third CPU can sort the rest of the data. 2. Threading 35 Multi-Threaded Processing PROC SORT DATA=A SMP 31 Eventually all of that data must be combined back together, so one CPU will handle that part. Now, you as a programmer do not get to control that this happens. The operating system may have other tasks that it needs to work on, so one CPU may be dedicated to working on something unrelated to your program. The threaded procedures just give the operating system the option to split the one PROC step across multiple CPUs. 36 Introduction to Parallel Processing Tuning Options for Threaded Procedures Options to control tuning include: CPUCOUNT option – controls number of CPUs to utilize; does not control number of threads NOTHREADS option in the PROC statement – prevents a threadenabled procedure from using multiple threads 32 Now if you are sharing a machine like a UNIX box with others in your organization, they may not be too thrilled with you if you start taking over all of the processors. You may need to perform tuning on your machine to restrict the number of threads available to prevent an individual session from monopolizing the machine. You can restrict the number of CPUs by using the CPUCOUNT= option. If you prefer your PROC step to run single-threaded, you can use the NOTHREADS option in that PROC statement. 2. Threading 37 Tuning Options for Threaded Procedures The number of CPUs to use for processing in each SAS session can be controlled with the CPUCOUNT system option. OPTIONS CPUCOUNT=1-1024 | ACTUAL ; 1-1024 is the number of CPUs that SAS assumes are available for use by threaded-enabled applications. ACTUAL is the number of CPUs that SAS detects are available for a specific session (default). 33 The syntax for the CPUCOUNT= option is OPTIONS CPUCOUNT= and then a number that ranges from 1 to 1024. So if you specify CPUCOUNT=2, then a maximum of two CPUs are used by that SAS session. You could also specify CPUCOUNT=ACTUAL, and that means however many CPUs are on that machine, that’s how many I want. This is the default value. If you did use a number that was greater than the actual number of CPUs on that machine, then SAS does not magically create extra processors, but changes the count back to ACTUAL. 38 Introduction to Parallel Processing How Much Better Is Threading? PROC SORT 34 So we ran some benchmarks just to see how much better threading was than not threading. We varied our data using anywhere from 1 million to 10 million observations in a data set, and then ran those data sets through a PROC SORT with four variables on our BY statement using a four-way UNIX machine. There really wasn’t much difference with less than two million observations. But after that, notice a remarkable improvement on overall elapsed time when we used threading in SAS 9.1.3 (the bottom line). When we re-ran the code and used the NOTHREADS option (the top line), it was pretty similar to what happens when we ran it in SAS 8.2. 2. Threading How Much Better Is Threading? PROC SUMMARY 35 Now we ran that same basic scenario but this time using PROC SUMMARY and four class variables. Notice how even at only one million observations we see an improvement in overall elapsed time. And the elapsed time stays remarkably low even with ten million observations, whereas the non-threading in 9.1.3 and 8.2 are very similar, and show a steeper rate of elapsed time with more observations. 39 40 Introduction to Parallel Processing Threading and Independent Parallelism SAS®9 Threading and Independent Parallelism DATA Step A SAS/CONNECT Software DATA Step B PROC SORT A PROC SORT A PROC SORT B PROC SORT B Base SAS Software 36 ® Even with threading, of course you can still perform independent parallelism in SAS 9. So I can process my A data and B data at the same time including any PROC steps that are thread-enabled. However, because we are performing independent parallelism, this still requires SAS/CONNECT software. So to recap, threading means that a single PROC step is taking advantage of multiple CPUs on a machine and is part of Base SAS, while asynchronous execution of different steps is taking advantage of multiple CPUs on a single machine, or single or multiple CPUs on more than one machine, and is part of SAS/CONNECT software. 2. Threading Summary 37 Traditionally each computer operates as a stand-alone unit without sharing processing power. By using the capabilities in parallel processing, the processors can work together simultaneously to reduce the overall elapsed time of your job. 41 42 Introduction to Parallel Processing Credits Introduction to Parallel Processing was developed by M. Michelle Buchecker. Additional contributions were made by Cheryl Doninger, Glenn Horton, Merry Rabb, and Chris Riddiough. 38 That concludes the Introduction to Parallel Processing e-lecture. 2. Threading Comments? We would like to hear what you think. Do you have any comments about this lecture? Did you find the information in this lecture useful? What other e-lectures would you like to see SAS develop in the future? Please e-mail your comments to [email protected] 39 If you have any comments on this lecture or other lectures you would like to see, please e-mail [email protected]. 43 44 Introduction to Parallel Processing Copyright SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies. Copyright © 2009 by SAS Institute Inc., Cary, NC 27513, USA. All rights reserved. 40 Thank you for your time.
© Copyright 2026 Paperzz