Master Thread SendAll(MOTIF_NEWLGH) SendAll(MOTIF_SCAN) GetAll(MOTIF_REDNORM) SendAll(MOTIF_SCAN) GetAll(MOTIF_REDNORM) GetAll(MOTIF_REDPWM) SendAll(MOTIF_SAVEVARS) GetAll(MOTIF_GETVARS) SendAll(MOTIF_PATELIM) Instead we set the sequences to permanently reside at the processing element and send it processing “triggers” (i.e. masterslave). Since sequences can be searched and modified independently they are spread across multiple processing elements (i.e. parallel master-slaves). Note the processing elements maintain state information and may perform asynchronous operation inbetween receiving triggers. time SendAll(MOTIF_STOP) Slave Threads Marchand, Bajic, Kaushik KAUST Oct 2011 1 Pure MPI 3000000 2500000 2000000 1500000 1000000 MPI 500000 Motif Now the maximum speedup is 239.6x – over 256 cores (MPI-OpenMP). 0 # CPUs MPI Overheads #CPUS 256 512 1024 2048 4096 8192 16384 32768 65536 Pure MPI Ini3al 18112 9072 4799 2375 11655 56369 279519 1156811 MPI-‐ OpenMP Ini3al 541 109 80 78 247 2289 10124 49814 255710 Marchand, Bajic, Kaushik KAUST Oct 2011 Pure MPI Grouped 9353 18444 9278 4734 2892 3493 12614 43995 44782 MPI-‐ MPI-‐ OpenMP Pure MPI OpenMP Grouped Mul3Level Mul3Level 328 36 643 148 43 133 82 5425 3316 68 3466 1497 60 1765 1918 154 856 404 722 451 424 2807 224 148 13692 29 28 300.00 250.00 200.00 150.00 100.00 50.00 0.00 Speedup Pure-MPI 256 512 1024 2048 4096 8192 16384 32768 65536 Run Time (msec) 3500000 MPIOpenMP 2
© Copyright 2026 Paperzz