Executing Nested Parallel Loops On Shared-Memory Multiprocessors (PostScript version, PDF version)
Sadun Anik and Wen-mei Hwu
Proceedings of the 21st Annual Int'l Conference on Parallel Processing, St. Charles, IL, Aug. 1992, pp.(III) 241-244

Cache-coherent, bus-based shared-memory multiprocessors are a cost-effective platform for parallel processing. In scientific parallel applications, most of computation involves processing of large multidimensional data structures which results in a high degree of data parallelism. This parallelism can be exploited in the form of nested parallel loops. Most existing shared memory multiprocessors exploit this multi-level parallelism at only one level. In this paper. we explore efficient algorithms and models for executing nested parallel loops and present a simulation based performance comparison of different technique using real application traces. We show that it is possible to exploit the parallelism in the nested parallel loops with the use of good scheduling and synchronization algorithms.


[ IMPACT Main Page | Team Members | Publications | Software | FAQ ]