[an error occurred while processing this directive] [an error occurred while processing this directive]
Department of Computer Science
1998-1999 Seminar Series (Department Ph.D Thesis)
Future Effectiveness of Centralized and Distributed Memory Parallel Computing Systems
Meenakshisundaram R. Chinthamani
PhD Candidate
Department of Computer Science
University of Saskatchewan

THESIS DEFENCE
DATE: Friday, November 20, 1998
TIME: 8:30am
PLACE: Dean's Conference Room
COMMITTEE: Dr. Derek Eager (Supervisor)
Dr. David Nicol (External)
Dr. Winfried Grassman
Dr. Carl McCrosky
Dr. Carey Williamson
Dr. Dvoralai Wulfsohn (Cognate)
*** Limited Seating***
DEPARTMENT SEMINAR
DATE: Monday, November 23, 1998
TIME: 2:30pm (Note!)
PLACE: 2E25 Agriculture
*** Everyone is welcome ***

Abstract:

This work is concerned with the question of how current parallel systems would need to evolve in terms of hardware capabilities, architectures, and software control policies, if they are to continue to be useful computing platforms in the future. This question is motivated by the past and continuing increases in processor speeds that substantially exceed the increases in performance of other hardware resources, including, for example, the networks needed for communication between processors in parallel systems. The hardware capabilities considered include processor speed, inter-processor communication network latency, inter-processor communication network bandwidth, and cache sizes.

The candidate parallel system architectures considered include centralized memory architectures, in which all main memory accesses must traverse an interconnection network to access a shared, centralized memory, and distributed memory architectures, in which the total system main memory is distributed among the processors, such that only accesses to a part of the main memory allocated to another processor need traverse a global interconnection network. The software control policies considered include affinity scheduling policies that assign computations to various processors based on the likely contents of their caches and the local memories, and latency tolerant scheduling policies that permit aggregation of computation and communication into larger units.

Results of scalability analyses are presented in two forms: asymptotic results and transient results. Asymptotic results are developed by studying the execution characteristics of a number of data parallel numeric and scientific applications, as the system parameters of processor speed, cache sizes, inter-processor communication network bandwidth, and latency, and the application parameters are scaled, while holding the number of processors fixed, under a time-constrained scaling model. Asymptotic results are concerned with the question of whether an application will eventually become communication bound, and thus the machine becomes unsuitable as a parallel computing platform for the application. On the other hand, the transient results are concerned with the rate at which the asymptotic results take hold.

For the class of data parallel near-neighbour computations, a new latency tolerant scheduling policy is proposed. It is shown to alleviate substantially the potential ``latency bottleneck'' in high latency parallel computing environments. Yet, it is shown that for the class of one and two-dimensional near-neighbour computations, the proposed scheduling policy increases the total communication volume only by at most a constant factor compared to the conventional scheduling policy. It is also shown that, for arbitrary d-dimensional near-neighbour computations ( d greater than or equal to 1 ), the asymptotic bandwidth scaling requirements are at the same level as for conventional scheduling. The benefits of the proposed latency tolerant scheduling policy are also demonstrated through an experimental study.

All students registered in the CMPT990 Seminar Series should attend.


[an error occurred while processing this directive]