| ![]() | |||||||||
University of Leeds
SCHOOL OF COMPUTER STUDIES
RESEARCH REPORT SERIES
Report 96.29
Parallel CSG, Skeletons and Performance Modelling 1
by
H Deldari, J R Davy & P M Dew
Division of Computer Science
October 1996
1To be presented at CSICC'96, Tehran, Iran
Abstract
We describe an efficient implementation of a parallel algorithmic skeleton
which supports set membership classification problems in Constructive
Solid Geometry. A performance modelling methodology is developed,
which realistically predicts the asymptotic performance of specific CSG
applications.
1 Introduction
In recent years several researchers (eg [1, 2]) have proposed the use of algorithmic skeletons as a way to resolve the complexity and machine-dependence of programming parallel computers. A skeleton is a template providing a parallel implementation of a class of parallel algorithms which share a common structure. Examples include pipelines, farms, and divide-and-conquer algorithms. A programmer need only specify a number of problem-specific sequential procedures to complete the parallel solution; following the terminology of [3] we call these the skeleton's customising functions. Skeletons provide an elegant machine-independent approach to parallel-programming, greatly aiding programmer productivity by hiding all the details of parallel implementation within a compiler.
If skeletons are to be used methodically to build parallel programs, it is important to have a performance model enabling alternative designs to be evaluated prior to implementation. A systematic approach to describing the performance of individual skeletons was proposed in [3], illustrated by skeletons from the field of image processing. However, since these skeletons had routine data parallel implementations they did not provide a rigorous testing of the method.
This paper describes an implementation of a skeleton in the field of solid modelling. Since this uses unbalanced and unpredictable tree structures it offers a more severe challenge both to the implementation of the parallel template and to the performance methodology. Practical results show good speedups even for highly unbalanced trees and match the predicted theoretical results closely.