
A Scalable MIMD Volume Rendering Algorithm
Craig M. Wittenbrink Michael Harrington
Dept. of Electrical Engineering, Applied Physics Laboratory University of Washington University of Washington Seattle, WA 98195 Seattle, WA 98105 email: craig@shasta.ee.washington.edu mikeh@apl.washington.edu
Abstract
Volume rendering is a compute intensive graphics algorithm
with wide application. Researchers have sought
to speed it up using parallel computers. Our new algorithm
distributes the data for storage efficiency, avoids
bottlenecks, and scales to more processors than rays. The
main contribution is explicit partitioning of the input volume
for higher memory utilization, while retaining viewpoint
freedom and speedup. The largest volumes
processed on our MIMD (multiple instruction multiple
data) machine (Proteus) are 512x512x128 voxels (32
Mbytes). Performance measurements show a speedup of
22 over our sequential code on 32 Intel i860 processors.
We have used no preprocessing or data dependent optimization.
The efficiency results from nonconflicting communication,
a permutation warp, that remains efficient with
larger data sets, larger parallel machines, and high order
filters showing scalability can be achieved through object
space partitioning.
Keywords: Parallel Algorithms, Object Space Partitioning,
Permutation Warping.
1 Introduction
Volume rendering [2] is transparency visualization of sampled 3D data. Inputs are point samples, or volume elements called voxels. A variety of applications create voxels such as magnetic resonance imaging and finite element analysis. Most applications are calculation intensive because of the size of the data sets (up to one billion voxels), and even the simplest calculations on workstations take too much time for interactive visualization. Volume rendering is described fully elsewhere [2].
Parallelism is difficult to use because partitioning may result in lower quality (See [12] for filter comparison), or large memory requirements [13] (and [8] where shared memory requires large caches and high cache overhead). We have developed a practical solution that does not sacrifice quality or memory. In this paper we discuss the performance attained on the Proteus Supercomputer [9] and
how to maintain full parallelism on scalable MIMD (multiple instruction multiple data) machines. We also give a brief overview of Proteus, the parallel software development environment, and existing methods for parallel volume rendering.
2 MIMD Volume Rendering Algorithm
In data parallel machines such as the CM2 and MasPar MP1, our preferred solution is a one step warping phase to align to the view direction [11][12], but in high granularity machines such as Proteus [9] small messages create network congestion. Instead we use an algorithm that takes advantage of the high virtualization ratio and sends large messages. The approach of sending large messages, using object space partitioning, and evaluating compositing by parallel product is new for a MIMD volume rendering algorithm. Paper reviewers have also made us aware of recent results of others using large messages [5], object space partitioning [5][7], and parallel compositing [5][6]. Our approach uses an innovative resampling step without communication congestion that scales in input and machine size.
The resampling is performed by permutation warping. We have worked on permutation warping for data parallel machines [11][12], and this paper extends that work to MIMD (multiple instruction multiple data) machines. Permutation warping is better than previous parallel resampling algorithms because it is simultaneously memory efficient, processor efficient, general, and accurate. The algorithm requires a strong interconnection network, which many parallel computers including Proteus provide. Permutation, or onetoone, communication is used to communicate resampled data from object space to screen space aligned frames. Resampling is required for volume rendering because the input voxels are a discrete sampling of a continuous function. The resampling is an interpolation to find the values in between the provided voxel values. Imagine looking through a lattice of points. Each pixel for the image resulting from that view is a ray