To appear in IEEE Transactions on Parallel and Distributed Systems
Comparative Modeling and Evaluation of CC-NUMA and COMA on
Hierarchical Ring Architectures ?
Xiaodong Zhang Yong Yan
High Performance Computing and Software Laboratory
The University of Texas at San Antonio
San Antonio, Texas 78249
Parallel computing performance on scalable shared-memory architectures is affected by the structure
of the interconnection networks linking processors to memory modules and on the efficiency
of the memory/cache management systems. Cache Coherence Non-Uniform Memory Access (CC-
NUMA) and Cache Only Memory Access (COMA) are two effective memory systems, and the hierarchical
ring structure is an efficient interconnection network in hardware. This paper focuses on
comparative performance modeling and evaluation of CC-NUMA and COMA on a hierarchical ring
shared-memory architecture. Analytical models for the two memory systems for comparative evaluation
are presented. Intensive performance measurements on data migrations have been conducted
on the KSR-1, a COMA hierarchical ring shared-memory machine. Experimental results support the
analytical models, and we present practical observations and comparisons of the two cache coherence
memory systems. Our analytical and experimental results show that a COMA system balances the
work load well. However the overhead of frequent data movement may match the gains obtained
from improving load balance. We believe our performance results could be further generalized to
the two memory systems on a hierarchical network architecture. Although a CC-NUMA system may
not automatically balance the load at the system level, it provides an option for a user to explicitly
handle data locality for a possible performance improvement.
?This work is supported in part by the National Science Foundation under grants CCR-9102854 and CCR-9400719, by the Air Force Office of Scientific Research under grant AFOSR-95-1-0215, by a grant from the Cray Research, and by a Fellowship from the Southwestern Bell Foundation. Part of the experiments were conducted and on the KSR-1 machines at Cornell University and at the University of Washington.