
Towards Practical Permutation Routing on Meshes
Michael Kaufmann? Uli Meyery Jop F. Sibeynz
Abstract
We consider the permutation routing problem on twodimensional n ? n meshes. To be practical, a routing algorithm is required to ensure very small queue sizes Q, and very low running time T , not only asymptotically but particularly also for the practically important n up to 1000. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a nearoptimal result: T = 2 ? n +O(1) with Q = 2. Although Q is very attractive now, the lower order terms in T make this algorithm highly impractical. Therefore we present simple schemes which are asymptotically slower, but have T around 3 ? n for all n and Q between 2 and 8.
1 Introduction
Communication between processing units (PUs) in a network is performed by exchanging packets of information. Since the network is sparse, due to physical constraints on the number and length of links, the packets have to travel through intermediate nodes. Packet routing is concerned with the organization of the movement of the packets in a network. The efficiency of a packetrouting protocol is measured by (1) the time that passes until all the routing requests are completed; and (2) the size of auxiliary memory in each PU. The time is measured by the maximum number T of routing steps, and the memory by the maximum number Q of packets that may be queued simultaneously in a PU. As we think of PUs as small nodes with limited storage capacity, it is very important to have algorithms that work with small Q. Furthermore, the larger the queues are, the longer it takes to insert and extract packets from them: the assumption that managing the queues takes no time is realistic only if the queues are very small. Another criterion for a practical algorithm is simplicity: in applications an algorithm with a few instructions always outperforms a complicated scheme which appears better in theory.
?Fakult?at Informatik, Universit?at T?ubingen, Sand 13, 72076 T?ubingen, Germany. Email: mk@informatik.unituebingen.de yMaxPlanckInstitut f?ur Informatik, Im Stadtwald, 66123 Saarbr?ucken, Germany. Email: umeyer@mpisb.mpg.de zMaxPlanckInstitut f?ur Informatik, Im Stadtwald, 66123 Saarbr?ucken, Germany. Email: jopsi@mpisb.mpg.de Partially supported by EC Cooperative Action IC1000 (Project ALTEC: Algorithms for Future Technologies).
We consider routing permutations on a twodimensional n ? n MIMD mesh. In a permutation routing problem, every PU is source and destination of precisely one packet. A routing algorithm is called optimal if T = 2?n?2, the diameter of the mesh, and nearoptimal if T = 2 ? n + O(1). Recently, routing on meshes has attracted a considerable amount of attention. The first routing algorithms which required close to 2 ? n steps were given by Kunde [5] and Rajasekaran and Tsantilas [10]. Leighton, Makedon and Tollis [6] presented the first deterministic algorithm with optimal routing time and constant size queues. This paper is of a great theoretical importance but the maximum queue size is impractically large (Q = 1008 according to [9]). Rajasekaran and Overholt [9] reduced Q to about 150 (for comments see [1]). Further improvements are given in [1] and [12]: we presented routing algorithms with T = 2 ? n?2 and Q = 33, and with T = 2 ? n + O(1) and Q = 12.
In this paper we take a different approach. It appears that the line of algorithms from [6] over [9] to [1] and [12] has come to an end. We believe that no further substantial improvements can be achieved by developing the routing scheme, the scattering technique, or the spreading technique. Therefore, we step back somewhat and consider a combination of the algorithms of Rajasekaran/Tsantilas [10], Kaklamanis/Krizanc/Rao [2] and Kaufmann/Sibeyn/Suel [4]. This gives an algorithm with really short queues, Q = 2 and T = 2 ? n + O(1). However, the additional constant is impracticably large. Therefore we consider schemes which are simple and behave much better than any known algorithm for practically important sizes of the mesh: n between 16 and 1024. Variants give a tradeoff between Q and T . These algorithms have been simulated to test their implementability, and to determine the actual number of routing steps.
In the remainder of the paper we first give an overview of basic ideas underlying the algorithms. Section 3 offers a basic algorithm with T = 3 ? n + 41 ? n1=2 and Q = 2. Then this algorithm is refined to one with T = 2 ? n + O(n3=4) and Q = 2. In Section 4.5 we introduce critical and noncritical packets to obtain T = 2 ? n + O(1) and Q = 2. Hereafter we give the more practical schemes and present results of simulations thereof.