Towards Practical Permutation Routing on Meshes
Michael Kaufmann? Uli Meyery Jop F. Sibeynz
We consider the permutation routing problem on twodimensional n ? n meshes. To be practical, a routing algorithm is required to ensure very small queue sizes Q, and very low running time T , not only asymptotically but particularly also for the practically important n up to 1000. With a technique inspired by a scheme of Kaklamanis/Krizanc/Rao, we obtain a near-optimal result: T = 2 ? n +O(1) with Q = 2. Although Q is very attractive now, the lower order terms in T make this algorithm highly impractical. Therefore we present simple schemes which are asymptotically slower, but have T around 3 ? n for all n and Q between 2 and 8.
Communication between processing units (PUs) in a network is performed by exchanging packets of information. Since the network is sparse, due to physical constraints on the number and length of links, the packets have to travel through intermediate nodes. Packet routing is concerned with the organization of the movement of the packets in a network. The efficiency of a packet-routing protocol is measured by (1) the time that passes until all the routing requests are completed; and (2) the size of auxiliary memory in each PU. The time is measured by the maximum number T of routing steps, and the memory by the maximum number Q of packets that may be queued simultaneously in a PU. As we think of PUs as small nodes with limited storage capacity, it is very important to have algorithms that work with small Q. Furthermore, the larger the queues are, the longer it takes to insert and extract packets from them: the assumption that managing the queues takes no time is realistic only if the queues are very small. Another criterion for a practical algorithm is simplicity: in applications an algorithm with a few instructions always outperforms a complicated scheme which appears better in theory.
?Fakult?at Informatik, Universit?at T?ubingen, Sand 13, 72076 T?ubingen, Germany. E-mail: email@example.com yMax-Planck-Institut f?ur Informatik, Im Stadtwald, 66123 Saarbr?ucken, Germany. E-mail: firstname.lastname@example.org zMax-Planck-Institut f?ur Informatik, Im Stadtwald, 66123 Saarbr?ucken, Germany. E-mail: email@example.com Partially supported by EC Cooperative Action IC-1000 (Project ALTEC: Algorithms for Future Technologies).
We consider routing permutations on a twodimensional n ? n MIMD mesh. In a permutation routing problem, every PU is source and destination of precisely one packet. A routing algorithm is called optimal if T = 2?n?2, the diameter of the mesh, and near-optimal if T = 2 ? n + O(1). Recently, routing on meshes has attracted a considerable amount of attention. The first routing algorithms which required close to 2 ? n steps were given by Kunde  and Rajasekaran and Tsantilas . Leighton, Makedon and Tollis  presented the first deterministic algorithm with optimal routing time and constant size queues. This paper is of a great theoretical importance but the maximum queue size is impractically large (Q = 1008 according to ). Rajasekaran and Overholt  reduced Q to about 150 (for comments see ). Further improvements are given in  and : we presented routing algorithms with T = 2 ? n?2 and Q = 33, and with T = 2 ? n + O(1) and Q = 12.
In this paper we take a different approach. It appears that the line of algorithms from  over  to  and  has come to an end. We believe that no further substantial improvements can be achieved by developing the routing scheme, the scattering technique, or the spreading technique. Therefore, we step back somewhat and consider a combination of the algorithms of Rajasekaran/Tsantilas , Kaklamanis/Krizanc/Rao  and Kaufmann/Sibeyn/Suel . This gives an algorithm with really short queues, Q = 2 and T = 2 ? n + O(1). However, the additional constant is impracticably large. Therefore we consider schemes which are simple and behave much better than any known algorithm for practically important sizes of the mesh: n between 16 and 1024. Variants give a trade-off between Q and T . These algorithms have been simulated to test their implementability, and to determine the actual number of routing steps.
In the remainder of the paper we first give an overview of basic ideas underlying the algorithms. Section 3 offers a basic algorithm with T = 3 ? n + 41 ? n1=2 and Q = 2. Then this algorithm is refined to one with T = 2 ? n + O(n3=4) and Q = 2. In Section 4.5 we introduce critical and non-critical packets to obtain T = 2 ? n + O(1) and Q = 2. Hereafter we give the more practical schemes and present results of simulations thereof.