This paper provides research results that suggest ways in which the speed-performance of array-based FPGAs, like those from Xilinx, can be improved through enhancing their interconnect. Using an experimental approach, we study this issue from both the perspective of improving the routing architectures of the chips, as well as the CAD tools used to route circuits. The basic conclusions reached are: the lengths of wire segments in the interconnect dramatically affects speed-performance, it is crucial to limit the number of programmable switches that signals pass through in series, the impact of decisions made by the CAD routing tools is very significant, and the CAD tools should consider both speed-performance and area utilization, not just focus on one goal.
Over the last decade, Field-Programmable Gate Arrays (FPGAs) have emerged as a key technology for implementing circuits in VLSI. A number of different types of FPGAs are currently available from various manufacturers. One of the key features distinguishing different classes of FPGAs is the technology used to implement the user-programmable switches, examples of which are SRAM, EPROM, EEPROM, and antifuse. Because of its widespread use, being offered in FPGAs manufactured by Xilinx, Altera, AT&T, Atmel and Algotronix, this paper assumes that programmable switches are based on SRAM technology.
Since they are user-programmable, FPGAs considerably reduce manufacturing and prototyping time, as well as design costs, and are thus more attractive than alternative technologies like Mask Programmable Gate Arrays (MPGAs). On the other hand, the inherent drawbacks of user-programmability are reduced logic capacity and speed-performance, and this engenders limitations on the range of applications for which FPGAs are suitable. In this paper, we focus on improving the speed-performance of FPGAs.
Speed-performance is limited by two main factors: combinational delays in logic blocks, and propagation delays through the interconnect?s programmable switches. Unlike mask-programmed technologies where combinational delay dominates, in FPGAs interconnect accounts for a very significant 40-60 percent of total delay . Two key factors affect delays due to interconnect in an FPGA: 1) the routing architecture, which comprises the wires and switches used to interconnect the logic blocks, and 2) the CAD tools used to implement circuits.
In previous research  , the speed-performance of FPGA routing architectures has been studied, but only for rowbased devices such as those from Actel. In this paper, we
investigate ways of decreasing interconnect delays for FPGA architectures that are array-based, like those from Xilinx. One motivation for this research is the results of recent benchmarks  that have shown that array-based FPGAs currently offer lesser speed-performance than row-based FPGAs or Complex PLDs (CPLDs). As one of the key differences between these devices, the interconnect structure is a prime candidate for enhancing speed-performance.
To experiment with a range of routing architectures, a general model for FPGAs is needed. Fig. 1 shows the model used for this study. It consists of a two-dimensional array of logic blocks, vertical and horizontal routing channels, and I/O cells around the chip periphery. The logic blocks contain combinational and sequential circuit elements, while the routing channels comprise the wire segments and switches used to interconnect the logic blocks. Wire segments exist in both vertical and horizontal tracks. For the small example in Fig. 1, there are four tracks per channel and each logic block has two pins that appear on each of its sides. Although not shown in the figure, the C blocks contain switches for connecting the pins of the logic blocks to the wire segments, and the S blocks house switches that join one wire segment to another. The routing switches are pass-transistors controlled by SRAM cells. Wire segments can be of any length, where the length of a wire segment is defined as the number of logic blocks it spans. As an illustration, Fig. 2 shows a section of a horizontal channel with wire segments of length 1, 2 and 3. Our FPGA model allows virtually any channel segmentation scheme.
In early FPGAs, tracks consisted mostly of short wire segments of length one, and longer segments could be formed by joining together two or more of these short segments via routing switches. While this provides for good utilization of the wires in the sense that there are no long segments that might be wasted on short connections, requiring that long connections pass through several switches in series severely impairs speed-performance. This follows because a pass-transistor
Figure 1 - General Model of an Array-based FPGA.
Minimizing Interconnection Delays in Array-based FPGAs
Muhammad Khellah, Stephen Brown, and Zvonko Vranesic
Department of Electrical and Computer Engineering
University of Toronto