The Real-Time Publisher/Subscriber Inter-Process Communication
Model for Distributed Real-Time Systems: Design and Implementation
Ragunathan Rajkumar, Mike Gagliardi and Lui Sha
Software Engineering Institute
Carnegie Mellon University
Pittsburgh, PA 15213
Distributed real-time systems
are becoming more pervasive
in many domains including process control, discrete manufacturing, defense systems, air traffic control, and online monitoring systems in medicine. The construction of such systems, however, is impeded by the lack of simple yet powerful programming models and the lack of efficient, scalable, dependable and analyzable interfaces and their implementations. We argue that these issues need to be resolved with powerful application-level toolkits similar to that provided by ISIS . In this paper, we consider the inter-process communication requirements which form a fundamental block in the construction of distributed realtime systems. We propose the real-time publisher/subscriber model, a variation of group-based programming and anonmyous communication techniques, as a model for distributed real-time inter-process communication which can address issues of programming ease, portability, scalability and analyzability. The model has been used successfully in building a software architecture for building upgradable real-time systems. We provide the programming interface, a detailed design and implementation details of this model along with some preliminary performance benchmarks. The results are encouraging in that the goals we seek look achievable.
With the advent of high-performance networks such as ATM and upward-compatible 100Mbps network technologies, the cost of network bandwidth continues to drop steeply. From a hardware perspective, it is also often significantly cheaper to network multiple relatively cheap PCs and low-cost workstations to obtain an abundance of processing power. Unfortunately, these cost benefits have been slower in accruing to distributed real-time systems because of the still formidable challenges posed by integration issues, frequent lack of real-time support, lack of standardized interfaces, lack of good programming models, dependencies on specific communication protocols and networks, portability requirements and lack of trained personnel who perhaps need to be re-educated on the benefits and potentially major pitfalls of building working distributed
This work is supported in part by the Office of Naval Research under contract N00014-92-J-1524 and in part by the Software Engineering Institute.
real-time systems. We believe that the difficulty of programming distributed real-time systems with predictable timing behavior is in itself a significant bottleneck. Unfortunately, the problem is actually much harder because many other issues need to be addressed concurrently. These issues include the need to maintain portability, efficiency, scalability and dependability as the systems evolve or just become larger. Put together, these issues pose daunting challenges to the construction of predictable distributed real-time systems.
Three factors in particular seem to dominate the various phases of the life-cycle of developing and maintaining distributed real-time systems. First, the development, debugging and testing of distributed real-time systems is hindered by complexity; there are many degrees of freedom compared to uniprocessor systems which are conceptually easier to program. It would be extremely desirable to have a programming model which does not depend upon the underlying hardware architecture, whether it is a uniprocessor or a network of multiple processors. Secondly, systems change or evolve over time. Hardwiring any assumptions into processes, communication protocols and programming models can prove to be extremely costly as needs change. Finally, as distributed real-time systems grow larger, there is often a need to introduce new functionality by extending services and functions provided. Such extensions can become impossible if currently operational code needs to be rewritten to accommodate these changes, or sometimes even if the entire system just has to be taken down for installation. In other words, it would be desirable if a subsystem can be added online without having any downtime for the rest of the system.
1.1. Objectives for a Distributed Real-Time
Due to the many issues that need to be addressed simultaneously in building distributed real-time systems, we believe that a well-understood framework for building distributed real-time systems is essential. Specifically, the following objectives should be addressed by a framework for building such systems:
? Ease of programming: The framework must provide simple programming models and interfaces to the programmer. The implementation of this programming model is hidden from the application programmer and completely bears the responsibility of hiding the complexity of the underlying hardware (processor and net-