page 2  (15 pages)
to previous section1
3to next section

Vouk & Singh Quality of Service & Scientific Workflows (DRAFT, V0.7) 2

1. Introduction

Modern problem solving environments (PSE) are envisioned as collections of cooperating programs, tools, clients, and intelligent agents [Gal94]. These components are integrated into an environment that facilitates user interaction (such as problem statement and solution engineering) and cooperative execution of the components charged with the solution tasks. An example is a system that would help an environmental scientist or a regulator to pose environmental engineering questions (problems), develop, execute and validate solutions, analyze results, and arrive at a decision (e.g., cost-effective emission control strategy). Such a PSE would consist of a management, analysis and computational framework that would be populated with a variety of models and data that describe the science behind the phenomena, the solutions of interest and the decision rules [e.g., Den96]. It is usually assumed that a modern PSE is distributed across a number of central processing units that may or may not reside in one physical computer. In fact, the advent of high-performance computing engines and networks, the potential of new technologies (such as the Asynchronous Transfer Mode) to "guarantee quality of service", and the ready access to network-based information through the World-Wide Web (WWW) is opening a fantastic opportunity for bringing serious numerical and problem-solving applications closer to a broad base of potential users.

An important part of modern PSE framework is its ability to facilitate effective and efficient communication among the PSE components (or objects). This is recognized by both researchers and software manufacturers, and, in recent years, it has resulted in a proliferation of communication building blocks for distributed scientific computing. Two examples are PVM and MPI [Gei94, Sni95], communication libraries and message and process brokers that allow distribution of a parallelized problem over a number of processing units in order to increase the system's computational performance. Although not originally intended for this purpose, both PVM and MPI can be used to distribute not only fine-granularity solution elements (such as code segments) but whole programs and PSE parts. Another example is a variety of, usually CORBA- compliant, commercial object brokers that can be used to construct PSEs.

From the perspective of a PSE user, one of the key issues will be the quality of service (QoS). In this context we broaden the classical definition of QoS to include not only network-based parameters (such as response delay and throughput), but also measurable end-user quality characteristics such as system availability, performance, algorithmic scalability, effectiveness, quality of system content, quality of user-system interactions, and so on. Furthermore, in order to facilitate integration of the QoS and PSE concepts, and naturally introduce already existing formal specification and quality analysis approaches, we view PSEs as computer and network-based systems that support scientific workflows, i.e., a series of structured activities and computations that arise in scientific problem-solving. Scientific workflows are expected to coexist, cooperate and even meld with other user workflows (e.g., business workflows, educational workflows, legislative workflows). As such they must support compatible QoS. We can use data from existing network-based systems and workflows to quantitatively bound some of the PSE QoS parameters.

Section 2 defines the workflow view of problem solving. Section 3 discusses the quality of services issues and provides quantitative bounds for some more prominent QoS parameters. Conclusions and summary are given in Section 4.