page 2  (5 pages)
to previous section1
3to next section

Applying Scalable Read Consistency to Data Replication

C. Liebig, H.-H. Pagnia-Koch, F. Schwappacher O. Theel

Dept. of Computer Science Dept. of Computer Science

Technical University of Darmstadt University of California, Riverside

Germany U.S.A.

Abstract

By allowing read operations to succeed even when only an insufficient number of replicas is accessible, the read availability can be significantly increased. This paper presents such a weak consistency approach for replicated data in a networking environment. A method for estimating the degree of correctness" of those modified read operations is proposed. For demonstration purposes, we illustrate the advantages of our approach by an example scenario and apply the method to the Majority Consensus Protocol. However, the approach is applicable to a large number of even more sophisticated protocols.

Keywords
Distributed Systems, Software Fault Tolerance, Data Replication, Highly Available Operations, Weak Consistency

Introduction

People who have suffered severe burns must immediately undergo special medical treatment. Since the required medical equipment is very expensive, only a few hospitals are equipped with proper burn units. When an accident has happened, the paramedics on site must instantly be able to derive which hospital can accommodate the burn victims. A straightforward method for the parademics to obtain such information is to sequentially contact the appropriately equipped hospitals. Unfortunately by doing this, they waste precious time with devastating impacts on the victims' state of health. By storing the required information about available capacities in a computer at each hospital, it is possible to let the paramedics retrieve the information for all hospitals directly and instantaneously.

Consider a hospital computer network with five locations in Aachen, Berlin, Chemnitz, Darmstadt, and Erlangen as shown in Figure 1. Each hospital computer manages one replica of an object containing information about all available beds for burn victims. Read operations provided by the hospital computer system therefore retrieve the number of patients that can be taken care of at each hospital. Write operations are used to pre-allocate medical

Berlin

NETWORK

Erlangen
Darmstadt

Aachen
Chemnitz

Figure 1: Five hospitals connected by a computer network

treatment for a certain number of victims at given hospitals. Those operations can be remotely triggered and executed at all sites of the network. The applicability of this approach requires a replica control algorithm in order to keep the replicated object (consisting of the five replicas) in a consistent state. Furthermore, the hospital computer system and the read operation must be highly available: as already pointed out, the paramedics do not have much time to decide where to transport the patients to.

Voting strategies, like Majority Consensus [1] or Hierarchical Quorum Consensus [2], fulfill both requirements identified above: consistency and availability. However, there is still the possibility that read operations cannot be executed since computers or network links may fail. In such a critical situation, the rescue crew must wait until the system has recovered from failure, or make decisions without having further information. The basic idea of our approach is to provide the paramedics in this critical situation with as much information as currently available. Although not all replicas which are required to execute the read operation with 100% consistency might be available, there is still a chance that at least one replica with the most recent data is among those currently accessible. Since there is othing to loose", tolerating such an incomplete read operation makes sense in the scenario given above. Furthermore, at-