AF:Small:Complexity of Distributed Storage
- View All
Distributed computing systems are all around us, ranging from multiple cores in a cell phone to the Internet itself. Distributed storage, or shared data, is a vital mechanism for communication among computing entities ("processors") in distributed systems and facilitates the development of correct and efficient applications. Although shared data is a desirable abstraction, it is not generally provided in large-scale distributed systems due to physical limitations. Instead, processors keep individual copies of the data, and communicate by sending messages to keep the copies consistent. It is known that providing shared data with strong guarantees on how consistent the data is can be expensive -- in particular, the operations on the data can take a long time to complete. This fact has fueled interest in more "relaxed" versions of the data, in the hope that operations can be implemented faster. As an example of relaxed data, imagine data that supports read and write operations where a read operation can return a value that is not the one most recently written. Developing software for distributed storage systems (and indeed for most distributed systems) that is correct and efficient is challenging due to complications caused by concurrency, component failures, and variable communication delays. Yet being able to do so will benefit society because of the ubiquity of such software. This project takes a principled approach, based on rigorous mathematical reasoning, to find distributed algorithms for some fundamental problems that underlie distributed storage systems, with especial focus on relaxed data, and to characterize applications that can exploit the relaxations and their improved performance. Project activities will also include creating undergraduate curricular materials on distributed computing to fill existing gaps and providing research experiences for undergraduates, especially women who are woefully under-represented, to encourage more to obtain graduate degrees in computing-related field. The technical problems to be solved include these: Find optimal implementations of various data structures that satisfy the "linearizability" consistency condition, where the performance metrics considered include worst-case as well as amortized time; amortized bounds are often of more use than isolated worst-case results, yet they have not been the focus of much analysis. Discover the relationship between relaxing the specification of an object and relaxing the consistency condition. Characterize general classes of applications that can exploit relaxed data structures or relaxed consistency conditions. Determine the level of fault-tolerance of data types in a generic way. Characterize patterns of churn (processors entering and leaving the system) that allow linearizable objects to be implemented in an asynchronous system subject to crash failures. The PIs plan to apply, as a general tool, classifications of data type operations by their algebraic properties.