My research interests are centered around the theory and practice of distributed computing. I primarily work on algorithms and lower bounds for fault-tolerant distributed systems with a focus on building building provably safe and secure concurrent/distributed systems and applications.
Short CV
Research overview
Understanding the foundations of distributed computing is important for the design of efficient computational techniques across all scientific fields. As a consequence of failures and the asynchrony pervasive in distributed systems, many problems that are trivial to solve sequentially are impossible or infeasible to solve in a distributed fashion, thus presenting us with problems of deep intellectual yet practical interest.
When reasoning about the design space of a distributed computation, I have typically asked myself the following sequence of questions:
-
-
- What is the model of computation?
- Given a computational model, is it (im)possible to build a distributed application with specific safety and progress properties?
- What are the complexity metrics that characterize the cost of the computation and can we derive bounds that identify the implementation’s theoretical cost?
- What are the hardware and programming abstractions that help programmers realize a working implementation with largely sequential semantics in mind?
- How can we verify that the resulting implementation conforms to its intended semantics?
-
Generally speaking, when reasoning about distributed computations in any context; multicore machines or graph algorithms for routing in today’s Internet network architectures or information sharing in social peer-to-peer networks, all the above questions apply. But there are also other practical considerations: Is there a bound on the number of participating computational entities? Are we seeking deterministic or randomized algorithms? Is there an a priori agreed naming convention for identifying the computational entities? Lastly, how easy or hard is the distributed programming model? This latter question is especially important from a practical standpoint since it is the simplicity of the programming model that determines whether ordinary programmers may choose to adopt it. Reasoning about the correctness of a distributed computation is a science unto itself, and the programming model must ideally enable programmers to build distributed applications with largely their sequential semantics in mind and without worrying about synchronization problems that may arise. I have attempted to answer some of these questions by presenting algorithms, formal semantics, lower bounds and programming models for multicore CPUs as well as cloud infrastructures; implementations of concurrent data structures; efficient protocols for cryptocurrencies and distributed update protocols for software-defined networks.