Smart Redundancy for Big-Data Systems: Theory and Practice

Rashmi Vinayak

Speaker:  Rashmi Vinayak, Postdoctoral Researcher, UC Berkeley

Date:  Wednesday, March 1
Time: 11am
Location:  Room 1202, CSE Building

Abstract:  Large-scale distributed storage and caching systems form the foundation of big-data systems. A key scalability challenge in distributed storage systems is achieving fault tolerance in a resource-efficient manner. Towards addressing this challenge, erasure codes provide a storage-efficient alternative to the traditional approach of data replication. However, classical erasure codes come with critical drawbacks: while optimal in utilizing storage space, they significantly increase the usage of other important cluster resources such as network and I/O. In the first part of the talk, I present new erasure codes and theoretical optimality guarantees. The proposed codes reduce the network and I/O usage by 35-70% for typical parameters while retaining the storage efficiency of classical codes. I then present an erasure-coded storage system that employs the proposed codes, and demonstrate significant benefits over the state-of-the-art in evaluations under production setting at Facebook. Our codes have been integrated into Apache Hadoop 3.0. The second part of the talk focuses on achieving high performance in distributed caching systems. These systems routinely face the challenges of skew in data popularity, background traffic imbalance, and server failures, which result in load imbalance across servers and degradation in read latencies. I present EC-Cache, a cluster cache that employs erasure coding to achieve a 3-5x improvement as compared to the state-of-the-art. 

Bio:  Rashmi K. Vinayak recieved her PhD in the EECS department at UC Berkeley in 2016 where she is now a postdoctoral researcher at AMPLab/RISELab and BLISS. Her dissertation received the Eli Jury Award 2016 from the EECS department at UC Berkeley for outstanding achievement in the area of systems, communications, control, or signal processing. Rashmi is also a recipient of the Facebook Fellowship 2012-13, the Microsoft Research PhD Fellowship 2013-15, and the Google Anita Borg Memorial Scholarship 2015-16. She is also the recipient of the IEEE Data Storage Best Paper and Best Student Paper Awards for the years 2011/2012. Her research interests lie in the theoretical and system challenges that arise in storage and analysis of big data, with a current focus on erasure coding for big-data systems.

Related Research Publications
 A "hitchhiker's" guide to fast and efficient data reconstruction in erasure-coded data centers
 EC-Cache: Load-balanced, low-latency cluster caching with online erasure coding

Faculty host: CSE Prof. Alex Snoeren (