Data Science Without Data Collection
Abstract
Although cloud computing has successfully accommodated the three "V"s of Big Data, collecting everything into the cloud is becoming increasingly infeasible. Today, we face a new set of challenges. A growing awareness of privacy among individual users and governing bodies is forcing platform providers to restrict the variety of data we can collect. Often, we cannot transfer data to the cloud at the velocity of its generation. Many cloud users suffer from sticker shock, buyer's remorse, or both as they try to keep up with the volume of data they must process. Making sense of data closer to its home is more appealing than ever.
Although theoretical research on federated/distributed learning is growing exponentially to meet these challenges, we are far from putting those theories into practice. In this talk, I will introduce FedScale, a scalable and extensible open-source federated learning and analytics platform. It provides high-level APIs to implement algorithms, a modular design to customize implementations for diverse hardware and software backends, and the ease of deploying the same code at many scales. FedScale also includes a comprehensive benchmark that allows data science and machine learning users to evaluate their ideas in realistic, large-scale settings. FedScale is available at fedscale.ai.
Bio
Mosharaf Chowdhury is a Morris Wellman Associate Professor of CSE at the University of Michigan, Ann Arbor, where he leads the SymbioticLab. His research improves application performance and system efficiency of machine learning and big data workloads with a recent focus on optimizing energy consumption and data privacy. His group developed Infiniswap, the first scalable memory disaggregation solution; Salus, the first software-only GPU sharing system for deep learning; FedScale, a scalable federated learning and analytics platform; and Zeus, the first GPU energy-vs-training performance optimizer for DNN training. In the past, Mosharaf invented coflows and was one of the original creators of Apache Spark. He has received many individual awards, fellowships, and seven paper awards from top venues, including NSDI, OSDI, and ATC. Mosharaf received his Ph.D. from the AMPLab at UC Berkeley in 2015.
Additional Information:
Event Contact: Timothy Zhu