Resource Management and Scheduling in Today’s Cloud
Abstract
The cloud computing landscape today is evolving quickly. New computing and deployment paradigms, like serverless computing and microservices architecture, are augmenting the classical VM-based cloud that we are all familiar with. In this new cloud, the basic question still remains "How do we efficiently utilize available resources to maintain low application latency". In this talk, I will discuss this evolving landscape briefly and then describe some recent works from our lab that address resource management and scheduling problems in the context. I will start by discussing ENSURE, our serverless scheduling solution that aims to maintain serverless function latency while minimizing the amount of resources needed. The specific challenges here are the diversity in serverless workload characteristics and the cold start penalty that non-warm containers incur when launched. I will then discuss an opportunity that serverless workloads help to realize---improving cloud server utilization via colocation. In this setting, we consider colocating latency-sensitive serverful workloads with latency-sensitive serverless requests. I will present ServerMore, our dynamic, server-level resource manager that opportunistically colocates customer serverless jobs with serverful customer VMs while respecting their target latencies. I will end with a brief discussion of other projects in the cloud computing space we have looked at, including deploying Spark in an NDP environment, optimizing TensorFlow DNN training times in multi-GPU servers, and tuning the configuration of microservices architecture applications.
Biography
Anshul Gandhi is an Associate Professor in the Computer Science Department at Stony Brook University. He received his Ph.D. from Carnegie Mellon University in 2013 and then spent a year as a postdoc at the IBM T. J. Watson Research Center. His current research focuses on performance modeling in distributed systems, and is funded by an NSF Career award, an IBM Faculty award, and a Google Research award. His contributions to performance modeling were recently recognized by an ACM Sigmetrics Rising Star Award.
Event Contact: Timothy Zhu