CSE Colloquium: Challenges and Opportunities in Deploying Parallel Filesystems in the Modern Cloud

Abstract
High-Performance Computing (HPC) has never been a stagnant area to work in, with the preferred languages, libraries, operating systems or distributions, processors, interconnect technologies, and storage media (to name a few) changing constantly. However, while the “how” has constantly evolved, the “where” has almost entirely remained the same through these decades of HPC advancement: on-premise. With the advent and rampant growth of modern clouds the fiscal and technological rationales for continuing to make huge investments in on-premise HPC systems becomes more difficult to make each and every year. However, to enable that “where” to shift from on-prem to in the cloud, modern cloud vendors must innovate to deliver the software and hardware ecosystems HPC users have become accustomed to while still enabling the most salient feature of life in the cloud: transiency.

One major component of any HPC solution is a parallel filesystem up to the task of keeping the applications and associated expensive processors well-fed. The most popular incarnation of this is the Lustre Filesystem, which is used by the majority of the supercomputers on the TOP500 list. However, there arise numerous challenges and simultaneously, opportunities, when attempting to deploy Lustre into a modern cloud like Microsoft Azure. This talk will highlight the most interesting challenges the Azure-Managed Lustre Filesystem team faced while determining how to deploy, monitor, and manage a parallel filesystem in Azure, and will detail currently implemented and future opportunities for innovation that manifested when cloud and HPC storage ecosystems collided.

Bio
Ellis Wilson is a Principal Software Engineer Manager on the Azure Managed Lustre Filesystem team of Microsoft in Pittsburgh, Pennsylvania. Prior to joining Microsoft in 2021 he spent a decade at Panasas as Software Architect working on the PanFS parallel filesystem and associated storage appliance. He received his PhD in Computer Science from the Pennsylvania State University under Mahmut Kandemir, having presented a dissertation focused on NAND flash firmware technology, parallel filesystems, and multi-protocol filesystem interactions.

 

Share this event

facebook linked in twitter email

Event Contact: Timothy Zhu

 
 

About

The School of Electrical Engineering and Computer Science was created in the spring of 2015 to allow greater access to courses offered by both departments for undergraduate and graduate students in exciting collaborative research fields.

We offer B.S. degrees in electrical engineering, computer science, computer engineering and data science and graduate degrees (master's degrees and Ph.D.'s) in electrical engineering and computer science and engineering. EECS focuses on the convergence of technologies and disciplines to meet today’s industrial demands.

School of Electrical Engineering and Computer Science

The Pennsylvania State University

207 Electrical Engineering West

University Park, PA 16802

814-863-6740

Department of Computer Science and Engineering

814-865-9505

Department of Electrical Engineering

814-865-7667