Ceph in HPC Environments at SC15

Overview

Individuals from MSI, UAB, RedHat Inc., Intel Corp., CADRE, and MIMOS came together at SC15: The International Conference for High Performance Computing, Networking, Storage and Analysis on Wednesday, November 18, 2015 in Austin, TX to share their experiences with Ceph in HPC Environments. 

This "Birds of a Feather (BoF)" meeting brought together Ceph developers and systems administers to share their experience working with Ceph and to discuss how it can be used to address many of the challenges that we now associate with Big Data and data intensive research. The format for this BoF was a series of seven lightning talks. Each talk was about seven minutes, followed by three minutes of questions. Researchers, developers, and administrators were there to participate in this session and to share and learn about Ceph in the wild. 

         

The Ceph storage platform is freely available software that can be used to deploy distributed fault-tolerant storage clusters using inexpensive commodity hardware. Ceph can be used for a wide variety of storage applications because a single system can be made to support block, file, and object interfaces. This flexibility makes Ceph an excellent storage solution for academic research computing environments, which are increasingly being asked to support complicated workflows with limited hardware and personnel budgets. This BoF brought together experts who have deployed Ceph for a variety of applications and will illustrate how Ceph fits into research computing environments. 

Speakers and Panel Members

Jim Wilgenbusch, Minnesota Supercomputing Institute, University of Minnesota

  • Talk Title: Introductions and Overview
  • Bio Sketch: James (Jim) Wilgenbusch is the Associate Director of the Minnesota Supercomputing Institute at the University of Minnesota, Twin Cities. Jim helps define MSI's High Performance Computing (HPC) and cyberinfrastructure research agendas and oversees the daily operations of the institute. Prior to MSI, Jim was a Senior Research Associate in the Department of Scientific Computing and the founding director of Florida State University’s Research Computing Center. While at FSU Jim co-founded the Sunshine State Education and Research Computing Alliance (SSERCA) to bring together Florida’s geographically distributed academic organizations and high-end compute and data storage resources in order to better support statewide research and create regional synergies. In addition to his management responsibilities, Jim maintains funded research activities in the study and implementation of models and search algorithms used in phylogenetic inference and for over 15 years he has been an invited faculty member at workshops on this topic. Jim did his Ph.D. training at George Mason University in Fairfax, VA.

Douglas Fuller, Red Hat, Inc.

  • Talk Title: Introductions and Overview
  • Bio Sketch: Douglas Fuller has been a Ceph engineer at Red Hat, Inc. since 2015. Prior to joining Red Hat, Doug worked at Oak Ridge National Laboratory Oak Ridge Leadership Computing facility, working on design and implementation of some of the world’s largest and fastest scalable data storage systems. He holds bachelor’s and master’s degrees from Iowa State University, studying programming models and languages for high performance computing.

Dan Ferber, Intel Corporation

  • Talk Title: Ceph at Intel
  • Bio Sketch: Dan Ferber works in the Intel’s Storage Division Software Defined Storage team. He is involved with a number of open source server based storage projects including Ceph reference architectures, Ceph performance testing, and supporting several partners with their server based storage solutions. Prior to SDS work, Dan led the Solutions Engineering team for Lustre at Intel, supporting Intel’s Lustre community partners and customers in HPC and moving Lustre into both commercial and Exascale high performance IO environments. He previously ran Business Development and Strategic Partnerships at Whamcloud, a Lustre start-up acquired by Intel in July 2012. Dan has more than 20 years of high tech experience developing and supporting communications software, configuration management tools, database software, I/O libraries and commercial software at Cray, SGI, Sun, and Oracle. He also has managed engineering, support, and business development teams. Dan received an Excellence in Achievement award from Los Alamos National Laboratory. Dan holds a Master's degree in Computer Systems from the University of St. Thomas in St. Paul, Minnesota and a Bachelor's degree from Occidental College in Los Angeles.

Ben Lynch, Minnesota Supercomputing Institute, University of Minnesota

  • Talk Title: Ceph for Tier-2 Storage at MSI
  • Bio Sketch: Ben Lynch is a Senior Scientific Computing Consultant and Manager of Application Development at MSI. While working toward his Ph.D. at the University of Minnesota, Ben published papers that are among the all-time most cited articles in the Journal of Physical Chemistry A. He has spent several years working with researchers to automate and scale-up computational and data-intensive workloads for scientific applications. Ben led the testing and deployment of a Ceph cluster at MSI, and continues to be involved in the adoption of Ceph and cloud technologies in HPC environments.

BJ Lougee, Center for the Advancement of Data and Research in Economics (CADRE)

  • Talk Title: A Case for Using CephFS
  • Bio Sketch: BJ Lougee is a computer scientist and HPC Linux Systems Administrator in the Center for the Advancement of Data and Research in Economics (CADRE) at the Federal Reserve Bank of Kansas City. He does research and development on, administration of, and training for the Bank's high performance computing (HPC) environment. He has a particular interest in helping to drive the adoption of HPC techniques in the economics field. Prior to joining the Bank in 2014, he was the Lead HPC Systems Administrator at the High Performance Computing Center (HPCC) for the 76th Software Maintenance Group at Tinker Air Force Base. He holds a bachelor of science in computer science from the University of Central Oklahoma and will be working toward his master’s degree in computer science through the OMSCS program at Georgia Tech, in January.

Dr. Hong Ong, Advanced Computing Lab and Accelerative Technology Lab, MIMOS Berhad

  • Talk Title: Ceph@MIMOS
  • Bio Sketch: Hong Ong is a Senior Director at MIMOS Berhad where he leads the Advanced Computing Lab and Accelerative Technology Lab. His current role is to provide technology leadership in defining and developing lines of applied system research and to transform research products into commercializable products. Hong received his Ph.D. in Computer Science from the University of Portsmouth, UK in 2004, and his B.S and M.S in Computer Science from Kent State University, Kent, Ohio, in 1996 and 1999 respectively. He has published more than 70 research papers and 10 patents.

John-Paul Robinson, University of Alabama at Birmingham

  • Talk TitleCeph@UAB: Empowering Research
  • Bio SketchJohn-Paul Robinson has been developing distributed systems and helping organizations adopt them for more than two decades. A long term proponent of open solutions and their ability to empower users, he is working to enhance the functionality of campus high performance computing environments to supply researchers with the tools needed for collaborative science, using the cloud to power reliable pipelines that scale across systems and remain reproducible over time. John-Paul lead the adoption of Ceph and OpenStack at the University of Alabama at Birmingham and has used it for over two years to develop services and empower researchers to think beyond the storage bottleneck.

Greg Farnum, Red Hat, Inc.

  • Talk Title: CephFS Today and Tomorrow
  • Bio Sketch: Greg Farnum is a long-standing member of the core Ceph development team, having joined the project as the third full-time engineer after graduating from Harvey Mudd College in 2009. He has served in many roles as Ceph grows and acts as a roving contributor when not wearing the hat of filesystem team technical lead. Greg is passionate about solving problems in distributed computing.

Mark Nelson, Red Hat, Inc.

  • Talk Title: Ceph Performance: Recent Testing and Findings
  • Bio Sketch: Mark Nelson is the Ceph Community Performance Lead at Red Hat. Mark's interests range from building tools to aid in performance analysis and testing of storage systems to research and development of new storage technologies. When he's not digging into topics like the behavior of memory allocators and file systems, Mark takes on overly-ambitious DIY home renovation projects and enjoys spending time with his wife and two children.