Mendes, Celso L.; Bode, Brett; Bauer, Gregory H.; Enos, Jeremy; Beldica, Cristina; Kramer, William T.
Deploying a Large Petascale System: the Blue Waters Experience
2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 29:198-209, 2014

Deployment of a large parallel system is typically a very complex process, involving several steps of preparation, delivery, installation, testing and acceptance. Despite the availability of various petascale machines currently, the steps and lessons from their deployment are rarely described in the literature. This paper presents the experiences observed during the deployment of Blue Waters, the largest supercomputer ever built by Cray and one of the most powerful machines currently available for open science. The presentation is focused on the final deployment steps, where the system was intensively tested and accepted by NCSA. After a brief introduction of the Blue Waters architecture, a detailed description of the set of acceptance tests employed is provided, including many of the obtained results. This is followed by the major lessons learned during the process. Those experiences and lessons should be useful to guide similarly complex deployments in the future.

DOI:10.1016/j.procs.2014.05.018

Find full text with Google Scholar.