Dr Ahmed Shamsul Arefin1
1Csiro, Canberra, Australia
The CSIRO’s HPC cluster systems are composed of 500+ compute nodes with various strengths, values and features. Some of the nodes are made up with the world’s fastest GPUs, while some with TB+ of DRAM and so on. However, the whole family of Linux computing facility is managed by a single SLES software image, rolled out and setup according to its target node profiles. A commercial HPC management tool called Bright Cluster Manager (BCM) is effectively utilized to tackle the HPC administration and monitoring workloads. In this work, we briefly report the basic framework of management of the CSIRO’s HPC systems and introduce with an upcoming end user monitoring feature called User Portal.
Dr Ahmed Arefin is a Computation Scientist working within the HPC Systems Team, Scientific Computing Platforms, CSIRO. He completed his PhD in Computer Science (Data-Parallel Computing & GPUs) from the University of Newcastle, Australia and worked as a Postdoctoral Researcher (Parallel Data Mining) at the Centre for Bioinformatics, Biomarker Discovery & Information-Based Medicine (CIBM), The University of Newcastle, Australia. His research interest focuses on the application of HPC in data mining, graphs and visualization.