Creating a Playground HPC Cluster
Dr Ahmed Shamsul Arefin1
1Csiro, Acton, Australia
We present a simple HPC playground for researchers, sysadmins and minds-alike to test and eliminate simple bugs in the HPC environment prior to any major production roll-out. We propose to use VMWare, Bright Cluster Manager (with a free Easy8 license) and SLES15 ISO from Bright Computing. We create the virtual environment as close possible to the production HPC cluster, however deploy the cluster head and compute nodes as virtual machines (VM). The process is initiated by installing a base OS and a virtualization tool on a physical hardware. The head installation process is GUI guided, which also deploys a DHCP server, scheduler and a file server. We create the compute VMs, pre-allocate storage and assign a MAC address to each node. Finally, from the Bightview, a Bright Cluster Manager admin portal inside the head node VM, we create compute node profiles and assign the designated MAC addresses. The head node dynamically serves the OS image, IP addresses and hostnames when the compute VMs are turned on. We install software in the NFS mounted folder as per our testing requirements.
The final virtualised outcome HPC cluster software stack is capable of scheduling jobs and running apps. It enables us to test the latest compilers and commercial software against any updated OS kernel. It won’t facilitate to test the network, or InfiniBand performance, but accelerated compatibility is an option when the host has GPUs or similar devices. In future we aim to automate the cluster creation process by creating relevant Ansible playbooks.
Dr Arefin works within the HPC Team, Scientific Computing, CSIRO. He completed his PhD-Postdoc in the area of parallel data mining from the University of Newcastle and a PGCert of Higher Education from the Macquarie University His primary research interest focuses on the application of HPC in data analytics.