Skip Nav

Hokule'a User Guide

Table of Contents

1. Introduction

1.1. Document Scope and Assumptions

This document provides an overview and introduction to the use of the IBM POWER8 (Hokule'a) located at the MHPCC DSRC, along with a description of the specific computing environment on Hokule'a. The intent of this guide is to provide information that will enable the average user to perform computational tasks on the system. To receive the most benefit from the information provided here, you should be proficient in the following areas:

  • Use of the UNIX operating system
  • Use of an editor (e.g., vi or emacs)
  • Remote usage of computer systems via network or modem access
  • A selected programming language and its related tools and libraries

1.2. Policies to Review

Users are expected to be aware of the following policies for working on Hokule'a.

1.2.1. Login Node Abuse Policy

Memory or CPU intensive programs running on the login nodes can significantly affect all users of the system. Therefore, only small applications requiring less than 10 minutes of runtime and less than 2 GBytes of memory are allowed on the login nodes. Any job running on the login nodes that exceeds these limits may be unilaterally terminated.

1.2.2. Workspace Purge Policy

The /scratch/ directory is subject to a 60 day purge policy. A system "scrubber" monitors scratch space utilization, and if available space becomes low, files not accessed within 60 days are subject to removal, although files may remain longer if the space permits. There are no exceptions to this policy.

1.2.3. Scheduled Maintenance Policy

The Maui High Performance Computing Center may reserve the entire system on site for regularly scheduled maintenance the 4th Wednesday of every month from 8:00 am - 10:00 pm (HST). The reservation is scheduled the previous Friday. Every Monday afternoon, a committee convenes to determine if maintenance will be performed.

Additionally, the system may be down periodically for software and hardware upgrades at other times. Users are usually notified of such times in advance by "What's New" and by the login banner. Unscheduled downtimes are unusual but do occur. In such cases, notification to users may not be possible. If you cannot access the system during a non-scheduled downtime period, please send an email or call HPC Help Desk.

1.2.4. Archive Policy

MHPCC has provided on its web site information for best use of the Archive. Users that read/write thousands of files or very large files to the Archive, adversely impact the performance of the Archive for all users. A user that is negatively impacting the performance of the Archive will be notified and advised of how to best use the Archive. After being notified, if the user continues to adversely impact the Archive, the user's access to the Archive will be suspended until the user has agreed to follow best use practices. Data that is stored on the Archive must be for legitimate projects or task orders. Users will be asked to remove data from the Archive that is not for a sanctioned project or task order. If the user does not remove the unacceptable data from the archive, it will be removed by the MHPCC storage administrator.

1.3. Obtaining an Account

The process of getting an account on the HPC systems at any of the DSRCs begins with getting an account on the HPCMP Portal to the Information Environment, commonly called a "pIE User Account." If you do not yet have a pIE User Account, please visit HPC Centers: Obtaining An Account and follow the instructions there. Once you have an active pIE User Account, visit the ARL accounts page for instructions on how to request accounts on the ARL DSRC HPC systems. If you need assistance with any part of this process, please contact the HPC Help Desk at accounts@helpdesk.hpc.mil.

1.4. Requesting Assistance

The HPC Help Desk is available to help users with unclassified problems, issues, or questions. Analysts are on duty 8:00 a.m. - 8:00 p.m. Eastern, Monday - Friday (excluding Federal holidays).

You can contact the MHPCC DSRC directly in any of the following ways for support services not provided by the HPC Help Desk:

For more detailed contact information, please see our Contact Page.

2. System Configuration

2.1. System Summary

Hokule'a is an IBM POWER8 cluster. Hokule'a has 32 compute nodes. Each node has 4 NVIDIA Tesla P100 SXM2 GPUs, for a total of 128 GPUs in the cluster. Each GPU has a theoretical peak of 5.3 TF for double precision, and 10.6 TF for single precision. For each GPU there are 3584 cores. Each GPU has 16 GB of HBM2 stacked memory with 720 GB/s bandwidth. Each GPU has NVLink 1.0 which has a total of 160 GB/s interconnect bandwidth per GPU. NVLink is used to connect a GPU to a POWER8 processor as well as to another GPU. The bandwidth between a GPU and a POWER8 processor is 40 GB/s in each direction (80 GB/s total), and the bandwidth between a GPU and another GPU is also 40 GB/s in each direction (80 GB/s total). Two GPUs are connected to a single POWER8. In the Hokule'a cluster there are 2 Power8 processors and 4 GPUs per node.

Node Configuration
Login Nodes Compute Nodes
Total Nodes 16 32
Operating System RedHat Linux
Cores/Node 20
Core Type IBM POWER 8
Core Speed 2.86 GHz
Memory/Node 256 GBytes
Memory Model Distributed
Interconnect Type Mellanox SB7700 36 X EDR 100Gb/s Infiniband
File Systems on Hokule'a
Path Capacity Type
/gpfs/home/<uid> 50 TBytesGPFS
/gpfs/scratch/<uid> 200 PBytesGPFS
/mnt/archive/<uid> 32 TBytes NFS

2.2. Processor

Hokule'a uses POWER8 processors on its login and compute nodes. There are 2 processors per node, each with 10 cores, for a total of 20 cores per node.

2.3. Memory

Hokule'a uses a distributed memory model. Memory is shared among all the cores on a node.

Nodes contains 256 GBytes of main memory. All memory and cores on the node are shared among all users who are logged in.

Each compute node contains 252 GBytes of user-accessible shared memory.

2.4. Operating System

The operating system on Hokule'a is RedHat Linux. The operating system supports 64-bit software.

2.5. File Systems

Hokule'a has the following file systems available for user storage:

2.5.1. $HOME /gpfs/home/<uid>

This file system is locally mounted from Hokule'a's GPFS file system. All users have a home directory located on this file system which can be referenced by the environment variable $HOME. Quotas enforced at 20GB by default.

2.5.2. $WORKDIR /gpfs/scratch/<uid>

These directories share Hokule'a's locally mounted GPFS file system. All users have a work directory located on /gpfs/scratch/<uid> which can be referenced by the environment variable $WORKDIR.

2.5.3. $ARCHIVE_HOME /mnt/archive/<uid>

This NFS mounted file system is accessible from the login nodes on Hokule'a. Files in this file system are subject to migration to tape and access may be slower due to the overhead of retrieving files from tape. All users have a directory located on this file system which can be referenced by the environment variable $ARCHIVE_HOME.

2.6. Peak Performance

Hokule'a is rated at 440 peak TFLOPS.

3. Accessing the System

3.1. Kerberos

A Kerberos client kit must be installed on your desktop to enable you to get a Kerberos ticket. Kerberos is a network authentication tool that provides secure communication by using secret cryptographic keys. Only users with a valid HPCMP Kerberos authentication can gain access to Hokule'a. More information about installing Kerberos clients on your desktop can be found at HPC Centers: Kerberos & Authentication.

3.2. Logging In

Hokule'a may be accessed via Kerberized SSH:

% ssh Hokule'a.arl.hpc.mil

Kerberized rlogin is also allowed.

or by the following Login Node:

hokulea.mhpcc.hpc.mil

Login nodes are shared access points for Hokule'a. Therefore, users should not run resource intensive processes on these nodes. MHPCC reserves the right to kill user processes without notice that may be affecting primary access functionality of the login nodes.

3.3. File Transfers

File transfers to DSRC systems (except transfers to the local archive system) must be performed using Kerberized versions of the following tools: ftp, scp, sftp, and mpscp. Before using any Kerberized tool, you must use a Kerberos client to obtain a Kerberos ticket. Information about installing and using a Kerberos client can be found at HPC Centers: Kerberos & Authentication.

The command below uses secure copy (scp) to copy a single local file into a destination directory on an Hokule'a login node. The mpscp command is similar to the scp command, but has a different underlying means of data transfer, and may enable greater transfer rate. The mpscp command has the same syntax as scp.

% scp local_file user@hokulea.mhpcc.hpc.mil:/target_dir

Both scp and mpscp can be used to send multiple files. This command transfers all files with the .txt extension to the same destination directory.

% scp *.txt user@hokulea.mhpcc.hpc.mil:/target_dir

The example below uses the secure file transfer protocol (sftp) to connect to Hokule'a, then uses the sftp cd and put commands to change to the destination directory and copy a local file there. The sftp quit command ends the sftp session. Use the sftp help command to see a list of all sftp commands.

% sftp user@Hokule'a.arl.hpc.mil

sftp> cd target_dir
sftp> put local_file
sftp> quit

The Kerberized file transfer protocol (kftp) command differs from sftp in that your username is not specified on the command line, but given later when prompted. The kftp command may not be available in all environments.

% kftp hokulea.mhpcc.hpc.mil

username> user
kftp> cd target_dir
kftp> put local_file
kftp> quit

Windows users may use a graphical file transfer protocol (ftp) client such as FileZilla.

4. User Environment

4.1. User Directories

The following user directories are provided for all users on Hokule'a.

4.1.1. Home Directory

When you log on to Hokule'a, you will be placed in your home directory, /gpfs/home/<username> . The environment variable $HOME is automatically set for you and refers to this directory. $HOME is visible to both the login and compute nodes and may be used to store small user files, but it has limited capacity and is not backed up; therefore, it should not be used for long-term storage.

4.1.2. Work Directory

Hokule'a has one large file system for the temporary storage of data files needed for executing programs. You may access your personal working directory under /gpfs/scratch by using the $WORKDIR environment variable, which is set for you upon login. Your $WORKDIR directory has no disk quotas. Because of high usage, the /gpfs/scratch file system tends to fill up frequently. Please review the Purge Policy and be mindful of your disk usage.

REMEMBER: /gpfs/scratch is a "scratch" file system and is not backed up. You are responsible for managing files in your $WORKDIR by backing up files to the archive server and deleting unneeded files when your jobs end. See the section below on Archive Usage for details.

All of your jobs should execute from your $WORKDIR directory, not $HOME. While not technically forbidden, jobs that are run from $HOME are subject to disk space quotas and have a much greater chance of failing if problems occur with that resource.

To avoid unusual errors that can arise from two jobs using the same scratch directory, a common technique is to create a unique subdirectory for each batch job by including the following lines in your batch script:

TMPD=${WORKDIR}/${PBS_JOBID}
mkdir -p ${TMPD}

4.2. Shells

The following shells are available on Hokule'a: csh, bash, ksh, tcsh, zsh, and sh. To change your default shell, please email a request to require@hpc.mil.

4.3. Environment Variables

A number of environment variables are provided by default on all HPCMP HPC systems. We encourage you to use these variables in your scripts where possible. Doing so will help to simplify your scripts and reduce portability issues if you ever need to run those scripts on other systems.

4.3.1. Common Environment Variables

The following environment variables are common to both the login and batch environments:

Common Environment Variables
Variable Description
$ARCHIVE_HOME Your directory on the archive server.
$ARCHIVE_HOST The host name of the archive server.
$BCI_HOME Variable points to location of HPCMO common set of open source utilities per HPCMO Baseline Configuration Program
$CSE_HOME (TBD) Variable points to location of Computational Science Environment (CSE) Software
$CSI_HOME (TBD) The directory containing the following list of heavily used application packages: ABAQUS, Accelrys, ANSYS, CFD++, Cobalt, EnSight, Fluent, GASP, Gaussian, LS-DYNA, MATLAB, and TotalView, formerly known as the Consolidated Software Initiative (CSI) list. Other application software may also be installed here by our staff.
$HOME Your home directory on the system.
$JAVA_HOME (TBD) The directory containing the default installation of JAVA.
$PET_HOME (TBD) The directory containing the tools formerly installed and maintained by the PETTT staff. This variable is deprecated and will be removed from the system in the future. Certain tools will be migrated to $COST_HOME, as appropriate.
$SAMPLES_HOME (TBD) The Sample Code Repository. This is a collection of sample scripts and codes provided and maintained by our staff to help users learn to write their own scripts. There are a number of ready-to-use scripts for a variety of applications.
$WORKDIR Your work directory on the local temporary file system (i.e., local high-speed disk).
4.3.2. Batch-Only Environment Variables

In addition to the variables listed above, the following variables are automatically set only in your batch environment. That is, your batch scripts will be able to see them when they run. These variables are supplied for your convenience and are intended for use inside your batch scripts.

Batch-Only Environment Variables
Variable Description
$BC_CORES_PER_NODE The number of cores per node for the compute node on which a job is running.
$BC_MEM_PER_NODE The approximate maximum user-accessible memory per node (in integer MBytes) for the compute node on which a job is running.
$BC_MPI_TASKS_ALLOC The number of MPI tasks allocated for a job.
$BC_NODE_ALLOC The number of nodes allocated for a job.

4.4. Modules

Software modules are a convenient way to set needed environment variables and include necessary directories in your path so that commands for particular applications can be found. Hokule'a uses "modules" to initialize your environment with COTS application software, system commands and libraries, compiler suites, environment variables, and PBS batch system commands.

A number of modules are loaded automatically as soon as you log in. To see the modules which are currently loaded, use the "module list" command. To see the entire list of available modules, use "module avail". You can modify the configuration of your environment by loading and unloading modules. For complete information on how to do this, see the Modules User Guide.

4.5. Archive Usage (/archive)

Archive storage is provided through the /mnt/archive/<uid> NFS-mounted file system. All users are automatically provided a directory under this file system. However, it is only accessible from the login nodes. Since space in a user's login home area is limited, all large data files requiring permanent storage should be placed in/mnt/archive/<uid>. Also, it is recommended that all important smaller files for which a user requires long-term access be copied to/mnt/archive/<uid>as well. For more information on using the archive system, see the Archive System User Guide.

5. Program Development

5.1. Programming Models

Hokule'a supports Message Passing Interface (MPI). MPI is an example of a message- or data-passing model.

5.1.1. Message Passing Interface (MPI)

Hokule'a has IBM Spectrum MPI and OpenMPI standard library suites.

5.2. Available Compilers

Hokule'a has three programming environment suites:

  • IBM
  • GNU
  • PGI

The paths for the compilers are already setup for users through the use of "modules". The default modules loaded can be viewed by executing the command module list

To see what modules are available execute the command module avail

To change environment to a different module/compiler execute the command module purge

5.2.1. IBM Compilers
IBM Compiler Options
OptionPurpose
x|c IBM C compiler
x|C IBM C++ compiler
x|f F77 and F90 compiler
mpicc Compiles and links MPI programs written in C
mpiCC Compiles and links MPI programs written in C++.
mpif77 Compiles and links MPI programs written in Fortran 77
mpif90 Compiles and links MPI programs written in Fortran 90
5.2.2. GNU Compiler Collection
GNU Compiler Options
OptionPurpose
gcc C compiler, found in path /usr/bin
g++ C++ compiler, found in path /usr/bin
g77 Fortran 77 compiler, found in path /usr/bin
mpicc Compiles and links MPI programs written in C
mpiCC Compiles and links MPI programs written in C++
mpi77 Compiles and links MPI programs written in Fortran 77

NOTE: All MPI compilers are built for Infiniband interconnect communication. We do not support slower ethernet drivers.


Library paths:

/usr/lib

/usr/lib64

6. Batch Scheduling

6.1. Scheduler

The Portable Batch System (PBS) is currently running on Hokule'a. It schedules jobs and manages resources and job queues, and can be accessed through the interactive batch environment or by submitting a batch request. PBS is able to manage both single-processor and multiprocessor jobs. The PBS module is automatically loaded by the Master module on Hokule'a at login.

6.2. Queue Information

The following table describes the PBS queues available on Hokule'a:

Queue Name Description
urgent Approved by HPCMP Director only
high Must be approved by Service/Agency Principal
frontier Must be a Frontier project approved by HPCMP
debug LESS than 30 minutes and LESS than or EQUAL to 64 processors
standard Must be an approved HPCMP project
background Lowest priority, 8 hour limit, no allocation subtraction

6.3. Interactive Logins

When you log in to Hokule'a, you will be running in an interactive shell on a login node. The login nodes provide login access for Hokule'a and support such activities as compiling, editing, and general interactive use by all users. Please note the Login Node Abuse Policy. The preferred method to run resource-intensive executions is to use an interactive batch session.

6.4. Interactive Batch Sessions

An interactive session on a compute node is possible using a proper PBS command line syntax from a login node. Once PBS has scheduled your request on the compute pool, you will be directly logged into a compute node, and this session can last as long as your requested wall time.

To submit an interactive batch job, use the following submission format:

qsub -I -X -l walltime=HH:MM:SS -l select=#_of_nodes:ncpus=30:mpiprocs=20 -l place=scatter:excl -A proj_id -q your_queue -V

Your batch shell request will be placed in the interactive queue and scheduled for execution. This may take a few minutes or a long time depending on the system load. Once your shell starts, you will be logged into the first compute node of the compute nodes that were assigned to your interactive batch job. At this point, you can run or debug applications interactively, execute job scripts, or start executions on the compute nodes you were assigned. The "-X" option enables X-Windows access, so it may be omitted if that functionality is not required for the interactive job.

6.5. Batch Request Submission

PBS batch jobs are submitted via the qsub command. The format of this command is:

qsub [ options ] batch_script_file

qsub options may be specified on the command line or embedded in the batch script file by lines beginning with "#PBS".

For a more thorough discussion of PBS batch submission, see the Hokule'a PBS Guide.

6.6. Batch Resource Directives

A listing of the most common batch Resource Directives is available in the Hokule'a PBS Guide.

6.7. Launch Commands

There are different commands for launching MPI executables from within a batch job depending on which MPI implementation your script uses.

To launch an IBMMPI executable, use the mpirun command as follows:

mpirun ./mympijob.exe

To launch an OpenMPI executable, use the mpirun command within the OpenMPI path as follows:

mpirun ./mympijob.exe

To launch an IBM PE MPI executable, use the mpiexec command as follows:

mpiexec ./mympijob.exe

Note: The pe module must be loaded to ensure that MP environment variables are set correctly.  In addition, the MP_HOSTFILE and MP_PROCS variables need to be manually set prior to executing your IBM PE program.


For OpenMP executables, no launch command is needed.

6.8. Sample Script

Additional examples are available in the Hokule'a PBS Guide on Hokule'a.

6.9. PBS Commands

The following commands provide the basic functionality for using the PBS batch system:

qsub: Used to submit jobs for batch processing.
qsub [ options ] my_job_script

qstat: Used to check the status of submitted jobs.
qstat PBS_JOBID ## check one job
qstat -u my_user_name ## check all of user's jobs

qdel: Used to kill queued or running jobs.
qdel PBS_JOBID

A more complete list of PBS commands is available in the Hokule'a PBS Guide.

7. Software Resources

7.1. Application Software

A complete listing with installed versions can be found on our software page. The general rule for all COTS software packages is that the two latest versions will be maintained on our systems. For convenience, modules are also available for most COTS software packages.

7.2. Useful Utilities

The following utilities are available on Hokule'a:

Useful Utilities
UtilityDescription
check_license Checks the status of HPCMP shared applications.
node_use Display the amount of free and used memory for login nodes.
qpeek Display spooled stdout and stderr for an executing batch job.
qview Display information about batch jobs and queues.
showq User friendly, highly descriptive representation of the PBS queue specific to Hokule'a.
show_queues Report current batch queue status, usage, and limits.
showres An informative command regarding reservations.
show_storage Display MSAS allocation and usage by subproject.
show_usage Display CPU allocation and usage by subproject.

8. Links to Vendor Documentation

IBM Home: http://www.ibm.com
IBM POWER8: http://www-03.ibm.com/systems/power/hardware/
IBM POWER8 High-Performance Computing Guide:
https://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/sg248371.html
Performance Optimization and Tuning Techniques for IBM Power Systems Processors Including IBM POWER8 http://www.redbooks.ibm.com/abstracts/sg248171.html

RedHat Home: http://www.redhat.com/

GNU Home: http://www.gnu.org
GNU Compiler: http://gcc.gnu.org/onlinedocs

Linux High Performance Technical Computing: http://www.linuxhpc.org/
UnixGuide: http://unixguide.net/