Builder User Guide
Table of Contents
- 1. Introduction
- 1.1. Document Scope and Assumptions
- 1.2. Policies to Review
- 1.2.1. Login Node Abuse Policy
- 1.2.2. Workspace Purge Policy
- 1.2.3 Scheduled Maintenance Policy
- 1.2.3 Archive Policy
- 1.3. Obtaining an Account
- 1.4. Requesting Assistance
- 2. System Configuration
- 2.1. System Summary
- 2.2. Processor
- 2.3. Memory
- 2.4. Operating System
- 3. Accessing the System
- 3.1. Kerberos
- 3.2. Logging In
- 3.3. File Transfers
- 4. User Environment
- 4.1. User Directories
- 4.1.1. Home Directory
- 4.1.2. Work Directory
- 4.1.3. Centers Directory
- 4.2. Shells
- 4.3. Environment Variables
- 5. Program Development
- 5.1. Programming Models
- 5.2. Available Compilers
- 5.2.1. GNU Compilers
- 6. Batch Scheduling
- 6.1. Scheduler
- 7. Software Resources
- 7.1. Application Software
- 7.2. Useful Utilities
- 7.3. Sample Code Repository
- 8. Links to Vendor Documentation
1. Introduction
1.1. Document Scope and Assumptions
This document provides an overview and introduction to the use of the system named Builder located at the MHPCC DSRC, along with a description of the specific computing environment on Builder. The intent of this guide is to provide information that will enable the average user to perform computational tasks on the system. To receive the most benefit from the information provided here, you should be proficient in the following areas:
- Use of the UNIX operating system
- Use of an editor (e.g., vi or emacs)
- Use of an editor (e.g., vi or emacs)
- Remote usage of computer systems via network or modem access
- A selected programming language and its related tools and libraries
1.2. Policies to Review
Users are expected to be aware of the following policies for working on Builder.
1.2.1. Login Node Abuse Policy
Memory or CPU intensive running programs can significantly affect all users of the system. The Builder node is shared by all users and does not employ a batch scheduler. Long running processes and containers tests (over 12 hours) may be terminated to allow fair access to all users. Users may execute their containers with small data sets for under 3 hours on the Builder system as long as the node resources (CPU, GPU, RAM, etc.) are available. Users are expected to avoid excessive use of Builder resources running/testing containers to ensure everyone has equal access to the system.
1.2.2. Workspace Purge Policy
The /raid/workdir/
1.2.3. Scheduled Maintenance Policy
The Maui High Performance Computing Center may reserve the entire system on site for regularly scheduled maintenance the 3rd Wednesday of every month from 8:00 am - 10:00 pm (HST). The reservation is scheduled the previous Friday. Every Monday afternoon, a committee convenes to determine if maintenance will be performed.
Additionally, the system may be down periodically for software and hardware upgrades at other times. Users are usually notified of such times in advance by "What's New" and by the login banner. Unscheduled downtimes are unusual but do occur. In such cases, notification to users may not be possible. If you cannot access the system during a non-scheduled downtime period, please send an email or call HPC Help Desk.
1.2.4. Archive Policy
MHPCC has provided on its web site information for best use of the Archive. Users that read/write thousands of files or very large files to the Archive, adversely impact the performance of the Archive for all users. A user that is negatively impacting the performance of the Archive will be notified and advised of how to best use the Archive. After being notified, if the user continues to adversely impact the Archive, the user's access to the Archive will be suspended until the user has agreed to follow best use practices. Data that is stored on the Archive must be for legitimate projects or task orders. Users will be asked to remove data from the Archive that is not for a sanctioned project or task order. If the user does not remove the unacceptable data from the archive, it will be removed by the MHPCC storage administrator.
1.3. Obtaining an Account
The process of getting an account on the HPC systems at any of the DSRCs begins with getting an account on the HPCMP Portal to the Information Environment, commonly called a "pIE User Account." If you do not yet have a pIE User Account, please visit HPC Centers: Obtaining An Account and follow the instructions there. Once you have an active pIE User Account, visit the MHPCC accounts page for instructions on how to request accounts on the MHPCC DSRC HPC systems. If you need assistance with any part of this process, please contact the HPC Help Desk at accounts@helpdesk.hpc.mil.
1.4. Requesting Assistance
The HPC Help Desk is available to help users with unclassified problems, issues, or questions. Analysts are on duty 8:00 a.m. - 8:00 p.m. Eastern, Monday - Friday (excluding Federal holidays).
- Web: https://helpdesk.hpc.mil
- E-mail: help@helpdesk.hpc.mil
- Phone: 1-877-222-2039 or (937) 255-0679
- Fax: (937) 656-9538
You can contact the MHPCC DSRC directly in any of the following ways for support services not provided by the HPC Help Desk:
- Web: https://mhpcc.hpc.mil/user/help_form.html
- E-mail: help@helpdesk.hpc.mil
- Phone: (808) 879-5077
- Fax: (808) 879-5018
- U.S. Mail:
Maui High Performance Computing Center
550 Lipoa Parkway
Kihei, Maui, HI 96753
For more detailed contact information, please see our Contact Page.
2. System Configuration
2.1. System Summary
Builder is an Aspen Systems Linux Gigabyte server. The node is populated with AMD 7742 processors. Builder uses Intel GBE as its high-speed network for MPI messages and IO traffic. Builder uses AVAGO MegaRAID to manage its local file system that targets 130TB of disk arrays. Builder has 1 compute node that share memory only on the node; memory is not shared across the nodes (only 1 node). Each compute node has 2 64-core processors (128 cores) with its own RHEL 8 operating system, sharing 1024 GBytes of memory, with no user-accessible swap space. Builder has 130 TBytes (formatted) of disk storage.
The Builder node is shared by all users and does not employ a batch scheduler. Long running processes and containers tests (over 12 hours) may be terminated to allow fair access to all users. Users may execute their containers with small data sets for under 3 hours on the Builder system as long as the node resources (CPU, GPU, RAM, etc.) are available. Users are expected to avoid excessive use of Builder resources running/testing containers to ensure everyone has equal access to the system.
Login/Compute Node | ||
---|---|---|
Total Nodes | 1 | |
Operating System | RedHat Linux 8 | |
Cores/Node | 128 | |
Core Type | AMD EPYC 7742 + 2x NVIDIA V100 | |
Core Speed | 2.25 GHz | |
Memory/Node | 1024 GBytes +64 GBytes HBM | |
User Accessible Memory/Node | 1000GBytes | |
Memory Model | Shared on node | |
Interconnect Type | XL710 for 40GbE QSFP+ to cwfs, I350 Gigabit Network Connection to DREN |
File System | File System Type |
---|---|
/home ($HOME) User Quota | 1.5TB local ssd, xfs, 50GB |
/raid/projdir ($PROJECTS_HOME) | 80TB local raid, xfs |
/raid/workdir ($WORKDIR) User Quota | 51TB local raid, xfs, 14 day min file retention, 200GB |
/ssd | 3TB Singularity scratch disk |
/p/cwfs ($CENTER) | nfs, 120 day min file retention |
2.2. Processor
Builder uses two 2.25 GHz, sixty four-core AMD EPYC2 processors (128 Cores). Builder's GPU consists of dual NVIDIA V100 GPUs with 32GB of memory.
2.3. Memory
Builder uses a shared memory model. Memory is shared among all the cores on the node, which contains 1024 GBytes of user accessible shared memory.
2.4. Operating System
The operating system on Builder is RedHat Linux.
3. Accessing the System
3.1. Kerberos
A Kerberos client kit must be installed on your desktop to enable you to get a Kerberos ticket. Kerberos is a network authentication tool that provides secure communication by using secret cryptographic keys. Only users with a valid HPCMP Kerberos authentication can gain access to Builder. More information about installing Kerberos clients on your desktop can be found at HPC Centers: Kerberos & Authentication.
3.2. Logging In
The system host name for Builder is builder.mhpcc.hpc.mil.
• Kerberized SSH
The preferred way to login to Builder is via ssh, as follows:
ssh -l username builder.mhpcc.hpc.mil
3.3. File Transfers
File transfers to DSRC systems (except for those to the local archive system) must be performed using Kerberized versions of the following tools: scp, mpscp, sftp, ftp, and kftp. Before using any Kerberized tool, you must use a Kerberos client to obtain a Kerberos ticket. Information about installing and using a Kerberos client can be found at HPC Centers: Kerberos & Authentication.
The command below uses secure copy (scp) to copy a single local file into a destination directory on a Builder login node. The mpscp command is similar to the scp command, but has a different underlying means of data transfer, and may enable greater transfer rate. The mpscp command has the same syntax as scp.
% scp local_file user@builder.mhpcc.hpc.mil:/target_dir
Both scp and mpscp can be used to send multiple files. This command transfers all files with the .txt extension to the same destination directory.
% scp *.txt user@ builder.mhpcc.hpc.mil:/target_dir
The example below uses the secure file transfer protocol (sftp) to connect to Builder, then uses the sftp cd and put commands to change to the destination directory and copy a local file there. The sftp quit command ends the sftp session. Use the sftp help command to see a list of all sftp commands.
sftp user@builder.mhpcc.hpc.mil
sftp> cd target_dir
sftp> put local_file
sftp> quit
The Kerberized file transfer protocol (kftp) command differs from sftp in that your username is not specified on the command line, but given later when prompted. The kftp command may not be available in all environments.
% kftp builder.mhpcc.hpc.mil
username> user
kftp> cd target_dir
kftp> put local_file
kftp> quit
Windows users may use a graphical file transfer protocol (ftp) client such as FileZilla.
4. User Environment
4.1. User Directories
The following user directories are provided for all users on Builder.
4.1.1. Home Directory
When you log in, you are placed in your home directory, /home/username. It can be referenced by the environment variable $HOME. Your home directory is intended for storage of frequently-used files, scripts, and small utility programs. It has a 50GB quota, and files stored there are not subject to automatic deletion based on age. It is backed up weekly to enable file restoration in the event of catastrophic system failure.
Important! The home file system is not tuned for parallel I/O and does not support application-level I/O. Jobs performing file I/O in your home directory will perform poorly and cause problems for everyone on the system. Running jobs should use the work file system (/raid/workdir) for file I/O.
4.1.2. Work Directory
The work file system is a high-performance raid with a capacity of 51TB. It is accessible from the login and provides temporary file storage for queued and running jobs.
All users have a work directory, /raid/workdir/username, on this file system, which can be referenced by the environment variable, $WORKDIR. This directory should be used for all application file I/O. NEVER allow your jobs to perform file I/O in $HOME.
$WORKDIR has a 200GB quota. It is not backed up or exported to any other system and is subject to an automated deletion cycle. If available disk space gets too low, files that have not been accessed in 30 days may be deleted. If this happens or if catastrophic disk failure occurs, lost files are irretrievable. To prevent the loss of important files, transfer them to a long-term storage area, such as your archival directory ($ARCHIVE_HOME), which has no quota. Or, for smaller files, your home directory ($HOME).
The projects file system is a high-performance raid with a capacity of 80TB. User may request shared PROJECTS_HOME directories. It can be referenced by the environment variable $PROJECTS_HOME and resides at /raid/projdir/projectname. It has a 50GB quota, and files stored there are not subject to automatic deletion based on age. It is backed up weekly to enable file restoration in the event of catastrophic system failure
4.1.3. Centers Directory
The Center-Wide File System (CWFS) is an NFS-mounted file system. It is accessible from the login nodes of all HPC systems at the center and from the HPC Portal. It provides centralized, shared storage that enables users to easily access data from multiple systems. The CWFS is not tuned for parallel I/O and does not support application-level I/O.
All users have a directory on the CWFS. The name of your directory may vary between machines and between centers, but the environment variable $CENTER will always refer to this directory.
$CENTER has a quota of 100 TBytes. It is not backed up or exported to any other system and is subject to an automated deletion cycle. If available disk space gets too low, files that have not been accessed in 120 days may be deleted. If this happens or if catastrophic disk failure occurs, lost files are irretrievable. To prevent the loss of important files, transfer them to a long-term storage area, such as your archival directory ($ARCHIVE_HOME), which has no quota. Or, for smaller files, your home directory ($HOME).
4.2. Shells
The following shells are available on Builder: csh, bash, ksh, tcsh, zsh, and sh.
To change your default shell, log into the Portal to the Information Environment and go to "User Information Environment" > "View/Modify personal account information". Scroll down to "Preferred Shell" and select your desired default shell. Then scroll to the bottom and click "Save Changes". Your requested change should take effect within 24 hours.
4.3. Environment Variables
A number of environment variables are provided by default on all HPCMP high performance computing (HPC) systems. We encourage you to use these variables in your scripts where possible. Doing so will help to simplify your scripts and reduce portability issues if you ever need to run those scripts on other systems. The following environment variables are automatically set in your login environment:
4.3.1. Common Environment Variables
The following environment variables are common to both the login and batch environments:
Variable | Description |
---|---|
$ARCHIVE_HOME | Your directory on the archive server. |
$ARCHIVE_HOST | The host name of the archive server. |
$BC_HOST | The generic (not node specific) name of the system. |
$CC | The currently selected C compiler. This variable is automatically updated when a new compiler environment is loaded. |
$CENTER | Your directory on the Center Wide File System (CWFS). |
$CXX | The currently selected C++ compiler. This variable is automatically updated when a new compiler environment is loaded. |
$F77 | The currently selected Fortran 77 compiler. This variable is automatically updated when a new compiler environment is loaded. |
$HOME | Your home directory on the system. |
$KRB5_HOME | The directory containing the Kerberos utilities. |
$PROJECTS_HOME | A common directory where group-owned and supported applications and codes may be maintained for use by members of a group. Any project may request a group directory under $PROJECTS_HOME. |
$WORKDIR | Your work directory on the local temporary file system (i.e., local high-speed disk). |
5. Program Development
5.1. Programming Models
Builder does not use MPI.
5.2. Available Compilers
Builder has 1 compiler suite
- GNU
5.2.1. GNU Compiler Collection
Option | Purpose |
---|---|
gcc | C compiler, found in path /usr/bin |
g++ | C++ compiler, found in path /usr/bin |
6. Batch Scheduling
6.1. Scheduler
Builder does not employ a batch scheduler
7. Software Resources
7.1. Application Software
The main application running on this system is singularity. The general rule for all COTS software packages is that the two latest versions will be maintained on our systems. For convenience, modules are also available for most COTS software packages.
7.2. Useful Utilities
TBD
7.3. Sample Code Repository
TBD