MHPCC
HPC Portal Maintenance - Wednesday, September 2nd, 2015
27 August 2015
We will be performing a security patch for the MHPCC Portal onWednesday, September 2nd, 2015. The patch is expected to take 1 hour and the MHPCC

During this time the Portal will be unavailable.
Please plan accordingly.

If you have any questions/comments, please direct them to the CCAC (Consolidated Customer Assistance Center)

1-877-222-2039 or by e-mail at help@ccac.hpc.mil.

Thank You,
Portal Development Team
Holiday Notice - Monday, September 7th, 2015
24 August 2015
On Monday, September 7th, 2015 MHPCC will be operating on a holiday schedule. The center will be open and all technology services will be available, however MHPCC will be operating at reduced staffing levels.

If you require assistance, onsite staff will be available to respond to Severity 1 Help Desk calls at 808-879-5077, and if necessary, MHPCC will contact other technical staff if needed. If you anticipate requiring any special assistance on September 7th, please contact our User Services department or the CCAC by email or by telephone.


Home » Documentation
Printable versionPrintable version

Guide to Using the Archive System

Table Of Contents


1. Archival Basics

1.1. Why do I need to archive my data?

The short answer is to free up system resources and to protect your data.
Your work directory, $WORKDIR, resides on a large temporary file system that is shared with other users.  This file system is intended to temporarily hold data that is needed or generated by your jobs.  Since there is no quota on these directories and since user jobs often generate a lot of data, the file system would fill up very quickly if everyone was allowed to just leave their files there indefinitely.  This would negatively impact everyone and make the system unusable.  To protect the system, an automated purge cycle may run to free up disk space by deleting older or unused files.  And, if file space becomes critically low, ALL FILES, regardless of age, are subject to deletion.  To avoid this, we strongly encourage you to archive the data you want and keep your $WORKDIR clean by removing unnecessary files.  Remember that your $WORKDIR is not backed up; so if your files are purged and you didn't archive them, they are gone forever!

1.2. How does archival work?

The archive system ($ARCHIVE_HOST) provides a long-term storage area for your important data.  It is extremely large, and your personal archive directory ($ARCHIVE_HOME) has no quota. Even so, you probably don't want to archive everything you generate.
When you archive a file, it's copied to your $ARCHIVE_HOME directory on the archive server's disk cache, where it waits to be written to tape by the system.  The disk cache is a large temporary storage area for files moving to and from tape.  A file in the cache is said to be "online," while a file on tape is "offline."  Once your file is written to tape, it may remain "online" for a short time, but eventually it will be removed from the disk cache to make room for other files in transit.  Both online and offline files will show up in a directory listing, but offline files need to be retrieved from tape before you can use them.
Retrieval from tape can take a while, so be patient; there's a lot going on in the background.  First, the system must determine on which tape (or tapes) your data resides.  These are then robotically pulled from the tape library, mounted in one of the limited number of tape drives (assuming not all of them are busy), and wound into position before retrieval can begin.  After a delay, your file will be retrieved from tape and available for use.

1.2.1. Is there any way to estimate retrieval time?

Transfer or wait times are dependent on a multitude of parameters ranging from size and number of files, to number of tapes involved, to network load (tape-to-cache and cache-to-hpc), and others too numerous to list. Since most of these parameters constantly vary throughout the day, it is almost impossible to provide realistic estimates for transfer times.
A best case scenario for transfer times can be estimated, however, by assuming "zero network load" and ignoring all other parameters that adversely affect transfer time.  Of course, you must realize that this is an extremely optimistic estimate, and actual retrieval times WILL vary widely. To do this, you'll need three values:

With these values, the ideal retrieval time is estimated as follows:  
(size / bw1)seconds + (size / bw2)seconds + (1 minute for tape loading)
Using this formula, the ideal retrieval time for a 1-GByte file on the ORS system is estimated to be:  
(1024 / 120) + (1024 / 200)seconds + (1 minute) => 1 minute 12 seconds
The following table lists the maximum bandwidths for each center:


Maximum Bandwidths at DSRCs

 

Tape-to-archive cache

Archive cache-to-HPC

AFRL

240 MBytes/sec

165 MBytes/sec

ARL

240 MBytes/sec

140 MBytes/sec

ERDC

240 MBytes/sec

137 MBytes/sec

MHPCC

120 MBytes/sec

275 MBytes/sec

Navy

240 MBytes/sec

138 MBytes/sec

ORS

120 MBytes/sec

295 MBytes/sec

1.3. What are the archival configurations?

When using the standardized archival commands (see Section 3 below), the details of the archival configuration at a center are unimportant.  Some archival functions, however, can't be done with the archive command, so it's helpful to understand the archival setup wherever you're working.
There are two main archival processes currently in use across the High Performance Computing Modernization Program, but each DSRC has minor variations that affect how you access the archive server.  Some centers use NFS to mount their archive system on their HPC and Utility Server systems so that they appear as directories on the local machine.  This is very convenient but slightly slower.  Other centers provide access via remote commands, such as scp or rsh.  It's a little less convenient but slightly faster.  Other centers do both, and at those centers, you can choose which method to use.  In addition, some centers allow direct login to their archive server, allowing you to easily manage archived files.
The table below shows the access methods in use at each center.


Archival Processes at the DSRC's

 

NFS Mount

Remote Commands

Direct Login

AFRL

x

x

x

ARL

x

 

 

ERDC

 

x

x

MHPCC

x

 

 

Navy

 

x

x

ORS

x

x

x

1.3.1. NFS Mount (AFRL, ARL, MHPCC, and ORS)

An NFS-mounted archive file system provides perhaps the most familiar environment for interacting with archived data.  Mounted file systems appear as local directories and are accessible via standard Linux commands, such as cd, mkdir, chmod, etc.  Files can be archived/retrieved simply by copying them to/from $ARCHIVE_HOME.  This approach is extremely convenient and has virtually no learning curve, but can result in slightly slower transfer speeds, which may be more evident with larger files. It's also not portable if used in job scripts.  For portability, we recommend that you use the archive command discussed in Section 3.2.  The $ARCHIVE_HOST environment variable is irrelevant for NFS-mounted file systems.

1.3.2. Remote Commands (AFRL, ERDC, Navy, and ORS)

Remote access to the archive servers, while not quite as convenient as an NFS mount, provides slightly faster transfer speed due to lower network overhead.  Files can be archived/retrieved using commands such as scp, rcp, or mpscp, and other functions (such as chmod, mkdir, etc.) can be performed via the remote shell commands, ssh and rsh.  Remote commands are available in Kerberized and non-Kerberized variants, and each center may support a slightly different set of commands.  In addition, Kerberized commands generally don't work in transfer queue jobs.  All of these commands can make use of the $ARCHIVE_HOST and $ARCHIVE_HOME environment variables.  Some remote commands are demonstrated in Section 3.3 (below).  Note that scripts using remote commands may not be portable.  For portability, we recommend that you use the archive command discussed in Section 3.2.

1.3.3. Direct Login (AFRL, ERDC, Navy, and ORS)

Direct login to the archive server provides a standard Linux environment with access to all of the familiar commands, such as cp, mv, mkdir, rmdir, chmod, etc.   This allows you to easily and efficiently organize your archived content, set permissions, or delete content that is no longer needed.  Most commands can be run without causing the retrieval of anything from tape.   However, actions requiring access to the actual contents of a file will automatically retrieve the file and will take longer to complete.  For example: copying or editing a file that is already on tape, tar operations, compression operations, etc.

1.4. What is data staging?

Data staging is the process of making sure that your data is in the right place at the right time.  Related terms are "staging in" or "pre-staging" and "staging out" or "post-job archival."  Before a job can run, the input data needs to be "staged in" or "pre-staged."  This simply means that the data is copied from the archive server (or some other source) into a directory that is accessible by the job script.  Archiving your output data after the job completes is called "post-job archival" or "staging out."  "Staging out" may also refer to moving your output data to another location, like the Center-Wide File System ($CENTER), for further processing.
Staging may be performed manually or via a batch script, but since retrieving a file (especially a large file) from tape may take a while, ensuring that your input data is in place before your job runs, and that it stays there until it runs, isn't always as simple as it sounds.   To help with this, every HPC system and Utility Server has a transfer queue just for handling file transfers.  For more about manual staging, see Section 3 (below).  For more about batch staging with the transfer queue, see Section 5 (below). 

2. Important Guidelines

These guidelines are important to help safeguard stability of the archive server and to minimize negative impact to all users.  Failure to observe these guidelines may result in loss of archival privileges.

2.1. Do use compressed tar files.

There are two factors that make archival using compressed tar files a good idea:  overhead and size.
First, let's look at overhead.   Every time you archive or retrieve a file, a complex set of time-consuming actions occurs.  Some of these actions are described in Section 1.2, but there are others as well.   So, if you archive 100 individual files, those time-consuming actions must be performed 100 times.  This can really add up.  But if you combine those 100 files into a single tar file, those time-consuming actions happen only once.  Also note that NOT using tar files can adversely impact the performance of the archive server for all users.
Now let's look at size.  By compressing a tar file, you not only save space on the archive server (which benefits everyone), but you also increase the likelihood that your file will fit entirely on a single tape, eliminating the need to pull and mount multiple tapes and decreasing the chance of file corruption.  It also reduces the transfer time when moving the file to or from the archive server.  Note: always remember tar/gzip your files before transferring them.  There is, however, one gotcha that you need to watch for when using tar files.  Do not make them too big.  While the optimal tar file size may vary between DSRCs, a maximum tar file size of about 200 GBytes is a good rule-of-thumb.  At that size, the time required for file transfer and tape I/O is still reasonable. Files larger than 1 TByte are far more likely to span tapes, greatly increasing archival and retrieval times, as well as the chance that a portion of the file could become unusable.  The following table shows the maximum recommended tar file sizes at each of the centers.


Center

Recommended Maximum Tar File Size

AFRL DSRC

500 GBytes

ARL DSRC

200 GBytes

ERDC DSRC

500 GBytes

Navy DSRC

500 GBytes

MHPCC DSRC

200 GBytes

ORS

200 GBytes

There is one final caveat to address.  If your files are mostly binary data, compressing them will do little good and could possibly cost more time than would be saved.  If this is true of your data, you should probably forego compression, though we still recommend combining multiple files into a single tar file.

2.2. Do not overwhelm the archive system.

Although the archive system provides enormous capacity, it is, in fact, limited in two important ways.  The most significant limit is the number of tape drives, which determines the number of tapes that can be read from or written to at once.  The second limit is the size of the disk cache, which determines how much data can be online at once. 
Attempting to archive or retrieve too many files at once can fill up the disk cache on the archive server, halting archival and staging for all users.  Even if the cache does not reach capacity, it could still tie up all available tape drives, impacting other users.  To avoid this possibility, if you need to retrieve more than about 10 TBytes of data or more than about 30 files at once, please contact the HPC Help Desk for assistance.

2.3. Do not use files in the archive directly.

This is a common mistake for users who are logged into an archive server directly, or who use an NFS-mounted archive partition.   The important thing to realize is that although files "look" like they're on disk, they're actually on tape.  Any attempt to use those files (for instance with commands like tar, vi, more, less, or grep) will begin the time-consuming process of retrieving the file from tape.  Imagine the result of the following:
zcat *.tar.gz | tar -tv | grep search_term

The intent of this command would be to grep through the content listings of multiple compressed tar files for a search term.  On a normal file system, this would be no big deal.   But on an archive file system, this would require the retrieval of every one of the compressed tar files (possibly thousands of files), which could potentially overwhelm the disk cache on the archive server.  This would be undesirable.
If you find that you have inadvertently done something like this, cancel the command immediately, and contact the HPC Help Desk.

3. Archival from the Command Line (Manual Staging)

3.1. Why might I choose to manually stage my data?

Manual staging is simply staging from the command line without using the transfer queue.  For many users, this is the simplest way to do staging because small data sets can usually be transferred while you wait.  (Your mileage may vary based on system load.) There are, however, a few things to consider before deciding to stage data manually.

nohup archive get myfile.tar.gz &

3.2. Standardized Archive Command

The archive command is available on all HPC systems and Utility Servers, allowing you to use the same commands to perform common archival tasks regardless of where you're running or how the local archive server is configured.   The archive command can use wild cards when listing, archiving, or retrieving files, and works the same way in transfer queue job scripts as in an interactive login shell.  The archive command uses $ARCHIVE_HOME as its default target directory on the archive server, unless an alternative path is specified with the "-C path" option.  For operations within $ARCHIVE_HOME, "-C path" may be omitted.  For complete information on the archive command see the archive man page on the systems. 
Functions covered by the archive command are demonstrated below. 

3.2.1. Listing files

To list files on the archive server, use the following command:
archive ls -al [-C path]

3.2.2. Archiving files

To send one or more files to the archive server, use the following command:
archive put [-C path] file1.tar.gz file2.tar.gz ...     

3.2.3. Retrieving files

To retrieve a single file from the archive server, use the following command:
archive get [-C path] file1.tar.gz
Multiple files can be retrieved by listing them in sequence or by using wildcards.  However, wildcard strings must be enclosed in double quotes, as shown below.
archive get [-C path] "file*"

3.2.4. Making directories

To create a directory on the archive server, use the following command:
archive mkdir [-C path] [-m mode] [-p] dir1 dir2 ...
The "-m mode" option sets permissions on the newly created directory.  It is equivalent to executing chmod on the directory using numeric mode specifiers, for instance, "-m 750".
The "-p" option creates necessary intermediate directories in a path if they don't already exist.

3.2.5. Checking server status

Before performing an archive operation, it's always a good idea to check that the archive server is actually up and available.  To check the server status, use the following command:
archive stat

3.3. Non-standardized Archival Commands

There are, unfortunately, several functions not currently covered by the standardized archive command.  If you need to chmod, rm, or mv a file or directory on the archive server, there's currently no standardized way to do it, so you'll have to rely on methods that may differ from center to center.  For the Open Research system, the following commands are recommended:

3.3.1. Deleting a file

To delete a file on the archive server, use the following command:
rm $ARCHIVE_HOME/file

3.3.2. Deleting a directory

To delete a directory on the archive server, use the following command:
rmdir $ARCHIVE_HOME/directory

3.3.3. Moving or renaming a file or directory

To move or rename a file or directory on the archive server, use the following command:
mv $ARCHIVE_HOME/file $ARCHIVE_HOME/file-new

3.3.4.Changing the permissions of a file or directory

To change the permissions of a file or directory on the archive server, use the following command:
chmod [-R] permission $ARCHIVE_HOME/file
The "-R" option will recursively change the permissions of all matching directories and files beneath the specified directory.

4. Archival in Compute Jobs

Archival and retrieval operations within a batch script running in a compute queue are generally a really bad idea and are strongly discouraged.  While your data is being transferred, the cores reserved by your compute job sit idle and are unavailable to other jobs but continue to accrue time, wasting your allocation.    In addition, archival access (and possibly even the archive command) is not available from compute queues at all centers, and compute job scripts attempting to perform archival operations may fail.

5. Archival in Transfer Queue Jobs (Batch Staging)

5.1. When should I batch stage my data?

If any of the following apply to you, use batch staging:

5.2. What is the transfer queue?

The transfer queue is a special-purpose queue for transferring or archiving files.  It has access to $HOME, $ARCHIVE_HOME, $WORKDIR, and $CENTER.  Jobs running in the transfer queue use non-computational cores and do not accrue time against your allocation. 

5.3. Archival Commands

The archival functions listed in Section 3 work the same way in transfer queue jobs as in interactive login shells, so the command examples in Sections 3.2 and 3.3 apply to transfer queue jobs as well.  For more information on specific commands, see the associated man pages on the systems.  Additional transfer queue examples are also found in the Sample Code Repositories ($SAMPLES_HOME) on the systems.

5.4. Staging in via the transfer queue (Pre-staging)

By pre-staging your data in a transfer queue job, you don't have to sit around and wait for your data to be staged before submitting your computational job.  The following standalone script demonstrates retrieval of archived data from the archive server, placing it in a newly created directory in your $WORKDIR, whose name is based on the JOBID.   Let's call this a "pre-staging job."
#!/bin/sh
#PBS -q transfer
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -j oe
#PBS -A Project_ID

# Create a directory for this job in $WORKDIR and cd into it.
cd $WORKDIR
JOBID=`echo $PBS_JOBID | cut -d . -f 1`
mkdir my_job.$JOBID
cd my_job.$JOBID

# If the archive server is available, get the data. Otherwise, exit.
STATUS=`archive stat -retry 1 | grep 'on-line' | wc -l`
if [ $STATUS -eq 0 ]; then
echo "Archive system not on-line!!"
echo "Exiting: `date`"
exit
fi
echo "Archive system is on-line; retrieving job files."
archive get my_input_data.tar.gz

echo "Input data files retrieved: `date`"
echo "Unpacking input tar file"
tar xvzf my_input_data.tar.gz

echo "Directory contents:"
ls
An additional example of this script is also found in the Sample Code Repositories ($SAMPLES_HOME) on the systems.

5.5. Staging out via the transfer queue

The term "staging out" refers to the process of dealing with the data that's left in your $WORKDIR after your computational job completes.  This generally entails deletion of unneeded files and archival or transfer of important data, which can be time-consuming.  Because of this, users can benefit from using the transfer queue for these activities.  (Remember that jobs in the transfer queue do not consume allocation.)  The following standalone script demonstrates archival of output data to the archive server via the transfer queue. Let's call this a "stage out job."
#!/bin/sh
#PBS -q transfer
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -j oe
#PBS -A Project_ID

# cd to wherever your data is located
cd $WORKDIR
echo "Packing data for archiving:"
tar cvzf my_output_data.tar.gz my_output_data

echo "Storing data from computation job:`date`"

# Check to see if archive server is on-line.  If so, run archive task.
# If not, say so, and indicate where the output data is stored for later
# retrieval.
STATUS=`archive stat -retry 1 | grep 'on-line' | wc -l`
if [ $STATUS -eq 0 ]; then
echo "Archive system not on-line!!"
echo "Job data files cannot be stored."
echo "Retrieve them in `pwd` in my_output_data.tar"
echo "Exiting"
echo `date`
exit 2
fi
JOBID=`echo $PBS_JOBID | cut -d. -f 1`
archive mkdir my_job.$JOBID
archive put -C my_job.$JOBID my_output_data.tar.gz
archive ls my_job.$JOBID

date
exit
An additional example of this script is also found in the Sample Code Repositories ($SAMPLES_HOME) on the systems.

5.6. Tying it all together

While the previous examples were standalone examples, the following technique creates a 3-step job chain that runs from stage-in to stage-out without any involvement from you.  This can be advantageous if your workflow is already well-defined and proven, and does not require you to personally analyze your output prior to staging out.
If, however, your workflow does require an eyes-on analysis of the output data or if it requires post processing prior to analysis, you may want to use the stage out job instead to transfer your data to $CENTER, as demonstrated in Section 5.6.4 (below). You may still submit a transfer queue job later on the Utility Server to archive data that you want to keep.
For the purposes of this demonstration, we'll assume that the following scripts are saved as "prestaging.pbs," "computation.pbs," and "outstaging.pbs," respectively.  Additional examples of these scripts are also found in the Sample Code Repositories ($SAMPLES_HOME) on the systems. 
Note the use of the $PBS_O_WORKDIR environment variable in scripts 2 and 3 (below).  This variable is automatically set to the directory in which qsub is executed in script 1.  Scripts 2 and 3 then cd to that directory before launching their jobs.

5.6.1. Script 1 of 3 (Pre-staging)

This script contains the pre-staging job and launches the computation job.
#!/bin/sh
#PBS -q transfer
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -j oe
#PBS -A Project_ID

# Create a directory for this job in $WORKDIR and cd into it.
cd $WORKDIR
JOBID=`echo $PBS_JOBID | cut -d . -f 1`
mkdir my_job.$JOBID
cd my_job.$JOBID

# If the archive server is available, get the data. Otherwise, exit.
STATUS=`archive stat -retry 1 | grep 'on-line' | wc -l`
if [ $STATUS -eq 0 ] ; then
echo "Archive system not on-line!!"
echo "Exiting: `date`"
exit
fi

echo "Archive system is on-line; retrieving job files."
archive get my_input_data.tar.gz

echo "Input data files retrieved: `date`"
echo "Unpacking input tar file"
tar xvzf my_input_data.tar.gz
rm my_input_data.tar.gz

echo "Directory contents:"
ls

echo "Submitting computational job"
qsub -W depend=afterok:${JOBID} ${WORKDIR}/computation.pbs
exit

5.6.2. Script 2 of 3 (Computation)

This script contains the computational job and launches the stage-out job.
#!/bin/sh
#PBS -l walltime=00:30:00
#PBS -j oe
#PBS -q debug
## IBM select statement
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -A Project_ID
#PBS -r n

cd $PBS_O_WORKDIR

echo "Executing computation"
## IBM launch command
mpirun -n 16 ./my_executable | tee my_output_data

echo "Computation finished, submitting job to pack and archive data"
COMP_JOB=`echo $PBS_JOBID | cut -d. -f 1`

if [ -f ${WORKDIR}/outstaging.pbs ] ; then
echo "Submitting archive job to transfer queue: `date`"
qsub -W depend=afterok:${COMP_JOB} ${WORKDIR}/outstaging.pbs
else
echo "Post archival script is missing!!!"
echo "Archive step to store data cannot be performed."
echo "Exiting."
exit 1
fi
exit

5.6.3. Script 3 of 3 (Stage out to $ARCHIVE_HOME)

This script contains the out-staging script and is launched by the computation script.
#!/bin/sh
#PBS -q transfer
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -j oe
#PBS -A Project_ID
#
cd $PBS_O_WORKDIR
echo "Packing data for archiving:"
tar cvzf my_output_data.tar.gz my_output_data

echo "Storing data from computation job:`date`"

# Check to see if archive server is on-line.  If so, run archive task.
# If not, say so, and indicate where the output data is stored for later
# retrieval.
STATUS=`archive stat -retry 1 | grep 'on-line' | wc -l`
if [ $STATUS -eq 0 ] ; then
echo "Archive system not on-line!!"
echo "Job data files cannot be stored."
echo "Retrieve them in `pwd` in my_output_data.tar.gz"
echo "Exiting"
echo `date`
exit 2
fi
JOBID=`echo $PBS_JOBID | cut -d. -f 1`
archive mkdir my_job.$JOBID
archive put -C my_job.$JOBID my_output_data.tar.gz
archive ls my_job.$JOBID

date
exit

5.6.4. Alternate Script 3 of 3 (Stage out to $CENTER)

This script contains the out-staging script and is launched by the computation script.
#!/bin/sh
#PBS -q transfer
#PBS -l select=1:mpiprocs=16:ncpus=16
#PBS -j oe
#PBS -A Project_ID
#
cd $PBS_O_WORKDIR
echo "Packing data for archiving:"
tar cvzf my_output_data.tar.gz my_output_data

echo "Storing data from computation job:`date`"

# Check to see if $CENTER is on-line.  If so, copy the files.
# If not, say so, and indicate where the output data is stored for later
# retrieval.
if [ ! -d $CENTER ] ; then
echo "$CENTER is not available!!"
echo "Job data files cannot be stored."
echo "Retrieve them in `pwd` in my_output_data.tar.gz"
echo "Exiting"
echo `date`
exit 2
fi
JOBID=`echo $PBS_JOBID | cut -d. -f 1`
mkdir $CENTER/my_job.$JOBID
cp my_output_data.tar.gz $CENTER/my_job.$JOBID
ls $CENTER/my_job.$JOBID
date
exit

 

^ top