The following directories of the CSC environment are visible on the servers of the Hippu system: home $HOME, archive $ARCHIVE, metawork $METAWRK, and your project directory.
These directories are specific for the Hippu system: the work directory $WRKDIR, the server specific work directories $TMPDIR (and $FCWRKDIR), and the user's application directory $USERAPPL. Please note that $WRKDIR is shared between the Hippu nodes, but also with the Vuori nodes.
The older servers hippu1 and hippu2 are running the RHEL5 operating system distribution, while the newer hippu3 and hippu4 are running RHEL6. Your software compiled on the new servers might not run on the older servers, and vice versa, because of different libraries and architectural differences in the systems. For this reason, the directory $USERAPPL on hippu1 and hippu2 points to a different space than on hippu3 and hippu4. The former is supposed to be used for RHEL5 compatible applications, the latter for RHEL6 compatible installations.
The local scratch space in hippu3 and hippu4 is identified by $TMPDIR. On hippu1 and hippu2, instead, you should use $FCWRKDIR. For performance reasons you should install your own applications under $USERAPPL and launch your jobs from the local scratch space of the server you are running on.
Figure 2.1 illustrates the disk system Hippu.
Your home directory $HOME on the CSC computing environment is located on the disk server. The files in your home directory are shared between Hippu and other computing servers at CSC. The home directory is suitable only for small initialization files and frequently used small programs. It is not intended for extensive I/O operations or for large data sets. Therefore, before running a job it is highly recommended to copy all needed files from your home directory to work disk for performance reasons.
There are several file storage areas available for users. Usually you need not (and should not) refer to directories with full path names. Instead, use the defined environment variables. The most commonly used storage areas are listed in Table 2.1. The lifetime of files in each directory is limited as described in the table. $LOGNAME is an environment variable, whose value is your User Id (Login Name).
Table 2.1: File storage areas available for users on the newer servers hippu3 and hippu4.
| Symbol | Where |
Lifetime |
Backup |
Quota |
|---|---|---|---|---|
| $HOME |
Home directory (NFS-mounted) |
Unlimited | Yes | 1 GB |
| $WRKDIR | /wrk/$LOGNAME (Lustre) | 90 days |
No | |
| $TMPDIR | /tmp/$LOGNAME (Local disk) |
30 days |
No |
100-200 GB |
| $USERAPPL | /v/users/$LOGNAME (Lustre) |
Unlimited |
Yes | |
| $METAWRK | /fs/metawrk/$LOGNAME (NFS-mounted) |
30 days |
No | 200 GB |
| $ARCHIVE | /fs/archive... (NFS-mounted) | 22 months |
Yes |
Table 2.2: File storage areas available for users on the older servers hippu1 and hippu2.
| Symbol | Where |
Lifetime |
Backup |
Quota |
|---|---|---|---|---|
| $HOME |
Home directory (NFS-mounted) |
Unlimited | Yes | 1 GB |
| $WRKDIR | /wrk/$LOGNAME (Lustre) | 90 days |
No | |
| $FCWRKDIR | /fcwrk/$LOGNAME (Fibre Channel) |
30 days |
No |
100-200 GB |
| $USERAPPL | /v/users/$LOGNAME (NFS-mounted) |
Unlimited |
Yes | |
| $METAWRK | /fs/metawrk/$LOGNAME (NFS-mounted) |
30 days |
No | 200 GB |
| $ARCHIVE | /fs/archive... (NFS-mounted) | 22 months |
Yes |
1 TB |
| $TMPDIR |
/tmp/$LOGNAME (local disk) |
10 days |
No |
20-25 GB |
The home directory $HOME is backed up regularly. This directory is meant for permanent files, with a maximum total size of a few megabytes. It is a typical repository for source codes and small input files.
Working directory $WRKDIR is common for both Hippu nodes, but also for Vuori cluster. Working directory $TMPDIR (and $FCWRKDIR on the older servers) is node specific but it provides faster I/O performance. No backups are taken from the working directories so a disk crash may destroy the contents of the directories. In addition, unmodified files will be deleted after 30 days (or more, see table 2.1). Therefore, if you have installed your own applications and programs in the working directories, you should move or reinstall these under the directory $USERAPPL, if you are using these applications very soon or frequently. Rarely used applications should be moved under your project directory or $ARCHIVE.
User application directory $USERAPPL is common for hippu1 and hippu2 nodes. It's also common for the newer hippu3 and hippu4 nodes, but the content is different from the one of the older nodes. The reason why they're not all sharing the same space is the operating system differences and architecture differences of the newer and older servers. Backups are taken regularly. This is now the place where you should install your own applications, which you are using immediately after installation or which you are going to use frequently in the near future.
Directory $METAWRK is shared by all machines in the CSC computing environment. This directory is not local and it is slower to use than the directories $WRKDIR , $TMPDIR and $FCWRKDIR.
Directory $ARCHIVE refers to the archive server. This is not a local filesystem, and it should be used only for long-term storage of compressed and infrequently used files. The access time to the archive server can be very slow because usually the data resides on tapes.
Directory $TMPDIR is specific on all Hippu nodes and it should be used only for short-lived files or jobs requiring very fast I/O performance. $TMPDIR is also automatically used by some tools, such as some compilers.
For the best file retrieval performance, avoid storing a large number of small files. Instead use the tar utility to archive a number of files and/or directories into one file. For example,
tar -cf example.tar mydirectory
creates a file example.tar containing the whole directory structure and all the files of the directory mydirectory.
To extract the tar file issue the command tar -xf example.tar and the ”tarred” files with the directory structure will be extracted under the current working directory. The files are stored in the archive server for 22 months from the last modification date.
You can use the command
quota
to check your disk quota usage. NB: all quotas do not work yet!