Search Docs by Keyword
Scratch
RC maintains a large, shared temporary scratch filesystem for general use for high input/output jobs at /n/netscratch
.
Scratch Policy
Each lab is allotted 50TB of scratch space for its use in their jobs. This is temporary high-performance space and files older than 90 days will be deleted through a periodic purge process. This purge can run at any time, especially if scratch is getting full and is also often run at the start of the month during our monthly maintenance period.
There is no charge to labs for netscratch, but please note that it intended as volatile, temporary scratch space for transient data and is not backed up. If your lab has concerns or needs regarding scratch space or usage, please contact FASRC to discuss.
Modifying file times (via touch or other process) when initially placing data in scratch is allowed, however doing so subsequently to avoid deletion is an abuse of the filesystem and will result in administrative action from FASRC. To reiterate, you may initially modify the file date(s) on new data so that it is not in the past, but should not modify it further. If you have longer-term needs, please contact us to discuss options.
The cluster has storage built specifically for high-performance temporary use. You can create your own folder inside the folder of your lab group. If that doesn’t exist or you do not have write access, contact us.
IMPORANT: netscratch is temporary scratch space and has a strict retention policy.
Size limit | 4 Pb total, 50TB max. per group, 100M inodes |
---|---|
Availability | All cluster nodes. Cannot be mounted on desktops/laptops. |
Backup | NOT backed up |
Retention policy | 90 day retention policy. Deletions are run during the cluster maintenance window. |
Performance | High: Appropriate for I/O intensive jobs |
/n/netscratch
is short-term, volatile, shared scratch space for large data analysis projects.
The /n/netscratch
filesystem is managed by the VAST parallel file system and provides excellent performance for HPC environments. This file system can be used for data intensive computation, but must be considered a temporary store. Files are not backed up and will be removed after 90 days. There is a 50TB total usage limit per group.
Large data analysis jobs that would fill your 100 Gb of home space can be run from this volume. Once analysis has been completed, however, data you wish to retain must be moved elsewhere (lab storage, etc.). The retention policy will remove data from scratch storage after 90 days.
Each node contains a disk partition, /scratch
, also known as the local scratch that is useful for storing large temporary files created while an application is running.
IMPORTANT: Local scratch is highly volatile and should not be expected to persist beyond job duration.
Size limit | Variable (200-300GB total typical). See actual limits per partition. |
---|---|
Availability | Node only. Cannot be mounted on desktops/laptops. |
Backup | Not backed up |
Retention policy | Not retained – Highly Volatile |
Performance | High: Suited for limited I/O intensive jobs |
The /scratch
volumes are a directly connected (and therefore, fast) to temporary storage location that is local to the compute node. Many high performance computing applications use temporary files that go to /tmp
by default. On the cluster we have pointed /tmp
to /scratch
. Network-attached storage, like home directories, is slow compared to disks directly connected to the compute node. If you can direct your application to use /scratch
for temporary files, you can gain significant performance improvements and ensure that large files can be supported.
Though there are /scratch
directories available to each compute node, they are not the same volume. The storage is specific to the host and is not shared. For details on the /scratch
size available on the host belonging to a given partition, see the last column of the table on Slurm Partitions. Files written to /scratch
from holy2a18206, for example, are only visible on that host. /scratch
should only be used for temporary files written and removed during the running of a process. Although a ‘scratch cleaner’ does run hourly, we ask that at the end of your job you delete the files that you’ve created.
$SCRATCH VARIABLE
A global variable called $SCRATCH
exists on the FASRC Cannon and FASSE clusters which allows scripts and jobs to point to a specific directory in scratch regardless of any changes to the name or path of the top-level scratch filesystem. This variable currently points to /n/netscratch
so, for example, one could use the path $SCRATCH/jharvard_lab/Lab/jsmith
in a job script. This will have the added benefit of allowing us to change scratch systems at any time without your having to modify your jobs/scripts.