Search Docs by Keyword

Table of Contents

Scratch

RC maintains a large, shared temporary scratch filesystem for general use for high input/output jobs at /n/netscratch .

Scratch Policy

Each lab is allotted 50TB of scratch space for its use in their jobs. This is temporary high-performance space and files older than 90 days will be deleted through a periodic purge process. This purge can run at any time, especially if scratch is getting full and is also often run at the start of the month during our monthly maintenance period.

There is no charge to labs for netscratch, but please note that it intended as volatile, temporary scratch space for transient data and is not backed up. If your lab has concerns or needs regarding scratch space or usage, please contact FASRC to discuss.

Modifying file times (via touch or other process) when initially placing data in scratch is allowed, however doing so subsequently to avoid deletion is an abuse of the filesystem and will result in administrative action from FASRC. To reiterate, you may initially modify the file date(s) on new data so that it is not in the past, but should not modify it further.  If you have longer-term needs, please contact us to discuss options.


Networked, shared netscratch

The cluster has storage built specifically for high-performance temporary use. You can create your own folder inside the folder of your lab group. If that doesn’t exist or you do not have write access, contact us.

IMPORANT: netscratch is temporary scratch space and has a strict retention policy. 

Size limit 4 Pb total, 50TB max. per group, 100M inodes
Availability All cluster nodes.
Cannot be mounted on desktops/laptops.
Backup NOT backed up
Retention policy 90 day retention policy. Deletions are run during the cluster maintenance window.
Performance High: Appropriate for I/O intensive jobs

 

 

 

 

 

 

/n/netscratch is short-term, volatile, shared scratch space for large data analysis projects.

The /n/netscratch filesystem is managed by the VAST parallel file system and provides excellent performance for HPC environments. This file system can be used for data intensive computation, but must be considered a temporary store. Files are not backed up and will be removed after 90 days. There is a 50TB total usage limit per group.

Large data analysis jobs that would fill your 100 Gb of home space can be run from this volume. Once analysis has been completed, however, data you wish to retain must be moved elsewhere (lab storage, etc.). The retention policy will remove data from scratch storage after 90 days.


Local (per node), shared scratch storage

Each node contains a disk partition, /scratch, also known as the local scratch that is useful for storing large temporary files created while an application is running.

IMPORTANT: Local scratch is highly volatile and should not be expected to persist beyond job duration.

Size limit Variable (200-300GB total typical). See actual limits per partition.
Availability Node only.
Cannot be mounted on desktops/laptops.
Backup Not backed up
Retention policy Not retained – Highly Volatile
Performance High: Suited for limited I/O intensive jobs

 

 

 

 

 

 

The /scratch volumes are a directly connected (and therefore, fast) to temporary storage location that is local to the compute node. Many high performance computing applications use temporary files that go to /tmp by default. On the cluster we have pointed /tmp to /scratch. Network-attached storage, like home directories, is slow compared to disks directly connected to the compute node. If you can direct your application to use /scratch for temporary files, you can gain significant performance improvements and ensure that large files can be supported.

Though there are /scratch directories available to each compute node, they are not the same volume. The storage is specific to the host and is not shared. For details on the /scratch size available on the host belonging to a given partition, see the last column of the table on Slurm Partitions. Files written to /scratch from holy2a18206, for example, are only visible on that host. /scratch should only be used for temporary files written and removed during the running of a process. Although a ‘scratch cleaner’ does run hourly, we ask that at the end of your job you delete the files that you’ve created.

$SCRATCH VARIABLE

A global variable called $SCRATCH exists on the FASRC Cannon and FASSE clusters which allows scripts and jobs to point to a specific directory in scratch regardless of any changes to the name or path of the top-level scratch filesystem. This variable currently points to /n/netscratch so, for example, one could use the path $SCRATCH/jharvard_lab/Lab/jsmith in a job script. This will have the added benefit of allowing us to change scratch systems at any time without your having to modify your jobs/scripts.

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.