Search Docs by Keyword

Table of Contents

Data Storage Workflow

Identification of an appropriate storage location for your research data is a critical step in the research data lifecycle, as it ensures research data remains usable. We recommend you review the available storage options at FAS Research Computing and select the preferred storage offering for your group’s intended workflow, keeping in mind how often the data will be consistently utilized and accessed. The offerings below are designed to store research data, rather than administrative data. 

Each user is provided with a 100GB Home Directory for individual use. Each PI or Lab Account also receives a 4TB Lab Directory, for use by all members of the PI’s lab group and a 50TB allotment of NetScratch. See the matrix below for more details.

 

Storage Offerings

Name Home Directory (Active) Lab Directory (Active) Scratch (Active) Tier 0

Cluster Storage (Active)

Tier 1

Lab Storage (Active)

Tier 2

Lab Storage (Active)

Tier 3

(Long-term storage)

Description Personal user storage. Not recommended for computational purposes. General lab storage intended for all data types.  Temporary storage location for high performance data analysis. Active storage location for analysis data; readily utilized and accessed. General purpose storage location for raw and project data. Intended for less active research data and recently completed projects.  Long-term storage of inactive research data after project completion or for data retention purposes.
Performance Moderate Moderate High High/Moderate Moderate Low/Moderate None
Size 100GB 4TB 50TB Available upon request Available upon request Available upon request 20TB increments
Mount /n/homeNN/

username

/n/holylabs /n/netscratch /n/server/

LABS/folder

/rc_labs/folder /n/pi_lab Transfer data to Tape using Globus
Retention Daily snapshots for 7 days. Weekly snapshots for 4 weeks. Includes disaster recovery No snapshots. No disaster recovery.  No snapshots. No disaster recovery.

90-day retention policy. 

No snapshots. No disaster recovery.  Daily snapshots for 7 days. Weekly snapshots for 4 weeks. Includes disaster recovery No snapshots. Includes disaster recovery.  No snapshots. Includes disaster recovery. 
Cost None None None $50/yr per TB $250/yr per TB $100/yr per TB $5/yr per TB
Security Level Up to Level 2 Up to Level 2 Up to Level 2 Up to Level 2 (Level 3 with FASSE) Up to Level 2 (Level 3 with FASSE) Up to Level 2 (Level 3 with FASSE) Up to Level 2
Storage Folder generated for each user when granted cluster access. Limited to 100GB.  Folder generated for each approved PI and their group. Limited to 4TB.  Accessible to group members.  Request storage allocation Request storage allocation Request storage allocation Request storage allocation

*Snapshots are copies of a directory taken at a specific moment in time. They offer labs a self-service recovery option for overwritten or deleted files within the specific time period. Disaster recovery is a copy of an entire file system that can be used internally by FASRC in case of system-wide failure.

  • Home directories
    • Description: Individual user folder intended for other types of data (code, scripts, documentation, analysis data)
    • Moderate performance
    • Size: 100GB 
    • Mount: /n/homeNN/username
    • Daily snapshots for 7 days and weekly snapshots for 4 weeks. Includes disaster recovery. 
    • Cost: Free
    • Security level: Up to Level 2 (Level 3 with FASSE)
    • Automatically generated when granted cluster access. 
  • Lab Directory
    • Description: General lab folder intended for data, scripts with version control and documentation. 
    • Moderate performance 
    • Size: 4TB
    • Mount: /n/holylabs
    • No snapshots. No disaster recovery. 
    • Cost: Free
    • Security level: Up to Level 2
    • Automatically generated for approved PIs with three subfolders:
      • Lab: Subfolders are visible to everyone in the lab. We recommend housing most of the data in this subfolder. 
      • Users: Subfolders are specific to individual users. Each user can create their own subfolder. 
      • Everyone: Subfolders visible to anyone on the cluster, great for collaboration between labs. 
  • Scratch
    • Description: Temporary storage location for high performance data analysis. 
    • High performance. 
    • Size: 50TB per group, 100 million inodes
    • Mount: /n/netscratch
    • No snapshots. No disaster recovery. 
    • Retention: 90-day retention policy. 
    • Cost: Free
    • Security level: Up to Level 2 (Level 3 with FASSE)
    • Automatically accessible if a member of the lab group with three subfolders: 
      • Lab: Subfolders are visible to everyone in the lab. We recommend housing most of the data in this subfolder. 
      • Users: Subfolders are specific to individual users. Each user can create their own subfolder. 
      • Everyone: Subfolders visible to anyone on the cluster, great for collaboration between labs. 
  • Tier 0 (Cluster storage)
    • Description: Storage folder intended for active analysis research data connected to the high-performance compute cluster. 
    • High performance. 
    • Size: 1-1024TB
    • Mount:/n/server/LABS/folder
    • No snapshots. No disaster recovery. 
    • Cost: $50/yr per TB
    • Security level: Up to Level 2
    • Request storage allocation
  • Tier 1 (Lab storage with snapshots)
    • Description: General purpose storage location for data analysis and project data. Best for irrecoverable data like raw datasets as it comes with backups. 
    • Moderate performance. 
    • Size: 1-1024TB
    • Mount: /rc_labs/folder
    • Daily snapshots for 7 days and weekly snapshots for 4 weeks. Includes disaster recovery. 
    • Cost: $250/yr per TB
    • Security level: Up to Level 2
    • Request storage allocation
  • Tier 2 (Lab Storage)
    • Description: Intended for intermediary storage of research data for ongoing and recent projects. 
    • Low/moderate performance. 
    • Size: 1-306TB
    • Mount: /n/pi_lab
    • No snapshots. Includes disaster recovery. 
    • Cost: $100/yr per TB
    • Security level: Up to Level 2
    • Request storage allocation
  • Tier 3 (Long-term Storage)
    • Description: Long-term storage of inactive research data after project completion for data retention purposes. 
    • No performance or access. 
    • Size: 20TB increments. Ten thousand files per folder. File sizes between 1GB to 100 GB. 
    • Access: Tape-based access with Globus or S3 
    • No snapshots. Includes disaster recovery. 
    • Cost: $5/yr per TB
    • Security level: Up to Level 2
    • Request storage allocation
  • FASSE
    • Description: Secure storage environment for analysis or sensitive data, such as data generated using Data Use Agreements (DUAs) or IRB
    • Can be applied to Cluster Storage, Lab Storage, or Tier 2 based on project need. 
    • Security level: Up to Level 3  

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.