Harvard T.H. Chan School of Public Health – FAS Research Computing Overview
The Harvard T.H. Chan School of Public Health (HSPH) is using the FAS Research Computing environment to host data and run the analysis. If you are curious, take a look at the HCSPH & FAS Research Computing Q&A page for additional information. A list of the currently installed software on the cluster is available online. Please see our document on installing software yourself.
Requesting an account
Requesting a Research Computing account is a semi-automated process, requiring a selection of a PI (your PI must have an existing FASRC account) as your sponsor and whose lab you will belong to. The sponsor PI will then approve or deny the request. Once the PI has approved the request, turnaround is generally within the hour (Outline of Process). If the request is for an EXTERNAL user, this may take slightly longer as external users must be vetted manually. After receiving your credentials, you must attend one of our monthly New User Trainings or watch our Introduction to the Cluster videos if you selected cluster access (e.g. – the ability to run jobs). Training is essential for the proper use of our complex systems. The email stating your account has been created will have links to the necessary getting started and running jobs documentation necessary to get started on the cluster NOTE: If you did not request cluster access, you will not have a home directory or be able to run jobs. As such, you are not required to take the cluster quiz but will still need to set your password and acquire your OpenAuth token as shown in the getting started docs. Important: Regarding Account Sharing – The sharing of accounts is a violation of Harvard and RC information security policies. You cannot share your account with another individual, nor should anyone else know your login credentials. Additionally, you may have only one account. Please do not sign up for multiple accounts.
An account with cluster access provides users with access to resources hosted in the FAS RC environment, including expert consulting help coupled with extensive resources, such as over 40 PB of storage, over 82,000 processing cores, as well as numerous software modules and applications. For further details on accessing the cluster, please see the FAQ or our helpful Access and Login guide.
Billing and Usage
The previous annual per-user charge for FAS Research Computing accounts is no longer used as of the fiscal year 2019. Standard services, including accounts and access to the FAS RC high-performance computing cluster, will be effectively free to all Harvard Chan School community members via the account of an eligible PI (See: HCSPH PI Eligibility Policy). These FASRC computing costs are now covered by the school’s overhead rate. Supplemental resources such as increased storage and VMs still involve additional charges, please see our Billing FAQ.
Authentication and Security
FAS Research Computing uses a two-factor authentication system. The OpenAuth client application exists for Windows, OSX, and Linux, or you can import your RC token into Google Authenticator or DUO to display the necessary auth code there. Each account and token is unique to the individual account holder. Please note that the sharing of accounts/credentials is a violation of Harvard and RC security policies. In order to access some of the web servers, storage, and other services at FAS Research Computing, whether on wired or wireless, you will likely need to use a VPN connection. Again, consult our Access and Login guide for more details.
Using The Cluster
Taking advantage of the cluster’s massive compute capabilities is easy if you’ve read through our Quick Start Guide and Running Jobs document. You’ll find many more helpful documents in our online documentation. Additional support for using the FASRC is available via training and weekly HCSPH office hours. Additional computational training related to research computing is available from the Harvard Chan School Bioinformatics Core.
Note that it is relatively easy to overload the file storage system. Please use our high-performance scratch storage filesystems for high I/O jobs. Also, if you are submitting a large number of tasks, please see our Submitting Large Numbers of Jobs document, and please try to keep your jobs efficient and bundle them in 6 to 10 minute lots. If in doubt, contact us.
The cluster allows interactive use which is great for exploring new tools or running shell-based sessions (SAS, MATLAB, R) without having to submit jobs. Please see our Interactive Sessions information and Virtual Desktop documentation. If you would like to use a graphical client you need to enable X11 forwarding. A better solution is to use our VDI System.
Data storage and security
Data can be transferred to and from the cluster using multiple file transfer methods. Please see our Transferring Data Internally or Transferring data externally documentation. Certain HCSPH researchers have access to the FAS Research Computing computing file systems, in particular each existing PI with an account and lab group with RC is eligible for 4TB of base lab storage. All data stored in this space is periodically backed up in case of critical failure. Additional storage can be purchased: Please contact RCHelp to discuss needs and billing options.
- Backups are a second copy of a filesystem as it existed at the time of backup. All FAS storage hardware has built-in redundancy so that a limited number of disk drive failures and other hardware faults can be tolerated, a backup is required in order for the data to survive a catastrophic failure of the entire system or facility. Recovery from backups is for disaster recovery and by request only. Backups should not be relied on as a solution for day-to-day single file or directory deletions as restoration is often very labor-intensive.
- Snapshots are like a freeze frame picture of data at a point in time. Only home directories have snapshots. You can use home directory Snapshots to undo recent changes to files, recover deleted files, etc. Though Snapshots function much like a backups, they’re not backups, since the data still only exist as one copy in one place (Snapshots are reconstructed algorithmically, not stored as separate copies). See our Snapshot FAQ for more information.
Special Security Requirements
The general FAS RC cluster storage environment is not suitable for storing data with special security requirements. If access to your data needs to be limited in any way, contact RCHelp prior to transferring data. Important: Do not store data which is covered under a data use agreement or is otherwise considered high risk without consulting FAS RC first. Important: The sharing of accounts is a violation of Harvard and RC information security policies. Please contact us if you have a special need.
Contact and support
FAS RC has a number of methods for supporting researchers in need of help. Please see our FASRC Support page for details. You may also contact the HCSPH Bioinformatics Core for local help. Additionally, FAS RC hosts Office Hours regularly.