Search Docs by Keyword

Table of Contents

Glossary

Research Computing has its own body of terms and concepts which are both common in general High Performance Computing but also Information Technology in general.  Below is a glossary of common nomenclature you will run into and quick definitions of terms.

Allocation

Used variously.

  1. A block of cores, memory, and possibly GPU’s assigned by the scheduler
  2. A block of storage given to a group for use.
  3. The Fairshare granted to a group

Archival

FASRC storage, including tape, is not archival storage.  Tier 2 tape should be considered as merely an offline cold storage version of disk-based storage.  The term ‘archival’ has a very specific meaning and criteria in data storage and management and no FASRC offering meets that definition/criteria.  If you require archival storage of data, please see dataverse or contact  the library system for advice and options.

Cluster

Also synonymous with a supercomputer.  This is a collection of computers (called nodes) that are tied together by a fast network and uniform operating system.

Central Processing Unit (CPU)

This is a microprocessor that acts as the main director and calculator for a node.  These are divided into individual subdivisions called cores.

Aside from the strict definition, CPU can also be used synonymously with Core.

Chipset

This is a basic architecture for a CPU.  Chipsets vary depending on manufacturer.

Cloud Computing

Leveraging a shared block of computers owned and managed by a separate entity that has resiliency and scalability.

Code

A method of giving a series of instructions to a computer.

Command Line Interface (CLI)

Also known as terminal or console. The fundamental method of interacting with computers via direct instructions.

Compiler

Used to convert code into an executable which can run on a computer.

Containers

A method of creating an encapsulated an environment and software that overlaps the current operating system but does not start up a full independent virtual machine.

Core

A fundamental unit of compute.  Each core runs a single instruction of code, aka a process.

Aside from the strict definitions of Core and CPU, sometimes CPU is used interchangably with core.

Datacenter

A location, usually shared, where servers are housed.

Data Mangement

The art of organizing and handling large amounts of information on computers.

Disaster Recovery 

A copy of an entire file system that can be used internally by FASRC in case of system-wide failure.

Distributed Storage

Storage that uses multiple servers to host data.

Embarrassingly Parallel

The simplest form of parallelism.  This involves leveraging the scheduler to run many jobs at once.

Executable

Compiled code which can be run on a computer. Also known as application or binary.

Fairshare

The method by which Slurm adjudicated which groups get priority.

Graphics Processing Unit (GPU)

Originally designed for fast rendering of images (especially for games).  Today GPU’s are often utilized for machine learning due to their ability to process streams of data swiftly.

Group

A block of users who share something in common, typically a PI.

Graphical User Interface (GUI)

Also known as a desktop. This method of interaction with a computer is mediated through mouse clickable images, menus, and icons.

High Performance Computing (HPC)

Synonymous with Supercomputing.  Numerical work which pushes the limits of what computers can do.  Typically involve datacenters, infiniband, water cooling, schedulers, distributed storage, etc.

Hypervisor

A server which hosts multiple Virtual Machines.

Infiniband (IB)

A network technology with ultralow latency and high bandwidth.  Use commonly in supercomputing.

Information Technology (IT)

A catchall term for the broad category of things interacting with computers.  Other synonyms include anything with cyber in it.  Research Computing and High Performance Computing are subdisciplines of Information Technology

Input/Output (I/O or IO)

A term referring to reading in data (input) or writing out data (output) to storage. Covers both how much data is being accessed and how many individual files are being used. Is used to gauge the performance of storage.

Job

An individual allocation for a user by the scheduler.

Job Efficiency

A measure of how well the job allocation parameters match what the job actually uses.

Job Optimization

Work done on a code to ensure that a job runs at the maximum speed possible with the least amount of resource used.

Library

A precompiled set of functions that can be used by external applications.

Login Node

A server or virtual machine that is set up for users as an access point to the cluster.

Machine Learning

A misnomer, often used synonymously with Artificial Intelligence (AI). This is a sophisticated method of deriving correlations from empirical data based on how parts of the brain work.  These correlations are then used to predict future behavior or find common patterns.

Maintenance

When some part of the cluster is taken offline so that work can be done to improve or restore service.

Memory

Also known as RAM (Random Access Memory).  This is volatile locations on the node that hold information while a process is running.

Message Passing Interface (MPI)

The industry standard for processes that need to talk between nodes in a parallel fashion.

Network

A method of connecting various computers together.

Node

Synonymous with Server or Blade. An individual block of compute. Typically made up of a CPU, memory, local storage, and sometimes GPU card.

Operating System (OS)

The basic instructions and environment used by a computer.

Parallel

Executing multiple processes at once.  Typical methods include: Embarrassingly Parallel, Threading, MPI

Partition

A block of compute in Slurm that can be used for scheduling jobs.

Primary Investigator (PI)

Typically professors but can include others who have been designated as such by Harvard University.

Priority

The method by which a scheduler determines which job to schedule next.

Process

A single execution of a code with a singular code path.

Proxy

A method of using a bridge system to access an external network location from a secure network.

Queue

Sometimes used synonymously with Partition.  This is the group of jobs which are waiting to execute on the cluster.

Requeue

A method used by the scheduler to reschedule jobs that are preempted by higher priority jobs.

Research Computing (RC)

Is any application of numerical power to investigate how things work.  Generally this is found in academia, though it is used in industry under various names.

Scheduler

An automated process that adjudicates which jobs go where on a cluster.

Scratch

A location on storage that is meant only for temporary data.

Secure Network

A network that is restricted by various methods to permit it to be able to handle sensitive data.

Serial

Running a sequence of tasks in order.

Slurm

An open source scheduler.

Snapshots

Copies of a directory taken at a specific moment in time. They offer labs a self-service recovery option for overwritten or deleted files within the specific time period.

Storage

A location where you can permanently host data.

Threading

A method of breaking up a process over multiple cores that share memory for the sake of parallel execution.

Topology

Can refer to the organization and layout of:

  1. Nodes on a network (i.e. Network Topology).
  2. Cores, CPUs, Memory, and GPU’s on a Node (i.e. Node Topology).
  3. The processes that make up a Job with respect to Network and Node Topology.

User

You!  Also other people using the cluster.  Users can also be created for use by automated processes.

Virtual Desktop Interface (VDI)

A method of exporting graphics and applications to users outside of the normal command line interface.

Virtual Machine

A computer that exists purely in software and is hosted on a hypervisor.

Water Cooling

Uses a liquid medium (usually water) for removing heat from a computer instead of the standard air cooling.

X11

X11 is an older port-forwarding system for displaying graphical from one system to another. We do not recommend the use of X11 as it is slow and un-reliable. Frequent disconnects are not uncommon and window drawing/re-drawing will be very slow. We recommend Open OnDemand (aka OOD or VDI) which will provide more robust interface and is not tied to the quality and speed of your connection. See also: https://docs.rc.fas.harvard.edu/kb/ood-remote-desktop-how-to-open-software/

Test
Hidden content goes here

 

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.