Glossary

Research Computing has its own body of terms and concepts which are both common in general High Performance Computing but also Information Technology in general.  Below is a glossary of common nomenclature you will run into and quick definitions of terms.

Allocation

Used variously.

  1. A block of cores, memory, and possibly GPU’s assigned by the scheduler
  2. A block of storage given to a group for use.
  3. The Fairshare granted to a group

Cluster

Also synonymous with a supercomputer.  This is a collection of computers (called nodes) that are tied together by a fast network and uniform operating system.

Central Processing Unit (CPU)

This is a microprocessor that acts as the main director and calculator for a node.  These are divided into individual subdivisions called cores.

Chipset

This is a basic architecture for a CPU.  Chipsets vary depending on manufacturer.

Cloud Computing

Leveraging a shared block of computers owned and managed by a separate entity that has resiliency and scalability.

Code

A method of giving a series of instructions to a computer.

Command Line Interface (CLI)

Also known as terminal or console. The fundamental method of interacting with computers via direct instructions.

Containers

A method of creating an encapsulated an environment and software that overlaps the current operating system but does not start up a full independent virtual machine.

Core

A fundamental unit of compute.  Each core runs a single instruction of code, aka a process.

Datacenter

A location, usually shared, where servers are housed.

Data Mangement

The art of organizing and handling large amounts of information on computers.

Distributed Storage

Storage that uses multiple servers to host data.

Embarrassingly Parallel

The simplest form of parallelism.  This involves leveraging the scheduler to run many jobs at once.

Fairshare

The method by which Slurm adjudicated which groups get priority.

Graphics Processing Unit (GPU)

Originally designed for fast rendering of images (especially for games).  Today GPU’s are often utilized for machine learning due to their ability to process streams of data swiftly.

Group

A block of users who share something in common, typically a PI.

High Performance Computing (HPC)

Synonymous with Supercomputing.  Numerical work which pushes the limits of what computers can do.  Typically involve datacenters, infiniband, water cooling, schedulers, distributed storage, etc.

Hypervisor

A server which hosts multiple Virtual Machines.

Infiniband (IB)

A network technology with ultralow latency and high bandwidth.  Use commonly in supercomputing.

Information Technology (IT)

A catchall term for the broad category of things interacting with computers.  Other synonyms include anything with cyber in it.  Research Computing and High Performance Computing are subdisciplines of Information Technology

Job

An individual allocation for a user by the scheduler.

Login Node

A server or virtual machine that is set up for users as an access point to the cluster.

Machine Learning

A misnomer, often used synonymously with Artificial Intelligence (AI). This is a sophisticated method of deriving correlations from empirical data based on how parts of the brain work.  These correlations are then used to predict future behavior or find common patterns.

Maintenance

When some part of the cluster is taken offline so that work can be done to improve or restore service.

Memory

Also known as RAM (Random Access Memory).  This is volatile locations on the node that hold information while a process is running.

Message Passing Interface (MPI)

The industry standard for processes that need to talk between nodes in a parallel fashion.

Network

A method of connecting various computers together.

Node

Synonymous with Server or Blade. An individual block of compute. Typically made up of a CPU, memory, local storage, and sometimes GPU card.

Operating System (OS)

The basic instructions and environment used by a computer.

Parallel

Executing multiple processes at once.  Typical methods include: Embarrassingly Parallel, Threading, MPI

Partition

A block of compute in Slurm that can be used for scheduling jobs.

Primary Investigator (PI)

Typically professors but can include others who have been designated as such by Harvard University.

Priority

The method by which a scheduler determines which job to schedule next.

Process

A single execution of a code with a singular code path.

Proxy

A method of using a bridge system to access an external network location from a secure network.

Queue

Sometimes used synonymously with Partition.  This is the group of jobs which are waiting to execute on the cluster.

Requeue

A method used by the scheduler to reschedule jobs that are preempted by higher priority jobs.

Research Computing (RC)

Is any application of numerical power to investigate how things work.  Generally this is found in academia, though it is used in industry under various names.

Scheduler

An automated process that adjudicates which jobs go where on a cluster.

Scratch

A location on storage that is meant only for temporary data.

Secure Network

A network that is restricted by various methods to permit it to be able to handle sensitive data.

Serial

Running a sequence of tasks in order.

Slurm

An open source scheduler.

Storage

A location where you can permanently host data.

Threading

A method of breaking up a process over multiple cores that share memory for the sake of parallel execution.

User

You!  Also other people using the cluster.  Users can also be created for use by automated processes.

Virtual Desktop Interface (VDI)

A method of exporting graphics and applications to users outside of the normal command line interface.

Virtual Machine

A computer that exists purely in software and is hosted on a hypervisor.

Water Cooling

Uses a liquid medium (usually water) for removing heat from a computer instead of the standard air cooling.