Search Docs by Keyword

Table of Contents

Python Package Installation

Description

Python packages on the cluster are primarily managed with Mamba.  Use of pip on the FASRC clusters is discouraged.

Mamba is a package manager that is a drop-in replacement for Conda, and is generally faster and better at resolving dependencies:

  • Speed: Mamba is written in C++, which makes it faster than Conda. Mamba uses parallel processing and efficient code to install packages faster.
  • Compatibility: Mamba is fully compatible with Conda, so it can use the same commands, packages, and environment configurations.
  • Cross-platform support: Mamba works on Mac, Linux and Windows.
  • Dependency resolution: Mamba is better at resolving dependencies than Conda.
  • Environment creation: Mamba is faster at creating environments, especially large ones.
  • Package repository: Mamba uses Mambaforge ( aka conda-forge ), the most up to date packages available.

Important:
Anaconda’s Conda is no longer free for non-profit academic research use at institutions with more than 200 employees, use Mamba instead.  Downloading packages through Anaconda’s Main channel may incur costs that pass through to your lab.  Hence, we recommend our users switch to using open-source conda-forge channel for package distribution when possible.

Mamba is a drop-in replacement for Conda and uses the same commands and configuration options as conda. You can swap almost all commands between conda & mamba.  By default, mamba uses conda-forge, the free Mambaforge package repository.  ( In this doc, we will generally only refer to mamba.)

Usage

mamba is available on the FASRC cluster as a software module either as Mambaforge or as python/3* which is aliased to mamba:

$ module load python/{PYTHON_VERS}-fasrc01
$ python -V Python {PYTHON_VERS}

You can create conda environments with mamba in the same way as with conda:

$ mamba create -n ENV_NAME PACKAGES

You can also install packages with the create command, speeding up your setup time significantly.

$ mamba create -n python_env1 python={PYTHON_VERS} pip wheel

To activate this environment, use

$ source activate python_env1

To deactivate an active environment, use

$ mamba deactivate

To use the environment, do:

$ source activate python_env1

You can list the packages currently installed in the mamba or  conda environment with:

$ mamba list

You can install new packages in the conda environment with mamba. For example:

$ mamba install -y numpy

Note: Do not install pip outside a mamba environment on any FASRC cluster. If you execute pip install outside of a mamba environment, then all the packages that pip installs are located in your $HOME/.local, which could create package conflicts resulting in some packages either not getting installed or loaded viamamba successfully. 

To uninstall packages, use:

$ mamba uninstall PACKAGE

When you finish using the conda environment, you can deactivate it with:

$ mamba deactivate

For additional features, please refer to the Mamba documentation.

Best Practices

Use mamba environment in Jupyter Notebooks

If you would like to use a mamba environment as a kernel in a Jupyter Notebook on Open OnDemand (Cannon OOD or FASSE OOD), you have to install packages, ipykernel and nb_conda_kernels. These packages will allow Jupyter to detect mamba environments that you created from the command line.

For example, if your environment name is python_env1:

module load python
source activate python_env1
mamba install ipykernel nb_conda_kernels
After these packages are installed, launch a new Jupyter Notebook job (existing Jupyter Notebook jobs will fail to “see” this environment). Then:
  1. Open a Jupyter Notebook (a .ipynb file)
  2. On the top menu, click Kernel -> Change kernel -> select the conda environment

Mamba environments in holylabs space

With mamba, use the -p or --prefix option to specify writing environment files to a holylabs share location.  Don’t use your home directory as it has very low performance due to filesystem latency.  Using a lab share location, you can also share your conda environment with other people on the cluster.  Keep in mind, you will need to make the destination directory, and specify the python version to use.  For example:

$ mamba create --prefix /n/holylabs/LABS/{YOUR_LAB}/Lab/envs python={PYTHON_VERS}

$ mamba activate /n/holylabs/LABS/{YOUR_LAB}/Lab/envs

Troubleshooting

Interactive vs. batch jobs

If your code works in an interactive job, but fails in a slurm batch job,

  1. You are submitting your jobs from within a mamba environment.
    Solution 1: Deactivate your environment with the command mamba deactivate and submit the job or
    Solution 2: Open another terminal and submit the job from outside the environment.

  2. Check if your ~/.bashrc or ~/.bash_profile files have a section of conda initialize or a source activate command. The conda initialize section is known to create issues on the FASRC clusters.
    Solution: Delete the section between the two conda initialize statements. If you have source activate in those files, delete it or comment it out.
    For more information on ~/.bashrc files, see https://docs.rc.fas.harvard.edu/kb/editing-your-bashrc/

© The President and Fellows of Harvard College
Except where otherwise noted, this content is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.