Search Docs by Keyword
PyTorch
Description
PyTorch, developed by Facebook’s AI Research lab, is an open-source machine learning library that offers a flexible platform for building deep learning models. It features a Python front end and integrates seamlessly with Python libraries like NumPy, SciPy, and Cython to extend its functionality. Unique for its use of dynamic computational graphs, unlike TensorFlow’s static graphs, PyTorch allows for greater flexibility in model design. This is particularly advantageous for research applications involving novel architectures.
The library supports GPU acceleration, enhancing performance significantly, which is vital for tackling high-level research tasks in areas such as climate change modeling, DNA sequence analysis, and AI research that involve large datasets and complex architectures. Automatic differentiation in PyTorch is handled through a tape-based system at both the functional and neural network layers, offering both speed and flexibility as a deep learning framework.
Best Practices
PyTorch and Jupyter Notebook on Open OnDemand
To use PyTorch in Jupyter Notebook on Open OnDemand/VDI, install ipykernel
and ipywidgets
:
mamba install ipykernel ipywidgets
Pull a PyTorch Singularity Container
Alternatively, you can pull and use a PyTorch singularity container:
singularity pull docker://pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime
PyTorch on MIG Mode
Note: Currently only the gpu_test
partition has MIG mode enabled.
# Get GPU card name
nvidia-smi -L
# Set CUDA_VISIBLE_DEVICES with the MIG instance
export CUDA_VISIBLE_DEVICES=MIG-5b36b802-0ab0-5f37-af2d-ac23f40ef62d
Or automate the process with:
export CUDA_VISIBLE_DEVICES=$(nvidia-smi -L | awk '/MIG/ {gsub(/[()]/,"");print $NF}')
Examples
For example scripts covering installation, and use cases, see our User Codes > AI > PyTorch repo.
External Resources:
- Various PyTorch/CUDA version compatibility chart.
- PyTorch Official Documentation – Comprehensive resource for all functionalities including tutorials and API reference.
- PyTorch Tutorials – Practical tutorials covering basic to advanced topics, specifically tailored for deep learning and high-performance computing tasks.
- PyTorch Discussion Forums – A community forum for discussing specific issues, sharing solutions, and collaborating on projects.
- Introduction to Distributed Deep Learning – A detailed guide on implementing distributed deep learning models in PyTorch.
- Efficient PyTorch – An article by PyTorch developers on best practices for optimizing deep learning models for production.