Getting Started with DeltaAI

For IOWarp team members on project CIS250329 DeltaAI: NVIDIA GH200 Grace Hopper Supercomputer at NCSA

What is DeltaAI?

DeltaAI is a 152-node supercomputer at NCSA, each node packing 4x NVIDIA GH200 superchips (H100 GPU + Grace ARM CPU). Our allocation gives us ~1,000 GPU Hours on H100s with 120GB HBM3 each.

Important: DeltaAI runs on ARM (aarch64) CPUs, not x86. This affects everything you compile.

Step 1: Get Your Credentials

You need three things before you can log in:

1a. NCSA Username

Your PI or allocation manager has already added you to the project. Your NCSA username is typically your university NetID (e.g., jdoe3). Check with the PI if unsure.

1b. NCSA Kerberos Password

This is separate from your university password. Set it at:

https://identity.ncsa.illinois.edu/reset

Enter your NCSA username and follow the email verification flow.

1c. NCSA Duo MFA

You need a second factor for every login. The easiest method:

Go to https://duo.security.ncsa.illinois.edu
Generate emergency backup recovery codes
Save these codes somewhere safe — you'll type one each time you SSH in

Alternatively, install the Duo Mobile app and enroll your phone.

Step 2: SSH In

ssh YOUR_USERNAME@dtai-login.delta.ncsa.illinois.edu

You'll be prompted for:

Your NCSA Kerberos password
A Duo passcode (type a recovery code or 1 for a push notification)

Pro tip: Use tmux for persistent sessions

# After logging in, immediately start tmux
tmux new -s work

# If you disconnect, reconnect with:
ssh YOUR_USERNAME@gh-login04.delta.ncsa.illinois.edu  # same login node!
tmux attach -t work

SSH config shortcut

Add this to your ~/.ssh/config:

Host delta-ai
    HostName dtai-login.delta.ncsa.illinois.edu
    User YOUR_USERNAME
    PreferredAuthentications keyboard-interactive,password
    ServerAliveInterval 60
    ServerAliveCountMax 3

Then just: ssh delta-ai

Step 3: Understand Your Storage

Path	Quota	Use For
`/u/YOUR_USERNAME`	~100 GB	Dotfiles, scripts, small configs
`/work/hdd/bekn/YOUR_USERNAME/`	1 TB	Your primary workspace — code, builds, data
`/work/nvme/bekn/`	500 GB	Fast I/O scratch (shared across team)
`/projects/bekn/`	500 GB	Shared project files
`/tmp`	3.9 TB	Compute-node-local scratch (deleted after your job ends)

Rule of thumb: Do everything in /work/hdd/bekn/YOUR_USERNAME/. Home is too small for builds.

Check your quota: quota

Step 4: Run Your First Job

Interactive session (for exploration)

srun --account=bekn-dtai-gh --partition=ghx4-interactive \
  --nodes=1 --gpus-per-node=1 --cpus-per-task=16 \
  --mem=64G --time=00:30:00 --pty bash

This gives you a shell on a compute node with 1 GPU for 30 minutes.

Once on the compute node:

nvidia-smi          # See your GPU (GH200 120GB)
uname -m            # Should print "aarch64"

Batch job

Create job.slurm:

#!/bin/bash
#SBATCH --account=bekn-dtai-gh
#SBATCH --partition=ghx4
#SBATCH --nodes=1
#SBATCH --gpus-per-node=1
#SBATCH --cpus-per-task=16
#SBATCH --mem=64G
#SBATCH --time=01:00:00
#SBATCH --job-name=my-experiment
#SBATCH --output=logs/%j.out
#SBATCH --error=logs/%j.err

# Load your environment
source ~/miniconda3/etc/profile.d/conda.sh
conda activate myenv

# Run your code
srun python train.py

Submit: sbatch job.slurm Check status: squeue -u $USER Cancel: scancel JOB_ID

Cost awareness

Action	Cost
1 GPU for 1 hour (batch)	1 GPU Hour
1 GPU for 1 hour (interactive)	2 GPU Hours
Full node (4 GPUs) for 1 hour	4 GPU Hours

We have ~1,000 GPU Hours. Use interactive sessions for debugging, batch for real work.

Step 5: Set Up Python / Conda

DeltaAI doesn't have Anaconda. Install Miniconda:

curl -L -o /tmp/mc.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-aarch64.sh
bash /tmp/mc.sh -b -p $HOME/miniconda3
source $HOME/miniconda3/etc/profile.d/conda.sh
conda init bash

Create an environment:

conda create -n myenv python=3.11 -y
conda activate myenv
conda install -c conda-forge pytorch numpy scipy matplotlib -y

For large environments, install to /work to avoid HOME quota:

conda create --prefix /work/hdd/bekn/$USER/envs/myenv python=3.11 -y

Step 6: Build IOWarp Clio Core

ARM Architecture

DeltaAI uses aarch64 ARM CPUs. The default system GCC is 7.5 (too old). You must use gcc-13/g++-13 explicitly.

# Activate conda with all deps
source ~/miniconda3/etc/profile.d/conda.sh
conda activate iowarp

# Clone
cd /work/hdd/bekn/$USER
git clone --recurse-submodules https://github.com/iowarp/clio-core.git
cd clio-core

# Build (must use gcc-13 explicitly!)
cmake \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_C_COMPILER=/usr/bin/gcc-13 \
  -DCMAKE_CXX_COMPILER=/usr/bin/g++-13 \
  -DCMAKE_C_FLAGS="-I$CONDA_PREFIX/include" \
  -DCMAKE_CXX_FLAGS="-I$CONDA_PREFIX/include" \
  -DCMAKE_EXE_LINKER_FLAGS="-L$CONDA_PREFIX/lib" \
  -DCMAKE_SHARED_LINKER_FLAGS="-L$CONDA_PREFIX/lib" \
  -DCMAKE_PREFIX_PATH=$CONDA_PREFIX \
  -DCMAKE_INSTALL_PREFIX=$CONDA_PREFIX \
  -DWRP_CORE_ENABLE_RUNTIME=ON -DWRP_CORE_ENABLE_CTE=ON \
  -DWRP_CORE_ENABLE_CAE=ON -DWRP_CORE_ENABLE_CEE=ON \
  -DWRP_CORE_ENABLE_TESTS=OFF -DWRP_CORE_ENABLE_PYTHON=OFF \
  -DWRP_CORE_ENABLE_MPI=OFF -DWRP_CORE_ENABLE_IO_URING=OFF \
  -DWRP_CORE_ENABLE_ZMQ=ON -DWRP_CORE_ENABLE_CEREAL=ON \
  -DWRP_CORE_ENABLE_HDF5=ON \
  -Wno-dev -B build -G Ninja

cmake --build build -j16
cmake --install build

Known build issues

msgpack cmake naming — conda msgpack-cxx provides msgpack-cxx-config.cmake but CMake expects msgpackConfig.cmake. Create symlinks:

mkdir -p $CONDA_PREFIX/lib/cmake/msgpack
ln -sf $CONDA_PREFIX/lib/cmake/msgpack-cxx/msgpack-cxx-config.cmake \
  $CONDA_PREFIX/lib/cmake/msgpack/msgpackConfig.cmake
ln -sf $CONDA_PREFIX/lib/cmake/msgpack-cxx/msgpack-cxx-config-version.cmake \
  $CONDA_PREFIX/lib/cmake/msgpack/msgpackConfigVersion.cmake
ln -sf $CONDA_PREFIX/lib/cmake/msgpack-cxx/msgpack-cxx-targets.cmake \
  $CONDA_PREFIX/lib/cmake/msgpack/msgpack-cxx-targets.cmake

No io_uring — SLES 15.6 kernel may not support it. Disable with -DWRP_CORE_ENABLE_IO_URING=OFF.
Ninja from conda — system cmake is 3.20 (old). Install cmake + ninja from conda for better compatibility.

Step 7 (Optional): Install AI Coding Agents

DeltaAI does not ship with Node.js. If you want to use terminal-based coding agents, install Node.js first via nvm:

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
source ~/.bashrc
nvm install --lts

Then install whichever agents your workflow requires:

Claude Code (Anthropic)

npm install -g @anthropic-ai/claude-code

Requires an Anthropic API key or a Claude Pro/Max subscription. See claude.ai/code for details.

Gemini CLI (Google)

npm install -g @google/gemini-cli

Authenticate with your Google account on first run. See geminicli.com for details.

Codex CLI (OpenAI)

npm install -g @openai/codex

Requires an OpenAI API key. See openai.com/codex for details.

OpenCode

npm install -g opencode-ai@latest

Supports multiple LLM providers. See opencode.ai for details.

tip

All of these agents work well inside a tmux session, which is especially useful on DeltaAI where SSH sessions require re-authentication.

Key Things to Remember

This is ARM, not x86. Binaries from your laptop won't run here. Compile everything on DeltaAI.
No mpirun. Use srun for everything.
Use gcc-13/g++-13 explicitly. The default system GCC is 7.5 (too old).
No SSH keys. Password + Duo every time. Use tmux.
Interactive = 2x cost. Use batch jobs for anything longer than quick debugging.
No backups on /work. Only HOME has snapshots. Back up important work yourself.
Keep builds off HOME. Use /work/hdd/bekn/YOUR_USERNAME/ for everything.

Useful Commands Cheat Sheet

accounts                          # Check GPU hour balance
quota                             # Check storage usage
sinfo -a                          # See partition status
squeue -u $USER                   # Your running/queued jobs
scancel JOB_ID                    # Cancel a job
nvidia-smi                        # GPU status (compute nodes only)
module list                       # Loaded software modules
module spider PACKAGE             # Search for available software

GPU Info

NVIDIA GH200 120GB per superchip
4 superchips per node (4 GPUs)
CUDA 12.8, Driver 570.172
SM architecture: 9.0 (Hopper)
Use nvidia-smi on compute nodes (no GPUs on login nodes)

Getting Help

NCSA Support: http://help.ncsa.illinois.edu or email help@ncsa.illinois.edu
DeltaAI Docs: https://docs.ncsa.illinois.edu/systems/deltaai/en/latest/
Team Slack/Chat: Ask the PI or allocation managers (Jaime, Luke)

Required Acknowledgment

If you publish results using DeltaAI, include:

"This research used the DeltaAI system at the National Center for Supercomputing Applications through allocation CIS250329 from the ACCESS program, supported by NSF award OAC 2320345."

What is DeltaAI?​

Step 1: Get Your Credentials​

1a. NCSA Username​

1b. NCSA Kerberos Password​

1c. NCSA Duo MFA​

Step 2: SSH In​

Pro tip: Use tmux for persistent sessions​

SSH config shortcut​

Step 3: Understand Your Storage​

Step 4: Run Your First Job​

Interactive session (for exploration)​

Batch job​

Cost awareness​

Step 5: Set Up Python / Conda​

Step 6: Build IOWarp Clio Core​

Known build issues​

Step 7 (Optional): Install AI Coding Agents​

Key Things to Remember​

Useful Commands Cheat Sheet​

GPU Info​

Getting Help​

Required Acknowledgment​