Skip to main content

HPC Cluster Deployment

This guide covers manual deployment of the IOWarp runtime on bare-metal HPC clusters using the unified utility scripts included with the installation.

Prerequisites

IOWarp must be installed on every node in the cluster. The recommended method is Jarvis:

# On each node: clone and install
git clone https://github.com/iowarp/runtime-deployment.git
cd runtime-deployment
pip install -e . -r requirements.txt
jarvis init
jarvis rg build

All nodes must share access to the same IOWarp binaries, either via:

  • A shared filesystem (NFS, Lustre, GPFS)
  • Identical per-node installations with the same paths

Environment Variables

The following environment variables control runtime behavior. Set them before starting any IOWarp process.

Configuration File

VariablePriorityDescription
CHI_SERVER_CONFPrimaryPath to the Chimaera YAML configuration file. Checked first.
WRP_RUNTIME_CONFFallbackUsed when CHI_SERVER_CONF is not set.
export CHI_SERVER_CONF=/etc/iowarp/config.yaml

Networking Overrides

VariableDefaultDescription
CHI_PORT9413Override the RPC port. Takes priority over the YAML networking.port setting.
CHI_SERVER_ADDR127.0.0.1Override the server address that clients connect to.

IPC Transport Mode

VariableDefaultDescription
CHI_IPC_MODETCPTransport used by clients to reach the runtime server.
ValueModeWhen to Use
SHMShared MemoryClient and server on the same node. Lowest latency.
TCPZeroMQ TCPCross-node communication. Default when unset.
IPCUnix Domain SocketSame-node only, avoids TCP overhead.
# Same-node, lowest latency
export CHI_IPC_MODE=SHM

# Cross-node (default)
export CHI_IPC_MODE=TCP

Runtime Mode

VariableDefaultDescription
CHI_WITH_RUNTIME(unset)When set to 1, starts the runtime server in-process. When 0, client-only mode.

This variable is read by CHIMAERA_INIT(). If unset, the value of the default_with_runtime argument passed to CHIMAERA_INIT() is used instead.


Single-Node Deployment

# 1. Set configuration
export CHI_SERVER_CONF=/etc/iowarp/config.yaml

# 2. Start the runtime in the background
chimaera runtime start &

# 4. (Optional) Create pools from the compose section
chimaera compose $CHI_SERVER_CONF

# 5. Run your application
my_iowarp_app

Multi-Node Deployment

Hostfile

Create a hostfile with one IP (or hostname) per line — in the order nodes should be addressed:

192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13

Reference it in your config:

networking:
port: 9413
hostfile: /etc/iowarp/hostfile

Starting the Runtime on All Nodes

Use parallel-ssh (pssh) to launch the runtime simultaneously across the cluster. Forwarding PATH and CHI_SERVER_CONF ensures each node picks up the right binary and config:

parallel-ssh -i -h hostfile \
-x "-o SendEnv=PATH -o SendEnv=CHI_SERVER_CONF" \
"chimaera runtime start &"

If your SSH environment does not forward variables reliably, inline them:

parallel-ssh -i -h hostfile \
"export CHI_SERVER_CONF=/etc/iowarp/config.yaml && chimaera runtime start &"

Verifying the Cluster

After startup, verify all nodes joined the cluster from any node:

chimaera_pool_list

Stopping the Runtime

parallel-ssh -i -h hostfile "chimaera runtime stop"

Jarvis-Based Deployment

Jarvis automates multi-node orchestration and is the recommended method for scripted deployments:

# Configure and deploy across the cluster
jarvis pipeline create my_deploy
jarvis pipeline append iowarp_runtime

# Start
jarvis pipeline run

# Stop
jarvis pipeline clean

See the Jarvis runtime-deployment repository for pipeline configuration options.


Containers

Container-based deployment on HPC clusters is under active development. See Configuration for Docker Compose examples.