Skip to main content

HPC Cluster Deployment

This guide covers manual deployment of the CLIO Runtime on bare-metal HPC clusters using the unified utility scripts included with the installation.

Prerequisites

IOWarp must be installed on every node in the cluster. The recommended method is Jarvis:

# On each node: clone and install
git clone https://github.com/iowarp/runtime-deployment.git
cd runtime-deployment
pip install -e . -r requirements.txt
jarvis init
jarvis rg build

All nodes must share access to the same IOWarp binaries, either via:

  • A shared filesystem (NFS, Lustre, GPFS)
  • Identical per-node installations with the same paths

Environment Variables

The following environment variables control runtime behavior. Set them before starting any IOWarp process.

Configuration File

VariablePriorityDescription
CLIO_SERVER_CONFPrimaryPath to the Clio YAML configuration file. Checked first.
~/.clio/clio.yamlFallbackPer-user default. Seeded at install time.

Legacy paths and env vars are also accepted; see Deprecation Notes for the full list.

export CLIO_SERVER_CONF=/etc/iowarp/config.yaml

Networking Overrides

VariableDefaultDescription
CLIO_PORT9413Override the RPC port. Takes priority over the YAML networking.port setting.
CLIO_SERVER_ADDR127.0.0.1Override the server address that clients connect to.

IPC Transport Mode

VariableDefaultDescription
CLIO_IPC_MODETCPTransport used by clients to reach the runtime server.
ValueModeWhen to Use
SHMShared MemoryClient and server on the same node. Lowest latency.
TCPZeroMQ TCPCross-node communication. Default when unset.
IPCUnix Domain SocketSame-node only, avoids TCP overhead.
# Same-node, lowest latency
export CLIO_IPC_MODE=SHM

# Cross-node (default)
export CLIO_IPC_MODE=TCP

Runtime Mode

VariableDefaultDescription
CLIO_WITH_RUNTIME(unset)When set to 1, starts the runtime server in-process. When 0, client-only mode.

This variable is read by CHIMAERA_INIT(). If unset, the value of the default_with_runtime argument passed to CHIMAERA_INIT() is used instead.


Single-Node Deployment

# 1. Set configuration
export CLIO_SERVER_CONF=/etc/iowarp/config.yaml

# 2. Start the runtime in the background
clio_run start &

# 4. (Optional) Create pools from the compose section
clio_run compose $CLIO_SERVER_CONF

# 5. Run your application
my_iowarp_app

Multi-Node Deployment

Hostfile

Create a hostfile with one IP (or hostname) per line — in the order nodes should be addressed:

192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13

Reference it in your config:

networking:
port: 9413
hostfile: /etc/iowarp/hostfile

Starting the Runtime on All Nodes

Use parallel-ssh (pssh) to launch the runtime simultaneously across the cluster. Forwarding PATH and CLIO_SERVER_CONF ensures each node picks up the right binary and config:

parallel-ssh -i -h hostfile \
-x "-o SendEnv=PATH -o SendEnv=CLIO_SERVER_CONF" \
"clio_run start &"

If your SSH environment does not forward variables reliably, inline them:

parallel-ssh -i -h hostfile \
"export CLIO_SERVER_CONF=/etc/iowarp/config.yaml && clio_run start &"

Verifying the Cluster

After startup, verify all nodes joined the cluster from any node:

clio_run monitor

Stopping the Runtime

parallel-ssh -i -h hostfile "clio_run stop"

Jarvis-Based Deployment

Jarvis automates multi-node orchestration and is the recommended method for scripted deployments:

# Configure and deploy across the cluster
jarvis pipeline create my_deploy
jarvis pipeline append iowarp_runtime

# Start
jarvis pipeline run

# Stop
jarvis pipeline clean

See the Jarvis runtime-deployment repository for pipeline configuration options.


Containers

Container-based deployment on HPC clusters is under active development. See Configuration for Docker Compose examples.