Runtime Dashboard
The context_visualizer package provides a lightweight Flask web application that lets you inspect and manage a live Chimaera runtime cluster from your browser. It connects to the runtime using the same client API used by application code and surfaces cluster topology, per-node worker statistics, system resource utilization, block device stats, pool configuration, and the active YAML config.
Prerequisites
- IOWarp installed with Python support (
WRP_CORE_ENABLE_PYTHON=ON) - A running Chimaera runtime (
chimaera runtime start) - Python dependencies:
flask,pyyaml,msgpack
Install the Python dependencies with any of:
pip install flask pyyaml msgpack
# or
pip install iowarp-core[visualizer]
# or (conda)
conda install flask pyyaml python-msgpack
Starting the Dashboard
python -m context_visualizer
Then open http://127.0.0.1:5000 in your browser.
CLI Options
| Flag | Default | Description |
|---|---|---|
--host | 127.0.0.1 | Bind address. Use 0.0.0.0 to expose on all interfaces. |
--port | 5000 | Listen port. |
--debug | (off) | Enable Flask debug mode (auto-reload, verbose errors). |
# Expose on all interfaces, non-default port
python -m context_visualizer --host 0.0.0.0 --port 8080
# Debug mode (development only)
python -m context_visualizer --debug
Pages
Topology (/)
The landing page shows a live grid of all nodes in the cluster. Each node card displays:
- Hostname and IP address
- Status badge (alive)
- CPU, RAM, and GPU utilization bars (GPU shown only when GPUs are present)
- Restart and Shutdown action buttons
The search bar supports filtering by node ID (single 3, range 1-20, comma-separated 1,3,5) or by hostname/IP substring.
Clicking a node card navigates to the per-node detail page.
Node Detail (/node/<id>)
A per-node drilldown page showing:
- Worker statistics — per-worker queue depth, blocked tasks, processed count, and more
- System stats — time-series CPU, RAM, GPU, and HBM utilization
- Block device stats — per-bdev pool throughput and capacity
Pools (/pools)
Lists all pools defined in the compose section of the active configuration file:
| Column | Description |
|---|---|
| Module | ChiMod shared-library name (mod_name) |
| Pool Name | User-defined pool name |
| Pool ID | Unique pool identifier |
| Query | Routing policy (local, dynamic, broadcast) |
Config (/config)
Displays the full contents of the active YAML configuration file as formatted JSON, for quick inspection without opening a terminal.
REST API
All pages are backed by a JSON API. You can query these endpoints directly for scripting or integration with other monitoring tools.
Cluster-wide
| Endpoint | Method | Description |
|---|---|---|
/api/topology | GET | List all nodes with hostname, IP, CPU/RAM/GPU utilization |
/api/system | GET | High-level system overview (connected, worker/queue/blocked/processed counts) |
/api/workers | GET | Per-worker stats plus a fleet summary (local node) |
/api/pools | GET | Pool list from the compose section of the config |
/api/config | GET | Full active configuration as JSON |
Per-node
| Endpoint | Method | Description |
|---|---|---|
/api/node/<id>/workers | GET | Worker stats for a specific node |
/api/node/<id>/system_stats | GET | System resource utilization entries for a specific node |
/api/node/<id>/bdev_stats | GET | Block device stats for a specific node |
Node Management
| Endpoint | Method | Description |
|---|---|---|
/api/topology/node/<id>/shutdown | POST | Gracefully shut down a node via SSH |
/api/topology/node/<id>/restart | POST | Restart a node via SSH |
Shutdown and restart are performed by SSHing from the dashboard host to the target node and running chimaera runtime stop or chimaera runtime restart. This avoids the problem of a node killing itself mid-RPC. The SSH connection uses StrictHostKeyChecking=no and ConnectTimeout=5.
Shutdown response:
{
"success": true,
"returncode": 0,
"stdout": "",
"stderr": ""
}
Exit codes 0 and 134 (SIGABRT from std::abort() in InitiateShutdown) are both treated as success.
Restart uses nohup so the SSH session returns immediately while the node restarts in the background.
All endpoints return Content-Type: application/json. On error they return an appropriate HTTP status code (e.g., 503 if the runtime is unreachable, 404 if a node is not found) with an "error" field in the response body.
Examples
# Get cluster topology
curl http://127.0.0.1:5000/api/topology
# Get system overview
curl http://127.0.0.1:5000/api/system
# Get worker stats for node 2
curl http://127.0.0.1:5000/api/node/2/workers
# Shut down node 3
curl -X POST http://127.0.0.1:5000/api/topology/node/3/shutdown
# Restart node 3
curl -X POST http://127.0.0.1:5000/api/topology/node/3/restart
Configuration File Discovery
The dashboard reads the same config file as the runtime, using the same search order:
| Source | Priority |
|---|---|
CHI_SERVER_CONF environment variable | 1st |
WRP_RUNTIME_CONF environment variable | 2nd |
~/.chimaera/chimaera.yaml | 3rd |
See Configuration for details on the config file format.
Connection Lifecycle
The dashboard connects to the runtime lazily — on the first request that needs live data. If the runtime is not yet running when the dashboard starts, it will show a disconnected state and retry on subsequent requests. Shutdown is handled automatically via atexit so the client is finalized cleanly when the server process exits.
Docker / Remote Access
When running the runtime inside Docker or on a remote host, bind the dashboard to all interfaces and forward the port:
# On the host running the runtime
python -m context_visualizer --host 0.0.0.0 --port 5000
# docker-compose.yml — expose the dashboard port alongside the runtime
services:
iowarp:
image: iowarp/deploy-cpu:latest
ports:
- "9413:9413" # Chimaera RPC
- "5000:5000" # Dashboard
command: >
bash -c "chimaera runtime start &
python -m context_visualizer --host 0.0.0.0"
The dashboard has no authentication. Do not expose it on a public network without a reverse proxy that enforces access control.
Try It: Interactive Docker Cluster
An interactive test environment is provided that spins up a 4-node Chimaera cluster with the dashboard so you can explore all features from your browser.
Location
context-runtime/test/integration/interactive/
├── docker-compose.yml # 4-node runtime cluster
├── hostfile # Node IP addresses (172.28.0.10-13)
├── wrp_conf.yaml # Runtime configuration
└── run.sh # Launcher script
How It Works
- 4 Docker containers (
iowarp-interactive-node1throughnode4) run the Chimaera runtime on a private172.28.0.0/16network, each withsshdfor SSH-based shutdown/restart - Node 1 also runs the dashboard alongside its runtime
- The script connects the devcontainer to the Docker network and starts a local port-forward so that
localhost:5000reaches the dashboard inside Docker — VS Code then auto-forwards this to your host browser - SSH keys are distributed via a shared Docker volume so the dashboard can authenticate to all nodes
Running
cd context-runtime/test/integration/interactive
# Foreground (Ctrl-C to stop)
bash run.sh
# Or run in the background
bash run.sh start
# Follow runtime container logs
bash run.sh logs
# Stop everything (cluster + dashboard)
bash run.sh stop
Once the cluster is up (~15 seconds), open http://localhost:5000 to browse the topology, click into individual nodes, and use the Restart/Shutdown buttons.
If running from a devcontainer or a host where the workspace is at a different path, set HOST_WORKSPACE:
HOST_WORKSPACE=/host/path/to/workspace bash run.sh