Local Scheduler Guide
Overview
The Chimaera runtime uses a pluggable scheduler architecture to control how tasks are mapped to workers and how workers are organized. Scheduling happens at two levels: container-level scheduling resolves what to execute (via Container::ScheduleTask), and the Scheduler decides where to execute it (which worker thread). This document explains both levels and how to build custom schedulers.
Table of Contents
- Architecture Overview
- Two-Level Scheduling
- Scheduler Interface
- Worker Lifecycle
- Implementing a Custom Scheduler
- DefaultScheduler Example
- Best Practices
- Integration Points
Architecture Overview
Component Responsibilities
The IOWarp runtime separates concerns across three main components:
- ConfigManager: Manages configuration (number of threads, queue depth, etc.)
- WorkOrchestrator: Creates workers, spawns threads, assigns lanes to workers (1:1 mapping for all workers)
- Scheduler: Decides worker partitioning, task-to-worker mapping, and load balancing
- IpcManager: Manages shared memory, queues, and provides task routing infrastructure (
RouteTask,RouteLocal,RouteGlobal) - Container: Provides per-pool dynamic scheduling via
ScheduleTask
Data Flow
┌─────────────────┐
│ ConfigManager │──→ num_threads, queue_depth
└─────────────────┘
│
↓
┌─────────────────┐
│ WorkOrchestrator│──→ Creates num_threads + 1 workers
└─────────────────┘
│
↓
┌─────────────────┐
│ Scheduler │──→ Tracks worker groups for routing decisions
└─────────────────┘ Updates IpcManager with scheduler queue count
│
↓
┌─────────────────┐
│ WorkOrchestrator│──→ Maps ALL workers to lanes (1:1 mapping)
└─────────────────┘ Spawns OS threads for each worker
│
↓
┌─────────────────┐
│ IpcManager │──→ num_sched_queues used for client task mapping
└─────────────────┘
Task Routing Flow
When a task is submitted, IpcManager::RouteTask orchestrates the full routing pipeline:
RouteTask(future, force_enqueue)
│
├─ 1. Container::ScheduleTask() → resolve Dynamic pool query
│ (e.g., Dynamic → DirectHash)
│
├─ 2. ResolvePoolQuery() → resolve to physical node(s)
│
├─ 3. IsTaskLocal() → local or remote?
│ │
│ ├─ local: RouteLocal() → resolve container, pick worker
│ │ │
│ │ ├─ RuntimeMapTask() → scheduler picks dest worker
│ │ │
│ │ ├─ dest == current && !force_enqueue → execute directly
│ │ └─ otherwise → enqueue to dest worker's lane
│ │
│ └─ remote: RouteGlobal() → enqueue to net_queue_ for SendIn
│
└─ return true if task can be executed by current worker
Two-Level Scheduling
Level 1: Container Scheduling (ScheduleTask)
Before the Scheduler decides which worker runs a task, the container decides which container instance handles it. This is the Container::ScheduleTask virtual method.
class Container {
public:
virtual PoolQuery ScheduleTask(const hipc::FullPtr<Task> &task) {
return task->pool_query_; // Default: no transformation
}
};
Purpose: Transform a high-level PoolQuery (often Dynamic) into a concrete routing mode (DirectHash, Local, Broadcast, etc.) based on the task's semantics.
Example: A distributed filesystem container might schedule read tasks to the node holding the target block:
PoolQuery MyFsContainer::ScheduleTask(const hipc::FullPtr<Task> &task) {
if (task->method_ == kRead || task->method_ == kWrite) {
// Route to the container holding the target block
u64 block_id = GetBlockId(task);
return PoolQuery::DirectHash(block_id);
}
// Metadata ops stay local
return PoolQuery::Local();
}
ScheduleTask is called by RouteTask before pool query resolution. The returned PoolQuery then goes through ResolvePoolQuery to determine concrete physical nodes and containers.
Level 2: Worker Scheduling (RuntimeMapTask)
After container and node resolution, RouteLocal calls Scheduler::RuntimeMapTask to pick the specific worker thread. This is where I/O-size routing, task group affinity, and network worker pinning happen.
Scheduler Interface
All schedulers must inherit from the Scheduler base class and implement the following methods:
Required Methods
class Scheduler {
public:
virtual ~Scheduler() = default;
// Partition workers into groups after WorkOrchestrator creates them
virtual void DivideWorkers(WorkOrchestrator *work_orch) = 0;
// Map tasks from clients to worker lanes
virtual u32 ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task) = 0;
// Map tasks from runtime workers to other workers
virtual u32 RuntimeMapTask(Worker *worker, const Future<Task> &task,
Container *container) = 0;
// Rebalance load across workers (called periodically by workers)
virtual void RebalanceWorker(Worker *worker) = 0;
// Adjust polling intervals for periodic tasks
virtual void AdjustPolling(RunContext *run_ctx) = 0;
// Get designated GPU worker (optional)
virtual Worker *GetGpuWorker() const { return nullptr; }
// Get designated network worker (optional)
virtual Worker *GetNetWorker() const { return nullptr; }
};
Method Details
DivideWorkers(WorkOrchestrator *work_orch)
Purpose: Partition workers into functional groups after they've been created.
Called: Once during initialization, after WorkOrchestrator creates all workers but before threads are spawned.
Responsibilities:
- Access workers via
work_orch->GetWorker(worker_id) - Organize workers into scheduler-specific groups (e.g., scheduler worker, I/O workers, network worker, GPU worker)
- Update IpcManager with the scheduling queue count via
IpcManager::SetNumSchedQueues() - Set the network lane via
IpcManager::SetNetLane()so the runtime knows where to enqueue network tasks
Important: All workers are assigned lanes by WorkOrchestrator::SpawnWorkerThreads() using 1:1 mapping. The scheduler does NOT control lane assignment — it only tracks worker groups for routing decisions.
Example:
void MyScheduler::DivideWorkers(WorkOrchestrator *work_orch) {
u32 total_workers = work_orch->GetTotalWorkerCount();
// Worker 0: scheduler worker (metadata + small I/O)
scheduler_worker_ = work_orch->GetWorker(0);
// Workers 1..N-2: I/O workers (large I/O round-robin)
for (u32 i = 1; i < total_workers - 1; ++i) {
io_workers_.push_back(work_orch->GetWorker(i));
}
// Worker N-1: network worker
net_worker_ = work_orch->GetWorker(total_workers - 1);
// IMPORTANT: Update IpcManager
IpcManager *ipc = CHI_IPC;
if (ipc) {
ipc->SetNumSchedQueues(1); // Client tasks go to scheduler worker
if (net_worker_) {
ipc->SetNetLane(net_worker_->GetLane());
}
}
}
ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task)
Purpose: Determine which worker lane a task from a client should be assigned to.
Called: When clients submit tasks to the runtime via SendRuntimeClient.
Responsibilities:
- Return a lane ID in range
[0, num_sched_queues) - Use
ipc_manager->GetNumSchedQueues()to get valid lane count - Route special tasks (e.g., network Send/Recv) to the appropriate worker
- Common strategies: PID+TID hash, round-robin, locality-aware
Example:
u32 MyScheduler::ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task) {
u32 num_lanes = ipc_manager->GetNumSchedQueues();
if (num_lanes == 0) return 0;
// Network tasks (Send/Recv from admin pool) → network worker
Task *task_ptr = task.get();
if (task_ptr != nullptr && task_ptr->pool_id_ == chi::kAdminPoolId) {
u32 method_id = task_ptr->method_;
if (method_id == 14 || method_id == 15 ||
method_id == 20 || method_id == 21) {
return num_lanes - 1;
}
}
// Default: scheduler worker (lane 0)
return 0;
}
RuntimeMapTask(Worker *worker, const Future<Task> &task, Container *container)
Purpose: Determine which worker should execute a task when routing from within the runtime.
Called: By IpcManager::RouteLocal() after the execution container has been resolved. If the returned worker ID differs from the current worker (or if force_enqueue is set), the task is enqueued to the destination worker's lane.
Parameters:
worker: The current worker (may be nullptr if called from a non-worker thread)task: The task being routedcontainer: The resolved execution container. Used for task-group affinity lookups. May be nullptr when called without a resolved container.
Responsibilities:
- Return a worker ID for task execution
- Route periodic network tasks (Send/Recv) to the dedicated network worker
- Implement I/O-size-based routing (large I/O → dedicated workers)
- Implement task group affinity (pin related tasks to the same worker)
Example:
u32 MyScheduler::RuntimeMapTask(Worker *worker, const Future<Task> &task,
Container *container) {
Task *task_ptr = task.get();
// Task group affinity: if this task's group is already pinned, use that worker
if (container != nullptr && task_ptr != nullptr &&
!task_ptr->task_group_.IsNull()) {
int64_t group_id = task_ptr->task_group_.id_;
ScopedCoRwReadLock read_lock(container->task_group_lock_);
auto it = container->task_group_map_.find(group_id);
if (it != container->task_group_map_.end() && it->second != nullptr) {
return it->second->GetId();
}
}
// Periodic network tasks → network worker
if (task_ptr != nullptr && task_ptr->IsPeriodic()) {
if (task_ptr->pool_id_ == chi::kAdminPoolId) {
u32 method_id = task_ptr->method_;
if (method_id == 14 || method_id == 15) {
if (net_worker_ != nullptr) {
return net_worker_->GetId();
}
}
}
}
// Large I/O → round-robin across I/O workers
if (task_ptr != nullptr && !io_workers_.empty()) {
if (task_ptr->stat_.io_size_ >= kLargeIOThreshold) {
u32 idx = next_io_idx_.fetch_add(1, std::memory_order_relaxed)
% static_cast<u32>(io_workers_.size());
return io_workers_[idx]->GetId();
}
}
// Small I/O / metadata → scheduler worker
if (scheduler_worker_ != nullptr) {
return scheduler_worker_->GetId();
}
return worker ? worker->GetId() : 0;
}
RebalanceWorker(Worker *worker)
Purpose: Balance load across workers by stealing or delegating tasks.
Called: Periodically by workers after processing tasks.
Responsibilities:
- Implement work stealing algorithms
- Migrate tasks between workers
- Optional — can be a no-op for simple schedulers
Example:
void MyScheduler::RebalanceWorker(Worker *worker) {
// Simple schedulers can leave this empty
(void)worker;
}
AdjustPolling(RunContext *run_ctx)
Purpose: Adjust polling intervals for periodic tasks based on work done.
Called: After each execution of a periodic task.
Responsibilities:
- Modify
run_ctx->yield_time_us_based onrun_ctx->did_work_ - Implement adaptive polling (exponential backoff when idle)
- Reduce CPU usage for idle periodic tasks
- Important:
co_awaiton Futures setsyield_time_us_to 0, so this method must restore it to prevent busy-looping
Example:
void MyScheduler::AdjustPolling(RunContext *run_ctx) {
if (!run_ctx) return;
const double kMaxPollingIntervalUs = 100000.0; // 100ms
if (run_ctx->did_work_) {
// Reset to original period when work is done
run_ctx->yield_time_us_ = run_ctx->true_period_ns_ / 1000.0;
} else {
// Exponential backoff when idle
double current = run_ctx->yield_time_us_;
if (current <= 0.0) {
current = run_ctx->true_period_ns_ / 1000.0;
}
run_ctx->yield_time_us_ = std::min(current * 2.0, kMaxPollingIntervalUs);
}
}
Worker Lifecycle
Understanding the worker lifecycle is crucial for scheduler implementation:
1. ConfigManager loads configuration (num_threads, queue_depth)
↓
2. WorkOrchestrator::Init()
- Creates num_threads + 1 workers
- Calls Scheduler::DivideWorkers()
↓
3. Scheduler::DivideWorkers()
- Tracks workers into functional groups (scheduler, I/O, network, GPU)
- Updates IpcManager::SetNumSchedQueues()
- Sets network lane via IpcManager::SetNetLane()
↓
4. WorkOrchestrator::StartWorkers()
- Calls SpawnWorkerThreads()
- Maps ALL workers to lanes (1:1 mapping: worker i → lane i)
- Spawns actual OS threads
↓
5. Workers run task processing loops
- ProcessNewTask: pop futures from lane, route via RouteTask()
- RouteTask calls Container::ScheduleTask() for dynamic resolution
- RouteLocal calls Scheduler::RuntimeMapTask() for worker selection
- If dest != current worker (or force_enqueue), re-enqueue to dest lane
- Call Scheduler::RebalanceWorker() periodically
- Call Scheduler::AdjustPolling() after periodic task execution
Implementing a Custom Scheduler
Step 1: Create Header File
Create context-runtime/include/chimaera/scheduler/my_scheduler.h:
#ifndef CHIMAERA_INCLUDE_CHIMAERA_SCHEDULER_MY_SCHEDULER_H_
#define CHIMAERA_INCLUDE_CHIMAERA_SCHEDULER_MY_SCHEDULER_H_
#include <atomic>
#include <vector>
#include "chimaera/scheduler/scheduler.h"
namespace chi {
class MyScheduler : public Scheduler {
public:
MyScheduler() : scheduler_worker_(nullptr), net_worker_(nullptr),
gpu_worker_(nullptr), next_io_idx_{0} {}
~MyScheduler() override = default;
void DivideWorkers(WorkOrchestrator *work_orch) override;
u32 ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task) override;
u32 RuntimeMapTask(Worker *worker, const Future<Task> &task,
Container *container) override;
void RebalanceWorker(Worker *worker) override;
void AdjustPolling(RunContext *run_ctx) override;
Worker *GetGpuWorker() const override { return gpu_worker_; }
Worker *GetNetWorker() const override { return net_worker_; }
private:
Worker *scheduler_worker_;
std::vector<Worker*> io_workers_;
Worker *net_worker_;
Worker *gpu_worker_;
std::atomic<u32> next_io_idx_{0};
};
} // namespace chi
#endif // CHIMAERA_INCLUDE_CHIMAERA_SCHEDULER_MY_SCHEDULER_H_
Step 2: Implement Methods
Create context-runtime/src/scheduler/my_scheduler.cc:
#include "chimaera/scheduler/my_scheduler.h"
#include "chimaera/config_manager.h"
#include "chimaera/ipc_manager.h"
#include "chimaera/work_orchestrator.h"
#include "chimaera/worker.h"
namespace chi {
void MyScheduler::DivideWorkers(WorkOrchestrator *work_orch) {
if (!work_orch) return;
u32 total_workers = work_orch->GetTotalWorkerCount();
scheduler_worker_ = work_orch->GetWorker(0);
net_worker_ = work_orch->GetWorker(total_workers - 1);
if (total_workers > 2) {
gpu_worker_ = work_orch->GetWorker(total_workers - 2);
for (u32 i = 1; i < total_workers - 1; ++i) {
io_workers_.push_back(work_orch->GetWorker(i));
}
}
IpcManager *ipc = CHI_IPC;
if (ipc) {
ipc->SetNumSchedQueues(1);
if (net_worker_) {
ipc->SetNetLane(net_worker_->GetLane());
}
}
}
u32 MyScheduler::ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task) {
u32 num_lanes = ipc_manager->GetNumSchedQueues();
if (num_lanes == 0) return 0;
return 0; // All client tasks → scheduler worker
}
u32 MyScheduler::RuntimeMapTask(Worker *worker, const Future<Task> &task,
Container *container) {
Task *task_ptr = task.get();
// Task group affinity
if (container != nullptr && task_ptr != nullptr &&
!task_ptr->task_group_.IsNull()) {
int64_t group_id = task_ptr->task_group_.id_;
ScopedCoRwReadLock read_lock(container->task_group_lock_);
auto it = container->task_group_map_.find(group_id);
if (it != container->task_group_map_.end() && it->second != nullptr) {
return it->second->GetId();
}
}
// Periodic network tasks → network worker
if (task_ptr != nullptr && task_ptr->IsPeriodic() &&
task_ptr->pool_id_ == chi::kAdminPoolId) {
u32 m = task_ptr->method_;
if ((m == 14 || m == 15 || m == 20 || m == 21) && net_worker_) {
return net_worker_->GetId();
}
}
// Large I/O → round-robin across I/O workers
if (task_ptr != nullptr && !io_workers_.empty() &&
task_ptr->stat_.io_size_ >= 4096) {
u32 idx = next_io_idx_.fetch_add(1, std::memory_order_relaxed)
% static_cast<u32>(io_workers_.size());
return io_workers_[idx]->GetId();
}
// Default → scheduler worker
if (scheduler_worker_) return scheduler_worker_->GetId();
return worker ? worker->GetId() : 0;
}
void MyScheduler::RebalanceWorker(Worker *worker) { (void)worker; }
void MyScheduler::AdjustPolling(RunContext *run_ctx) {
if (!run_ctx) return;
run_ctx->yield_time_us_ = run_ctx->true_period_ns_ / 1000.0;
}
} // namespace chi
Step 3: Register Scheduler
Update context-runtime/src/ipc_manager.cc to create your scheduler:
bool IpcManager::ServerInit() {
// ... existing initialization code ...
// Create scheduler based on configuration
ConfigManager *config = CHI_CONFIG_MANAGER;
std::string sched_name = config->GetLocalSched();
if (sched_name == "my_scheduler") {
scheduler_ = new MyScheduler();
} else if (sched_name == "default") {
scheduler_ = new DefaultScheduler();
} else {
HLOG(kError, "Unknown scheduler: {}", sched_name);
return false;
}
return true;
}
Step 4: Configure
Update your configuration file to use the new scheduler:
runtime:
local_sched: "my_scheduler"
num_threads: 4
queue_depth: 1024
DefaultScheduler Example
The DefaultScheduler provides a reference implementation with I/O-size-based routing and task group affinity.
Worker Partitioning
- Worker 0: Scheduler worker — handles metadata and small I/O (< 4KB)
- Workers 1..N-2: I/O workers — handle large I/O via round-robin
- Worker N-2: Also serves as the GPU worker (polls GPU queues)
- Worker N-1: Network worker — handles all Send/Recv/ClientSend/ClientRecv tasks
SetNumSchedQueues(1)— client tasks all go to the scheduler worker initially
Dynamic Scheduling via ScheduleTask
The DefaultScheduler works hand-in-hand with Container::ScheduleTask. When a task arrives with a Dynamic pool query, the container's ScheduleTask resolves it before RuntimeMapTask picks the worker:
- Task submitted with
PoolQuery::Dynamic() RouteTaskcallscontainer->ScheduleTask(task)→ returns e.g.DirectHash(block_id)ResolvePoolQueryresolvesDirectHashto a physical node + containerIsTaskLocalchecks if the target is this nodeRouteLocalcallsRuntimeMapTaskto pick the worker
This two-level design means containers control what gets executed where, while the scheduler controls how workers are utilized.
Task Group Affinity
The DefaultScheduler implements task group affinity to pin related tasks to the same worker. Each Container maintains a task_group_map_ mapping group IDs to workers:
- When
RuntimeMapTasksees a task with a non-nulltask_group_, it checks the container'stask_group_map_ - If the group is already mapped to a worker, the task goes to that worker
- If not, normal routing selects a worker and records the mapping
This ensures tasks in the same group (e.g., operations on the same file handle) execute on the same worker, avoiding lock contention and improving cache locality.
I/O-Size Routing
- Tasks with
stat_.io_size_ >= 4096→ round-robin across I/O workers - Tasks with
stat_.io_size_ < 4096→ scheduler worker (worker 0) - Network tasks (admin pool methods 14, 15, 20, 21) → network worker
Force Enqueue
RouteTask accepts a force_enqueue parameter (default false). When true, RouteLocal always enqueues the task to the destination worker's lane, even if the destination is the current worker. This is used by SendRuntime (non-worker thread path) which cannot execute tasks directly and must always enqueue.
Code Reference
See implementation in:
- Header:
context-runtime/include/chimaera/scheduler/default_sched.h - Implementation:
context-runtime/src/scheduler/default_sched.cc
Best Practices
1. Always Update IpcManager in DivideWorkers
void MyScheduler::DivideWorkers(WorkOrchestrator *work_orch) {
// ... partition workers ...
IpcManager *ipc = CHI_IPC;
if (ipc) {
ipc->SetNumSchedQueues(num_client_lanes);
if (net_worker_) {
ipc->SetNetLane(net_worker_->GetLane());
}
}
}
Why: Clients use GetNumSchedQueues() to map tasks to lanes. If this doesn't match the actual number of workers/lanes, tasks will be mapped to non-existent or wrong workers.
2. Route Network Tasks to the Network Worker
Both ClientMapTask and RuntimeMapTask should route Send/Recv tasks (methods 14/15/20/21 from admin pool) to the dedicated network worker (last worker). This prevents network I/O from blocking task processing workers.
3. Implement Task Group Affinity
Use container->task_group_map_ to pin related tasks to the same worker. Protect map access with container->task_group_lock_ (read lock for lookups, write lock for updates).
4. Validate Lane IDs
u32 MyScheduler::ClientMapTask(IpcManager *ipc_manager, const Future<Task> &task) {
u32 num_lanes = ipc_manager->GetNumSchedQueues();
if (num_lanes == 0) return 0;
u32 lane = ComputeLane(...);
return lane % num_lanes; // Ensure lane is in valid range
}
5. Handle Null Pointers
void MyScheduler::DivideWorkers(WorkOrchestrator *work_orch) {
if (!work_orch) return;
// ... proceed ...
}
u32 MyScheduler::RuntimeMapTask(Worker *worker, const Future<Task> &task,
Container *container) {
return worker ? worker->GetId() : 0;
}
6. Consider Thread Safety
If your scheduler maintains shared state accessed by multiple workers:
- Use atomic operations for counters (e.g.,
next_io_idx_) - Use
CoRwLockfor complex data structures (e.g.,task_group_map_) - Prefer lock-free designs when possible
7. Test with Different Configurations
Test your scheduler with various num_threads values:
- Single thread (num_threads = 1): single worker serves dual role
- Small (num_threads = 2-4)
- Large (num_threads = 16+)
Integration Points
Singletons and Macros
Access runtime components via global macros:
// Configuration
ConfigManager *config = CHI_CONFIG_MANAGER;
u32 num_threads = config->GetNumThreads();
// IPC Manager
IpcManager *ipc = CHI_IPC;
u32 num_lanes = ipc->GetNumSchedQueues();
// System Info
auto *sys_info = HSHM_SYSTEM_INFO;
pid_t pid = sys_info->pid_;
// Thread Model
auto tid = HSHM_THREAD_MODEL->GetTid();
Worker Access
Access workers through WorkOrchestrator:
u32 total_workers = work_orch->GetTotalWorkerCount();
Worker *worker = work_orch->GetWorker(worker_id);
// Get worker properties
u32 id = worker->GetId();
TaskLane *lane = worker->GetLane();
Container Access
Containers expose scheduling-related state:
// Task group affinity map (per container)
container->task_group_map_ // std::unordered_map<int64_t, Worker*>
container->task_group_lock_ // CoRwLock protecting the map
// Override ScheduleTask for custom dynamic scheduling
PoolQuery MyContainer::ScheduleTask(const hipc::FullPtr<Task> &task) {
// Transform Dynamic queries into concrete routing modes
return PoolQuery::DirectHash(ComputeHash(task));
}
Logging
Use Hermes logging macros:
HLOG(kInfo, "Scheduler initialized with {} workers", num_workers);
HLOG(kDebug, "Mapping task to lane {}", lane_id);
HLOG(kWarning, "Worker {} has empty queue", worker_id);
HLOG(kError, "Invalid configuration: {}", error_msg);
Configuration Access
Read configuration values:
ConfigManager *config = CHI_CONFIG_MANAGER;
u32 num_threads = config->GetNumThreads();
u32 queue_depth = config->GetQueueDepth();
std::string sched_name = config->GetLocalSched();
Advanced Topics
Custom Container Scheduling
Override ScheduleTask to implement application-specific routing:
class DistributedKVStore : public Container {
public:
PoolQuery ScheduleTask(const hipc::FullPtr<Task> &task) override {
switch (task->method_) {
case kGet:
case kPut:
case kDelete: {
// Route key-value ops to the node owning the key's hash partition
u64 key_hash = ExtractKeyHash(task);
return PoolQuery::DirectHash(key_hash);
}
case kScan:
// Range scans may span multiple nodes
return PoolQuery::Range(ExtractStartKey(task), ExtractEndKey(task));
case kCreateIndex:
// Metadata ops broadcast to all nodes
return PoolQuery::Broadcast();
default:
return PoolQuery::Local();
}
}
};
Work Stealing
Implement work stealing in RebalanceWorker:
void MyScheduler::RebalanceWorker(Worker *worker) {
TaskLane *my_lane = worker->GetLane();
if (my_lane->Empty()) {
for (Worker *victim : io_workers_) {
if (victim == worker) continue;
TaskLane *victim_lane = victim->GetLane();
if (!victim_lane->Empty()) {
Future<Task> stolen_task;
if (victim_lane->Pop(stolen_task)) {
my_lane->Push(stolen_task);
break;
}
}
}
}
}
Priority-Based Scheduling
Use task priorities for scheduling:
void MyScheduler::DivideWorkers(WorkOrchestrator *work_orch) {
u32 total = work_orch->GetTotalWorkerCount();
u32 high_prio_count = total / 2;
for (u32 i = 0; i < high_prio_count; ++i) {
high_priority_workers_.push_back(work_orch->GetWorker(i));
}
for (u32 i = high_prio_count; i < total - 1; ++i) {
low_priority_workers_.push_back(work_orch->GetWorker(i));
}
// Network worker
net_worker_ = work_orch->GetWorker(total - 1);
}
Troubleshooting
Tasks Not Being Processed
Symptom: Tasks submitted but never execute
Check:
- Did you call
IpcManager::SetNumSchedQueues()inDivideWorkers? - Are all workers getting lanes via WorkOrchestrator's 1:1 mapping?
- Does
ClientMapTaskreturn lane IDs in valid range? - Is
IpcManager::SetNetLane()called for the network worker?
Client Mapping Errors
Symptom: Assertion failures or crashes in ClientMapTask
Check:
- Is returned lane ID in range
[0, num_sched_queues)? - Did you check for
num_lanes == 0? - Are you using modulo to wrap lane IDs?
Tasks Hang After Re-enqueue
Symptom: Tasks enqueued to a different worker never complete
Check:
- When
RouteLocalre-enqueues to a different worker,ProcessNewTaskon the destination worker updates the RunContext'sworker_id_,lane_, andevent_queue_to match the new worker. If RunContext fields are stale, subtask completion events go to the wrong worker. - Check that
TASK_STARTEDis respected — re-enqueued tasks with live coroutines must be resumed, not restarted.
Worker Crashes
Symptom: Workers crash during initialization
Check:
- Are you checking for null pointers?
- Does
DivideWorkershandletotal_workers < expected? - Is the single-worker case handled (when
total_workers == 1)?
References
- Scheduler Interface:
context-runtime/include/chimaera/scheduler/scheduler.h - DefaultScheduler:
context-runtime/src/scheduler/default_sched.cc - Container Base:
context-runtime/include/chimaera/container.h - IpcManager (RouteTask/RouteLocal):
context-runtime/src/ipc_manager.cc - WorkOrchestrator:
context-runtime/src/work_orchestrator.cc - Configuration: Configuration Reference