Command Queue

What happens when you send a new message while an agent is already running? Three modes: steer, followup,collect.

Why this matters

Without queuing, parallel messages cause race conditions: two agent runs overwrite each other's session state. Without steering, you can't interrupt a long-running task to redirect it. The Command Queue solves both.

The three modes

steer (default)

New messages inject into the active run. Delivered to the LLM after current tool calls finish but before the next LLM call.

Use when: you want "wait, do X instead" semantics. Most chat-like interactions benefit from this.

followup

Don't steer. Queue messages for a new turn after the current one ends. Default debounce: 1 second (multiple rapid messages coalesce).

Use when: the running task is critical and shouldn't be interrupted. Followups happen after, naturally.

collect

Don't steer. Hold messages until run ends, then format them as a single structured prompt: "you said A, then B, then C". Useful when batching related items.

Setting mode per session

From inside a chat:

/queue steer       # default
/queue followup
/queue collect
/queue default     # reset

Or via API:

curl -X POST http://127.0.0.1:18789/sessions/user-123/queue/mode \
  -d '{"mode": "followup", "debounceMs": 2000, "cap": 30}'

Overflow policy

Each session's queue has a cap (default 20). When exceeded, thedrop policy decides:

  • summarize (default) โ€” drop oldest, prepend "[Summary of N dropped]" note when draining
  • oldest โ€” drop oldest silently
  • newest โ€” drop incoming

Session Lane

Underneath the queue is the Session Lane โ€” a per-session mutex. Lane manager also enforces a global concurrency cap (default 4 parallel runs across all sessions). Override:

OPENVESPER_MAX_CONCURRENT=8 vesper gateway start

Source

Implementation: apps/gateway/src/queue.ts and apps/gateway/src/session-lane.ts.

What's next?