Command Queue

What happens when you send a new message while an agent is already running? Three modes: steer, followup,collect.

Why this matters

Without queuing, parallel messages cause race conditions: two agent runs overwrite each other's session state. Without steering, you can't interrupt a long-running task to redirect it. The Command Queue solves both.

The three modes

`steer` (default)

New messages inject into the active run. Delivered to the LLM after current tool calls finish but before the next LLM call.

Use when: you want "wait, do X instead" semantics. Most chat-like interactions benefit from this.

`followup`

Don't steer. Queue messages for a new turn after the current one ends. Default debounce: 1 second (multiple rapid messages coalesce).

Use when: the running task is critical and shouldn't be interrupted. Followups happen after, naturally.

`collect`

Don't steer. Hold messages until run ends, then format them as a single structured prompt: "you said A, then B, then C". Useful when batching related items.

Setting mode per session

From inside a chat:

/queue steer       # default
/queue followup
/queue collect
/queue default     # reset

Or via API:

curl -X POST http://127.0.0.1:18789/sessions/user-123/queue/mode \
  -d '{"mode": "followup", "debounceMs": 2000, "cap": 30}'

Overflow policy

Each session's queue has a cap (default 20). When exceeded, thedrop policy decides:

summarize (default) — drop oldest, prepend "[Summary of N dropped]" note when draining
oldest — drop oldest silently
newest — drop incoming

Session Lane

Underneath the queue is the Session Lane — a per-session mutex. Lane manager also enforces a global concurrency cap (default 4 parallel runs across all sessions). Override:

OPENVESPER_MAX_CONCURRENT=8 vesper gateway start

Source

Implementation: apps/gateway/src/queue.ts and apps/gateway/src/session-lane.ts.