Streaming

Stream agent replies as they generate. Available via SSE (POST /agent/stream) or WebSocket (/ws).

Event types

TypeFired when
startRun begins
message_startLLM starts generating
block_startA block (text / tool_use / thinking) begins
tokenEach text token
thinkingExtended-thinking content (Anthropic, OpenAI o-series)
tool-callLLM decided to call a tool
tool-resultTool returned
block_endCurrent block finished
message_endLLM done, includes token usage
doneWhole run complete, includes reply + durationMs
errorSomething failed

SSE

curl -N -X POST http://127.0.0.1:18789/agent/stream \
  -H "Content-Type: application/json" \
  -d '{"sessionKey": "user-123", "message": "Hello"}'

# data: {"type":"start","sessionId":"s_...","agent":"auto"}
# data: {"type":"token","text":"Hi"}
# data: {"type":"token","text":" there"}
# data: {"type":"done","reply":"Hi there!","durationMs":1240}

WebSocket

const ws = new WebSocket("ws://127.0.0.1:18789/ws");
ws.onopen = () => {
  ws.send(JSON.stringify({type:"register",sessionKey:"u1",channel:"web"}));
  ws.send(JSON.stringify({type:"message",sessionKey:"u1",message:"Hi"}));
};
ws.onmessage = (e) => {
  const event = JSON.parse(e.data);
  if (event.type === "token") process.stdout.write(event.text);
};

Source

Implementation: apps/gateway/src/streaming.ts

What's next?