Home / Blog / Incident

Incident agent

Published on August 4, 2025

A vacation project that explores the OpenAI Agents SDK while integrating concepts closer to the capability frontier. My goal was to combine multi-agent collaboration, realtime voice, Twilio's Media Streams API, MCP servers, and computer use into a practical demonstration.

The problem

Picture a critical incident in production. You gather vendors and system integrators on a Zoom call. The agents take notes, answer routine questions by leveraging MCP for system access, and conduct basic system changes using computer use capabilities.

Reflections

The Agents SDK is sparsely documented. You will have to read the source code to get a real-world implementation to work.
No parity between Python and Typescript. I started with Python, but due to limitations in realtime voice agents and computer use, I had to switch to Typescript. Reading the documentation again now, I can see some of the issues are partially resolved already.
Voice latency is a hurdle, especially when tools are involved. The demo below is sped up 4x.
I never managed to make the agent acknowledge the question before diving into tool use, leaving an awkward silence at the start of every interaction.
Visualizing the conversation is not trivial or out-of-the-box. Transcripts arrive (understandably) late, requiring additional processing. It is also difficult to line up the conversation with the computer use screenshots, especially when several agents and tools are involved.
Few external tracing options exist for Typescript. I tried AgentOps, but was underwhelmed. Python has more mature options though.
Reliability is okay, if you catch and ignore the occasional error.

Demo

A short video of the agent in action. You only see the conversation monitoring, the call itself happened from my phone through Twilio. The code can be found here and here.

OopsOps in action, sped up 4x.