---
IDE: VS Code
model: claude-sonnet-4-5
mode: Agent
---

Implement the streaming transcription based on the latest documentation in #file:openai-transcription-streaming.md. Preferences:
1. No VAD. We still want to use manual silence detection to trigger a response.
2. Use `gpt-realtime` model
3. Manually stream audio to server and trigger response
4. Do NOT use the transcription. Instead, we want `text` modality output. The result will be present in the completed event.