Kolbo Code
Voice Input (Push-to-Talk)
Dictate your next message by holding Ctrl+Y in the Kolbo Code prompt
Kolbo Code has built-in push-to-talk voice input. You never have to
open a separate transcription tool — just hold Ctrl+Y in the prompt
and speak.
How it works
- Focus the prompt in the TUI.
- Press and hold
Ctrl+Y. The prompt switches to a listening state and recording begins immediately. - Speak your message. Partial transcripts stream in live so you can see what the model is hearing.
- Release
Ctrl+Ywhen you are done. The final transcript is appended to the prompt buffer; pressenterto send.
Why Ctrl+Y?
- It is a dedicated shortcut, so normal typing (including spaces) is never affected.
- It is easy to hold with one hand while you think, and easy to release the instant you are done speaking.
- It works consistently across macOS, Linux, and Windows terminals.
What happens under the hood
- Audio is captured via the bundled FFmpeg binary (no separate
install required). On macOS it uses
avfoundation, on Linuxpulse, on Windowsdshow. - Raw PCM16 mono @ 16 kHz is streamed over a Socket.IO connection to
api.kolbo.ai, which proxies to ElevenLabs Scribe v2 Realtime. - Partial and committed transcripts stream back in real time.
- Cost: voice transcription from Kolbo Code is billed as
source: "chat", which is free — zero credits deducted.
Requirements
- A working microphone the OS can see.
- You must be signed in (
kolbo auth login) — the socket uses your Kolbo API key for authentication. - FFmpeg is bundled; no install needed. If the bundled binary is missing
for any reason, Kolbo Code will fall back to system
ffmpeg, then systemsoxon yourPATH.
Troubleshooting
- "Not logged in" → run
kolbo auth login. - "No mic backend" → none of bundled FFmpeg, system FFmpeg, or
soxcould be resolved. Reinstall Kolbo Code so the bundled FFmpeg is restored. - Backend timeout → the CLI waited >5s for the server to acknowledge the session. Check your network and try again.
- Permission denied on macOS → grant your terminal app microphone access in System Settings → Privacy & Security → Microphone.
- Windows: no audio device found → Kolbo Code enumerates dshow audio devices automatically. If none are listed, plug in a mic or set a default recording device in Windows sound settings.
Debug logs for push-to-talk sessions are written to:
<data-dir>/log/push-to-talk.logwhere <data-dir> is ~/.local/share/kolbo (Linux), ~/Library/Application Support/kolbo (macOS),
or %LOCALAPPDATA%\kolbo (Windows).