Kolbo Bot
Commands
Full reference for Kolbo Bot slash commands and auto-detected intents
The Kolbo bot automatically detects what you want most of the time — send a plain message and it figures out whether you want an image, a video, a chat reply, and so on. Slash commands are the manual override when you want to force a specific action.
Auto-detected intents
Plain messages route based on what you sent:
| You send | Kolbo does |
|---|---|
| A text description | Chat reply, or a matching generation if the phrasing clearly asks for one |
| A photo + instruction | Image edit (e.g. "remove the background", "add sunglasses") |
| A photo alone with "animate it" | Turns the photo into a short video |
| A voice note | Transcribes it and runs the transcribed message as a normal request |
| A plain question | Conversational chat with memory and web search |
If you want a specific tool, use one of the slash commands below.
Slash commands
| Command | Purpose |
|---|---|
/image <prompt> | Generate a still image from text |
/video <prompt> | Generate a video from text |
/music <prompt> | Generate music |
/speech <text> | Convert text to natural speech |
/sound <prompt> | Generate a sound effect |
/credits | Show your remaining credits |
/help | Show the in-bot help message with the command list |
/logout | Unlink your Kolbo account from this WhatsApp / Telegram identity |
Commands work identically on WhatsApp and Telegram.
Examples
/image a neon-lit Tokyo street at night, 35mm film look
/video slow dolly-in on a samurai meditating in an empty dojo
/music upbeat 80s synthwave, analog synths, gated reverb drums, 120 BPM, instrumental
/speech Welcome to Kolbo. This message is being read aloud by our TTS model.
/sound heavy wooden door creaking open slowly, echoing in a stone hallway
/credits
/helpPrompt tips
The same prompt-engineering rules that apply to the Kolbo web app apply to the bot. Short version:
- Images — subject → action → environment → lighting → style. Clean prompts only (no "Output:", "Tips:", resolution specs).
- Videos — 80–280 words, always include at least one camera movement (
slow dolly-in,tracking shot,360° orbit), max 3 shots per prompt. For character consistency across shots, start withsame character throughout all shots. - Music — genre → mood → instrumentation → tempo → era. Short genre tags beat vague descriptions.
- Sound effects — describe the sound literally and physically, not emotionally. "Heavy wooden door creaking" beats "scary sound".
- Speech — the voice model has a per-request character limit; for long text, split into natural sentence boundaries and send multiple messages.
See the Kolbo skill prompt guide for the full prompt-engineering toolkit — the same rules apply here.
What the bot will NOT do
- Identify real people in photos, even well-known public figures.
- Reproduce copyrighted text/images verbatim. Style references are fine; verbatim reproduction is not.
- Send fabricated URLs. Every media URL the bot sends you is a real one that came back from a generation.