Kolbo.AIKolbo.AI Docs
Kolbo Bot

Commands

Full reference for Kolbo Bot slash commands and auto-detected intents

The Kolbo bot automatically detects what you want most of the time — send a plain message and it figures out whether you want an image, a video, a chat reply, and so on. Slash commands are the manual override when you want to force a specific action.

Auto-detected intents

Plain messages route based on what you sent:

You sendKolbo does
A text descriptionChat reply, or a matching generation if the phrasing clearly asks for one
A photo + instructionImage edit (e.g. "remove the background", "add sunglasses")
A photo alone with "animate it"Turns the photo into a short video
A voice noteTranscribes it and runs the transcribed message as a normal request
A plain questionConversational chat with memory and web search

If you want a specific tool, use one of the slash commands below.

Slash commands

CommandPurpose
/image <prompt>Generate a still image from text
/video <prompt>Generate a video from text
/music <prompt>Generate music
/speech <text>Convert text to natural speech
/sound <prompt>Generate a sound effect
/creditsShow your remaining credits
/helpShow the in-bot help message with the command list
/logoutUnlink your Kolbo account from this WhatsApp / Telegram identity

Commands work identically on WhatsApp and Telegram.

Examples

/image a neon-lit Tokyo street at night, 35mm film look
/video slow dolly-in on a samurai meditating in an empty dojo
/music upbeat 80s synthwave, analog synths, gated reverb drums, 120 BPM, instrumental
/speech Welcome to Kolbo. This message is being read aloud by our TTS model.
/sound heavy wooden door creaking open slowly, echoing in a stone hallway
/credits
/help

Prompt tips

The same prompt-engineering rules that apply to the Kolbo web app apply to the bot. Short version:

  • Images — subject → action → environment → lighting → style. Clean prompts only (no "Output:", "Tips:", resolution specs).
  • Videos — 80–280 words, always include at least one camera movement (slow dolly-in, tracking shot, 360° orbit), max 3 shots per prompt. For character consistency across shots, start with same character throughout all shots.
  • Music — genre → mood → instrumentation → tempo → era. Short genre tags beat vague descriptions.
  • Sound effects — describe the sound literally and physically, not emotionally. "Heavy wooden door creaking" beats "scary sound".
  • Speech — the voice model has a per-request character limit; for long text, split into natural sentence boundaries and send multiple messages.

See the Kolbo skill prompt guide for the full prompt-engineering toolkit — the same rules apply here.

What the bot will NOT do

  • Identify real people in photos, even well-known public figures.
  • Reproduce copyrighted text/images verbatim. Style references are fine; verbatim reproduction is not.
  • Send fabricated URLs. Every media URL the bot sends you is a real one that came back from a generation.