Skip to main content
OpenAI provides a wide range of AI capabilities for your agent, from language models to image generation and voice features.

GPT models

OpenAI’s GPT models are available as an alternative to the default model for your agent. Configure model selection under Settings → Advanced Config.
See Model Selection for the full list of available models and guidance on choosing the right one.

Image generation

Generate images from text descriptions using OpenAI’s GPT image generation. How to enable: Toggle on the Image generation capability on your agent, and select GPT.
Image generation capability card
Example use cases:
  • “Generate a hero image for our blog post about sustainable energy”
  • “Create an illustration showing our product workflow”

Voice transcription

Transcribe spoken audio to text using OpenAI’s Whisper model. How to use: Press the mic button in the chat, or drop an audio file to transcribe. Example use cases:
  • Talk instead of typing to your agent
  • Drop a meeting recording file and ask the agent to transcribe it
See Voice Communication for details on voice input and walk-and-talk mode.

Text to speech

Convert text to natural-sounding speech using OpenAI’s TTS models. How to use: Press the speak button under any chat message to hear it read aloud. Example use cases:
  • Listen to a long response while making coffee or taking a walk
  • Have your agent read out a summary or report
See Voice Communication for details on text to speech and walk-and-talk mode.

Image recognition

Your agent can analyze and understand images automatically. How to use: Provide an image through the chat interface, agent documents, or email attachments. Example use cases:
  • “Look at this screenshot and tell me what’s wrong with the layout”
  • “Read this invoice image and extract the key details”
  • “Read this chart image and add the data to my spreadsheet”