
When to use the chat
The role of the chat depends a lot on what the agent’s job is. During initial setup and tuning, the chat is particularly useful for:- Discussing and updating the agent’s instructions “I want you to help me handle incoming invoices, check that they are valid, and then route them to the right team. What do you need from me to do this?”
- Testing the agent: “Check this incoming order PDF and process it according to the instructions, as if you received it in an email. Don’t interact with any external systems, just tell me what you would have done.”
- Understanding and improving the agent’s behavior: “You flagged the latest invoice as potentially fraudulent, even though you’ve processed similar invoices before without any issues. Can you analyze your instructions and docs and figure out why? And then suggest how we can improve your instructions to avoid this in the future?”
- Adding more context: “Since you keep asking me about people’s contact info, here is a link to our contacts page. Update your instructions to use this as a source of information.”
- Some agents are very chat-centric, where most of the interaction happens in the chat. For example if the job of the agent is to analyze the competitive landscape, you might use the chat as the primary interface for discussing this.
- Some agents use other channels mostly, such as email or Slack. In that case the built-in chat has a secondary role.
- Some agents handle a workflow automatically and rarely need to chat at all. For example, once the invoice router is up and running and doing its job, it will work silently in the background for the most part, except for when it needs tuning.
File uploads
You can upload files directly in the chat:- Images — The agent can analyze and describe images, extract text, or use them as context
- Documents — PDFs, Word documents, spreadsheets, and presentations
- Audio — Recordings that can be transcribed and analyzed
- Data files — CSV, JSON, and other structured data
Working with agent documents
From the chat, you can ask the agent to read, create, or update its agent documents and databases. This makes it easy to build up the agent’s knowledge base through natural conversation.- Saving documents: “Save these updated guidelines as an agent doc please”
- Reading documents: (drop a PDF) “Check if this PDF complies with the guidelines document”
- Updating databases: (after brainstorming) “Those are great ideas, please add them to the brainstorm database”
- Creating databases: (paste a whiteboard photo) “Here are some notes from our product planning session, please create a database and store the product ideas, with suitable fields like category, prio, etc”.
Rich responses
Agents can respond with more than plain text. They can respond with formatted text, diagrams, images, and voiceovers.- Formatted text: “Make a nice-looking draft blog post based on my messy brainstorm notes”
- Diagrams: “Show me a flowchart of your workflow”
- Images: “Create a nice-looking infographic of the given meeting notes”
- Voiceovers - “Email me a voiceover of this article, using a casual female british accent”
- Mixed formats: “Create a nice-looking draft blog post based on my messy brainstorm notes, include a suitable image and visual overview, and a voiceover link.”

Interactive applications
Agents can generate interactive applications directly in the chat. This allows agents to create custom interfaces for specific tasks:- Dashboards: “Create a nice-looking interactive dashboard for our OKRs, where you can browse progress per team, and aggregate the progress by quarter.”
- Forms — “Create a user feedback form that saves data to the user feedback database”
Interactive applications are rendered using React components that the agent generates on the fly. The agent can iterate on these based on your feedback.
Code execution
If you enable the Code execution capability, the agent can execute code directly in the chat. This is useful for situations where large amounts of data need to be processed, or where the agent needs to perform complex calculations.- “Calculate the total cost of the project, including materials and labor”
- “Generate a report of the sales data, including charts and tables”
- “Aggregate the sales data in the database and store in a new database called ‘Sales by product category’”
Voice input
You don’t have to type—press the microphone button to speak, and your speech is transcribed to text. This works in most languages, and is often a huge time saver.Text-to-speech
If you hover over a chat message, you can click on the “speak” button below to have it read aloud. This is useful for situations such as taking a walk, where you don’t want to look at your phone more than necessary. There is also a walk-and-talk mode that is optimized for this use case. See Voice communication for more on voice features, including walk-and-talk mode.Multi-user chat
The chat is collaborative—different users on your team can talk to the same agent, and multiple people can participate in the same conversation. Responses are live-streamed, so if several users are watching the same chat, they all see the response as it generates. See Collaboration for more on working with agents as a team.Conversation history
All conversations are saved automatically. You can:- Review past conversations anytime
- Continue previous conversations
- Rename and remove conversations
- Share conversation links with teammates

