Skip to main content
Every agent needs a brain. On Abundly, you can choose which large language model (LLM) powers your agent—or simply let the platform pick the best one for you.
Agent settings showing model selection with Claude Sonnet Latest selected

Default behavior

If you don’t have a preference, leave the model set to (no preference). This is the default, and it means Abundly will use whatever model we think works best for general agentic behavior. We continuously evaluate models and update this default as better options become available. This is the right choice for most users—you get great performance without having to think about model selection at all.

Model aliases vs specific versions

When selecting a model, you can choose between:
  • Aliases like “Claude Sonnet Latest” — automatically points to the latest version that we have verified works well
  • Specific versions like “Claude Sonnet 4.5” — locked to that exact version
We recommend using aliases in most cases. Models improve constantly, and when a new version of Claude Sonnet arrives, it usually makes sense to upgrade. By selecting an alias, you won’t need to remember to do that—Abundly will point the alias to the latest version when we’ve verified it works well (typically after some internal testing). However, if you want full control and want to minimize the risk of surprises, pick a specific model version. Although newer models usually perform better, agent behavior can sometimes change subtly. Locking to a specific version lets you manually decide when to upgrade.

Available models

Model availability is workspace-specific. Workspace admins can turn off specific catalog models for the entire workspace from Workspace management → Model selection—useful for controlling costs or standardizing on a smaller set of models. Models that Abundly has disabled platform-wide remain locked and cannot be re-enabled at the workspace level.

Policy for new models

When Abundly adds a new model to the catalog, your workspace can either make it available automatically or hold it back until an admin reviews it. Choose the policy under Workspace management → Settings → LLM models:
  • New models on by default (the default) — New catalog models become available to your agents as soon as Abundly adds them. You can still turn any model off afterwards on the Model selection tab.
  • Require approval for new models — New catalog models stay off until an admin enables them on the Model selection tab. Switching to this mode does not disable any model that is currently available; only models added to the catalog after you switch require an explicit enable.
Pick “Require approval” if you want a controlled set of models and prefer to evaluate new additions yourself before agents can use them. Pick the default if you want your workspace to benefit from new models automatically.
Most workspaces use a mix of:
  • Anthropic Claude
  • OpenAI GPT
  • Google Gemini
  • xAI Grok
Some workspaces also include additional families such as Qwen (EU-hosted via Scaleway serverless inference), DeepSeek, or self-hosted models. You can switch models at any time—even mid-conversation.

Claude

Anthropic’s models, known for nuanced reasoning and safety

GPT

OpenAI’s models, versatile and widely capable

Gemini

Google’s models, strong at knowledge tasks and multimodal understanding

Comparing models

Not sure which model to choose? We tend to default to the Claude models (Sonnet or Opus). We find they work very well for agentic behaviour. But the other models have improved a lot lately, and some have specific strenghts that you may want to leverage. This is a changing landscape, so you’re probably best off researching online yourself (or asking an Agent to do it for you…). But here’s a high level overview of how the models from each provider compare against each other.

Anthropic Claude models

ModelBest forSpeedCostNotes
Claude Fable 5The very hardest problemsModerate$$Anthropic’s most powerful model—a new tier above Opus. 1M token context window with adaptive thinking.
Claude Opus 4.8Complex reasoning, high-stakes tasksModerate$Most capable Opus model. 1M token context window with adaptive thinking. Best for difficult problems that defeat Sonnet.
Claude Sonnet 4.6Everyday work, coding, analysisFast$Recommended default. Excellent balance of capability and speed.
Claude Haiku 4.5High-volume, speed-critical tasksFastest$~3x faster than Sonnet at 1/3 the cost. Near-Sonnet quality for most tasks.
Start with Claude Sonnet for most use cases. Use Opus—or Fable, at a higher cost—for your hardest problems, and Haiku when you need speed at scale.

Google Gemini models

ModelBest forSpeedCostNotes
Gemini 3.5 FlashFast reasoning with large contextFast$$1M token context. Thinking support. Strong all-rounder for tasks that need both speed and depth.
Gemini Pro 3Maximum reasoning depthModerate$$$Strongest reasoning. 1M token context for huge documents.
Gemini Flash 2.5Balanced performanceFast$$Good multimodal capabilities. Native audio support.
Gemini Flash Lite 2.5High-volume processingFastest$Most cost-effective. Great for bulk summarization and routing.
Gemini 3.5 Flash is the new workhorse—fast with thinking support and a 1M token context. Choose Pro for maximum reasoning depth and Flash Lite for high-volume, simple operations.

OpenAI GPT models

ModelBest forSpeedCostNotes
GPT 5.5Premium reasoning with massive contextModerate$$1.05M token context window. Thinking support. Best for tasks needing a very large context and top-tier reasoning.
GPT 5.2 ProMission-critical, complex reasoningModerate$$Extended reasoning for highest accuracy. Best for high-stakes decisions.
GPT 5.2Latest improvements, reduced hallucinationsFast$$$30% fewer hallucinations. Strong on coding (55.6% SWE-Bench Pro) and reasoning.
GPT 5 MiniWell-defined tasks, interactive appsFaster$$Great balance of speed and capability. Ideal for chat and coding assistants.
GPT 5 NanoHigh-volume, simple tasksFastest$Ultra-cheap. Best for summarization, classification, and bulk processing.
GPT 5.2 is the recommended default for most tasks. Use Mini for speed-sensitive applications and Nano for high-volume simple operations.

xAI Grok models

ModelBest forSpeedCostNotes
Grok 4.3Long-context reasoning, multimodal tasksFast$$1M token context window. Adjustable reasoning intensity. Supports text, image, and video input.
Grok 4.1 FastFast agentic workflows and tool-heavy tasksFast$$Available in reasoning and non-reasoning variants. Some Grok variants use integrated reasoning and may enforce model-specific thinking behavior.
Use Grok 4.3 for long-context tasks (up to 1M tokens) or when you need multimodal reasoning. It’s a strong alternative to Gemini Pro and GPT 5.5 for document-heavy workloads.
Grok models are available when your workspace has xAI configured.

Per-context model selection

Different contexts in your agent’s life may benefit from different models. You can configure a model and thinking preference per context:
  • Chat — Conversations you start with the agent
  • Scheduled tasks — When the agent runs on a schedule
  • Email — When the agent responds to incoming emails
  • Slack — When the agent responds to Slack messages
  • Microsoft Teams — When the agent responds to Teams messages
  • Agent messages — When other agents send your agent a message
If a context has no override, it inherits the agent’s default model. This lets you use a fast, cheap model for routine scheduled work while keeping a more capable model for interactive chat.

Per-task model overrides

Individual scheduled tasks can have their own model and thinking setting on top of the per-context default. Open a task’s settings card to configure it. This is useful when one particular task needs heavier reasoning than your other scheduled tasks.

Thinking mode

Many models support a thinking mode where the model spends more effort reasoning before it responds.
Thinking settings shown in model selection controls
Thinking makes your agent smarter at complex reasoning tasks, but it also increases processing time and credit usage. You can configure thinking in the default model settings, per context, and on individual scheduled tasks.
Thinking behavior depends on the selected model:
  • Supports thinking — You can toggle thinking on or off
  • Requires thinking — Thinking is forced on
  • Does not support thinking — The thinking toggle is hidden

When to change models

ScenarioRecommendation
Just getting startedLeave it on “(no preference)“
Complex reasoning, planningClaude Opus, Gemini Pro, or GPT 5.2 Pro
Everyday tasks, codingClaude Sonnet, GPT 5.2, or GPT 5 Mini
High-volume, simple tasksClaude Haiku, Gemini Flash Lite, or GPT 5 Nano
Speed is criticalClaude Haiku, Gemini Flash, or GPT 5 Nano
Lowest hallucination ratesGPT 5.2 or GPT 5.2 Pro
Working with long documentsClaude Opus (1M token context), Gemini Pro or Gemini 3.5 Flash (1M token context), GPT 5.5 (1.05M token context), or Grok 4.3 (1M token context)

Unified interface

Regardless of which model you choose, the platform provides a consistent experience:
  • Switch models even mid-conversation
  • Same capabilities work across all models
  • Consistent behavior and tool usage
  • No need to learn different APIs
You can switch models at any time. If you’re not happy with the results, try a different model—your agent’s instructions and capabilities stay the same.
Each model has different credit costs per token. More capable models like Opus and GPT 5.2 cost more per request, while Haiku and Flash Lite are very cost-effective. Check your usage dashboard to monitor credit consumption.
Yes. Each agent can have its own model preference. You might use a fast, cheap model for a high-volume support agent and a powerful reasoning model for a research agent.
Yes. You can set a model per context (chat, scheduled tasks, email, Slack, agent messages) and even override the model on individual scheduled tasks. This lets you optimize cost and capability for each type of work the agent does.

Learn more

Anthropic (Claude)

Learn about Claude models and capabilities

OpenAI (GPT)

Learn about GPT models and capabilities

Google Gemini

Learn about Gemini models and capabilities