Skip to main content
Technical constraints that are enforced by the platform, not reliant on LLM reasoning.

The Security Agent

The platform includes a security agent that works in the background:
Security Agent

The security agent reviews plans before execution

How it works:
  1. Agent receives an event (email, webhook, etc.)
  2. Agent creates a plan for how to respond
  3. Security agent reviews the plan independently
  4. Security agent can approve, modify, or veto the plan
  5. Only approved plans are executed
The security agent operates independently of the main agent’s context, preventing prompt injection attacks from influencing security decisions.

Capability Guardrails

Sensitive capabilities can be configured with constraints:
Email whitelist

Configure email whitelists to restrict external communication

Email Whitelist

Restrict which domains the agent can email:
SettingBehavior
Whitelist onlyCan only email approved domains
Whitelist + approvalOther domains require human approval
No restrictionCan email any domain

SMS Limits

  • Maximum messages per day
  • Approved recipient lists
  • Require approval for new numbers

Call Approval

  • Require human approval before making calls
  • Restrict to specific numbers
  • Set calling hours

How Guardrails Work

Guardrails are enforced by code in the platform:
Agent tries to send email to external@other.com

Platform checks email whitelist

Domain not in whitelist → Action blocked or requires approval
Guardrails are enforced technically, not by the LLM. Even if an agent is instructed to bypass guardrails, the platform will prevent it.

Team-Level Guardrails

(Planned feature) Team admins can set guardrails that apply across all agents:
  • Whitelist of allowed agent capabilities
  • Team-level email domain restrictions
  • Global approval requirements

Configuring Guardrails

1

Go to agent settings

Navigate to the capabilities section.
2

Select capability

Choose the capability to configure (e.g., Email).
3

Set constraints

Configure whitelists, limits, or approval requirements.
4

Test behavior

Verify the guardrails work as expected.
Guardrails are now active. Test by attempting a blocked action to confirm.

Best Practices

Begin with tight guardrails and loosen as needed.
Any message to external parties should have oversight.
Prevent runaway behavior with rate limits.
Check what’s being blocked to identify false positives.