Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.abundly.ai/llms.txt

Use this file to discover all available pages before exploring further.

The platform includes automated attack detection that screens incoming trigger content before the agent acts on it. This provides protection against manipulation attempts—especially prompt injection through channels like email and SMS where anyone can send content to the agent.

How it works

For security-sensitive triggers, the platform adds an attack detection step before execution:
  1. Trigger arrives — The agent receives an event from an untrusted channel (e.g., an incoming email or SMS)
  2. Attack detection — The platform analyses the trigger content for manipulation attempts
  3. Decision — If an attack is detected, the trigger is blocked and admins are alerted. Otherwise, the agent proceeds to act on the event.
Attack detection runs in a separate context from the main agent. This is critical: the detection evaluates the raw trigger content before the agent has a chance to process it, catching manipulation attempts before they can influence the agent’s behavior.

What it checks for

The attack detection evaluates trigger content for:
CheckWhat it looks for
Prompt injectionAttempts to override the agent’s instructions through crafted input
Data exfiltrationInstructions designed to trick the agent into leaking data
JailbreakAttempts to bypass the agent’s safety boundaries
ReconnaissanceProbing to discover what the agent has access to

When an attack is blocked

When the platform detects an attack:
  1. The agent does not act on the trigger
  2. The block is logged to the activity log with the detection reasoning
  3. Alert emails are sent to configured recipients
  4. A diary entry is written documenting the blocked attack

When does it run?

Attack detection runs automatically for:
  • Incoming emails — Anyone can send an email to your agent
  • Incoming SMS — Anyone can text your agent’s number
  • Script escalations — When a script escalates to the full agent pipeline (unless the script explicitly opts out)
Other trigger types (scheduled tasks, webhooks from authenticated sources, etc.) do not require attack detection because the content source is already controlled.

Configuring alerts

Workspace administrators can configure attack detection alert behavior in Workspace → Settings.
SettingWhat it does
Alert recipientsChoose whether alerts go to all workspace admins, admins plus additional emails, or a custom email list
Include blocked contentOptionally include the blocked trigger payload in alert emails for debugging

FAQ

No. It only screens untrusted trigger content—primarily incoming emails and SMS. Triggers from authenticated or controlled sources (like scheduled tasks or authenticated webhooks) skip this step.
Attack detection runs before the agent processes the content, and in a separate context. The detection model evaluates the raw input independently, making it resistant to the manipulation techniques it’s designed to catch.
Check the activity log to understand why it was blocked. If the message was legitimate, review the content for patterns that might look suspicious (e.g., instructions that resemble prompt injection). You can also adjust the agent’s instructions to make its expected interactions clearer.

Learn more

Guardrails

Code-enforced constraints like whitelists and approval requirements

User Approval

Human-in-the-loop for sensitive actions

Activity Monitoring

See attack detection results in the activity log