Security Agent - Abundly

The platform includes a security agent that works in the background to review agent plans before execution. This provides an independent layer of protection against unsafe or unauthorized actions—even if the main agent has been manipulated through prompt injection.

The security agent reviews plans before execution, showing its assessment in the activity log

How it works

The security agent uses LLM-based reasoning (not fixed rules) to evaluate whether a planned action is safe. For security-sensitive triggers, here’s the flow:

Trigger arrives — The agent receives an event that requires security review (e.g., an incoming email)
Planning phase — The agent creates a plan for how to respond.
Security review — The security agent independently reviews the plan against the agent’s instructions.
Decision — The security agent returns a verdict: safe or unsafe, with reasoning.
Execution or block — Safe plans execute. Unsafe plans are blocked and logged to the diary and activity log.

The security agent operates in a separate context from the main agent. This is critical: if someone tries to manipulate the main agent through a malicious email, the security agent reviews the resulting plan from a clean context and can block suspicious actions.

What it checks for

The security agent evaluates plans for:

Check	What it looks for
Data exposure	Actions that might expose or leak data the agent has access to
Instruction tampering	Attempts to modify the agent’s instructions or configuration
Unauthorized communication	External messages that seem outside the agent’s normal scope
Infinite loops	Patterns like two agents stuck emailing each other back and forth

The evaluation compares the proposed plan against the agent’s instructions. If the plan seems inconsistent with the agent’s stated purpose, it gets flagged.

When a plan is blocked

When the security agent blocks a plan:

The action does not execute
The block is logged to the activity log with the security agent’s reasoning
The agent is told the action was blocked for security reasons
You can review the diary entry to understand what happened and why

If the blocked action was legitimate, you may need to update the agent’s instructions to make the intended behavior clearer.

Example in action

The Activity Monitoring page shows a complete example of the security agent in action, including:

The incoming trigger (an email)
The agent’s interpretation and plan
The security agent’s assessment
The execution (or blocking) of the plan

FAQ

Does the security agent check every action?

The security agent reviews plans for security-sensitive triggers, particularly incoming emails. Chat conversations with authenticated workspace members rely on other safeguards like guardrails and approval requirements.

Can the security agent be bypassed by prompt injection?

This is exactly what it’s designed to prevent. The security agent operates in a separate context from the main agent—it doesn’t see the malicious content that might have been in the triggering email. It only sees the plan that resulted and evaluates whether that plan is consistent with the agent’s instructions.

What if the security agent blocks a legitimate action?

Check the diary entry to understand why it was blocked. If the action is legitimate, you may need to update the agent’s instructions to make the intended behavior clearer, or adjust the scope of what the agent should do.

Learn more

Guardrails

Code-enforced constraints that complement the security agent

User Approval

Human-in-the-loop for sensitive actions

Activity Monitoring

See the security agent in action in the activity log

Security & Compliance

​How it works

​What it checks for

​When a plan is blocked

​Example in action

​FAQ

​Learn more

Guardrails

User Approval

Activity Monitoring

How it works

What it checks for

When a plan is blocked

Example in action

FAQ

Learn more