Programming & Development

When AI Attacks Itself: A Fully Autonomous Red Team vs Blue Team Experiment

When AI Attacks Itself: A Fully Autonomous Red Team vs Blue Team Experiment Date: June 22, 2026 · Environment: Kali Linux VM · Azure OpenAI · Docker Tags: AI Security Penetration Testing AppSec Autonomous Agents GPT-4o gpt-5.2 The Idea I Couldn't Get Out of My Head What if two AI agents fought each other — one building and defending a web application, the other trying to break in? Two different models. No human intervention. No waiting. No typos in terminal commands. I ran the experiment. The results were more interesting than I expected — not just because the attack and defense both worked, but because of how fast everything happened. The Setup Two models. Two roles. One isolated Kali Linux VM. Agent Model Role 🔴 Red Agent GPT-4o (Azure OpenAI) Attack, analyze findings, verify patch 🔵 Blue Agent gpt-5.2 (Azure OpenAI) Build target app, patch vulnerabilities Target stack: Flask · SQLite · Werkzeug 3.1.8 · Python

DEV Community

1h ago

0 0

Discussion

Get the discussion rolling

A single comment can start something great.

No comments yet.

Be the first to share your take and keep the conversation moving.

Join the conversation

UPVOTERS

Community appreciation

See who found this content valuable and showed their support.

No upvotes yet.

Be the first to show your appreciation for this content.

TOPICS

Explore the same topics

Discover more content from the topics this post is mapped to.

infoq.com

Presentation: Challenging Google Analytics: Building a Scalable…

Alina Krasavina explains how Delivery Hero successfully deprecated Google Analytics and migrated to an internal user tracking platform. She discusses how a sim…

InfoQ

12h ago

store.steampowered.com

Steam Machine launches today

Comments

Alex Carter

13h ago

nevergivethemyourface.com

Never Give Them Your Face

Comments

Stefani

16h ago

patrickmccanna.net

The text in Claude Code’s “Extended Thinking” output

Comments

Hacker News

16h ago

css-tricks.com

Using Scroll-Driven Animations for Opposing Scroll Directions

Sometimes designers have silly ideas that eventually grow on you. That happened to me with this concept where I had to build columns of items moving in opposit…

CSS-Tricks

16h ago

dev.to

Java To A Native Windows EXE: No JVM, 5MB, x64 And Arm

If you were around Java forums in the late nineties you remember the threads. "How do I compile my Java program to an EXE?" was asked constantly, answered badl…

Thomas Lefevre

16h ago

Keep browsing

Explore more from this topic

Dive into the full feed of curated posts covering Programming & Development.

Browse Topics

Continue exploring

Discover more content that aligns with your interests and this post.

dev.to

Java To A Native Windows EXE: No JVM, 5MB, x64 And Arm

If you were around Java forums in the late nineties you remember the threads. "How do I compile my Java program to an EXE?" was asked constantly, answered badl…

Thomas Lefevre

16h ago

dev.to

Understanding Server Sent Events

Hello reader! Today I have learnt about server-side events and going to discuss it here. Introduction Assume you are using ChatGPT and sent a query to the L…

Hans

1d ago

dev.to

Orchestrating AI: LangChain Framework Abstraction vs. Pure Nati…

When building prototypes with Generative AI, velocity is everything. Developers want to stitch together prompts, text splitters, vector stores, and models as q…

DEV Community

1d ago

dev.to

How I accidentally learned email infrastructure while trying to…

Why am I paying for this when I can host it myself? The hidden subscription fee: your time and your RAM The moment I realized SendGrid wasn't expensive That da…

DEV Community

1d ago

dev.to

Build Games In Java: Sprites, Box2D Physics And Low-Latency Sou…

A confession before the feature tour: for years I was very much against adding gaming to Codename One. I used to work in the gaming industry (Jane's USAF, amon…

DEV Community

1d ago