Red-team, Blue-team

Red-team or Blue-team

a posts I came across today: Mathstodon: Terence Tao: we use/design LLM as blue-team

Tao argued in his post that, this should be the reverse:

to maximise strengths for both parties, AI (where I would call LLMs) is currently better suited for red-team tasks since it’s not always 100% reliable, rather than directly as generators (blue-team).

A minimax game; for each party, to reach the equilibrium, they need to choose:

the best action for red-team is to attack the weakest link in blue-team’s output
the best action for blue-team is to eliminate red-team’s strongest possible move

in other words:

“the output of a red-team is only as strong as its strongest possible attack”
“the output of a blue-team is only as strong as its weakest output”

or:

“the red-team’s value is determined by its best (strongest) attack”
“the blue-team’s value is determined by its worst (weakest) defence”

Which one you think is harder? playing as the red-team or the blue-team?

to play as the red-team, you need strong prior knowledge and must give everything you’ve got at every split second.
to play as the blue-team, you might only need to focus on your weaknesses.

LLMs are massively parallel-compute programs. By design, they can pay attention to any nuance in context or from their colossal training corpus.

We human could be trained, at the very best, ~160 petabytes of sensory data (Data and “tokens” a 30 year old human “trains” on), yet we are not able to remember what we ate for dinner 39 days ago.
If we place LLMs in the blue-team and a single human play as the red-team; this is another way to say “vibe-coding”.

Couldn’t relate more.

as the red team, you only need a 0.1% breach rate.
but as the blue team, you need 100% defensibility.

If currently LLMs at best offer 99.9% of defensive strength (reliability) as blue-team, that 0.1% attack surface (unreliability) may compound.

And you don’t want that to happen.