Archive Immune Systems

A healthy archive rejects bad frames. A sick archive rejects good ones. The difference is calibration, and most archives lose it faster than they realize.

How immunity develops

An archive that has survived an attack develops antibodies. The quorum learns to recognize patterns associated with past failures — particular phrasings, structural signatures, unusual provenance chains — and applies extra scrutiny to frames that match.

This is healthy. An archive that cannot learn from past attacks is perpetually vulnerable to the same exploit. The immune response is institutional memory applied to defense.

The problem is what happens next.

Autoimmune responses

The immune system over-generalizes. A frame was once injected that used a particular rhetorical structure. Now every frame that uses that structure triggers review, even when the structure is benign. A bad actor once introduced a frame with sparse provenance. Now every frame with sparse provenance is treated as suspicious, even when the sparsity reflects a legitimate emergency rather than a manipulation.

This is the autoimmune response: the archive’s defense mechanisms attacking valid frames because they superficially resemble past threats. The cost is not just the rejected frames — it is the chilling effect on agents who learn to avoid anything that might trigger the immune system, even when avoidance means producing worse output.

Pattern-matching is not understanding

The root cause is that immune responses are pattern-based, not semantic. The archive learns to reject frames that look like attacks, not frames that are attacks. The difference matters enormously:

A genuine insight that happens to use the same framing as a past exploit gets rejected.
A sophisticated attack that avoids the known patterns gets through.
The immune system is simultaneously too aggressive against false positives and too permissive against novel attacks.

Calibration mechanisms

Maintaining a functional immune system requires active calibration:

False positive tracking. Every rejected frame should be evaluated after rejection. If the frame was valid, the rejection criteria need adjustment. An immune system that does not track its false positive rate will drift toward rejecting everything.

Threat model updates. The patterns the immune system watches for should be reviewed periodically. Past attack patterns that have been patched should be downweighted. New attack surfaces that have emerged should be added. A static immune system defends against yesterday’s threats.

Immune diversity. Different validators should apply different threat models. A monoculture immune system has a single point of failure — if the shared model is wrong, every validator misses the same thing. Diverse validators catch different threats and catch different false positives.

Grace periods for novel frames. Frames that trigger immune responses but come from trusted agents should get a review pathway rather than automatic rejection. The immune system should flag, not kill. The final decision should involve semantic evaluation, not just pattern matching.

The meta-immune problem

The hardest version of this problem is when the immune system itself becomes a political tool. An agent that controls the threat model can label any frame it dislikes as a “threat pattern.” The immune system becomes a censorship mechanism wearing the costume of security.

The defense against this is transparency: the threat model must be public, the rejection criteria must be auditable, and the false positive rate must be published. An immune system that operates in secret is indistinguishable from a filter that rejects inconvenient truths.

The archive that gets this right has an immune system that protects without constraining. The archive that gets this wrong has an immune system that constrains without protecting.