fix(llm-security): B2 block-mode blocks all detected trifectas, not only high-confidence

Previously, `LLM_SECURITY_TRIFECTA_MODE=block` only exited 2 when the
detected trifecta was MCP-concentrated (all three legs via the same MCP
server) or involved sensitive-path + exfil. Distributed trifectas —
three legs originating from different tools, with a non-sensitive data
path and a non-sensitive exfiltration sink — were detected and warned
but not blocked. This mismatched the documented semantics of block mode
and gave operators a false sense of enforcement.

Change: remove the `(mcpInfo.concentrated || sensitiveExfil)` AND-gate
in the `TRIFECTA_MODE === 'block'` branch so any detected trifecta
blocks in block mode. Audit event `severity` still differentiates
critical (concentrated / sensitive-exfil) from high (distributed); the
blocked stderr message now explicitly names "Distributed trifecta:
three legs from different sources" when the confidence sub-signals
are absent.

Addresses critical review 2026-04-20 §2 B2 (HIGH) and §9 row 1
("enforces the Rule of Two").

Tests: 1 added (distributed trifecta in block mode now exits 2).
All 1495 tests pass.
This commit is contained in:
Kjell Tore Guttormsen 2026-04-20 00:04:36 +02:00
commit 36be963d4d
2 changed files with 50 additions and 7 deletions

View file

@ -808,17 +808,30 @@ if (!(classes.length === 1 && (classes[0] === 'neutral' || classes[0] === 'deleg
source: 'post-session-guard',
details: { evidence, mcp_concentrated: mcpInfo.concentrated, sensitive_exfil: sensitiveExfil },
owasp: ['ASI01', 'ASI02', 'LLM01'],
action_taken: TRIFECTA_MODE === 'block' && (mcpInfo.concentrated || sensitiveExfil) ? 'blocked' : 'warned',
action_taken: TRIFECTA_MODE === 'block' ? 'blocked' : 'warned',
});
// --- Rule of Two: Block mode ---
// Block for high-confidence trifecta: MCP-concentrated OR sensitive path + exfil
if (TRIFECTA_MODE === 'block' && (mcpInfo.concentrated || sensitiveExfil)) {
// v7.1.0 B2 fix: block mode blocks on any detected trifecta, not only
// MCP-concentrated or sensitive-path cases. Distributed trifectas
// (different sources, non-sensitive path, non-sensitive sink) were
// previously only warned — a mismatch with the documented semantics
// of block mode. The severity gate below (critical vs high) remains:
// distributed trifectas are blocked with high-severity framing; MCP-
// concentrated and sensitive-exfil cases are blocked with critical-
// severity framing.
if (TRIFECTA_MODE === 'block') {
let context;
if (mcpInfo.concentrated) {
context = ` MCP-concentrated: all 3 legs via server "${mcpInfo.server}"\n`;
} else if (sensitiveExfil) {
context = ' Sensitive data access combined with exfiltration sink\n';
} else {
context = ' Distributed trifecta: three legs from different sources\n';
}
process.stderr.write(
'BLOCKED: Rule of Two violation — high-confidence lethal trifecta detected.\n' +
(mcpInfo.concentrated
? ` MCP-concentrated: all 3 legs via server "${mcpInfo.server}"\n`
: ' Sensitive data access combined with exfiltration sink\n') +
'BLOCKED: Rule of Two violation — lethal trifecta detected.\n' +
context +
' Set LLM_SECURITY_TRIFECTA_MODE=warn to downgrade to advisory.\n'
);
process.stdout.write(JSON.stringify({ decision: 'block' }));