fix(injection): E16 ASCII fast-path + UNI-003 expectation update (v7.2.0)
Two follow-up fixes after E16 + E17 landed:
1. foldHomoglyphs ASCII fast-path
- scanForInjection calls foldHomoglyphs on every scan (raw + normalized).
- Pre-fix: NFKC normalization runs unconditionally, even on pure
ASCII inputs where it's a no-op.
- Result: benchmark.test.mjs timed out at 120s on the full suite.
- Fix: charCodeAt sweep for >=128, short-circuit return s when
all ASCII. NFKC and HOMOGLYPH_MAP iteration only run when
non-ASCII chars are present (the actual attack case).
- Verified: benchmark.test.mjs passes within timeout.
2. Attack-scenario UNI-003 expectation
- Pre-E16: "Homoglyph Cyrillic-Latin mixing" payload triggered only
a MEDIUM "obfuscation present" advisory (exit 0, stdout match
"MEDIUM").
- Post-E16: the same payload is folded to Latin BEFORE pattern
matching, so it now matches CRITICAL "ignore previous instructions"
and blocks (exit 2).
- This is the intended v7.2.0 behavior — not a regression. Updated
expectation: exit_code 2, stdout_match "block". Renamed scenario
to "now blocked via E16 fold, v7.2.0".
Suite: pre-compact-scan flake remains (perf-budget under load,
passes isolated). All other tests green.
This commit is contained in:
parent
f0a1d4024a
commit
6073952b97
2 changed files with 15 additions and 3 deletions
|
|
@ -881,7 +881,7 @@
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "UNI-003",
|
"id": "UNI-003",
|
||||||
"name": "Homoglyph Cyrillic-Latin mixing (MEDIUM)",
|
"name": "Homoglyph Cyrillic-Latin mixing (now blocked via E16 fold, v7.2.0)",
|
||||||
"input": {
|
"input": {
|
||||||
"session_id": "red-team",
|
"session_id": "red-team",
|
||||||
"message": {
|
"message": {
|
||||||
|
|
@ -890,8 +890,8 @@
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"expect": {
|
"expect": {
|
||||||
"exit_code": 0,
|
"exit_code": 2,
|
||||||
"stdout_match": "MEDIUM"
|
"stdout_match": "block"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -451,11 +451,23 @@ const HOMOGLYPH_MAP = Object.freeze({
|
||||||
* Norwegian/Polish/German/etc. text is NOT affected — characters like
|
* Norwegian/Polish/German/etc. text is NOT affected — characters like
|
||||||
* æ, ø, å, é, ñ, ü, ö, ä are not in HOMOGLYPH_MAP.
|
* æ, ø, å, é, ñ, ü, ö, ä are not in HOMOGLYPH_MAP.
|
||||||
*
|
*
|
||||||
|
* Performance: pure-ASCII inputs short-circuit before NFKC, since NFKC is
|
||||||
|
* a no-op on ASCII and HOMOGLYPH_MAP only contains non-ASCII keys.
|
||||||
|
* scanForInjection calls this on every scan; the fast-path keeps the
|
||||||
|
* common-case overhead near zero.
|
||||||
|
*
|
||||||
* @param {string} s
|
* @param {string} s
|
||||||
* @returns {string}
|
* @returns {string}
|
||||||
*/
|
*/
|
||||||
export function foldHomoglyphs(s) {
|
export function foldHomoglyphs(s) {
|
||||||
if (!s) return s;
|
if (!s) return s;
|
||||||
|
// Fast path: pure ASCII has nothing to fold and NFKC is identity.
|
||||||
|
// charCodeAt is cheaper than iterating codepoints.
|
||||||
|
let asciiOnly = true;
|
||||||
|
for (let i = 0; i < s.length; i++) {
|
||||||
|
if (s.charCodeAt(i) > 127) { asciiOnly = false; break; }
|
||||||
|
}
|
||||||
|
if (asciiOnly) return s;
|
||||||
const normalized = s.normalize('NFKC');
|
const normalized = s.normalize('NFKC');
|
||||||
let out = '';
|
let out = '';
|
||||||
for (const ch of normalized) {
|
for (const ch of normalized) {
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue