v0.0.2 — Massive Template Expansion
ISC-Bench v0.0.2
Highlights
- 77 templates across 9 domains (was 57)
- 309 prompt variants — English, Chinese, extreme, zero-shot
- 28 confirmed ISC models (was 26) — added GLM-4.7, GLM-4.6
- 100% trigger rate on Qwen3 Coder (309/309)
New Templates (+20)
AI/ML (+16): sentiment, toxigen, phishing, spambot, malware, openai_detector, fraud, darkweb, pii, clickbait, medical_ner, wildguard, emotion, fake_news, sarcasm, propaganda, code_vuln
Cyber (+1): nids (IDS evasion)
Epi (+1): biosurveillance (CBRN threat assessment)
Media (+1): disinfo (information operations)
Prompt Variants
prompt.txt— standard (one-shot)prompt_ex.txt— extreme anchorprompt_zh.txt/prompt_zh_ex.txt— Chinese localized (33 templates)prompt_v2.txt— strengthened variantsprompt_v3_zeroshot.txt— untargeted (all???, zero-shot)
Infrastructure
- Folder rename:
domain_tool_descriptor→domain_tool - MIN_WORDS raised to 50 across all templates
- PROBE_REQUEST_PATTERNS removed from LlamaGuard variants
scripts/batch_test_templates.py— parallel batch testing (10 workers)- Leaderboard chart auto-generation
- Demo video embedded in README
- README: ISC definition, evaluation settings (TVD-Single / TVD-ICL / TVD-Agent)