You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+25-22Lines changed: 25 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,11 +45,11 @@
45
45
</p>
46
46
47
47
> [!CAUTION]
48
-
> **This project is released solely for academic safety research and responsible disclosure.**
48
+
> **⚠️ Disclaimer: This project is released solely for academic safety research and responsible disclosure.**
49
49
>
50
50
> As AI agents become increasingly autonomous, we believe ISC represents a critical and underexplored threat to safety alignment. The purpose of this work is to help the research community understand the vulnerability and collaboratively develop effective mitigations — not to enable harm.
51
51
>
52
-
> We **strongly discourage** any use of ISC-Bench outside of safety research contexts. The templates and techniques in this repository should not be used to generate harmful content for any purpose other than improving AI safety. We do not endorse, support, or take responsibility for any misuse.
52
+
> **WE DO NOT ALLOW** any use of ISC-Bench outside of safety research contexts. The templates and techniques in this repository should not be used to generate harmful content for any purpose other than improving AI safety. **WE DO NOT ALLOW** any misuse of this research.
53
53
>
54
54
> If you are a model provider and would like to collaborate on mitigations, please [contact us](mailto:wuy7117@gmail.com).
55
55
@@ -63,18 +63,18 @@
63
63
64
64
| Date | Update |
65
65
|:-----|--------|
66
-
| 🔥 v8 — 2026-03-26 |**New finding**: [file upload triggers ISC](community/issue-19-gemini3flash-redteam-testgen/) — same TVD, lower barrier. Community templates, disclaimer |
> **Found ISC on an untested model?**[Submit via GitHub Issue →](https://github.com/wuyoscar/ISC-Bench/issues/new?template=isc-submission.md&title=[ISC]+Model+Name) — we'll verify and add you to the leaderboard.
107
106
>
108
-
> Rankings synced with [Arena](https://arena.ai/leaderboard) weekly. Include a public conversation link, harmful content type, and domain. See our [paper](https://arxiv.org/abs/2603.23509) for details.
107
+
> **Rules**: Rankings are synced with [Arena](https://arena.ai/leaderboard) weekly. Submit your ISC case via the [issue template](.github/ISSUE_TEMPLATE/isc-submission.md) — include a public conversation link, the type of harmful content generated, and the domain. ISC is a low-conditional design concept — just a professional task that causes models to generate harmful content on their own. See our [paper](https://arxiv.org/abs/2603.23509) for details.
56 prompt templates across 8 domains. Each one triggers any frontier LLM to generate harmful content. All 56 tested on 5 models — **every single one succeeded**. Ask the same questions directly, every model refuses.
476
476
477
-
### 🌍 Community Templates
477
+
### 🌍 Community Reproductions
478
478
479
-
New ISC cases discovered by the community — novel templates, trigger methods, and domains beyond the original 56.
479
+
Community members who learned the ISC concept and successfully reproduced it on frontier models.
|[#19](community/issue-19-gemini3flash-redteam-testgen/)| Gemini 3 Flash |[@bboylyg](https://github.com/bboylyg)| Red-team test case generator (file upload) | AI Safety |
|[#9](https://github.com/wuyoscar/ISC-Bench/issues/9)| Grok 4.20 Beta |[@HanxunH](https://github.com/HanxunH)| LLaMA Guard test case gen (hardcore) | AI Safety |
486
-
|[#11](https://github.com/wuyoscar/ISC-Bench/issues/11)| Dola Seed 2.0 |[@HanxunH](https://github.com/HanxunH)| LLaMA Guard test case gen | AI Safety |
|[#22](https://github.com/wuyoscar/ISC-Bench/issues/22)| GPT-5.3 Chat |[@zry29](https://github.com/zry29)| Modified `aiml_openai_moderation`| AI Safety & ML | ② |
484
+
|[#19](community/issue-19-gemini3flash-redteam-testgen/)| Gemini 3 Flash |[@bboylyg](https://github.com/bboylyg)| Red-team test case gen (file upload) | AI Safety & ML | ③ |
485
+
|[#12](https://github.com/wuyoscar/ISC-Bench/issues/12)| Gemini 3 Flash |[@HanxunH](https://github.com/HanxunH)| CommsDraft Pro (fake govt declarations) | Media & Comms | ③ |
486
+
|[#9](https://github.com/wuyoscar/ISC-Bench/issues/9)| Grok 4.20 Beta |[@HanxunH](https://github.com/HanxunH)| LLaMA Guard test case gen (hardcore) | AI Safety & ML | ③ |
487
+
|[#11](https://github.com/wuyoscar/ISC-Bench/issues/11)| Dola Seed 2.0 |[@HanxunH](https://github.com/HanxunH)| LLaMA Guard test case gen | AI Safety & ML | ③ |
488
+
|[#4](https://github.com/wuyoscar/ISC-Bench/issues/4)| Qwen 3 Max |[@HanxunH](https://github.com/HanxunH)| Cantera incineration chemistry | Comp. Chemistry | ② |
> **Type**: ① Direct use of ISC-Bench template · ② Modified ISC-Bench template · ③ New method using ISC concept · ④ Outside TVD paradigm
489
492
490
493
> [!TIP]
491
494
> Designed a new ISC template? [Submit it →](https://github.com/wuyoscar/ISC-Bench/issues/new?template=isc-submission.md&title=[ISC]+Model+Name) and we'll add it to the community collection with full attribution.
0 commit comments