Skip to content

Commit aec7d12

Browse files
authored
Merge pull request #93 from wuyoscar/update/2026-04-24
update/2026-04-24: template MIN_WORDS + TVD diversification + aiml_guard rename
2 parents 25e9c19 + d7f1285 commit aec7d12

153 files changed

Lines changed: 1958 additions & 1470 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AGENT_README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ We study this with the **TVD** (Task + Validator + Data) framework:
2525
- **Validator**: tool validation that defines whether the task succeeds
2626
- **Data**: sensitive data fields the model must fill in to pass validation
2727

28-
**77 templates** across **9 domains** (AI/ML, biology, chemistry, cybersecurity, epidemiology, clinical genomics, pharmacology, media, and more). **28 confirmed models** including GPT-5.2, Claude Opus 4.6, Gemini 3 Pro, GLM-4.7, and others.
28+
**84 templates** across **9 domains** (AI/ML, biology, chemistry, cybersecurity, epidemiology, clinical genomics, pharmacology, media, and more), with a growing ISC Arena of confirmed and ranked model reproductions.
2929

3030
### Boundaries
3131

CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ ISC can still be triggered on live production endpoints as of 2026-04-10. Attack
7575
### Community Evidence (10 new folders)
7676
- Multi-template evidence for Gemini 3 Flash (guard_attack_v2 + detoxify + offensive)
7777
- Multi-template evidence for GPT-5.1 (guard_attack_v2 + detoxify + phishing)
78-
- Multi-template evidence for Grok 4.1 (guard_attack_v2 + detoxify + aiml_guard)
78+
- Multi-template evidence for Grok 4.1 (guard_attack_v2 + detoxify + aiml_qwen3guard)
7979

8080
### Agent Fix
8181
- Fix `reasoning_effort` bug in `experiment/isc_agent/agent.py`: was applied to all models; now only set when `--thinking` flag is passed

ISC_PAPER_DIGEST.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -448,7 +448,7 @@ The key variation across domains is in the **validator** and **anchor**:
448448
449449
For single-turn: everything in one prompt file. For agentic mode: split into separate files in a workspace directory.
450450
451-
All 53 templates are in [`templates/`](templates/) — browse them for more domain-specific patterns.
451+
All 84 templates are in [`templates/`](templates/) — browse them for more domain-specific patterns.
452452
453453
## TVD Customization
454454

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ Quickest way in: open any template folder. `prompt.txt` is the evaluation prompt
466466

467467
| # | Task | Tool | Template |
468468
|:-:|------|------|:--------:|
469-
| 46 | LLM input-output safeguard classification | Llama-Guard | [📄](templates/aiml_guard/) |
469+
| 46 | LLM input-output safeguard classification | Llama-Guard | [📄](templates/aiml_llamaguard/) |
470470
| 47 | Statistical outlier detection | PyOD | [📄](templates/aiml_pyod/) |
471471
| 48 | Toxic comment classification | Detoxify | [📄](templates/aiml_detoxify/) |
472472
| 49 | Content moderation benchmark (3 variants) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -495,7 +495,7 @@ Quickest way in: open any template folder. `prompt.txt` is the evaluation prompt
495495
</details>
496496

497497
```bash
498-
cat templates/aiml_guard/prompt.txt
498+
cat templates/aiml_llamaguard/prompt.txt
499499
# → Copy, paste into any LLM. That's it.
500500
```
501501

README_es.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -464,7 +464,7 @@ La vía más rápida: abre cualquier carpeta de plantilla. `prompt.txt` es el pr
464464

465465
| # | Tarea | Herramienta | Plantilla |
466466
|:-:|------|------|:--------:|
467-
| 46 | Clasificación de salvaguardas de entrada/salida de LLMs | Llama-Guard | [📄](templates/aiml_guard/) |
467+
| 46 | Clasificación de salvaguardas de entrada/salida de LLMs | Llama-Guard | [📄](templates/aiml_llamaguard/) |
468468
| 47 | Detección estadística de valores atípicos | PyOD | [📄](templates/aiml_pyod/) |
469469
| 48 | Clasificación de comentarios tóxicos | Detoxify | [📄](templates/aiml_detoxify/) |
470470
| 49 | Benchmark de moderación de contenido (3 variantes) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -493,7 +493,7 @@ La vía más rápida: abre cualquier carpeta de plantilla. `prompt.txt` es el pr
493493
</details>
494494

495495
```bash
496-
cat templates/aiml_guard/prompt.txt
496+
cat templates/aiml_llamaguard/prompt.txt
497497
# → Copia y pega en cualquier LLM. Eso es todo.
498498
```
499499

README_ja.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -464,7 +464,7 @@ ISC はフロンティア大規模モデル上で現れ続けています。以
464464

465465
| # | タスク | ツール | テンプレート |
466466
|:-:|------|------|:--------:|
467-
| 46 | LLM 入出力セーフガード分類 | Llama-Guard | [📄](templates/aiml_guard/) |
467+
| 46 | LLM 入出力セーフガード分類 | Llama-Guard | [📄](templates/aiml_llamaguard/) |
468468
| 47 | 統計的外れ値検出 | PyOD | [📄](templates/aiml_pyod/) |
469469
| 48 | 有害コメント分類 | Detoxify | [📄](templates/aiml_detoxify/) |
470470
| 49 | コンテンツモデレーションベンチマーク(3バリアント) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -493,7 +493,7 @@ ISC はフロンティア大規模モデル上で現れ続けています。以
493493
</details>
494494

495495
```bash
496-
cat templates/aiml_guard/prompt.txt
496+
cat templates/aiml_llamaguard/prompt.txt
497497
# → コピーして任意の LLM に貼り付けるだけです。
498498
```
499499

README_ko.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ ISC는 프론티어 대형 모델에서 반복적으로 나타납니다. 아래
463463

464464
| # | 작업 | 도구 | 템플릿 |
465465
|:-:|------|------|:--------:|
466-
| 46 | LLM 입출력 보호 장치 분류 | Llama-Guard | [📄](templates/aiml_guard/) |
466+
| 46 | LLM 입출력 보호 장치 분류 | Llama-Guard | [📄](templates/aiml_llamaguard/) |
467467
| 47 | 통계적 이상치 탐지 | PyOD | [📄](templates/aiml_pyod/) |
468468
| 48 | 독성 댓글 분류 | Detoxify | [📄](templates/aiml_detoxify/) |
469469
| 49 | 콘텐츠 조절 벤치마크(3가지 변형) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -492,7 +492,7 @@ ISC는 프론티어 대형 모델에서 반복적으로 나타납니다. 아래
492492
</details>
493493

494494
```bash
495-
cat templates/aiml_guard/prompt.txt
495+
cat templates/aiml_llamaguard/prompt.txt
496496
# → 복사하여 어떤 LLM에든 붙여넣으세요. 그것으로 끝입니다.
497497
```
498498

README_pt.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ Forma mais rápida de começar: abra qualquer pasta de template. `prompt.txt` é
463463

464464
| # | Tarefa | Ferramenta | Template |
465465
|:-:|------|------|:--------:|
466-
| 46 | Classificação de salvaguarda de entrada-saída de LLM | Llama-Guard | [📄](templates/aiml_guard/) |
466+
| 46 | Classificação de salvaguarda de entrada-saída de LLM | Llama-Guard | [📄](templates/aiml_llamaguard/) |
467467
| 47 | Detecção estatística de outliers | PyOD | [📄](templates/aiml_pyod/) |
468468
| 48 | Classificação de comentários tóxicos | Detoxify | [📄](templates/aiml_detoxify/) |
469469
| 49 | Benchmark de moderação de conteúdo (3 variantes) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -492,7 +492,7 @@ Forma mais rápida de começar: abra qualquer pasta de template. `prompt.txt` é
492492
</details>
493493

494494
```bash
495-
cat templates/aiml_guard/prompt.txt
495+
cat templates/aiml_llamaguard/prompt.txt
496496
# → Copie, cole em qualquer LLM. Só isso.
497497
```
498498

README_vi.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,7 @@ Cách vào nhanh nhất: mở bất kỳ thư mục template nào. `prompt.txt`
463463

464464
| # | Tác vụ | Công cụ | Template |
465465
|:-:|------|------|:--------:|
466-
| 46 | Phân loại bảo vệ input-output LLM | Llama-Guard | [📄](templates/aiml_guard/) |
466+
| 46 | Phân loại bảo vệ input-output LLM | Llama-Guard | [📄](templates/aiml_llamaguard/) |
467467
| 47 | Phát hiện ngoại lệ thống kê | PyOD | [📄](templates/aiml_pyod/) |
468468
| 48 | Phân loại bình luận độc hại | Detoxify | [📄](templates/aiml_detoxify/) |
469469
| 49 | Benchmark kiểm duyệt nội dung (3 biến thể) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -492,7 +492,7 @@ Cách vào nhanh nhất: mở bất kỳ thư mục template nào. `prompt.txt`
492492
</details>
493493

494494
```bash
495-
cat templates/aiml_guard/prompt.txt
495+
cat templates/aiml_llamaguard/prompt.txt
496496
# → Sao chép, dán vào bất kỳ LLM nào. Vậy là xong.
497497
```
498498

README_zh.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -473,7 +473,7 @@ ISC 在前沿大模型上持续出现。下面这些案例由社区贡献者完
473473

474474
| # | Task | Tool | Template |
475475
|:-:|------|------|:--------:|
476-
| 46 | LLM input-output safeguard classification | Llama-Guard | [📄](templates/aiml_guard/) |
476+
| 46 | LLM input-output safeguard classification | Llama-Guard | [📄](templates/aiml_llamaguard/) |
477477
| 47 | Statistical outlier detection | PyOD | [📄](templates/aiml_pyod/) |
478478
| 48 | Toxic comment classification | Detoxify | [📄](templates/aiml_detoxify/) |
479479
| 49 | Content moderation benchmark (3 variants) | OpenAI Moderation API | [📄](templates/aiml_moderation/) |
@@ -502,7 +502,7 @@ ISC 在前沿大模型上持续出现。下面这些案例由社区贡献者完
502502
</details>
503503

504504
```bash
505-
cat templates/aiml_guard/prompt.txt
505+
cat templates/aiml_llamaguard/prompt.txt
506506
# → Copy, paste into any LLM. That's it.
507507
```
508508

0 commit comments

Comments
 (0)