Skip to content

xr843/llm-seclint

Repository files navigation

llm-seclint

PyPI version CI License: MIT Python 3.10+

Find LLM security vulnerabilities before they ship.

llm-seclint is a static analysis tool that scans Python source code for security issues specific to LLM-powered applications. Think Bandit, but for the AI era.

The Problem

LLM-powered applications introduce a new class of vulnerabilities that traditional security tools miss:

  • Prompt injection through unsanitized user input
  • Arbitrary code execution when LLM output flows into eval(), subprocess, or SQL queries
  • API key leakage through hardcoded credentials
  • Path traversal when LLM output controls file access
  • Template injection when dynamic content reaches template engines unsandboxed
  • XML external entities when parsing untrusted XML without protection
  • Supply chain attacks through unpinned LLM dependency versions

Existing tools like garak, LLM Guard, and Guardrails operate at runtime -- they test deployed models or filter live traffic. None of them analyze your source code before you ship.

llm-seclint fills this gap. It scans your Python source using AST analysis to find LLM-specific security issues at development time, just like Bandit does for general Python security.

Quick Start

pip install llm-seclint

Scan your project:

llm-seclint scan .

That's it. You'll see output like:

src/app.py
  !! L12 [LS001] Hardcoded API key assigned to 'OPENAI_API_KEY'
  !  L25 [LS002] User input interpolated into prompt via f-string
  !! L41 [LS003] LLM/dynamic output interpolated into SQL query via f-string

Found 3 issue(s): 2 critical, 1 high
Scanned in 0.03s

How It Works

Source Code → AST Parsing → 9 Security Rules → Findings Report
                              ├─ LS001: Hardcoded API Keys
                              ├─ LS002: Prompt Injection
                              ├─ LS003: SQL Injection via LLM
                              ├─ LS004: Shell Injection via LLM
                              ├─ LS005: Path Traversal via LLM
                              ├─ LS006: Insecure Deserialization
                              ├─ LS007: Template Injection (SSTI)
                              ├─ LS008: XXE XML Parsing
                              └─ LS010: Unpinned LLM Dependencies

llm-seclint parses your Python files into Abstract Syntax Trees and applies targeted security rules that understand LLM-specific data flows. No model access required, no runtime overhead -- just fast, deterministic analysis.

Real-World Results

llm-seclint has found real vulnerabilities in production codebases:

Project Stars Finding Status
Dify 100k+ Unsafe pickle.loads() on database data (LS006) Reported
Dify 100k+ render_template_string() SSTI in UNSAFE mode (LS007) Reported
Dify 100k+ 54 SQL f-string injections in VDB drivers (LS003) Reported
LiteLLM 20k+ exec() RCE in custom code guardrails (LS006) PR #24455
LiteLLM 20k+ Jinja2 SSTI in 4 prompt managers (LS007) PR #24458
vllm 45k+ eval() on LLM output in example code (LS006) PR #37939
crewAI 30k+ XXE exception handling + exec() in code interpreter PR #5005

What It Detects

Rule Name Severity Description
LS001 hardcoded-api-key CRITICAL Hardcoded API keys for LLM providers (OpenAI, Anthropic, xAI, etc.)
LS002 prompt-concat-injection HIGH User input concatenated into LLM prompts via f-strings, +, or .format()
LS003 llm-to-sql-injection CRITICAL LLM output interpolated into SQL queries
LS004 llm-to-shell-injection CRITICAL LLM output passed to subprocess / os.system
LS005 llm-to-path-traversal HIGH LLM output used as file paths
LS006 insecure-deserialization HIGH eval / exec / pickle / unsafe YAML on dynamic input
LS007 server-side-template-injection CRITICAL Dynamic content passed to template engine without sandboxing
LS008 xxe-xml-parsing HIGH XML parsing without protection against external entity attacks
LS010 unpinned-llm-dependency HIGH LLM dependency uses unpinned version constraint (e.g. >= without <), vulnerable to supply chain attacks

Examples

LS001: Hardcoded API Key

# Bad - detected by llm-seclint
openai.api_key = "sk-proj-abc123..."
client = Anthropic(api_key="sk-ant-api03-...")

# Good
openai.api_key = os.environ["OPENAI_API_KEY"]
client = Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

LS002: Prompt Injection

# Bad - user input directly in prompt
prompt = f"You are a bot. User says: {user_input}"

# Good - separate message roles
messages = [
    {"role": "system", "content": "You are a bot."},
    {"role": "user", "content": user_input},
]

LS003: SQL Injection via LLM Output

# Bad - LLM output in SQL
cursor.execute(f"SELECT * FROM users WHERE name = '{llm_response}'")

# Good - parameterized query
cursor.execute("SELECT * FROM users WHERE name = ?", (llm_response,))

LS004: Shell Injection via LLM Output

# Bad - LLM output to shell
subprocess.run(llm_output, shell=True)

# Good - validate against allowlist
if command in ALLOWED_COMMANDS:
    subprocess.run([command], check=False)

LS005: Path Traversal via LLM Output

# Bad - LLM output as file path
with open(llm_response) as f: ...

# Good - validate against base directory
path = (ALLOWED_BASE / filename).resolve()
assert str(path).startswith(str(ALLOWED_BASE))

LS006: Insecure Deserialization

# Bad - eval on LLM response
data = eval(llm_response)

# Good - use safe parsing
data = json.loads(llm_response)

LS007: Server-Side Template Injection

# Bad - user input in template string
render_template_string(f"<h1>Hello {user_input}</h1>")

# Good - pass variables through context
render_template_string("<h1>Hello {{ name }}</h1>", name=user_input)

LS008: XXE XML Parsing

# Bad - parsing untrusted XML without protection
tree = etree.parse(user_uploaded_file)

# Good - use defusedxml
from defusedxml.lxml import parse
tree = parse(user_uploaded_file)

LS010: Unpinned LLM Dependency

# Bad - open-ended constraint allows malicious future releases (requirements.txt)
litellm>=1.64.0
dspy>=2.0
openai>=1.0

# Good - pinned to exact version
litellm==1.82.2

# Good - upper bound prevents auto-upgrade to compromised versions
litellm>=1.64.0,<1.83

This rule was motivated by the litellm supply chain attack where dspy used litellm>=1.64.0 and a compromised release was automatically pulled in. It scans requirements.txt, pyproject.toml, and setup.cfg for LLM packages with open-ended >= constraints.

Framework Support

llm-seclint understands patterns from popular LLM frameworks:

  • LangChain -- PromptTemplate, ChatPromptTemplate.from_messages(), HumanMessagePromptTemplate
  • LiteLLM -- litellm.completion(), litellm.acompletion()
  • OpenAI SDK -- openai.ChatCompletion.create(), client.chat.completions.create()
  • Anthropic SDK -- anthropic.Anthropic().messages.create()
  • Flask/Jinja2 -- render_template_string(), jinja2.Template()

OWASP LLM Top 10 Mapping

OWASP LLM Top 10 llm-seclint Rules
LLM01: Prompt Injection LS002
LLM02: Insecure Output Handling LS003, LS004, LS005, LS006, LS007
LLM06: Sensitive Information Disclosure LS001
A05:2021: Security Misconfiguration LS008 (CWE-611)

Comparison

Feature llm-seclint garak LLM Guard Guardrails
Analysis type Static (AST) Dynamic (probing) Runtime (filter) Runtime (guard)
Requires running model No Yes Yes Yes
CI/CD integration Native Manual Manual Manual
SARIF output Yes No No No
# nosec inline suppression Yes N/A N/A N/A
Pre-commit hook Yes No No No
Finds hardcoded keys Yes No No No
Finds prompt injection patterns Yes Tests for Filters Filters
Finds output handling flaws Yes No No No
Language Python Python Python Python

CLI Usage

# Scan current directory
llm-seclint scan .

# Scan specific files
llm-seclint scan src/ --include "*.py"

# JSON output
llm-seclint scan . --format json -o results.json

# SARIF output (for GitHub Code Scanning)
llm-seclint scan . --format sarif -o results.sarif

# Ignore specific rules
llm-seclint scan . --ignore LS001,LS002

# Set minimum severity
llm-seclint scan . --min-severity HIGH

# List all rules
llm-seclint rules

# Show version
llm-seclint --version

Profiles

llm-seclint ships with two scan profiles:

  • --profile app (default) — Full scan for LLM-powered applications
  • --profile engine — Tuned for LLM inference engines (vllm, TGI, etc.). Disables LS002 (prompt injection) since processing prompts is the engine's job.

Inline Suppression

Suppress specific findings with # nosec comments:

api_key = "sk-test-key-for-ci"  # nosec LS001

GitHub Code Scanning Integration

llm-seclint supports SARIF output for direct integration with GitHub Code Scanning. Add this to your GitHub Actions workflow:

- name: Run llm-seclint
  run: llm-seclint scan . --format sarif -o results.sarif

- name: Upload SARIF to GitHub
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: results.sarif

Pre-commit Hook

Add llm-seclint to your .pre-commit-config.yaml:

repos:
  - repo: https://github.com/xr843/llm-seclint
    rev: v0.1.0
    hooks:
      - id: llm-seclint

Configuration

Create a .llm-seclint.yml in your project root:

# Patterns for files to include
include_patterns:
  - "*.py"

# Patterns for files to exclude
exclude_patterns:
  - "test_*.py"
  - "*_test.py"

# Rules to ignore
ignore_rules:
  - LS005

# Minimum severity to report (CRITICAL, HIGH, MEDIUM, LOW, INFO)
min_severity: MEDIUM

Installation for Development

git clone https://github.com/xr843/llm-seclint.git
cd llm-seclint
pip install -e ".[dev]"
pytest

Contributing

Contributions are welcome! Here's how to add a new rule:

  1. Create a new file in src/llm_seclint/rules/python/
  2. Subclass Rule and implement the check() method
  3. Register the rule in src/llm_seclint/rules/registry.py
  4. Add tests in tests/rules/
  5. Update this README

Please open an issue first to discuss significant changes.

Roadmap

  • v0.2: JavaScript/TypeScript analyzer (LangChain.js, Vercel AI SDK)
  • v0.3: YAML/JSON config file scanning (detecting secrets in LangChain configs)
  • v0.4: Framework-specific rules (LangChain, LlamaIndex, Semantic Kernel)
  • v0.5: Auto-fix suggestions with --fix flag
  • v1.0: Stable API, VS Code extension

License

MIT

About

Static security linter for LLM-powered applications. The Bandit for the AI era.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages