AI Slop Detector v2.6.1 Self-Audit: I planted 3 bad files — did it catch them? #32
flamehaven01
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
AI code can look “production-ready” while implementing almost nothing.
And yet—when you trace execution paths—there’s barely any load-bearing logic.
That’s AI slop: not broken code, just convincingly empty code.
I built AI Slop Detector to turn that vague “this feels off” into inspectable signals.
Then I had to face the uncomfortable credibility test:
Would the tool accuse my own repo?
v2.6.1 is a trust release
This release is about auditability and stability:
The goal: make the tool harder to regress, and easier to trust.
The self-audit
I ran the detector on its own codebase the same way anyone would:
slop-detector --project .Overall result: CLEAN
Verbatim report note: I’m publishing the complete self-inspection report as a PDF exactly as generated — no edits, no redactions, no reformatting, and no interpretive changes.
“CLEAN” didn’t mean “perfect.”
It meant: there’s enough real implementation here that review is worthwhile.
The real experiment: I planted 3 “known-bad” files
A score alone is easy to doubt.
So I did something deliberate: I planted three intentionally-bad fixtures inside the repo — on purpose — so I’d know whether the detector detonates for the right reasons.
1) Empty shell
A 0-line file scored 100.00, and the report explains it plainly:
Translation: sometimes code “exists” without doing anything — and that should be visible.
2) Dangerous structure (
structural_issues.py)Score 71.21, flagging patterns that rot systems quietly:
Translation: code can run and still become hard to trust or debug later.
3) Buzzword + slop cocktail (
generated_slop.py)Score 96.77, triggering jargon like:
…alongside classic slop patterns.
Translation: impressive language and weak substance often travel together.
The twist
It even flagged my own wording: “production-ready” inside
cli.py.The tool didn’t just audit code.
It audited claims.
Why this is the AI-era review problem
Most tools answer yesterday’s questions:
AI-era workflows need a prior question:
Is there meaningful implementation here at all?
AI Slop Detector isn’t a verdict machine.
It’s a review signal that turns “looks good, but…” into measurable follow-ups.
Quick start
CI Gate modes (soft / hard / quarantine)
If you want one fast review question
If the answer circles back to docstring promises instead of concrete behavior, it’s usually scaffolding.
If you try it
I’d love real-world cases:
Beta Was this translation helpful? Give feedback.
All reactions