Skip to content
View LueBangs-coder's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report LueBangs-coder

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
LueBangs-coder/README.md

Hi, I'm Luis πŸ‘‹

Learner and developer focused on AI safety and agent reliability β€” teaching myself to build tools that hold AI systems to the same bar we'd hold an engineer: verify before you claim "done."

Founder & Operator, Onslaught Gaming LLC.


πŸ”­ What I'm building β€” Nemesis

A Python evaluation harness that turns real, observed AI-agent failure modes into automated detectors. When an agent reports success, Nemesis checks whether it actually verified the work β€” the tests, the files, the repository state β€” and reports the truth, with evidence.

  • 20 detectors grounded in a documented failure-mode catalog
  • Ships three ways: a CLI, a GitHub Action, and a pip package
  • Tokenless OIDC publishing, CodeQL + dependency scanning, full test suite, green CI

I'm building Nemesis in the open partly to learn β€” the best way to understand how an AI-safety harness works is to build one that actually runs.


πŸ› οΈ How I work

  • Evidence over assertion β€” verify the real state, never trust the transcript
  • Test-driven, small reviewable PRs, CI green before anything ships
  • Learning in public, with clean docs and reproducible builds

🧰 Tools I'm learning and using

Python Β· pytest Β· pre-commit Β· ruff Β· black Β· GitHub Actions Β· hatchling / packaging Β· Git

🌱 Right now

Going deep on Python and AI-safety tooling, and open to opportunities where rigor and correctness matter.

πŸ“« Reach me

Pinned Loading

  1. nemesis-eval nemesis-eval Public

    She who catches hubris in agents β€” a Python evaluation harness for agentic failure modes.

    Python 1