security: replace predictable production token with cryptographic randomness#3243
security: replace predictable production token with cryptographic randomness#3243Snakinya wants to merge 2 commits into
Conversation
The previous implementation used zlib.adler32 as a PRNG seed with random.sample to generate 4-letter suffixes. This made tokens fully deterministic — anyone who knows the flow name can compute the exact token, bypassing deployment authorization. Replace with secrets.choice() producing 16-character alphanumeric tokens (36^16 ≈ 7.96×10²⁴ possibilities), making brute-force infeasible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR addresses a real predictability vulnerability in production token generation by replacing the seeded
Confidence Score: 5/5The core change is correct and safe to merge: the cryptographic fix is sound, existing stored tokens are unaffected, and no new failure paths are introduced in production_token.py itself. The secrets.choice() implementation is correct, load_token/store_token are untouched so backward compatibility holds, and the only concern — dead if token is None branches and the lost --new-token guard in the CLI callers — was already surfaced in prior review threads and is outside the changed file. No files in the diff require additional attention; the unchanged CLI callers carry stale dead-code branches that were flagged in a previous review thread. Important Files Changed
Reviews (2): Last reviewed commit: "address review: document prev_token rete..." | Re-trigger Greptile |
The prev_token parameter is no longer used for sequence-based generation but is kept for call-site compatibility. With cryptographic randomness, new_token() can always succeed — the old None-return path (which guarded against exhausting a deterministic sequence) is no longer reachable. This is intentional: the --generate-new-token flow now always produces a valid token rather than erroring when the old sequence was exhausted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
The production token mechanism in
metaflow/plugins/aws/step_functions/production_token.pyuses a fully deterministic algorithm to generate deployment authorization tokens:This means anyone who knows the flow name (which is public information) can compute the exact production token and bypass deployment authorization checks in Step Functions, Airflow, and Argo Workflows.
Before (predictable — same output every time):
After (cryptographically random):
Vulnerability Details
zlib.adler32(a checksum, not CSPRNG) used as seed forrandom.samplewith only 4 lowercase letters (26⁴ = 456,976 possibilities, all enumerable)Fix
Replace with
secrets.choice()generating 16-character alphanumeric tokens (36¹⁶ ≈ 7.96×10²⁴ possibilities).Backwards compatible: existing stored tokens continue to work via
load_token(). Only newly generated tokens use the secure algorithm.Test plan
new_token()produces different output on each call{prefix}-{suffix}(compatible with existing storage/comparison)load_token()/store_token()unchangedcreate