Feat/echo OSV upstream and python matcher#1
Open
orizerah wants to merge 2 commits into
Open
Conversation
OSV schema 1.7.0 introduced an `upstream` field carrying IDs of the vulnerabilities a record was derived from (CVEs, GHSAs, etc). osv-scanner v1.9.2 — the last v1.x release — doesn't model this field, so the v6 OSV transformer was silently dropping it. Producers that place CVE/GHSA cross-references only in `upstream` (e.g. Echo's OSV feed) lost all cross-reference data on DB build. Wrap models.Vulnerability with a thin struct that adds an Upstream slice. JSON decoding flattens correctly because the upstream type has no custom UnmarshalJSON. The transformer now folds aliases + upstream + related-if-advisory into VulnerabilityBlob.Aliases with strset-backed dedup and first-occurrence ordering, matching the house style in kev/transform.go. Embedding the upstream type rather than switching to github.com/ossf/osv-schema/bindings/go/osvschema keeps this a one-field extension instead of a full proto-based migration. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Echo distributes patched Python packages identified by a PEP 440 local
version segment (e.g. `django==5.2.1+echo.1`) and ships its
vulnerability records under the `Echo:PyPi` ecosystem. Without this
change a python scan can't reach those records (the v6 store queries by
syft package type, which never resolves to `Echo:PyPi`) and reports
upstream CVEs that Echo has already backported as false positives.
Two layers:
1. New `search.ByExactEcosystem(string)` criterion. The existing
`ByEcosystem(lang, type)` always derives the wire ecosystem from a
syft Language/Type, with no way to inject a literal string. Extend
EcosystemCriteria with an ExactEcosystem field and teach the v6
search query builder to honor it first in handleEcosystem.
2. Python matcher gates on `\+echo\.\d+` in the installed version. When
present:
- Compute a suppression set: query Echo:PyPi for every record
covering this package, keep those whose fix version is at-or-below
the installed version, collect their RelatedVulnerabilities (the
upstream CVE/GHSA aliases). Drop standard matches whose ID is in
that set — these are vulnerabilities Echo has backported on this
system.
- Append positive Echo matches (records where installed < Echo fix)
so Echo-specific records surface for users running below the patch.
Packages without the marker take a single regex check and otherwise
behave exactly as before.
Regex-marker unit tests cover PEP 440 epochs/post/dev/stacked locals
and reject near-misses (no digits, case mismatch, missing dot).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.