Skip to content

Feat/echo OSV upstream and python matcher#1

Open
orizerah wants to merge 2 commits into
mainfrom
feat/echo-osv-upstream-and-python-matcher
Open

Feat/echo OSV upstream and python matcher#1
orizerah wants to merge 2 commits into
mainfrom
feat/echo-osv-upstream-and-python-matcher

Conversation

@orizerah

Copy link
Copy Markdown
Owner

No description provided.

orizerah and others added 2 commits May 13, 2026 17:17
OSV schema 1.7.0 introduced an `upstream` field carrying IDs of the
vulnerabilities a record was derived from (CVEs, GHSAs, etc). osv-scanner
v1.9.2 — the last v1.x release — doesn't model this field, so the v6 OSV
transformer was silently dropping it. Producers that place CVE/GHSA
cross-references only in `upstream` (e.g. Echo's OSV feed) lost all
cross-reference data on DB build.

Wrap models.Vulnerability with a thin struct that adds an Upstream slice.
JSON decoding flattens correctly because the upstream type has no custom
UnmarshalJSON. The transformer now folds aliases + upstream +
related-if-advisory into VulnerabilityBlob.Aliases with strset-backed
dedup and first-occurrence ordering, matching the house style in
kev/transform.go.

Embedding the upstream type rather than switching to
github.com/ossf/osv-schema/bindings/go/osvschema keeps this a one-field
extension instead of a full proto-based migration.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Echo distributes patched Python packages identified by a PEP 440 local
version segment (e.g. `django==5.2.1+echo.1`) and ships its
vulnerability records under the `Echo:PyPi` ecosystem. Without this
change a python scan can't reach those records (the v6 store queries by
syft package type, which never resolves to `Echo:PyPi`) and reports
upstream CVEs that Echo has already backported as false positives.

Two layers:

1. New `search.ByExactEcosystem(string)` criterion. The existing
   `ByEcosystem(lang, type)` always derives the wire ecosystem from a
   syft Language/Type, with no way to inject a literal string. Extend
   EcosystemCriteria with an ExactEcosystem field and teach the v6
   search query builder to honor it first in handleEcosystem.

2. Python matcher gates on `\+echo\.\d+` in the installed version. When
   present:
   - Compute a suppression set: query Echo:PyPi for every record
     covering this package, keep those whose fix version is at-or-below
     the installed version, collect their RelatedVulnerabilities (the
     upstream CVE/GHSA aliases). Drop standard matches whose ID is in
     that set — these are vulnerabilities Echo has backported on this
     system.
   - Append positive Echo matches (records where installed < Echo fix)
     so Echo-specific records surface for users running below the patch.

Packages without the marker take a single regex check and otherwise
behave exactly as before.

Regex-marker unit tests cover PEP 440 epochs/post/dev/stacked locals
and reject near-misses (no digits, case mismatch, missing dot).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant