Skip to content

[kernel-spark] Add failing tests for 17 DSv2 streaming feature-matrix bugs (V1+V2 parity)#6765

Draft
zikangh wants to merge 4 commits into
delta-io:masterfrom
zikangh:stack/failing-tests
Draft

[kernel-spark] Add failing tests for 17 DSv2 streaming feature-matrix bugs (V1+V2 parity)#6765
zikangh wants to merge 4 commits into
delta-io:masterfrom
zikangh:stack/failing-tests

Conversation

@zikangh
Copy link
Copy Markdown
Collaborator

@zikangh zikangh commented May 12, 2026

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

Adds 11 test files (1 Scala helper + 10 Java) that reproduce 17 DSv2 streaming bugs catalogued in the DSv2 Streaming AI-assisted Feature Matrix Testing doc. Every test exercises both the DSv1 (spark.readStream().format(\"delta\").load(path)) and DSv2 (spark.readStream().table(\"dsv2.delta.\`")`) streaming paths and asserts they produce identical results (V1 as oracle), so each bug surfaces as a V1-vs-V2 divergence.

Bugs covered:

# File Bug
1 V2StreamingDeletionVectorVariantTest DV reads silently corrupt VARIANT columns
2 V2StreamingReadTest (added 1 method) NPE on partitioned tables when partition col isn't last
3, 7, 16, 17, 21 SparkGoldenTableTest (already differential) Silent value swap / restart row-dup / projection+filter+rate-limit
4, 5, 6 V2PartitionValueBoundaryTest Partition-value encoding boundaries
10 V2StreamingRowTrackingTest AIOOBE projecting _metadata.row_id
11 V2StreamingIctTest + IctTestUtils.scala ICT mid-history resolves startingTimestamp to v0
18, 19, 20 UCDeltaStreamingEdgeDataReadTest UC MANAGED null-column NPE + empty-table precondition
23 V2StreamingRaceLifecycleTest Long-running stream misses mid-stream protocol upgrade
24 V2StreamingMidPriorityScenarios13to18Test maxBytesPerTrigger off-by-one (V1+V2 shared bug → asserts against oracle, not V1==V2)
25, 26 V2StreamingSchemaRejectionTest VOID column unloadable; user-schema silently accepted

These tests are intended to fail on master — they exercise the bug, with V1 acting as the oracle. The fixes live on a separate stack and will land in follow-up PRs; this PR is the regression net.

How was this patch tested?

sparkV2/Test/compile, sparkUnityCatalog/Test/compile, sparkV2/Test/javafmtCheck, sparkUnityCatalog/Test/javafmtCheck all pass on Java 17. Tests themselves are expected to fail (intentional — V2 path broken; V1 is oracle).

Does this PR introduce any user-facing changes?

No.

@zikangh zikangh changed the title [Spark] Add failing tests for 17 DSv2 streaming feature-matrix bugs (V1+V2 parity) [kernel-spark] Add failing tests for 17 DSv2 streaming feature-matrix bugs (V1+V2 parity) May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant