Skip to content

feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing#3883

Draft
aasthabharill wants to merge 6 commits into
mainfrom
add-functional-testing-skill
Draft

feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing#3883
aasthabharill wants to merge 6 commits into
mainfrom
add-functional-testing-skill

Conversation

@aasthabharill
Copy link
Copy Markdown
Member

@aasthabharill aasthabharill commented Jun 4, 2026

This PR introduces the migrations-functional-testing agent skill under .agent/skills/. This skill equips the AI agent with a modular, gated workflow to perform end-to-end functional testing of Dataflow templates against local code changes using GCP resources.

IMPORTANT: The skill clearly mentions that it is to be used ONLY for migrations specific pipelines i.e. sourcedb-to-spanner, spanner-to-sourcedb, datastream-to-spanner and gcs-spanner-dv as it's written only keeping these in mind.

Key Features

  1. Topology & Schema Planning: Analyzes repository code changes, maps source/sink configurations, and plans test cases.
  2. Autonomous Provisioning: Autonomously provisions ephemeral Spanner and Cloud SQL instances once the proposed plans are approved.
  3. Database Credential Management: Safely requests database user credentials or offers to autonomously generate a temporary DB user and password for Cloud SQL.
  4. Optimized Staging: Stage jobs using standard public Dataflow template paths or packages local modifications using optimized Maven flags (skipping Spotless, Checkstyle, and unit tests to minimize latency).
  5. Custom & Sharding Transformations: Detects if a custom/sharding JAR is required, plans the transformation logic, builds the JAR, uploads it to GCS, and wires the paths into the Terraform configurations.
  6. Strict Safety & Approval Gates: Enforces "Stop and Wait" gates for code analysis, schema setups, configuration files, and Terraform variables. Once approved, the agent executes tasks (like running Terraform apply) autonomously.
  7. Verification & Success Criteria: Proposes destination table verification queries, explicit DLQ checks (dlq/ and filteredEvents/ GCS folders), and generates a final markdown verification report.

Files Added/Modified

  • [NEW] SKILL.md: The core instruction set defining the orchestrator workflow, safety gates, and automation rules.
  • [NEW] TEST.md: A step-by-step manual test case designed for reviewers to verify the skill using a custom transformation scenario.
  • [NEW] skills_index.md: Added the index references to map to the new skill directory and name.

Verification Run

2 tests were done:

  1. Testing Custom transformation in sourcedb-to-spanner without any new changes
  2. Testing a new feature to support custom transformations in gcs-spanner-dv (Github PR)

Analysis:

  • Since a lot of these changes are gated with user approval, it is a bit hands on but that's by design.
  • It does improve the functional testing experience by a lot as most things are done autonomously by agent and requires user attention mainly at the beginning (during curation of test cases and setup).
  • The test cases do require some attention and customization from user - this is super important.
  • The setup creation is quite good as per the pre-decided test cases and requires minimal prompting from user and the agent is able to clearly define the success criteria, etc.
  • Sometimes the agent might prompt the user to do some steps (example create the custom transformation jar) but the agent is able to do it by itself once prompted from user - there were some improvements made to better this experience
  • The agent was able to successfully utilise the debugging skill to track job progress, debug issues, make changes and re-run the job. The follow up jobs were a success.
  • The agent tracks job progress and autonomously queries destination and prepares test success report clearly stating successes and failures after test execution.

Follow up work

The test cases do require some attention and customization from user. There will be a follow-up effort to add a skill to improve creating edge cases which will be referred to here once its completed.

@aasthabharill aasthabharill changed the title initial changes [Migrations] Add skill to functionally test PRs Jun 4, 2026
@aasthabharill aasthabharill changed the title [Migrations] Add skill to functionally test PRs feat: Add migrations-functional-testing agent skill for Dataflow pipeline testing Jun 4, 2026
@aasthabharill aasthabharill added the addition New feature or request label Jun 4, 2026
@aasthabharill aasthabharill marked this pull request as ready for review June 4, 2026 09:08
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request adds a new functional testing skill for the AI agent, designed to streamline the validation of Dataflow migration templates. By automating the provisioning of ephemeral GCP resources and integrating verification steps, the skill reduces manual overhead while maintaining strict safety and approval requirements for infrastructure changes.

Highlights

  • New Agent Skill: Introduced the migrations-functional-testing skill to enable autonomous, gated end-to-end functional testing for specific Dataflow migration templates.
  • Workflow Automation: The skill automates environment provisioning, configuration staging, and verification reporting while enforcing strict safety gates for user approval.
  • Template Support: Supports functional testing for sourcedb-to-spanner, spanner-to-sourcedb, datastream-to-spanner, and gcs-spanner-dv templates.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@aasthabharill aasthabharill requested a review from manitgupta June 4, 2026 09:08
@aasthabharill aasthabharill marked this pull request as draft June 4, 2026 09:09
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the migrations-functional-testing skill, which provides a modular and gated workflow for functionally testing local Dataflow pipeline changes. It includes the skill definition, test cases, and an updated skills index. The review feedback highlights two main issues: a typo in the directory path for the smt-e2e-dataflow-debugging skill in the index file, and a non-portable shell command used for generating unique run IDs in the skill definition which can fail on macOS/BSD platforms.

Comment thread .agents/skills_index.md
Auto-generated index of available skills.

## smt-e2e-dataflow-debugging
**Directory**: `.agents/skills/smt-e2e-dataflow-debugging`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There is a typo in the directory path. It refers to .agents/skills/... (plural) instead of .agent/skills/... (singular), which is the actual directory structure used in this repository. Please update the directory path to: **Directory**: .agent/skills/smt-e2e-dataflow-debugging

## Workflow Phases

### Phase 1: Code Analysis & Test Case Generation
1. **Sourcing State**: Execute `source .env.testing` in the terminal or load the variables into context. Generate a unique run ID: `export TEST_RUN_ID=$(head /dev/urandom | tr -dc a-z0-9 | head -c 6)`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The command head /dev/urandom | tr -dc a-z0-9 | head -c 6 is not portable and can fail or hang on macOS/BSD platforms. On macOS, head expects lines and can fail with an Illegal byte sequence error when reading raw binary data from /dev/urandom under a UTF-8 locale. A more robust and portable way to generate a random 6-character alphanumeric string across both Linux and macOS is: LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom | head -c 6. Please update the command to: export TEST_RUN_ID=$(LC_ALL=C tr -dc 'a-z0-9' < /dev/urandom | head -c 6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

addition New feature or request size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant