Skip to content

chore(evalhub): sync provider ConfigMaps from upstream eval-hub#720

Open
gnaulak-redhat wants to merge 5 commits intotrustyai-explainability:mainfrom
gnaulak-redhat:chore-sync-evalhub-config
Open

chore(evalhub): sync provider ConfigMaps from upstream eval-hub#720
gnaulak-redhat wants to merge 5 commits intotrustyai-explainability:mainfrom
gnaulak-redhat:chore-sync-evalhub-config

Conversation

@gnaulak-redhat
Copy link
Copy Markdown
Collaborator

@gnaulak-redhat gnaulak-redhat commented May 3, 2026

Sync eval-hub/eval-hub main config provider yaml files.
These files are auto-synced with hack/sync-evalhub-providers.py script

This is in part to fix the ci pipeline in eval-hub/eval-hub repository side

Run python scripts/check_configmap_sync.py
Fetching ConfigMap listing from trustyai-explainability/trustyai-service-operator...
Found 9 ConfigMap(s) to check.

OK  collection-leaderboard-v2.yaml <-> config/collections/leaderboard-v2.yaml
OK  collection-safety-and-fairness-v1.yaml <-> config/collections/safety-and-fairness-v1.yaml
OK  collection-toxicity-and-ethical-principles.yaml <-> config/collections/toxicity-and-ethical-principles.yaml

Drift detected:

provider-garak-kfp.yaml vs config/providers/garak-kfp.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-garak.yaml vs config/providers/garak.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-guidellm.yaml vs config/providers/guidellm.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-ibm-clear.yaml vs config/providers/ibm-clear.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-lighteval.yaml vs config/providers/lighteval.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-lm-evaluation-harness.yaml vs config/providers/lm_evaluation_harness.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
Error: Process completed with exit code 1.

Summary by CodeRabbit

  • Chores
    • Updated runtime execution configurations across multiple evaluation providers.
    • Reformatted benchmark description text across provider configurations to multi-line/wrapped YAML for improved readability.
    • Reformatted an embedded collection description for consistent line-wrapping.
    • Adjusted the provider/collection serialization to change generated ConfigMap line-wrapping.

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@gnaulak-redhat has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 7 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 88bf6d1b-481b-40fc-a25e-1cefa35f7cbd

📥 Commits

Reviewing files that changed from the base of the PR and between 5640c54 and 29f6b49.

📒 Files selected for processing (6)
  • config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
  • config/configmaps/evalhub/provider-garak-kfp.yaml
  • config/configmaps/evalhub/provider-ibm-clear.yaml
  • config/configmaps/evalhub/provider-lighteval.yaml
  • config/configmaps/evalhub/provider-lm-evaluation-harness.yaml
  • hack/sync-evalhub-providers.py
📝 Walkthrough

Walkthrough

Six EvalHub provider ConfigMaps update runtime commands from a no-op ('true') to execute python tests/features/test_data/runtime/main.py, and many benchmark description strings are reformatted (wrapped/quoted/escaped) into multi-line YAML representations across those ConfigMaps.

Changes

EvalHub provider ConfigMap formatting and runtime command updates

Layer / File(s) Summary
Runtime command update
config/configmaps/evalhub/provider-garak-kfp.yaml, .../provider-garak.yaml, .../provider-guidellm.yaml, .../provider-ibm-clear.yaml, .../provider-lighteval.yaml, .../provider-lm-evaluation-harness.yaml
Replace command: 'true' with command: python tests/features/test_data/runtime/main.py in provider ConfigMap runtime sections (local / k8s as applicable).
Benchmark description reflow
config/configmaps/evalhub/provider-garak-kfp.yaml, .../provider-garak.yaml, .../provider-ibm-clear.yaml, .../provider-lighteval.yaml, .../provider-lm-evaluation-harness.yaml
Reformat many benchmarks[*].description values from single-line strings to multi-line/quoted YAML forms; some entries use escaped unicode (e.g., \u2014) or escaped non-ASCII in name fields.
Collection description reflow
config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
Adjust line-wrapping/indentation of the embedded description field without changing content.
Generator formatting tweak
hack/sync-evalhub-providers.py
When serializing provider/collection YAML payloads, call yaml.dump(..., width=100) to change line-wrapping behavior of the generated ConfigMap data.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • ruivieira
  • saichandrapandraju
  • julpayne

Poem

🐰
I hopped through YAML, neat and spry,
Reflowed the text and changed the try,
From true to Python, now it runs,
Descriptions wrapped like tiny buns,
A quiet patch—hooray, goodbye dry.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main change: syncing provider ConfigMaps from the upstream eval-hub repository using the sync script.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/configmaps/evalhub/provider-garak-kfp.yaml`:
- Around line 29-30: Replace the modified local command value with the upstream
placeholder: locate the YAML block with the local key and the command: python
tests/features/test_data/runtime/main.py entry in the provider-garak-kfp
configuration and restore it to upstream’s value (true) so the local block
matches the original upstream content exactly; do not change any other keys or
formatting in that provider-*.yaml file to preserve byte-for-byte alignment.

In `@config/configmaps/evalhub/provider-garak.yaml`:
- Around line 28-29: Revert the local-only override in provider-garak.yaml by
restoring runtime.local.command to the upstream value (true) instead of the
local test path; locate the runtime.local.command entry in provider-garak.yaml
and replace "python tests/features/test_data/runtime/main.py" with the upstream
placeholder/boolean true so the file matches upstream exactly and stops
check_configmap_sync.py from reporting drift.

In `@config/configmaps/evalhub/provider-guidellm.yaml`:
- Around line 24-25: This file alters the upstream config for provider-guidellm
by setting the local.command override; revert the change so the ConfigMap
matches upstream exactly: remove or restore the local.command entry in
provider-guidellm.yaml (the local block / local.command key) back to the
upstream placeholder/template value so the file is no longer intentionally
unsynced and scripts/check_configmap_sync.py will pass.

In `@config/configmaps/evalhub/provider-ibm-clear.yaml`:
- Around line 24-25: The local ConfigMap change replacing the upstream
placeholder for runtime.local.command must be reverted: restore the upstream
placeholder (set runtime.local.command back to true) in
config/configmaps/evalhub/provider-ibm-clear.yaml so the file exactly matches
upstream; do not replace the placeholder with the repo-specific test entrypoint
("python tests/features/test_data/runtime/main.py")—either revert this local
edit or land the equivalent change upstream and resync, ensuring the file stays
identical to eval-hub so python scripts/check_configmap_sync.py no longer
reports drift.

In `@config/configmaps/evalhub/provider-lighteval.yaml`:
- Around line 24-25: This PR hardcodes runtime.local.command to "python
tests/features/test_data/runtime/main.py" in provider-lighteval.yaml which
breaks the upstream sync; revert that change so runtime.local.command is
restored to the upstream placeholder/value (set runtime.local.command: true or
the exact upstream token), removing the local test script, and ensure
provider-*.yaml files (specifically the runtime.local.command entry) remain
identical to upstream templates rather than being modified for local tests.

In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml`:
- Around line 29-30: Revert the hardcoded test runner in
provider-lm-evaluation-harness.yaml by restoring the upstream placeholder for
runtime.local.command (set it back to true) so the file matches upstream
exactly; locate the runtime.local.command entry in
provider-lm-evaluation-harness.yaml and replace the custom "python
tests/features/test_data/runtime/main.py" value with the upstream value true to
satisfy scripts/check_configmap_sync.py.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a2d0b8f-f572-42ff-8555-9cf325c22537

📥 Commits

Reviewing files that changed from the base of the PR and between 2432800 and 540be9d.

📒 Files selected for processing (6)
  • config/configmaps/evalhub/provider-garak-kfp.yaml
  • config/configmaps/evalhub/provider-garak.yaml
  • config/configmaps/evalhub/provider-guidellm.yaml
  • config/configmaps/evalhub/provider-ibm-clear.yaml
  • config/configmaps/evalhub/provider-lighteval.yaml
  • config/configmaps/evalhub/provider-lm-evaluation-harness.yaml

Comment thread config/configmaps/evalhub/provider-garak-kfp.yaml
Comment thread config/configmaps/evalhub/provider-garak.yaml
Comment thread config/configmaps/evalhub/provider-guidellm.yaml
Comment thread config/configmaps/evalhub/provider-ibm-clear.yaml
Comment thread config/configmaps/evalhub/provider-lighteval.yaml
Comment thread config/configmaps/evalhub/provider-lm-evaluation-harness.yaml
Copy link
Copy Markdown
Collaborator

@ppadashe-psp ppadashe-psp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 4, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ppadashe-psp

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gnaulak-redhat
Copy link
Copy Markdown
Collaborator Author

/retest-required

@openshift-ci openshift-ci Bot removed the lgtm label May 4, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 4, 2026

New changes are detected. LGTM label has been removed.

@gnaulak-redhat gnaulak-redhat force-pushed the chore-sync-evalhub-config branch 4 times, most recently from 9519019 to f6d1064 Compare May 4, 2026 09:31
…onfigMaps

Replace yaml.dump round-trip with raw content embedding in sync-evalhub-providers.py
to preserve upstream YAML formatting exactly. Re-sync all provider and collection
ConfigMaps to reflect the preserved original formatting.

Co-Authored-By: Claude <noreply@anthropic.com>
@gnaulak-redhat gnaulak-redhat force-pushed the chore-sync-evalhub-config branch from f6d1064 to a96100d Compare May 4, 2026 09:32
gnaulak-redhat and others added 2 commits May 5, 2026 12:23
Limit line length to 100 characters in yaml.dump to reduce excessive wrapping
while keeping output consistent. Re-sync all provider and collection ConfigMaps.

Co-Authored-By: Claude <noreply@anthropic.com>
Adjust line width to 117 characters in yaml.dump to better match yamllint
line-length limits. Re-sync all affected provider and collection ConfigMaps.

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
config/configmaps/evalhub/provider-lm-evaluation-harness.yaml (1)

1-3100: ⚠️ Potential issue | 🟠 Major

Fix yamllint violations before merge. The file currently has 102 yamllint warnings that prevent CI/CD compliance:

  • 101 line-length violations (lines exceeding 80 characters)
  • 1 missing document start ("---") warning

Address these issues by either rewrapping long lines or adjusting yamllint configuration rules in CI/CD to match the file's needs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml` around lines 1
- 3100, The ConfigMap value lm_evaluation_harness.yaml is failing yamllint due
to a missing document start and many >80-character lines; add a YAML document
start ("---") at the top of the lm_evaluation_harness.yaml value and either
rewrap long scalar lines (especially long description and name fields inside the
lm_evaluation_harness.yaml multi-line string) to <=80 chars or update the CI
yamllint config to allow longer line-lengths for this provider config; focus
changes around the metadata/data key where lm_evaluation_harness.yaml is defined
and the long "description" and "name" fields in benchmark entries (e.g., id:
lm_evaluation_harness, id: arc_easy, and the many benchmark description blocks).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml`:
- Around line 1-3100: The ConfigMap value lm_evaluation_harness.yaml is failing
yamllint due to a missing document start and many >80-character lines; add a
YAML document start ("---") at the top of the lm_evaluation_harness.yaml value
and either rewrap long scalar lines (especially long description and name fields
inside the lm_evaluation_harness.yaml multi-line string) to <=80 chars or update
the CI yamllint config to allow longer line-lengths for this provider config;
focus changes around the metadata/data key where lm_evaluation_harness.yaml is
defined and the long "description" and "name" fields in benchmark entries (e.g.,
id: lm_evaluation_harness, id: arc_easy, and the many benchmark description
blocks).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa6652b6-b146-4787-ae1e-adf769f2e73b

📥 Commits

Reviewing files that changed from the base of the PR and between 540be9d and 5640c54.

📒 Files selected for processing (8)
  • config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
  • config/configmaps/evalhub/provider-garak-kfp.yaml
  • config/configmaps/evalhub/provider-garak.yaml
  • config/configmaps/evalhub/provider-guidellm.yaml
  • config/configmaps/evalhub/provider-ibm-clear.yaml
  • config/configmaps/evalhub/provider-lighteval.yaml
  • config/configmaps/evalhub/provider-lm-evaluation-harness.yaml
  • hack/sync-evalhub-providers.py
✅ Files skipped from review due to trivial changes (3)
  • config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
  • hack/sync-evalhub-providers.py
  • config/configmaps/evalhub/provider-ibm-clear.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • config/configmaps/evalhub/provider-guidellm.yaml
  • config/configmaps/evalhub/provider-lighteval.yaml
  • config/configmaps/evalhub/provider-garak-kfp.yaml

Adjust line width to 130 characters in yaml.dump. Re-sync affected
provider and collection ConfigMaps.

Co-Authored-By: Claude <noreply@anthropic.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 5, 2026

@gnaulak-redhat: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/trustyai-service-operator-e2e 29f6b49 link true /test trustyai-service-operator-e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants