chore(evalhub): sync provider ConfigMaps from upstream eval-hub by gnaulak-redhat · Pull Request #720 · trustyai-explainability/trustyai-service-operator

gnaulak-redhat · 2026-05-03T15:02:06Z

Sync eval-hub/eval-hub main config provider yaml files.
These files are auto-synced with hack/sync-evalhub-providers.py script

This is in part to fix the ci pipeline in eval-hub/eval-hub repository side

Run python scripts/check_configmap_sync.py
Fetching ConfigMap listing from trustyai-explainability/trustyai-service-operator...
Found 9 ConfigMap(s) to check.

OK  collection-leaderboard-v2.yaml <-> config/collections/leaderboard-v2.yaml
OK  collection-safety-and-fairness-v1.yaml <-> config/collections/safety-and-fairness-v1.yaml
OK  collection-toxicity-and-ethical-principles.yaml <-> config/collections/toxicity-and-ethical-principles.yaml

Drift detected:

provider-garak-kfp.yaml vs config/providers/garak-kfp.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-garak.yaml vs config/providers/garak.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-guidellm.yaml vs config/providers/guidellm.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-ibm-clear.yaml vs config/providers/ibm-clear.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-lighteval.yaml vs config/providers/lighteval.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
provider-lm-evaluation-harness.yaml vs config/providers/lm_evaluation_harness.yaml:
  ~ runtime.local.command: remote='true'  local='python tests/features/test_data/runtime/main.py'
Error: Process completed with exit code 1.

Summary by CodeRabbit

Chores
- Updated runtime execution configurations across multiple evaluation providers.
- Reformatted benchmark description text across provider configurations to multi-line/wrapped YAML for improved readability.
- Reformatted an embedded collection description for consistent line-wrapping.
- Adjusted the provider/collection serialization to change generated ConfigMap line-wrapping.

Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai · 2026-05-03T15:02:17Z

Warning

Rate limit exceeded

@gnaulak-redhat has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 56 minutes and 7 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 88bf6d1b-481b-40fc-a25e-1cefa35f7cbd

📥 Commits

Reviewing files that changed from the base of the PR and between 5640c54 and 29f6b49.

📒 Files selected for processing (6)

config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
config/configmaps/evalhub/provider-garak-kfp.yaml
config/configmaps/evalhub/provider-ibm-clear.yaml
config/configmaps/evalhub/provider-lighteval.yaml
config/configmaps/evalhub/provider-lm-evaluation-harness.yaml
hack/sync-evalhub-providers.py

📝 Walkthrough

Walkthrough

Six EvalHub provider ConfigMaps update runtime commands from a no-op ('true') to execute python tests/features/test_data/runtime/main.py, and many benchmark description strings are reformatted (wrapped/quoted/escaped) into multi-line YAML representations across those ConfigMaps.

Changes

EvalHub provider ConfigMap formatting and runtime command updates

Layer / File(s)	Summary
Runtime command update `config/configmaps/evalhub/provider-garak-kfp.yaml`, `.../provider-garak.yaml`, `.../provider-guidellm.yaml`, `.../provider-ibm-clear.yaml`, `.../provider-lighteval.yaml`, `.../provider-lm-evaluation-harness.yaml`	Replace `command: 'true'` with `command: python tests/features/test_data/runtime/main.py` in provider ConfigMap runtime sections (local / k8s as applicable).
Benchmark description reflow `config/configmaps/evalhub/provider-garak-kfp.yaml`, `.../provider-garak.yaml`, `.../provider-ibm-clear.yaml`, `.../provider-lighteval.yaml`, `.../provider-lm-evaluation-harness.yaml`	Reformat many `benchmarks[*].description` values from single-line strings to multi-line/quoted YAML forms; some entries use escaped unicode (e.g., `\u2014`) or escaped non-ASCII in `name` fields.
Collection description reflow `config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml`	Adjust line-wrapping/indentation of the embedded `description` field without changing content.
Generator formatting tweak `hack/sync-evalhub-providers.py`	When serializing provider/collection YAML payloads, call `yaml.dump(..., width=100)` to change line-wrapping behavior of the generated ConfigMap data.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

trustyai-explainability/trustyai-service-operator#662: Edits the same provider-garak-kfp ConfigMap (runtime command and description formatting).
chore(evalhub): bump up memory limit for lm_evaluation_harness #711: Overlaps on provider-lm-evaluation-harness.yaml runtime and benchmark description reflow.
chore(evalhub): update garak provider configmap with supported benchmarks #654: Related changes to Garak provider ConfigMap entrypoint/runtime and benchmark entries.

Suggested labels

lgtm

Suggested reviewers

ruivieira
saichandrapandraju
julpayne

Poem

🐰
I hopped through YAML, neat and spry,
Reflowed the text and changed the try,
From true to Python, now it runs,
Descriptions wrapped like tiny buns,
A quiet patch—hooray, goodbye dry.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main change: syncing provider ConfigMaps from the upstream eval-hub repository using the sync script.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/configmaps/evalhub/provider-garak-kfp.yaml`:
- Around line 29-30: Replace the modified local command value with the upstream
placeholder: locate the YAML block with the local key and the command: python
tests/features/test_data/runtime/main.py entry in the provider-garak-kfp
configuration and restore it to upstream’s value (true) so the local block
matches the original upstream content exactly; do not change any other keys or
formatting in that provider-*.yaml file to preserve byte-for-byte alignment.

In `@config/configmaps/evalhub/provider-garak.yaml`:
- Around line 28-29: Revert the local-only override in provider-garak.yaml by
restoring runtime.local.command to the upstream value (true) instead of the
local test path; locate the runtime.local.command entry in provider-garak.yaml
and replace "python tests/features/test_data/runtime/main.py" with the upstream
placeholder/boolean true so the file matches upstream exactly and stops
check_configmap_sync.py from reporting drift.

In `@config/configmaps/evalhub/provider-guidellm.yaml`:
- Around line 24-25: This file alters the upstream config for provider-guidellm
by setting the local.command override; revert the change so the ConfigMap
matches upstream exactly: remove or restore the local.command entry in
provider-guidellm.yaml (the local block / local.command key) back to the
upstream placeholder/template value so the file is no longer intentionally
unsynced and scripts/check_configmap_sync.py will pass.

In `@config/configmaps/evalhub/provider-ibm-clear.yaml`:
- Around line 24-25: The local ConfigMap change replacing the upstream
placeholder for runtime.local.command must be reverted: restore the upstream
placeholder (set runtime.local.command back to true) in
config/configmaps/evalhub/provider-ibm-clear.yaml so the file exactly matches
upstream; do not replace the placeholder with the repo-specific test entrypoint
("python tests/features/test_data/runtime/main.py")—either revert this local
edit or land the equivalent change upstream and resync, ensuring the file stays
identical to eval-hub so python scripts/check_configmap_sync.py no longer
reports drift.

In `@config/configmaps/evalhub/provider-lighteval.yaml`:
- Around line 24-25: This PR hardcodes runtime.local.command to "python
tests/features/test_data/runtime/main.py" in provider-lighteval.yaml which
breaks the upstream sync; revert that change so runtime.local.command is
restored to the upstream placeholder/value (set runtime.local.command: true or
the exact upstream token), removing the local test script, and ensure
provider-*.yaml files (specifically the runtime.local.command entry) remain
identical to upstream templates rather than being modified for local tests.

In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml`:
- Around line 29-30: Revert the hardcoded test runner in
provider-lm-evaluation-harness.yaml by restoring the upstream placeholder for
runtime.local.command (set it back to true) so the file matches upstream
exactly; locate the runtime.local.command entry in
provider-lm-evaluation-harness.yaml and replace the custom "python
tests/features/test_data/runtime/main.py" value with the upstream value true to
satisfy scripts/check_configmap_sync.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a2d0b8f-f572-42ff-8555-9cf325c22537

📥 Commits

Reviewing files that changed from the base of the PR and between 2432800 and 540be9d.

📒 Files selected for processing (6)

config/configmaps/evalhub/provider-garak-kfp.yaml
config/configmaps/evalhub/provider-garak.yaml
config/configmaps/evalhub/provider-guidellm.yaml
config/configmaps/evalhub/provider-ibm-clear.yaml
config/configmaps/evalhub/provider-lighteval.yaml
config/configmaps/evalhub/provider-lm-evaluation-harness.yaml

ppadashe-psp

lgtm

openshift-ci · 2026-05-04T03:29:11Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ppadashe-psp

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gnaulak-redhat · 2026-05-04T03:51:01Z

/retest-required

openshift-ci · 2026-05-04T09:22:40Z

New changes are detected. LGTM label has been removed.

…onfigMaps Replace yaml.dump round-trip with raw content embedding in sync-evalhub-providers.py to preserve upstream YAML formatting exactly. Re-sync all provider and collection ConfigMaps to reflect the preserved original formatting. Co-Authored-By: Claude <noreply@anthropic.com>

Limit line length to 100 characters in yaml.dump to reduce excessive wrapping while keeping output consistent. Re-sync all provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

Adjust line width to 117 characters in yaml.dump to better match yamllint line-length limits. Re-sync all affected provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

config/configmaps/evalhub/provider-lm-evaluation-harness.yaml (1)
1-3100: ⚠️ Potential issue | 🟠 Major

Fix yamllint violations before merge. The file currently has 102 yamllint warnings that prevent CI/CD compliance:

101 line-length violations (lines exceeding 80 characters)

1 missing document start ("---") warning

Address these issues by either rewrapping long lines or adjusting yamllint configuration rules in CI/CD to match the file's needs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml` around lines 1
- 3100, The ConfigMap value lm_evaluation_harness.yaml is failing yamllint due
to a missing document start and many >80-character lines; add a YAML document
start ("---") at the top of the lm_evaluation_harness.yaml value and either
rewrap long scalar lines (especially long description and name fields inside the
lm_evaluation_harness.yaml multi-line string) to <=80 chars or update the CI
yamllint config to allow longer line-lengths for this provider config; focus
changes around the metadata/data key where lm_evaluation_harness.yaml is defined
and the long "description" and "name" fields in benchmark entries (e.g., id:
lm_evaluation_harness, id: arc_easy, and the many benchmark description blocks).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@config/configmaps/evalhub/provider-lm-evaluation-harness.yaml`:
- Around line 1-3100: The ConfigMap value lm_evaluation_harness.yaml is failing
yamllint due to a missing document start and many >80-character lines; add a
YAML document start ("---") at the top of the lm_evaluation_harness.yaml value
and either rewrap long scalar lines (especially long description and name fields
inside the lm_evaluation_harness.yaml multi-line string) to <=80 chars or update
the CI yamllint config to allow longer line-lengths for this provider config;
focus changes around the metadata/data key where lm_evaluation_harness.yaml is
defined and the long "description" and "name" fields in benchmark entries (e.g.,
id: lm_evaluation_harness, id: arc_easy, and the many benchmark description
blocks).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fa6652b6-b146-4787-ae1e-adf769f2e73b

📥 Commits

Reviewing files that changed from the base of the PR and between 540be9d and 5640c54.

📒 Files selected for processing (8)

config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
config/configmaps/evalhub/provider-garak-kfp.yaml
config/configmaps/evalhub/provider-garak.yaml
config/configmaps/evalhub/provider-guidellm.yaml
config/configmaps/evalhub/provider-ibm-clear.yaml
config/configmaps/evalhub/provider-lighteval.yaml
config/configmaps/evalhub/provider-lm-evaluation-harness.yaml
hack/sync-evalhub-providers.py

✅ Files skipped from review due to trivial changes (3)

config/configmaps/evalhub/collection-toxicity-and-ethical-principles.yaml
hack/sync-evalhub-providers.py
config/configmaps/evalhub/provider-ibm-clear.yaml

🚧 Files skipped from review as they are similar to previous changes (3)

config/configmaps/evalhub/provider-guidellm.yaml
config/configmaps/evalhub/provider-lighteval.yaml
config/configmaps/evalhub/provider-garak-kfp.yaml

Adjust line width to 130 characters in yaml.dump. Re-sync affected provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

openshift-ci · 2026-05-05T08:32:31Z

@gnaulak-redhat: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/trustyai-service-operator-e2e	`29f6b49`	link	true	`/test trustyai-service-operator-e2e`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

chore(evalhub): sync provider ConfigMaps from upstream eval-hub

540be9d

Co-Authored-By: Claude <noreply@anthropic.com>

gnaulak-redhat requested review from ruivieira and tarilabs May 3, 2026 15:02

coderabbitai Bot reviewed May 3, 2026

View reviewed changes

ppadashe-psp approved these changes May 4, 2026

View reviewed changes

openshift-ci Bot assigned ppadashe-psp May 4, 2026

openshift-ci Bot added the lgtm label May 4, 2026

openshift-ci Bot removed the lgtm label May 4, 2026

gnaulak-redhat force-pushed the chore-sync-evalhub-config branch 4 times, most recently from 9519019 to f6d1064 Compare May 4, 2026 09:31

gnaulak-redhat force-pushed the chore-sync-evalhub-config branch from f6d1064 to a96100d Compare May 4, 2026 09:32

gnaulak-redhat and others added 2 commits May 5, 2026 12:23

chore(evalhub): set yaml.dump width=100 and re-sync ConfigMaps

5640c54

Limit line length to 100 characters in yaml.dump to reduce excessive wrapping while keeping output consistent. Re-sync all provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

chore(evalhub): set yaml.dump width=117 and re-sync ConfigMaps

87efd46

Adjust line width to 117 characters in yaml.dump to better match yamllint line-length limits. Re-sync all affected provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

chore(evalhub): set yaml.dump width=130 and re-sync ConfigMaps

29f6b49

Adjust line width to 130 characters in yaml.dump. Re-sync affected provider and collection ConfigMaps. Co-Authored-By: Claude <noreply@anthropic.com>

Conversation

gnaulak-redhat commented May 3, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ppadashe-psp left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

gnaulak-redhat commented May 4, 2026

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gnaulak-redhat commented May 3, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 3, 2026 •

edited

Loading