[YUNIKORN-3121] Add REST Endpoint for Scheduling Order Visibility#1042
[YUNIKORN-3121] Add REST Endpoint for Scheduling Order Visibility#1042mitdesai wants to merge 3 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1042 +/- ##
==========================================
+ Coverage 81.76% 81.83% +0.06%
==========================================
Files 103 103
Lines 13535 13557 +22
==========================================
+ Hits 11067 11094 +27
+ Misses 2208 2203 -5
Partials 260 260 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
4e2c901 to
52b7e4a
Compare
|
@mitdesai please take care of the linter problem. You can check locally by running "make lint". |
9b91b8e to
f76bad8
Compare
manirajv06
left a comment
There was a problem hiding this comment.
Yes, it helps.
I propose orderLog or something like that similar to allocationLog to capture the order of events. We can come up with list of constant or message to be logged for each important step. As and when each step pass through, log the corresponding constant/message to the orderLog from the originating place as opposed to simulating the code given in this PR as it does not require us to update both the places. Finally, it can be dumped in the state dump. Not really sure this is something needs to be exposed through REST API.
In addition, we can also capture these kind of cases as well going forward.
|
I like the idea of in-place updates and maintaining the orderLog. But I still feel that it should be a separate endpoint even though we put this information in the state dump. State dumps are huge. And when we are debugging a scenario where some pods are not getting scheduled and we just need to know the order of what is getting scheduled, it is quicker and less expensive to get selective information. |
|
State dump is the place where we keep all these diagnostic info in general. It is just a matter of looking for "orderLog" string in the dump :) |
de98649 to
3b47b63
Compare
3b47b63 to
28895fa
Compare
| // Build a map of queue -> applications with pending requests | ||
| queueAppMap := make(map[string][]string) | ||
|
|
||
| for _, app := range pc.applications { |
There was a problem hiding this comment.
Map containing applications does not guarantee specific order. Since we are looking for scheduling order at any given moment, I would rather log into data structure which strictly maintains order based on the insertion order so that it can consumed as is in the state dump. I can see Queue.TryAllocate as the best place to push the application into the above mentioned slice.
There was a problem hiding this comment.
Ah, Good Catch. Not sure how I missed that. Let me fix it.
|
Just wanted to echo my agreement with @mitdesai on the utility of exposing this via REST interface; I also already use the REST endpoints for selective debugging over the massive state dumps. Fwiw, in my experience genai agents also do better with interfaces that let them grab specific information vs trying to deal with something too big for them to fit in context. I think as the utility of agents continues to increase it makes sense to keep agent ergonomics in mind too. |
What is this PR for?
Added a new Rest Endpoint for getting the scheduling order - These come in handy when debugging issues in scheduling.
What type of PR is it?
Todos
What is the Jira issue?
https://issues.apache.org/jira/browse/YUNIKORN-3121
How should this be tested?
Screenshots (if appropriate)
Questions: