Skip to content

fix: AIX memory statistics: add Available field and improve swap calculations#1982

Open
Dylan-M wants to merge 6 commits into
shirou:masterfrom
Dylan-M:dylanmyers/aix_mem
Open

fix: AIX memory statistics: add Available field and improve swap calculations#1982
Dylan-M wants to merge 6 commits into
shirou:masterfrom
Dylan-M:dylanmyers/aix_mem

Conversation

@Dylan-M
Copy link
Copy Markdown
Contributor

@Dylan-M Dylan-M commented Dec 23, 2025

Summary

Fixes and improves AIX memory statistics for the nocgo path:

Memory (callSVMon)

  • Available field: Populated from the real kernel-computed available column in svmon -G -O unit=KB, which includes free pages plus reclaimable cache. This value cannot be derived from raw page counts alone.
  • Precision preserved: Primary stats (Total, Used, Free, swap) are read from raw svmon -G page output and multiplied by os.Getpagesize(), avoiding the sub-KB truncation that -O unit=KB introduces. A second svmon call with -O unit=KB is made only to read the available column.
  • Swap calculations fixed: swap.Used is now explicitly set, and swap.UsedPercent is calculated.
  • Parsing extracted for testability: parseSVMonPages() and parseSVMonAvailable() are separate functions with table-driven unit tests using real AIX sample output.

Swap Devices (SwapDevicesWithContext)

  • GB/TB unit handling: parseLspsSize() now handles MB, GB, and TB suffixes in lsps -a output.
  • Unit tests: parseLspsOutput() and parseLspsSize() have comprehensive tests covering MB, GB, NaNQ (NFS paging spaces), multiple devices, and edge cases.

CGO path (mem_aix_cgo.go)

  • SwapDevicesWithContext: Implemented using perfstat.PagingSpaceStat().

Only affects AIX — no cross-platform changes.

@Dylan-M Dylan-M changed the title Fix AIX memory statistics: add Available field and improve swap calculations fix: AIX memory statistics: add Available field and improve swap calculations Dec 23, 2025
@Dylan-M Dylan-M marked this pull request as ready for review December 23, 2025 18:04
@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch from cd63eb7 to af34779 Compare January 24, 2026 14:42
@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch from af34779 to 0fa1bb4 Compare January 26, 2026 14:41
@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch 2 times, most recently from cf6175e to 964cf10 Compare February 16, 2026 14:34
@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch 2 times, most recently from fc8f669 to 1233a38 Compare March 20, 2026 21:33
Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Dylan-M, I've reviewed this PR more carefully. Unlike your other PRs, this one is relatively small, has a single concern, and is focused only on AIX — which made it much easier to review thoroughly. I appreciate that.

I'm happy to merge this with the following changes:

  1. Handle GB (and potentially other units) in lsps -a parsing: The current code assumes the size column always ends with "MB", but lsps can report sizes in GB on larger systems. Please handle at least MB and GB.
  2. Add a unit test for SwapDevicesWithContext (nocgo path): The lsps -a parsing logic should have a test with sample output, similar to how other packages test their command parsing.

These are small fixes — once addressed, this is good to merge. Thanks!

@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch from 0ec928c to a420654 Compare March 21, 2026 12:40
@Dylan-M
Copy link
Copy Markdown
Contributor Author

Dylan-M commented Mar 21, 2026

Hi @Dylan-M, I've reviewed this PR more carefully. Unlike your other PRs, this one is relatively small, has a single concern, and is focused only on AIX — which made it much easier to review thoroughly. I appreciate that.

All of the rest should be resubmitted soon, in atomic PR "stacks".

I'm happy to merge this with the following changes:

1. Handle GB (and potentially other units) in `lsps -a` parsing: The current code assumes the size column always ends with "MB", but `lsps` can report sizes in GB on larger systems. Please handle at least MB and GB.

Thanks for catching that, none of my test systems have had larger. I've now updated it to match as high as AIX kernel docs show (TB). If we ever run into a case of needed higher than TB, we can revisit.

2. Add a unit test for `SwapDevicesWithContext` (nocgo path): The `lsps -a` parsing logic should have a test with sample output, similar to how other packages test their command parsing.

Yup, That was just me forgetting to add it. Thank you.

These are small fixes — once addressed, this is good to merge. Thanks!

No, thank you sir. Sorry for being such a pain - it is a lot of work and you were entirely correct it should have been split up better.

@Dylan-M Dylan-M requested a review from shirou March 23, 2026 13:05
@Dylan-M
Copy link
Copy Markdown
Contributor Author

Dylan-M commented Mar 26, 2026

@shirou This PR has been updated to address your review feedback:

  1. GB/TB unit handlingparseLspsSize() now handles MB, GB, and TB suffixes in lsps -a output.
  2. Unit test for nocgo path — Added mem_aix_nocgo_test.go with tests for parseLspsOutput() and parseLspsSize(), covering MB, GB, NaNQ (NFS paging spaces), and edge cases.
  3. Bonus: real Available memory — Switched from svmon -G (raw pages, no available column) to svmon -G -O unit=KB which provides a real available value distinct from free.

Ready for re-review when you have a moment. Thanks!

@shirou
Copy link
Copy Markdown
Owner

shirou commented Mar 26, 2026

unit=KB in callSVMon

Adding -O unit=KB to svmon causes it to convert from pages to KB internally, which truncates any sub-KB fractions. This loses precision compared to the current approach of reading raw page counts and multiplying by the page size.

I'd suggest keeping the default page-based output and replacing the hardcoded pagesize := uint64(4096) with os.Getpagesize() instead. The Available field addition and swap calculation improvements can be applied independently of this unit change.

PR description doesn't match implementation

The PR description says "The Available field is now explicitly set equal to Free memory", but the implementation actually parses the available column (p[6]) directly from svmon output — which is the correct approach. Please update the PR description to reflect what the code actually does.

Missing parse tests for callSVMon

There are good tests for parseLspsOutput, but the svmon output parsing logic in callSVMon has no corresponding test. Given the column index changes (now reading p[6]) and the unit discussion, it would be valuable to extract the parsing logic and add table-driven tests with sample svmon output, similar to what was done for lsps.

@Dylan-M
Copy link
Copy Markdown
Contributor Author

Dylan-M commented Mar 26, 2026

@shirou All three points addressed:

  1. Precision: Switched back to raw svmon -G page output with os.Getpagesize() for all primary stats (Total, Used, Free, swap). A separate svmon -G -O unit=KB call is made only to read the available column, which doesn't exist in the raw page output. This preserves full precision while still providing real available memory.

  2. PR description: Updated to accurately reflect the implementation — available is read from the kernel-computed column, not set equal to free.

  3. Parse tests: Extracted parseSVMonPages() and parseSVMonAvailable() as separate testable functions with table-driven tests using sample AIX output, matching the pattern used for parseLspsOutput().

@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch 2 times, most recently from d6c4887 to 53d3db6 Compare March 30, 2026 16:44
@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch from 53d3db6 to 085438a Compare April 9, 2026 22:53
Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. I'd like to see tightened before merge.

Comment thread mem/mem_test.go Outdated
@Dylan-M
Copy link
Copy Markdown
Contributor Author

Dylan-M commented Apr 27, 2026

Tested empirically on AIX 7.3.0.0 — across 30 sequential samples with a 500ms cadence, Available > Free held strictly with a stable gap of ~253 MB (range 252.5–254.6 MB). The kernel maintains a non-trivial reclaimable cache footprint in normal operation, so the equality case isn't hit in practice.

The relaxation was appropriate for an earlier iteration of this PR where Available was set equal to Free as a fallback, but became stale once the parsing switched to reading the kernel's real available value. Tightened back to > and dropped the AIX-specific note. Pushed.

@Dylan-M Dylan-M force-pushed the dylanmyers/aix_mem branch from fbb29ee to a2a3865 Compare May 1, 2026 15:05
@Dylan-M
Copy link
Copy Markdown
Contributor Author

Dylan-M commented May 6, 2026

@shirou — going to follow your guidance and close out the AIX stack to restart with the one-PR-at-a-time discipline you asked for. The other 20+ PRs and the tracking issue close today.

One carve-out question on this PR specifically: it's been through three rounds of your review, the changes you asked for in the last round are in, and the diff is strictly AIX-only (mem/mem_aix*.go plus a single test-assertion edit). Would you prefer I close this one too and re-submit as the first PR under the new pattern, or keep it open since the review investment is already largely there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants