Skip to content

Implement process info collection on AIX#2028

Open
pgimalac wants to merge 4 commits into
shirou:masterfrom
pgimalac:pgimalac/aix-process-info
Open

Implement process info collection on AIX#2028
pgimalac wants to merge 4 commits into
shirou:masterfrom
pgimalac:pgimalac/aix-process-info

Conversation

@pgimalac
Copy link
Copy Markdown
Contributor

This PR implements basic process info collection on AIX, reading from the /proc/<pid>/psinfo file.

psinfo doesn't provide all the information that we need but it's readable by any user so this is a good start, its content is described in https://www.ibm.com/docs/en/aix/7.1.0?topic=files-proc-file.
Note that the fields are not ordered the same way in the doc than their actual memory layout.

The implementation has been tested on an AIX 7.3 host and compared to ps output.

Known limitations (on top of many functions just not being implemented):

  • Process name is limited to 15 characters
  • Full command line is limited to 79 characters
  • CPU times are combined user + system
  • Nice is a raw value from psinfo, doesn't match ps (ps seems to apply a formula but not always the same depending on processes, so returning raw value was simpler)
  • pr_size field is (slightly) bigger compared to ps vsz, probably includes more types of memory than ps does

This can be improved by reading other files but requires same user or running as root, and/or using perfstat with CGO, but starting with only psinfo makes the PR much simpler.

Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. This is a clean, well-tested implementation. The code follows project conventions and the test coverage is thorough. A few items to address:

  1. Struct definitions should use cgo -godefs: In this project, kernel structs are defined in types_<platform>.go as input to cgo -godefs, which generates the corresponding Go structs (see types_darwin.go, types_freebsd.go, types_openbsd go, etc.). Hand-written structs risk padding/alignment mismatches with the actual memory layout — especially since you noted that "the fields are not ordered the same way in the doc than their actual memory layout." While perhaps your implementation has been tested on a real machine, the cgo -godefs approach gives us a stronger guarantee going forward.

  2. Could you create a process/types_aix.go and generate the structs from the AIX headers? Something like:

//go:build ignore

/*
#include <sys/procfs.h>
*/
import "C"

type psinfo C.struct_psinfo
type lwpSinfo C.struct_lwpsinfo
type prTimestruc64 C.struct_timestruc64

Then run go tool cgo -godefs types_aix.go on your AIX 7.3 host to produce the generated output. Since you already have an AIX test environment, this should be straightforward.
2. Verify golang.org/x/sys/unix symbols on AIX: Adding aix to the build tag in process_posix.go pulls in symbols like Stat_t, Lstat, SIGSTOP, SIGCONT, etc. Please verify these all compile on AIX.
3. uint64 → int32/uint32 casts: The psinfo struct uses uint64 for PIDs and UIDs, but the return types are int32/uint32. These are safe in practice on AIX, but a brief comment noting this assumption would be helpful.
4. Minor: Consider renaming nullTerminatedBytesnullTerminatedString since it returns a string.

@pgimalac
Copy link
Copy Markdown
Contributor Author

pgimalac commented Apr 3, 2026

👋 @shirou Thanks for the comments, I agree !

Struct definitions should use cgo -godefs

I didn't know about that, that's super convenient !
FYI I had to pipe the output to sed 's/\*byte/uint64/' (similarly to what is done on BSD) so that binary.Read works

gh-worker-dd-mergequeue-cf854d Bot pushed a commit to DataDog/datadog-agent that referenced this pull request Apr 16, 2026
### What does this PR do?
Add live process monitoring support for AIX:
- Add `pkg/process/procutil/process_aix.go`: implements the `Probe` interface by reading `/proc/<pid>/psinfo` directly (world-readable 448-byte binary, no elevated privileges required). The psinfo struct layout is vendored from the DataDog/gopsutil fork — there is no `DataDog/gopsutil` import.
- Add `pkg/process/checks/system_info_aix.go`: `CollectSystemInfo` implementation for AIX via gopsutil v4.
- Guard `fp.Stats.CtxSwitches` and `fp.CtxSwitches` nil dereferences in `process.go` and `process_rt.go` (AIX does not provide context switch data).
- Exclude AIX from the generic `process_fallback.go` and the Linux-specific `system_info.go`.

### Motivation
Follow-up to #49052 and #49108. Cherry-picked and adapted from #47041.

### Describe how you validated your changes
Cherry-picked from #47041, which has been tested manually on AIX.

### Additional Notes
The process collection logic was originally added in the `DataDog/gopsutil` fork. Since that fork is no longer used, the logic has been vendored directly into the agent.
I'm working on upstreaming the logic to shirou/gopsutil but it's still WIP shirou/gopsutil#2028

Co-authored-by: pierre.gimalac <pierre.gimalac@datadoghq.com>
@pgimalac pgimalac requested a review from shirou April 28, 2026 13:16
@pgimalac
Copy link
Copy Markdown
Contributor Author

👋 Hey @shirou, would you mind giving this PR another review whenever you have time ? Thanks !

Copy link
Copy Markdown
Owner

@shirou shirou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the well-structured PR — tests and the cgo -godefs reproduction recipe are very nice. A few things I'd like fixed before merging:

Required

  1. Sort ChildrenWithContext (and ProcessesWithContext) by PID for parity with the other ports — see process_linux.go:369, process_freebsd.go:305.

  2. Don't bypass NewProcessWithContext in Processes/Children.: The comment explains skipping the PidExists check, but NewProcessWithContext also seeds p.createTime, which IsRunningWithContext uses to detect PID recycling. As written, AIX Process objects from these helpers silently lose that guarantee. Please either call NewProcessWithContext like the other ports, or seed createTime explicitly from the psinfo you already read.

  3. readPidsFromDir is currently duplicated in process_linux.go:1167 and process_solaris.go:278. Rather than adding a third copy, please move it to a shared location (e.g. process_posix.go, which now includes aix) and drop the linux/solaris copies. Happy to take that consolidation in this PR.

Please also address

  1. Move the "raw pr_nice" and "combined user+system in .User, .System=0" caveats into the godoc on NiceWithContext / TimesWithContext so API consumers see them.

  2. RlimitWithContextRlimitUsageWithContextErrNotImplementedError adds nothing while both are unimplemented; just return the error directly.

  3. The "psinfo stores PIDs/UIDs/GIDs as uint64; safe to truncate..." comment appears three times — once at the top of the file is enough.

Repeated readPsinfo opens per call (Name + Ppid + Times + MemoryInfo = 4 reads of the same file) is consistent with other ports, so feel free to leave a TODO and address later.

Thanks again for testing on AIX 7.3 and documenting the limitations so clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants