-
Notifications
You must be signed in to change notification settings - Fork 1.4k
DADP-71 Add ADP point telemetry to Agent telemetry #50750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
88c2f26
9097583
b95d0fe
2861fbe
14c6e6e
a846511
4537ed7
ad45fbb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -260,7 +260,13 @@ var defaultProfiles = ` | |
| - name: logs.auto_multi_line_default_would_truncate | ||
| - name: logs_destination.destination_workers | ||
| - name: point.sent | ||
| aggregate_tags: | ||
| - domain | ||
| - remote_agent | ||
| - name: point.dropped | ||
| aggregate_tags: | ||
| - domain | ||
| - remote_agent | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding is these will change only the COAT version of the metrics to start including these two tags. Since COAT is internal, strict compatibility on the shape of these metrics is not required
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should not be a problem |
||
| - name: transactions.input_count | ||
| - name: transactions.requeued | ||
| - name: transactions.retries | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -38,21 +38,37 @@ func (c *checkImpl) Run() error { | |
| return err | ||
| } | ||
|
|
||
| // Remote Agent Registry telemetry lives in the regular registry. Gather it on a best-effort basis so failures there | ||
| // do not prevent the customer-facing telemetry check from reporting Core Agent default telemetry values. | ||
| var regularMfs []*dto.MetricFamily | ||
| if gathered, err := c.telemetry.Gather(false); err != nil { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that this will now call out to remote agents via RAR every 15 seconds (default interval on
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But why is it important to change if it will be sent out only every 15m?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. COAT is emitted every 15m, but this PR also includes the telemetry in the |
||
| log.Warnf("failed to gather regular telemetry metrics for default telemetry merge: %v", err) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this fails it could get pretty noisy as it would emit every 15 seconds, wondering if I should remove it or make it debug level?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would vouch for debug level
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if it is only every 15s it seems reasonable to me. The impact is that this telemetry is missing which is significant. |
||
| } else { | ||
| regularMfs = gathered | ||
| } | ||
|
|
||
| mergeLabelsByMetric := discoverMergeLabels(mfs, regularMfs) | ||
| mergedMetrics := collectMergeMetrics(mfs, false, mergeLabelsByMetric) | ||
| mergedMetrics.merge(collectMergeMetrics(regularMfs, true, mergeLabelsByMetric)) | ||
|
|
||
| sender, err := c.GetSender() | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| sender.SetNoIndex(true) | ||
|
|
||
| c.sendMergedMetrics(mergedMetrics, sender) | ||
| c.handleMetricFamilies(mfs, sender) | ||
|
|
||
| return nil | ||
| } | ||
|
|
||
| func (c *checkImpl) handleMetricFamilies(mfs []*dto.MetricFamily, sender sender.Sender) { | ||
| for _, mf := range mfs { | ||
| if mf.Name == nil || mf.Type == nil || len(mf.Metric) == 0 { | ||
| // Merged metrics are emitted explicitly by sendMergedMetrics so overlapping regular-registry values can be included | ||
| // without changing customer-facing metric names or tags. | ||
| if mf == nil || mf.Name == nil || mf.Type == nil || len(mf.Metric) == 0 || isMergedMetric(mf.GetName()) { | ||
| continue | ||
| } | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.