Take entity data into account for payload size #1483

varunch77 · 2025-01-02T19:30:00Z

Description of the issue

Due to the addition of the new EntityMetricData field to the metrics, the current calculation of the payload size is not accurate. The agent is undercounting the payload size by not accounting for entity data, which leads to instances where the data appears to be under the 1mb limit but is actually over it. As a result, the metrics are not batched properly and cannot be sent to CloudWatch.

Description of changes

Update the overheads and existing metric sizes to account for the entity data
Add a function to calculate entity size
Update existing unit tests to account for increase in metrics sizes

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

I added a test that replicates the original scenario that was resulting in a RequestEntityTooLarge error:

the test creates two identical sets of metrics except one set has entity data and other doesn't
the set of metrics without entity data fall just below the 1mb threshold
the set of metrics with entity data is pushed over the 1mb limit
the test checks that the set of metrics with entity data takes 2 API calls and the one without takes 1 API call

Requirements

Before commit the code, please do the following steps.

Run make fmt and make fmt-sh ✅
Run make lint ✅

chadpatel

It is more complicated than this.

MetricData / EntityData are not directly related with regards to the PMD request structure
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutMetricData.html#API_PutMetricData_RequestParameters

We have a new top level item EntityMetricData

The current splitting code is here

amazon-cloudwatch-agent/plugins/outputs/cloudwatch/cloudwatch.go

Line 180 in ef15aa4

c.metricDatumBatch.Size += payload(datums[i])

It looks like it increments the payload size based on the MetricDatum and then calls isFull to determine if the batch is full

The entity information is appended on line 179

It looks like we only send the datum in to the payload calculation on line 180.

I think we need to either update payload to also process the entityStr/Partition or += additional Size

Lisa's commit might help. Looks like Partition changed from a list to a map
9849658

I am not 100% sure where the serialization happens

Maybe BuildMetricDatum or WriteToCloudWatch. We need to estimate the payload size for the entity that is getting serialized

chadpatel · 2025-01-02T21:37:16Z

I was expected testing beyond unit testing. Maybe testing end to end or re-producing the current >1MB payload scenario and demonstrating the payload is batched appropriately. maybe adding something to cloudwatch-agent-tests

plugins/outputs/cloudwatch/util.go

plugins/outputs/cloudwatch/cloudwatch_test.go

nathalapooja · 2025-01-09T22:53:27Z

plugins/outputs/cloudwatch/cloudwatch_test.go

@@ -482,7 +485,7 @@ func TestPublish(t *testing.T) {
 	interval := 60 * time.Second
 	// The buffer holds 50 batches of 1,000 metrics. So choose 5x.
 	numMetrics := 5 * datumBatchChanBufferSize * defaultMaxDatumsPerCall
-	expectedCalls := numMetrics / defaultMaxDatumsPerCall
+	expectedCalls := 388 // Updated to match the observed number of calls


Is it possible to write a math instead of number to understand the math behind size calculations?

plugins/outputs/cloudwatch/util.go

lisguo · 2025-01-14T21:51:30Z

I would recommend getting consensus on the expected payload size from CW front end for a given metric. Let's make sure our assumptions are correct before merging this.

Also pr builds seem to be failing?

Update estimated metric sizes

c337b44

varunch77 requested a review from a team as a code owner January 2, 2025 19:30

Update unit tests to reflect new metrics sizes

f5f4d46

varunch77 changed the title ~~Update estimated metric sizes~~ Take entity data into account for payload size Jan 2, 2025

chadpatel reviewed Jan 2, 2025

View reviewed changes

varunch77 and others added 10 commits January 6, 2025 17:13

Update to add more comprehensive calculation of entity data

08b9d0f

nonamederrors fix

2a63f71

real nonamedreturns fix

6ec98a6

Decrease number of metrics for test due to increased size

f764ce7

Merge branch 'main' into update-payload-calculations

c4d99fa

Update number of expected calls in test

6e08a7b

Update unit test

32df23e

Add test to reproduce >1MB scenario

28d78af

Merge branch 'main' into update-payload-calculations

89a2ee9

Move entity size calculations

1865755

chadpatel reviewed Jan 9, 2025

View reviewed changes

plugins/outputs/cloudwatch/util.go Outdated Show resolved Hide resolved

plugins/outputs/cloudwatch/util.go Outdated Show resolved Hide resolved

plugins/outputs/cloudwatch/util.go Outdated Show resolved Hide resolved

nathalapooja reviewed Jan 9, 2025

View reviewed changes

varunch77 and others added 8 commits January 13, 2025 00:58

Rename strictEntityValidationsize to strictEntityValidationSize

38b022c

Fix log message typo

417f2b0

Revert overhead changes + add new entity metric data prefix overhead

9b8300a

Update all consts

29e133f

Update/revert test changes

c1d9494

Upload payload function

6ca1cc0

Fix typo

d81805d

Merge branch 'main' into update-payload-calculations

ee3e148

varunch77 and others added 3 commits January 14, 2025 22:46

Merge branch 'main' into update-payload-calculations

f4d9be6

Update payload function for entity calculation

807f4b7

Fix unit tests

4b0a375

varunch77 and others added 2 commits January 15, 2025 14:41

Add comments

459ef7c

Merge branch 'main' into update-payload-calculations

b301e9f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take entity data into account for payload size #1483

Take entity data into account for payload size #1483

varunch77 commented Jan 2, 2025 •

edited

Loading

chadpatel left a comment •

edited

Loading

chadpatel commented Jan 2, 2025

nathalapooja Jan 9, 2025

lisguo commented Jan 14, 2025

Take entity data into account for payload size #1483

Are you sure you want to change the base?

Take entity data into account for payload size #1483

Conversation

varunch77 commented Jan 2, 2025 • edited Loading

Description of the issue

Description of changes

License

Tests

Requirements

chadpatel left a comment • edited Loading

Choose a reason for hiding this comment

chadpatel commented Jan 2, 2025

nathalapooja Jan 9, 2025

Choose a reason for hiding this comment

lisguo commented Jan 14, 2025

varunch77 commented Jan 2, 2025 •

edited

Loading

chadpatel left a comment •

edited

Loading