-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take entity data into account for payload size #1483
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is more complicated than this.
MetricData / EntityData are not directly related with regards to the PMD request structure
https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutMetricData.html#API_PutMetricData_RequestParameters
We have a new top level item EntityMetricData
The current splitting code is here
c.metricDatumBatch.Size += payload(datums[i]) |
It looks like it increments the payload size based on the MetricDatum and then calls isFull
to determine if the batch is full
The entity information is appended on line 179
It looks like we only send the datum in to the payload calculation on line 180.
I think we need to either update payload
to also process the entityStr/Partition
or +=
additional Size
Lisa's commit might help. Looks like Partition changed from a list to a map
9849658
I am not 100% sure where the serialization happens
Maybe BuildMetricDatum or WriteToCloudWatch. We need to estimate the payload size for the entity that is getting serialized
I was expected testing beyond unit testing. Maybe testing end to end or re-producing the current >1MB payload scenario and demonstrating the payload is batched appropriately. maybe adding something to cloudwatch-agent-tests |
@@ -482,7 +485,7 @@ func TestPublish(t *testing.T) { | |||
interval := 60 * time.Second | |||
// The buffer holds 50 batches of 1,000 metrics. So choose 5x. | |||
numMetrics := 5 * datumBatchChanBufferSize * defaultMaxDatumsPerCall | |||
expectedCalls := numMetrics / defaultMaxDatumsPerCall | |||
expectedCalls := 388 // Updated to match the observed number of calls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to write a math instead of number to understand the math behind size calculations?
I would recommend getting consensus on the expected payload size from CW front end for a given metric. Let's make sure our assumptions are correct before merging this. Also pr builds seem to be failing? |
Description of the issue
Due to the addition of the new
EntityMetricData
field to the metrics, the current calculation of the payload size is not accurate. The agent is undercounting the payload size by not accounting for entity data, which leads to instances where the data appears to be under the 1mb limit but is actually over it. As a result, the metrics are not batched properly and cannot be sent to CloudWatch.Description of changes
License
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Tests
I added a test that replicates the original scenario that was resulting in a
RequestEntityTooLarge
error:Requirements
Before commit the code, please do the following steps.
make fmt
andmake fmt-sh
✅make lint
✅