Add pod identity credential integration test #450

zhihonl · 2025-01-16T16:23:04Z

Description of the issue

Pod identity credential support is added to CloudWatch Agent and FluentBit, however we lack integration test.

Description of changes

Add new terraform folder for credential related testing
Add new test case when deployment strategy is pod identity. This mode only checks for basic Container Insight metrics that are present on any EKS cluster.

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Integration test run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/12797706927/job/35689119613. Note that other test failures were there before this change and it is not in scope for fixing as part of this PR.

zhihonl · 2025-01-16T16:24:24Z

test/metric/container_insights_util.go

@@ -145,11 +145,30 @@ func validateMetricsAvailability(dims string, expected []string, actual map[stri
 func compareMetrics(expected []string, actual map[string][][]types.Dimension) bool {
 	if len(expected) != len(actual) {
 		log.Printf("the count of fetched metrics do not match with expected count: expected-%v, actual-%v\n", len(expected), len(actual))
+
+		expectedSet := make(map[string]struct{})


This is added for debugging purpose and I think would be nice to keep. Currently if a metric_value_benchmark test fail for EKS scenario, it just tells us what tests failed but not about which metrics were missing according to the expected output.

zhihonl · 2025-01-16T16:26:45Z

test/metric_value_benchmark/eks_daemonset_test.go

@@ -31,7 +32,9 @@ type EKSDaemonTestRunner struct {

 func (e *EKSDaemonTestRunner) Validate() status.TestGroupResult {
 	var testResults []status.TestResult
-	testResults = append(testResults, metric.ValidateMetrics(e.env, "", eks_resources.GetExpectedDimsToMetrics(e.env))...)
+	if e.env.EksDeploymentStrategy != eksdeploymenttype.PODIDENTITY {


For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

amazon-cloudwatch-agent-test/test/metric_value_benchmark/eks_resources/util.go

Line 104 in 070c37d

var ExpectedDimsToMetrics = map[string][]string{

So instead of creating a new set of expected metrics list that can be hundreds of lines long, this solution is simpler since we are just looking for a sanity check to see if some metrics are present which means the credential is working

For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

Is that a problem? What is the diff?

Not a problem, CI only shows certain metrics under certain situations. For example some error metrics are only shown when the failure case occur for pods. iirc for pod identity I saw additional API server related metrics being there, I'm guessing it's because of API interactions from pod identity addon.

lisguo · 2025-01-22T18:50:46Z

environment/eksdeploymenttype/eks_deployment_type.go

+	REPLICA     EKSDeploymentType = "REPLICA"
+	SIDECAR     EKSDeploymentType = "SIDECAR"
+	STATEFUL    EKSDeploymentType = "STATEFUL"
+	PODIDENTITY EKSDeploymentType = "PODIDENTITY"


Seems kinda interesting to put PodIdentity here...it's not necessarily a deployment type but a credential provider. Technically we can have podidentity enabled with any of these deployment options right?

Correct. I added here just as an easy way to have a different way of running EKS tests with basics without adding another variable in the class which felt redundant.

lisguo · 2025-01-22T18:53:26Z

terraform/eks/daemon/credentials/pod_identity/main.tf

+// Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
+// SPDX-License-Identifier: MIT
+
+module "common" {


just curious -- is there a way we can re-use existing terraform modules from our other eks tests? So that we aren't adding all this terraform code for every new eks test we add

I think should be possible but probably need to refactor some code. Don't think we should do it in this PR since that adds additional code that's not necessarily related to the goal of this PR.

lisguo · 2025-01-22T18:56:11Z

terraform/eks/daemon/credentials/pod_identity/main.tf

+  type                     = "ingress"
+}
+
+resource "null_resource" "clone_helm_chart" {


Is there a reason we are using helm over the eks addon?

Wondering what it is this integ test is testing -- I would assume we would want to install the addon + pod identity enabled + the release artifact of the agent = verify functionality works / calls to cw work

I used helm chart for easier testing. For example, I can check out a new branch from helm chart repo with custom changes, then run terraform on my local with the said changes. Using add-on have the same effect but will be harder to test custom changes.

lisguo · 2025-01-22T18:57:50Z

terraform/eks/daemon/credentials/pod_identity/main.tf

+
+  provisioner "local-exec" {
+    command = <<-EOT
+      echo "Validating CloudWatch Agent and FluentBit with pod identity credential"


technically...we are testing against cloudwatch agent, not fluentbit. Or are we pulling the latest fluentbit version in this test?

And if we are...I don't think we should block our agent release if fluentbit fails.

Fixed in new commit

lisguo · 2025-01-22T18:59:28Z

test/metric_value_benchmark/eks_daemonset_test.go

@@ -31,7 +32,9 @@ type EKSDaemonTestRunner struct {

 func (e *EKSDaemonTestRunner) Validate() status.TestGroupResult {
 	var testResults []status.TestResult
-	testResults = append(testResults, metric.ValidateMetrics(e.env, "", eks_resources.GetExpectedDimsToMetrics(e.env))...)
+	if e.env.EksDeploymentStrategy != eksdeploymenttype.PODIDENTITY {


For more context, Container Insights provides different set of metrics based on what's present on the cluster. When testing Pod Identity I see a different set of metrics from this list:

Is that a problem? What is the diff?

Add pod identity credential integration test

9ee01da

zhihonl requested a review from a team as a code owner January 16, 2025 16:23

zhihonl commented Jan 16, 2025

View reviewed changes

zhihonl and others added 2 commits January 16, 2025 11:31

Fix linter error

054f67c

Merge branch 'main' into pod-identity-test

740f6b6

varunch77 previously approved these changes Jan 21, 2025

View reviewed changes

Merge branch 'main' into pod-identity-test

0a3f058

lisguo reviewed Jan 22, 2025

View reviewed changes

Remove fluentbit changes

03b5fe8

zhihonl dismissed varunch77’s stale review via 03b5fe8 January 24, 2025 02:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pod identity credential integration test #450

Add pod identity credential integration test #450

zhihonl commented Jan 16, 2025

zhihonl Jan 16, 2025 •

edited

Loading

zhihonl Jan 16, 2025 •

edited

Loading

lisguo Jan 22, 2025

zhihonl Jan 24, 2025

lisguo Jan 22, 2025

zhihonl Jan 24, 2025

lisguo Jan 22, 2025

zhihonl Jan 24, 2025

lisguo Jan 22, 2025

zhihonl Jan 24, 2025

lisguo Jan 22, 2025

zhihonl Jan 24, 2025

lisguo Jan 22, 2025

Add pod identity credential integration test #450

Are you sure you want to change the base?

Add pod identity credential integration test #450

Conversation

zhihonl commented Jan 16, 2025

Description of the issue

Description of changes

License

Tests

zhihonl Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

zhihonl Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhihonl Jan 16, 2025 •

edited

Loading

zhihonl Jan 16, 2025 •

edited

Loading