Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Terraform CLI logs to container stdout #258

Merged
merged 16 commits into from
Aug 23, 2024

Conversation

suramasamy
Copy link
Contributor

Description of your changes

Added optional field enableTerraformCLILogging to the Workspace resource to enable writing of Terraform CLI logs to the container stdout in JSON format to assist with debugging and to view detailed information about Terraform operations.

Fixes #163

I have:

  • Run make reviewable to ensure this PR is ready for review.
  • Run make e2e with UPTEST_EXAMPLE_LIST="examples/workspace-enable-logging.yaml"

How has this code been tested

  • On the local cluster, applied the Workspace resource present in the examples workspace-enable-logging.yaml
  • Observed that the container logs has the expected Terraform Plan and Apply logs
  • Made changes to the Workspace YAML and re-applied
  • Observed that the container logs are updated with new Terraform Plan and Apply logs
  • Changed enableTerraformCLILogging to false, made few other Terraform code changes in the Workspace YAML and re-applied
  • Observed that the container logs doesn't have the new Terraform Plan and Apply logs
  • Toggled enableTerraformCLILogging to true and re-applied the Workspace
  • Deleted the Workspace resource, Observed that the container logs has the expected Terraform Destroy logs.

Sample Logs:

{"level":"info","ts":"2024-04-10T20:24:04.678Z","logger":"provider-terraform","msg":"\nTerraform used the selected providers to generate the following execution\nplan. Resource actions are indicated with the following symbols:\n + create\n\nTerraform will perform the following actions:\n\n # random_id.example_id will be created\n + resource \"random_id\" \"example_id\" {\n + b64_std = (known after apply)\n + b64_url = (known after apply)\n + byte_length = 7\n + dec = (known after apply)\n + hex = (known after apply)\n + id = (known after apply)\n }\n\n # random_password.password will be created\n + resource \"random_password\" \"password\" {\n + bcrypt_hash = (sensitive value)\n + id = (known after apply)\n + length = 16\n + lower = true\n + min_lower = 0\n + min_numeric = 0\n + min_special = 0\n + min_upper = 0\n + number = true\n + numeric = true\n + result = (sensitive value)\n + special = true\n + upper = true\n }\n\nPlan: 2 to add, 0 to change, 0 to destroy.\n\nChanges to Outputs:\n + random_id_hex = (known after apply)\n + random_password = (sensitive value)\n","request":"example-random-generator"}

@suramasamy
Copy link
Contributor Author

Hello @ytsarev @bobh66,
I'm wondering if anyone would be able to share a timeline for the review of this PR?
Thanks so much!

Copy link
Collaborator

@bobh66 bobh66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @suramasamy - thanks for the PR! I took it for a spin and it looks good, just a couple of suggested changes.

ForProvider WorkspaceParameters `json:"forProvider"`
xpv1.ResourceSpec `json:",inline"`
ForProvider WorkspaceParameters `json:"forProvider"`
EnableTerraformCLILogging bool `json:"enableTerraformCLILogging,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this new attribute should be inside the ForProvider structure since it is used by the provider.

@@ -89,7 +91,7 @@ func main() {

// controller-runtime uses both ConfigMaps and Leases for leader
// election by default. Leases expire after 15 seconds, with a
// 10 second renewal deadline. We've observed leader loss due to
// 10 seconds renewal deadline. We've observed leader loss due to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The singular is actually ok here - English is a strange language.


## Enable Terraform CLI logs

Terraform CLI logs can be written to the container logs as standard output to assist with debugging and to view detailed information about Terraform operations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Terraform CLI logs can be written to the container logs as standard output to assist with debugging and to view detailed information about Terraform operations.
Terraform CLI output can be written to the container logs to assist with debugging and to view detailed information about Terraform operations.

...
```

- `enableTerraformCLILogging`: Specifies whether logging is enabled (`true`) or disabled (`false`). When enabled, Terraform CLI command logs will be written to the container logs as standard output. Default is `false`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `enableTerraformCLILogging`: Specifies whether logging is enabled (`true`) or disabled (`false`). When enabled, Terraform CLI command logs will be written to the container logs as standard output. Default is `false`
- `enableTerraformCLILogging`: Specifies whether logging is enabled (`true`) or disabled (`false`). When enabled, Terraform CLI command output will be written to the container logs. Default is `false`

}
case 2:
if h.EnableTerraformCLILogging {
h.Logger.Info(string(log))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be useful to the user to prefix the specific command that produced the output in each case - meaning "plan", "apply" or "delete"?

@Upbound-CLA
Copy link

Upbound-CLA commented May 10, 2024

CLA assistant check
All committers have signed the CLA.

@ramiradaideh ramiradaideh force-pushed the support-terraform-cli-log branch from 1c7ed9b to 3ae263c Compare May 10, 2024 19:33
@ramiradaideh ramiradaideh force-pushed the support-terraform-cli-log branch from 3ae263c to 1d2cf22 Compare May 10, 2024 19:47
@ramiradaideh
Copy link
Contributor

New log Structure: Which includes the {"operation":value} block

{"level":"debug","ts":"2024-05-10T16:34:07.437-0400","logger":"provider-terraform","msg":"Reconciling","controller":"providerconfig/providerconfig.tf.upbound.io","request":{"name":"default"}}
{"level":"info","ts":"2024-05-10T16:34:09.895-0400","logger":"provider-terraform","msg":"Refreshing Terraform state in-memory prior to plan...\nThe refreshed state will be used to calculate this plan, but will not be\npersisted to local or remote state storage.\n\n\n------------------------------------------------------------------------\n\nAn execution plan has been generated and is shown below.\nResource actions are indicated with the following symbols:\n + create\n\nTerraform will perform the following actions:\n\n # random_id.example_id will be created\n + resource "random_id" "example_id" {\n + b64_std = (known after apply)\n + b64_url = (known after apply)\n + byte_length = 8\n + dec = (known after apply)\n + hex = (known after apply)\n + id = (known after apply)\n }\n\n # random_password.password will be created\n + resource "random_password" "password" {\n + bcrypt_hash = (sensitive value)\n + id = (known after apply)\n + length = 16\n + lower = true\n + min_lower = 0\n + min_numeric = 0\n + min_special = 0\n + min_upper = 0\n + number = true\n + numeric = true\n + result = (sensitive value)\n + special = true\n + upper = true\n }\n\nPlan: 2 to add, 0 to change, 0 to destroy.\n\nChanges to Outputs:\n + random_id_hex = (known after apply)\n + random_password = (sensitive value)\n\n------------------------------------------------------------------------\n\nNote: You didn't specify an "-out" parameter to save this plan, so Terraform\ncan't guarantee that exactly these actions will be performed if\n"terraform apply" is subsequently run.\n\n","request":"example-random-generators","operation":"plan"}
{"level":"info","ts":"2024-05-10T16:34:11.517-0400","logger":"provider-terraform","msg":"random_id.example_id: Creating...\nrandom_id.example_id: Creation complete after 0s [id=CS5FbALoRxU]\nrandom_password.password: Creating...\nrandom_password.password: Creation complete after 0s [id=none]\n\nApply complete! Resources: 2 added, 0 changed, 0 destroyed.\n\nOutputs:\n\nrandom_id_hex = 092e456c02e84715\nrandom_password = \n","request":"example-random-generators","operation":"apply"}

@tammy-young tammy-young force-pushed the support-terraform-cli-log branch from 17f560e to e0aab2c Compare May 13, 2024 19:12
@ramiradaideh
Copy link
Contributor

Hello @ytsarev @bobh66,

When available, could you please review the PR.

We also wanted to discuss if our changes could impact existing users of provider terraform

- zl := zap.New(zap.UseDevMode(*debug), UseISO8601())
+ zl := zap.New(zap.UseDevMode(*debug), UseJSONencoder())

Will this change impact the format of their container logs?

Currently on main, logs are shown in a structured and unstructured format.

2024-05-13T12:18:17.635-0400 DEBUG provider-terraform Reconciling {"controller": "managed/workspace.tf.upbound.io", "request": {"name":"example-inline"}}

With the new change the logs are only structured

{"level":"debug","ts":"2024-05-10T16:34:07.437-0400","logger":"provider-terraform","msg":"Reconciling","controller":"providerconfig/providerconfig.tf.upbound.io","request":{"name":"default"}}

@zach-source
Copy link
Contributor

This almost feels like it should be extrapolated as a provider config option across all providers so you can like opt into all json logging if you want that. Having it as a flag on the workspace feels too low level here.

@bobh66
Copy link
Collaborator

bobh66 commented May 18, 2024

This almost feels like it should be extrapolated as a provider config option across all providers so you can like opt into all json logging if you want that. Having it as a flag on the workspace feels too low level here.

Debugging tends to happen at the individual Workspace level so I think it's appropriate to enable/disable logging at that level.

@jamesdobson
Copy link

Hi @bobh66 and @zach-source, I think there is a bit of a miscommunication as there are two separate questions here:

  1. The first question was, for Terraform CLI output, should the option to control whether it is logged or not be at the Workspace level or the ProviderConfig level. As @bobh66 said, the Workspace level makes the most sense because that's where debugging of Terraform configurations tends to happen.
  2. Regarding @ramiradaideh's question: this PR makes a change to the provider that appears to switch all of the controller's log output from plain text to structure/JSON format. Given that we can't know all the possible ways that users of this provider may have set up their log monitoring systems and alerts, we probably shouldn't change this unconditionally.

To address the second item, I think an option to select the controller's log format should be introduced. Correct me if I'm wrong, but I don't think the option can be put at the Workspace level or the ProviderConfig level, since that will result in mixed-format output from the controller process. I think the only place the option can be is as an option to the controller process itself; it's something that could be changed via an environment variable on the Deployment for the controller.

Thoughts?

Copy link
Collaborator

@bobh66 bobh66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's separate the logging format changes into a separate PR and make it configurable so we can get the rest of the changes merged. Thanks

@@ -64,7 +66,7 @@ func main() {
)
kingpin.MustParse(app.Parse(os.Args[1:]))

zl := zap.New(zap.UseDevMode(*debug), UseISO8601())
zl := zap.New(zap.UseDevMode(*debug), UseJSONencoder())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree that the logging format change should be in a separate PR and should be able to be specified as a runtime argument. It should default to the current format to prevent breaking existing deployments and allow for a json option to be specified. If we want to change the default log format in the future we can plan for it and announce it ahead of time, but for now I think it's better not to change the default format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bobh66, we reverted the changes for the logging format. Please review them when you get the chance and let us know if you require anything else to get this PR merged!

@tammy-young tammy-young force-pushed the support-terraform-cli-log branch from 3a174a1 to b0723c5 Compare July 2, 2024 15:16
Signed-off-by: Katrina Ronquillo <[email protected]>
@tammy-young tammy-young force-pushed the support-terraform-cli-log branch from b0723c5 to 04387b3 Compare July 2, 2024 15:18
Signed-off-by: Katrina Ronquillo <[email protected]>
@suramasamy
Copy link
Contributor Author

Hello @ytsarev @bobh66,
Could you please help in reviewing this PR, Thanks so much!

@denniskniep
Copy link
Contributor

Does it make sense to encapsulate the recurring logic

	switch cmd.ProcessState.ExitCode() {
	case 0:
		if h.EnableTerraformCLILogging {
			h.Logger.Info(string(log), "operation", "<operation>")
		}
	default:
		ee := &exec.ExitError{}
		errors.As(err, &ee)
		if h.EnableTerraformCLILogging {
			h.Logger.Info(string(ee.Stderr), "operation", "<operation>")
		}
	} 

into the runCommand function? A nice side effect is, that all commands are logged automatically. i.e. currently the workspace cmd is not logged.

Furthermore it would be nice to emit also a log before the command is executed including the cmd name and all the args. This would help a lot during troubleshooting, because one can reproduce with the exact commands. I think it makes also sense to implement this generally in the runCommand function.

What do you think @tammy-young @suramasamy @bobh66 ?

@negz
Copy link
Member

negz commented Aug 9, 2024

I know it's late in the game, but it might be worth taking a look at crossplane/crossplane-runtime#754.

I like the pattern we landed on for that feature (provider change logs):

  • The typical, existing provider logs are written to the main provider container's stdout. These logs are structured, but heterogenous in their schema - different log lines might have different fields.
  • The new change logs are sent over RPC to a sidecar container, which writes them to its stdout. These logs are structured and homogenous.

This way our change logs aren't mixed in with regular logs. You can easily configure fluent bit or similar to scrape all logs from the sidecar container and do something different with them from regular logs.

Copy link
Collaborator

@bobh66 bobh66 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - sorry for the delay. We can start with this implementation and make changes as needed. Thanks!

@bobh66 bobh66 merged commit 8850b8a into upbound:main Aug 23, 2024
9 checks passed
@suramasamy
Copy link
Contributor Author

@bobh66 Thanks for merging the changes, Could you please let us know when this planned to be released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Terraform apply logs