Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarize compact snapshot logs to reduce noise #467

Merged
merged 7 commits into from
Nov 9, 2023

Conversation

seanlinsley
Copy link
Member

@seanlinsley seanlinsley commented Oct 24, 2023

The high frequency of compact snapshots causes them to create the vast majority of log lines in the default (info) log level. That makes it difficult to find logs related to errors when debugging an issue.

This PR summarizes the 6+ compact snapshots sent per minute into a single log line:

Compact snapshots submitted: 6 activity, 2 logs

@seanlinsley seanlinsley requested a review from a team October 24, 2023 14:46
@seanlinsley seanlinsley changed the title Move compact snapshot logs from info to verbose level Move successful compact snapshot logs from info to verbose level Oct 24, 2023
@seanlinsley
Copy link
Member Author

The significantly quieter logs may surprise some users, especially when updating the collector. Some alternatives:

  • show these at the info level once, then use verbose for the future calls, including this note: future instances of this message will be skipped
  • once a minute, generate a single log line that's a summary of this activity, e.g. showing that 10 activity snapshots and 2 log snapshots have been submitted

@lfittl
Copy link
Member

lfittl commented Oct 24, 2023

once a minute, generate a single log line that's a summary of this activity, e.g. showing that 10 activity snapshots and 2 log snapshots have been submitted

I like that idea - I do often use the logs line ("Submitted compact logs snapshot successfully") in particular to debug whether logs are being sent successfully, but wouldn't be a big deal if that gets sent once a minute vs right away. It would be a tad harder to work with if that didn't get written at all by default (since usually the background logs are not in verbose mode, and with systemd its a bit of effort to turn on verbose logging for the background process).

output/compact.go Outdated Show resolved Hide resolved
@seanlinsley
Copy link
Member Author

Okay, the logs now look like this:

./pganalyze-collector          
2023/10/25 11:20:00 I [server1] Submitted snapshot successfully
2023/10/25 11:20:02 I [server1] Compact snapshots submitted over the past minute: 6 activity
2023/10/25 11:21:10 I [server1] Compact snapshots submitted over the past minute: 6 activity, 4 logs
2023/10/25 11:22:10 I [server1] Compact snapshots submitted over the past minute: 5 activity, 2 logs
2023/10/25 11:23:10 I [server1] Compact snapshots submitted over the past minute: 5 activity, 2 logs
2023/10/25 11:24:20 I [server1] Compact snapshots submitted over the past minute: 6 activity

Note that it currently doesn't track the stats per-server. It wasn't clear to me what server identity value should be used for the map

@seanlinsley seanlinsley changed the title Move successful compact snapshot logs from info to verbose level Summarize compact snapshot logs to reduce noise Oct 25, 2023
@seanlinsley seanlinsley requested a review from lfittl October 25, 2023 16:29
@seanlinsley
Copy link
Member Author

@lfittl per our discussion, this has been updated to have a shorter log message and to track the stats per-server

Copy link
Contributor

@msakrejda msakrejda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments, but looks good.

details += ", "
}
}
logger.PrintInfo("Compact snapshots submitted: " + details)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we ever get here with an empty CompactLogStats map? If so, that would print a weird
Compact snapshots submitted: message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK that would only happen if all snapshots failed, which would be logged as errors separately. I'll go ahead and add a presence check here.

server.CompactLogTime = time.Now()
server.CompactLogStats = make(map[string]uint8)
} else if time.Now().Sub(server.CompactLogTime) > time.Minute {
server.CompactLogTime = time.Now()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big deal, but since this is part of resetting the log stats state, it feels like it belongs together with resetting the map below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally didn't do that, to avoid clock skew from the time it takes to build the log message. But it should never take a long time and this code path is non-critical so I'll move it down to improve readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 a way to avoid the timestamp changing over time would be to truncate it to the nearest even minute, instead of simply using time.Now()

Copy link
Member Author

@seanlinsley seanlinsley Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would ensure we always print the log at an even minute. Though it does mean that when initially starting up the collector you may have to wait was long as 1m59s before anything is logged.

Or is it 1m30s? I'm not 100% sure what Truncate does

@@ -140,7 +140,7 @@ func submitCompactSnapshot(ctx context.Context, server *state.Server, collection
        } else if !quiet {
                logger.PrintVerbose("Submitted compact %s snapshot successfully", kind)
                if server.CompactLogTime.IsZero() {
-                       server.CompactLogTime = time.Now()
+                       server.CompactLogTime = time.Now().Truncate(time.Minute)
                        server.CompactLogStats = make(map[string]uint8)
                } else if time.Now().Sub(server.CompactLogTime) > time.Minute {
                        var keys []string
@@ -158,7 +158,7 @@ func submitCompactSnapshot(ctx context.Context, server *state.Server, collection
                        if len(details) > 0 {
                                logger.PrintInfo("Compact snapshots submitted: " + details)
                        }
-                       server.CompactLogTime = time.Now()
+                       server.CompactLogTime = time.Now().Truncate(time.Minute)
                        server.CompactLogStats = make(map[string]uint8)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example output:

./pganalyze-collector
2023/11/08 15:50:00 I [server1] Compact snapshots submitted: 5 activity
2023/11/08 15:50:00 I [server1] Submitted snapshot successfully
2023/11/08 15:51:00 I [server1] Compact snapshots submitted: 5 activity
2023/11/08 15:52:00 I [server1] Compact snapshots submitted: 5 activity
2023/11/08 15:53:00 I [server1] Compact snapshots submitted: 5 activity, 1 logs
2023/11/08 15:54:00 I [server1] Compact snapshots submitted: 5 activity, 1 logs
2023/11/08 15:55:00 I [server1] Compact snapshots submitted: 5 activity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and added this change; it's nice that the logs show up reliably at the same time every minute.

@seanlinsley seanlinsley merged commit 9a7b89d into main Nov 9, 2023
3 checks passed
@seanlinsley seanlinsley deleted the compact-logs-verbose branch November 9, 2023 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants