Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add arg_min, arg_max #182

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

@mhr3
Copy link
Contributor

mhr3 commented Jan 10, 2025

Do we not have preview links for these?

Comment on lines +6 to +12
The `arg_max` aggregation in APL helps you identify the record with the maximum value for a specific numeric field and return one or more additional fields from that record. Use `arg_max` when you want to determine key details associated with a record having the maximum value, such as the longest request duration, highest transaction amount, or most significant span duration.

This aggregation is particularly useful in scenarios like:

- Pinpointing the slowest HTTP requests in log data.
- Identifying the longest span durations in OpenTelemetry traces.
- Highlighting the highest severity security alerts in logs.
Copy link
Contributor

@mhr3 mhr3 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this mostly describes a max() aggregation and doesn't focus on the 'arg' part of it.

o1 makes it more clear imo:

- You group your data by one or more columns (using by in summarize).
- Within each group, Kusto finds the row where a particular expression (often a column) has the maximum value.
- It then returns specified columns from that “maximum” row.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


<CodeGroup>
```sql Splunk example
| stats max(req_duration_ms) as max_duration by id, uri
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK splunk doesn't have an equivalent for arg_max, this is just

| summarize max(req_duration_ms) by id, uri


```kusto
['sample-http-logs']
| summarize arg_max(req_duration_ms, method) by uri
Copy link
Contributor

@mhr3 mhr3 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO a better example would be with swapped uri & method - ie arg_max(duration, uri) by method - which would return slowest paths for any given HTTP method.

Find the slowest HTTP request for each URI in the ['sample-http-logs'] dataset.

that doesn't need arg_max, would be just summarize max(duration) by uri

Copy link
Contributor

@mhr3 mhr3 Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or perhaps replace the method with request_id (in the original), that also makes sense


```kusto
['otel-demo-traces']
| summarize arg_max(duration, span_id, trace_id) by ['service.name']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's a good one 👍🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants