Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Turning parsed strings into field/value pairs #2999

Closed
philrz opened this issue Sep 2, 2021 · 1 comment · Fixed by #4827
Closed

Turning parsed strings into field/value pairs #2999

philrz opened this issue Sep 2, 2021 · 1 comment · Fixed by #4827

Comments

@philrz
Copy link
Contributor

philrz commented Sep 2, 2021

#2989 provides an example of domain-specific parsing that could benefit from a purpose-built function that produces a specific structured record. However, parse-able key/value pairings are likely to show up in other contexts working with generic data, ETL pipelines, and so forth.

For example, consider the following record that contains a string:

{myfield:"country=USA,state=TX,city=Dallas"}

With our human eyes, we can deduce that the , delimiter is being used to separate the key/value pairs, and the = delimiter is being used to separate keys from values. A user may therefore seek Zed functionality to extract these so they can be treated as fields in the record, leading to an end state such as:

{country:"USA",state:"TX",city:"Dallas"}

Zed already has some building blocks that can extract the necessary pieces (e.g. split(), array references, etc.) but it can't currently go all the way to generating the field/value pairs. That's the enhancement this issue is tracking.

@mccanne correctly pointed out that the hard part here is figuring out how we want to present this functionality to the user. Implementing what we decide should be easy.

Just to offer some precedent that should be familiar to some of our users, Splunk's approach to addressing it is described here.

@philrz
Copy link
Contributor Author

philrz commented Apr 19, 2024

This can now be achieved via the grok function that was added in #4827. Showing how to parse the example above at Zed commit 20a867d:

$ zq -version
Version: v1.15.0-8-g20a867d9

$ echo '{myfield:"country=USA,state=TX,city=Dallas"}' | zq -Z 'yield grok("country=%{GREEDYDATA:country},state=%{GREEDYDATA:state},city=%{GREEDYDATA:city}",myfield)' -
{
    country: "USA",
    state: "TX",
    city: "Dallas"
}

Thanks @mattnibs!

@philrz philrz closed this as completed Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant