Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .order and .frame arguments to mutate(). #1542

Open
iangow opened this issue Sep 20, 2024 · 3 comments
Open

Add .order and .frame arguments to mutate(). #1542

iangow opened this issue Sep 20, 2024 · 3 comments

Comments

@iangow
Copy link

iangow commented Sep 20, 2024

Back in 2017, @hadley suggested "there are two possible APIs" for implementing what became window_frame() and window_order() (see tidyverse/dplyr#2874; @edgararuiz-zz).

At the time, I believe there was no .by argument to mutate(), so the window_frame()/window_order() approach seemed to make most sense. At that time one choice was:

df %>%
  group_by(gvkey) %>%
  window(
    .order = vars(datadate),
    .frame = (-3, 0),
    
    sale_ttm = sum(sale),
    cogs_ttm = sum(cogs),
    sga_ttm = sum(sga)
  )

But now this could be something like:

df |>
  mutate(
    sale_ttm = sum(sale),
    cogs_ttm = sum(cogs),
    sga_ttm = sum(sga),
    .by = gvkey,
    .order = vars(datadate),
    .frame = (-3, 0)
  )

This would seem to have the merit of making it easier for dbplyr to infer that a window function was being sought (currently there are cases where dbplyr does not get the hint).

I am surprised that I have only one instance of window_frame() in my book. It seems like a very handy pattern (e.g., moving averages, windowed regressions).

I had a comical exchange with ChatGPT about this this afternoon (Australia time) (see here).

@DavisVaughan
Copy link
Member

I think if this was going to be considered then it would only be an argument to the dbplyr data frame method, so I'm going to move it there and let them decide. I don't think it is super useful for the general case because we already have other means of doing windowed evaluations per column, like with {slider}.

@DavisVaughan DavisVaughan transferred this issue from tidyverse/dplyr Sep 20, 2024
@hadley
Copy link
Member

hadley commented Sep 20, 2024

Yeah, this syntax looks pretty reasonable to me now.

@iangow
Copy link
Author

iangow commented Nov 21, 2024

Yeah, this syntax looks pretty reasonable to me now.

I figure that this would also be useful in duckplyr as it would allow users of that package to access window-function functionality if implemented in the relevant version of mutate().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants