-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert fseq to function #106
Conversation
Excellent idea! @smbache, did you consider this approach? There might be some drawbacks of course, but at the first look, it looks very neat. |
From a conceptual point of view, I like the "purity" of each RHS as each their own function, and that they are applied sequentially; rather than a sequence of ...
. <- rhs1(...)
. <- rhs2(...)
... On the other hand I can see that debugging (and perhaps @hadley (Dr. purrr) do you have an opinion about this? Another, perhaps minor (not sure) point is that this would break indexing the functional sequences via |
I'm using this code since I created it, and I literally forgot how painful the debugging of pipes used to be. We can easily keep the storage format for the fseq, and cache the evaluator function redundantly. If lhs is |
- an fseq is now a function with an attribute magrittr:function_list that contains the list of functions - the function contains an unrolled representation of the pipe, a sequence of assignments . <- f(., ...) followed by a final function call - converting an fseq to a function happens simply by removing the fseq class
Updated the original description. I'm still not sure what the overhead of assembling the function is, and probably there's no point in forcibly byte-compiling. Ready for review. The freduce() function is not needed anymore, either. |
I don't have any strong feelings either way, but I do like that you can see exactly what magrittr is doing behind the scenes. It will make clear to people why magrittr doesn't (by design) work well with functions that use NSE. |
I have made a branch "simplified" with the following changes:
This has simplified the package a great deal, and several files was deleted. |
I'm glad you like the idea, please feel free to take over at this point. Removal of |
Removed aliases. Removed [ and [[ getters (as there are no more fseqs. Could be implemented though) Removed no-longer-needed tests
I'm not really sure what I like best; But I probably wouldn't want to do both; too much complexity for little value. I guess the class could still be there, and a getter can be defined that can subset the pipeline. Wouldn't be to difficult. |
seems to be a bit faster than the classic version as well... |
@smbache @jimhester @kevinushey This breaks r-lib/lintr@4369aa80092. Here's a session using Stefan's "simplified" branch for an empty package with an empty source file:
|
The error seems to imply an expression that depends on |
@smbache Seems like it. You could do this in |
yeah; still not sure that this particular issue should appeal to safeguard the private API, as e.g. the suggestion to have helper functions defined inside |
I think so. If you just take an object from another package, and put it in your package, then the responsibility is yours.... |
Agreed so far. Thanks for the insights. @kevinushey: Can you call register() in .onLoad() in your package, as suggested by Gábor? Currently, it seems to happen during build time, and this means that your copy of the function (in the hidden environment) may have a different implementation (requiring internal APIs not available anymore) than the installed version. |
FWIW this same procedure, importing We can upload a new version of rex to CRAN after the magrittr release to encourage people to update. But because the user-side fix is straightforward I don't think it is worth changing the method of import. |
Actually I just read Gabor's comment about the register call, so I see this is actually a rex specific issue. I can change the code to call register on |
but it is still an interesting discussion about where to place the "responsibility"... |
I guess if the pipe is to be re-exported, there is no choice but to live with "build-time import"? |
@smbache You can export a dummy, and then replace it with the run time imported one in |
@smbache, @gaborcsardi: That's what I thought, too -- but reexporting happens via NAMESPACE, isn't this a different mechanism? |
@krlmlr I am not sure what re-exporting does TBH. EDIT: but it is easy to try. |
In any case, if re-exporting is not good (I suspect it is not), then the dummy, replaced in |
A reprex would be great. Anyone? |
I'm too old to know what that means 😆 well I have an idea, but not sure what you want exactly. |
A package with a NAMESPACE of
And the following somewhere in if ("split_chain" %in% codetools::findGlobals(`%>%`, merge = FALSE)$functions) message("using old magrittr") else message("using new magrittr") Should be a simple reproducible example. |
My reprex tests:
I don't see irregularities: The correct private API is called without reinstalling BB. |
@krlmlr Sure, this is fine, of course. You would need to do test <- private_fun_a1 here: https://github.com/krlmlr/imexport.reprex/blob/master/A1/R/a.R#L2 And similarly in the other package. The problem with rex is that it stores a copy of |
@gaborcsardi: You mean I'd need to |
@krlmlr Yes, to break it. Or to see why I am not sure why If you really need to store an actual object (instead of a reference) from another package, then IMO the only working solution is to import/create it in |
|
Instead of calling the constructed function, perhaps we should simply evaluate the body of that function in the parent frame? I think this would finally solve #38. Downside~~~/Feature~~~: The |
IMO that's an unacceptable downside. |
I agree that the leaking
|
I think this is messy. You can restore promises, but need to write C code for it imo.
If you generate a random id every time that might work. This said, I am not a fan of evaluating the body, it seems somewhat messy. You lose the nice call stack as well. I think it is better to call the function. |
It's a very impure approach. Perhaps it could work, but I'm very much against it. |
The update branch looks much better than this. Looking forward to seeing it released, because it also helps a lot with profiling code that uses pipes. |
Does this story continue? Until I saw the thread, I have been using the following to de-pipe code for profiling, and I am looking for a better alternative. depipe <- function(expr) {
expr <- substitute(expr)
chain <- magrittr:::split_chain(expr)
calls <- c(chain$lhs, chain$rhs)
calls <- purrr::map(calls, ~as.call(list(quote(`<-`), quote(.), .x)))
as.call(c(quote(`{`), calls))
}
depipe(
mtcars %>%
group_by(cyl) %>%
summarize(mpg = mean(mpg)) %>%
ungroup()
)
#> {
#> . <- mtcars
#> . <- group_by(., cyl)
#> . <- summarize(., mpg = mean(mpg))
#> . <- ungroup(.)
#> } |
Example:
This gives much shorter call stacks and better error messages. Example with
options(error = expression(traceback(1)))
:Another advantage is that visibility is handled implicitly.
Calling an fseq still usesfreduce()
. I think we can get rid offreduce()
entirely, and implement the fseq as a function right away. This requires some more work, we should discuss first.A functional sequence is now a plain old function with class
fseq
and an attributemagrittr:function_list
that holds the list of functions (previouslyenv[["_function_list"]]
). Theenv
with all the underscore-prefixed variables has gone. I think this simplifies things a lot.All tests pass locally. Closes #107, which adds more tests and is included here. Also added documentation stub in vignette.
The idea is stolen from vadr's
mkchain()
function; the implementation is mine.CC @crowding, @gaborcsardi.
Related: #94, #95
Probably breaks #70