Convert fseq to function #106

krlmlr · 2015-10-09T12:38:39Z

Example:

> as.function(. %>% rexp(n=10) %>% sort)
function (.) 
{
    . <- rexp(., n = 10)
    sort(.)
}

This gives much shorter call stacks and better error messages. Example with options(error = expression(traceback(1))):

> 5 %>% runif %>% paste(collapse=" ") %>% stop
Error in magrittr(.) : 
  0.668534632306546 0.386715253116563 0.194241847842932 0.261686375364661 0.222338400548324
3: stop(.)
2: magrittr(.) at pipe.R#39
1: 5 %>% runif %>% paste(collapse = " ") %>% stop

Another advantage is that visibility is handled implicitly.

~~Calling an fseq still uses freduce(). I think we can get rid of freduce() entirely, and implement the fseq as a function right away. This requires some more work, we should discuss first.~~

A functional sequence is now a plain old function with class fseq and an attribute magrittr:function_list that holds the list of functions (previously env[["_function_list"]]). The env with all the underscore-prefixed variables has gone. I think this simplifies things a lot.

All tests pass locally. Closes #107, which adds more tests and is included here. Also added documentation stub in vignette.

The idea is stolen from vadr's mkchain() function; the implementation is mine.

CC @crowding, @gaborcsardi.

Related: #94, #95

Probably breaks #70

gaborcsardi · 2015-10-09T20:11:49Z

Excellent idea! @smbache, did you consider this approach? There might be some drawbacks of course, but at the first look, it looks very neat.

smbache · 2015-10-18T08:25:17Z

From a conceptual point of view, I like the "purity" of each RHS as each their own function, and that they are applied sequentially; rather than a sequence of

...

. <- rhs1(...)
. <- rhs2(...)

...

On the other hand I can see that debugging (and perhaps tamper) may benefit from this other approach.

@hadley (Dr. purrr) do you have an opinion about this?

Another, perhaps minor (not sure) point is that this would break indexing the functional sequences via [ and [[.

krlmlr · 2015-10-18T12:28:49Z

I'm using this code since I created it, and I literally forgot how painful the debugging of pipes used to be.

We can easily keep the storage format for the fseq, and cache the evaluator function redundantly. If lhs is ., we may want to byte-compile the returned function. I'll do some more work here, perhaps also a benchmark, so that we can discuss further in a second iteration.

- an fseq is now a function with an attribute magrittr:function_list that contains the list of functions - the function contains an unrolled representation of the pipe, a sequence of assignments . <- f(., ...) followed by a final function call - converting an fseq to a function happens simply by removing the fseq class

krlmlr · 2015-10-18T21:46:32Z

Updated the original description. I'm still not sure what the overhead of assembling the function is, and probably there's no point in forcibly byte-compiling. Ready for review.

The freduce() function is not needed anymore, either.

hadley · 2015-10-19T12:29:16Z

I don't have any strong feelings either way, but I do like that you can see exactly what magrittr is doing behind the scenes. It will make clear to people why magrittr doesn't (by design) work well with functions that use NSE.

smbache · 2015-10-26T19:58:09Z

I have made a branch "simplified" with the following changes:

No more fseq and freduce: only composition of a single pipelined function.
No more aliases, per request here Drop aliases? #108 (not sure how I feel about this, but let's play with the idea)
Removed tests related to the above
Removed vignette until settling on what's to be included, etc.

This has simplified the package a great deal, and several files was deleted.

cc @hadley @gaborcsardi @krlmlr

krlmlr · 2015-10-26T20:17:47Z

I'm glad you like the idea, please feel free to take over at this point. Removal of [ and [[, and of the fseq class, is a breaking change, but this might have been a rarely used feature indeed.

Removed aliases. Removed [ and [[ getters (as there are no more fseqs. Could be implemented though) Removed no-longer-needed tests

smbache · 2015-10-26T20:23:49Z

I'm not really sure what I like best; But I probably wouldn't want to do both; too much complexity for little value. I guess the class could still be there, and a getter can be defined that can subset the pipeline. Wouldn't be to difficult.

smbache · 2015-10-27T14:27:18Z

seems to be a bit faster than the classic version as well...

krlmlr · 2015-11-12T21:27:08Z

@smbache @jimhester @kevinushey This breaks r-lib/lintr@4369aa80092. Here's a session using Stefan's "simplified" branch for an empty package with an empty source file:

> lintr::lint_package()

> lintr::lint_package()
.Error in "\t" %>% one_or_more() : could not find function "split_chain"
32: "\t" %>% one_or_more()
31: eval(expr, envir, enclos)
30: eval(x$expr, data, x$env) at eval.R#27
29: FUN(X[[i]], ...)
28: lapply(x, lazy_eval, data = data) at eval.R#21
27: lazyeval::lazy_eval(args, as.list(.rex$env))
26: escape(lazyeval::lazy_eval(args, as.list(.rex$env)))
25: paste(sep = "", collapse = "", ...)
24: structure(x, class = "regex")
23: regex(paste(sep = "", collapse = "", ...))
22: p(escape(lazyeval::lazy_eval(args, as.list(.rex$env))))
21: structure(x, class = "regex")
20: regex(p(escape(lazyeval::lazy_eval(args, as.list(.rex$env)))))
19: rex_(args, env)
18: rex("\t" %>% one_or_more())
17: add_options(pattern, options)
16: re_matches(source_file$lines, rex("\t" %>% one_or_more()), locations = TRUE, 
        global = TRUE)
15: linters[[linter]](expr)
14: inherits(x, class)
13: assign_item(x)
12: flatten_list(x, class = "lint")
11: structure(flatten_list(x, class = "lint"), class = "lints")
10: flatten_lints(linters[[linter]](expr))
9: lint(file, ..., parse_settings = FALSE)
8: FUN(X[[i]], ...)
7: lapply(files, function(file) {
       if (interactive()) {
     ...
6: inherits(x, class)
5: assign_item(x)
4: flatten_list(x, class = "lint")
3: structure(flatten_list(x, class = "lint"), class = "lints")
2: flatten_lints(lapply(files, function(file) {
       if (interactive()) {
     ...
1: lintr::lint_package()
> devtools::session_info()
Session info ------------------------------------------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.2.2 (2015-08-14)
 system   x86_64, linux-gnu           
 ui       RStudio (0.99.486)          
 language en_US:en                    
 collate  en_US.UTF-8                 
 tz       <NA>                        
 date     2015-11-12                  

Packages ----------------------------------------------------------------------------------------------------------------------------------------------
 package    * version     date       source        
 devtools     1.9.1.9000  2015-11-03 local         
 digest       0.6.8       2014-12-31 CRAN (R 3.2.0)
 igraph       1.0.1       2015-06-26 CRAN (R 3.2.1)
 knitr        1.11        2015-08-14 CRAN (R 3.2.1)
 lazyeval     0.1.10.9000 2015-08-21 local         
 lintr        0.3.3       2015-11-12 local         
 magrittr   * 1.5         2015-11-12 local         
 memoise      0.2.99.9000 2015-10-08 local         
 rex          1.0.1       2015-04-28 CRAN (R 3.2.0)
 rstudioapi   0.3.1       2015-04-07 CRAN (R 3.2.0)
 ulimit       0.0-2       2015-04-14 local

kevinushey · 2015-11-12T21:45:24Z

The error seems to imply an expression that depends on split_chain() is being evaluated in an environment where it's not available; no idea what could be causing that.

gaborcsardi · 2015-11-13T11:05:32Z

@smbache Seems like it. You could do this in .onLoad in rex, and then it happens at run time (well, load time, really).

smbache · 2015-11-13T11:43:19Z

yeah; still not sure that this particular issue should appeal to safeguard the private API, as e.g. the suggestion to have helper functions defined inside pipe... It's more an issue on the importing side, rather than the exporting side, no?

gaborcsardi · 2015-11-13T11:45:01Z

I think so. If you just take an object from another package, and put it in your package, then the responsibility is yours....

krlmlr · 2015-11-13T11:56:30Z

Agreed so far. Thanks for the insights.

@kevinushey: Can you call register() in .onLoad() in your package, as suggested by Gábor? Currently, it seems to happen during build time, and this means that your copy of the function (in the hidden environment) may have a different implementation (requiring internal APIs not available anymore) than the installed version.

jimhester · 2015-11-13T12:35:21Z

FWIW this same procedure, importing %>%, then exporting is used in dplyr, ggvis, rvest among many others, so this same issue will happen with all these packages as well until they are re-installed.

We can upload a new version of rex to CRAN after the magrittr release to encourage people to update. But because the user-side fix is straightforward I don't think it is worth changing the method of import.

jimhester · 2015-11-13T12:38:41Z

Actually I just read Gabor's comment about the register call, so I see this is actually a rex specific issue. I can change the code to call register on .onload, which should fix it. Disregard previous message.

smbache · 2015-11-13T12:42:03Z

but it is still an interesting discussion about where to place the "responsibility"...

smbache · 2015-11-13T12:43:52Z

I guess if the pipe is to be re-exported, there is no choice but to live with "build-time import"?

gaborcsardi · 2015-11-13T12:46:19Z

@smbache You can export a dummy, and then replace it with the run time imported one in .onLoad.

krlmlr · 2015-11-13T12:46:53Z

@smbache, @gaborcsardi: That's what I thought, too -- but reexporting happens via NAMESPACE, isn't this a different mechanism?

gaborcsardi · 2015-11-13T12:48:02Z

@krlmlr I am not sure what re-exporting does TBH.

EDIT: but it is easy to try.

gaborcsardi · 2015-11-13T12:49:22Z

In any case, if re-exporting is not good (I suspect it is not), then the dummy, replaced in .onLoad will still work imo.

krlmlr · 2015-11-13T12:52:29Z

A reprex would be great. Anyone?

smbache · 2015-11-13T13:25:14Z

I'm too old to know what that means 😆 well I have an idea, but not sure what you want exactly.

fixes tidyverse/magrittr#106

jimhester · 2015-11-13T14:13:02Z

A package with a NAMESPACE of

importFrom(magrittr,"%>%")
export("%>%")

And the following somewhere in R/

if ("split_chain" %in% codetools::findGlobals(`%>%`, merge = FALSE)$functions) message("using old magrittr") else message("using new magrittr")

Should be a simple reproducible example.

krlmlr · 2015-11-13T14:50:47Z

My reprex tests:

Two versions of the same package AA with the same public API but different private APIs
One package BB that reexports the public API from AA
Installing first version of AA, BB, then second version of AA
Calling the reexported API from BB

I don't see irregularities: The correct private API is called without reinstalling BB.

gaborcsardi · 2015-11-13T14:55:39Z

@krlmlr Sure, this is fine, of course. You would need to do

test <- private_fun_a1

here: https://github.com/krlmlr/imexport.reprex/blob/master/A1/R/a.R#L2

And similarly in the other package.

The problem with rex is that it stores a copy of %>% in an environment. If it just called %>%, then it would be fine.

krlmlr · 2015-11-13T14:57:21Z

@gaborcsardi: You mean I'd need to <- to break it? I guess this would do, but I think most packages are rather interested in a working solution.

gaborcsardi · 2015-11-13T15:13:23Z

@krlmlr Yes, to break it. Or to see why rex is currently broken.

I am not sure why rex needs to store %>% locally? To reexport it?

If you really need to store an actual object (instead of a reference) from another package, then IMO the only working solution is to import/create it in .onLoad.

jimhester · 2015-11-13T15:35:18Z

rex only exposes most of it's functions within a rex() expression (including %>%). Exporting %>% is legacy behavior and should be removed.

krlmlr · 2015-11-16T10:29:16Z

Instead of calling the constructed function, perhaps we should simply evaluate the body of that function in the parent frame? I think this would finally solve #38.

Downside~~~/Feature~~~: The . would escape to the calling environment.

smbache · 2015-11-16T10:46:28Z

IMO that's an unacceptable downside.

krlmlr · 2015-11-18T11:22:58Z

I agree that the leaking . can be surprising. But I can think of ways to fix that:

Restore the original value of . on exit if it existed before, remove otherwise
- Is this possible if . is a promise?
Use a unique identifier in the generated function instead of .
- We probably still want to remove this unique identifier from the caller's environment.

gaborcsardi · 2015-11-18T11:32:28Z

Restore the original value of . on exit if it existed before, remove otherwise
Is this possible if . is a promise?

I think this is messy. You can restore promises, but need to write C code for it imo.

Use a unique identifier in the generated function instead of .
We probably still want to remove this unique identifier from the caller's environment.

If you generate a random id every time that might work.

This said, I am not a fan of evaluating the body, it seems somewhat messy. You lose the nice call stack as well. I think it is better to call the function.

smbache · 2015-11-18T11:45:06Z

It's a very impure approach. Perhaps it could work, but I'm very much against it.

krlmlr · 2017-12-30T21:40:34Z

The update branch looks much better than this. Looking forward to seeing it released, because it also helps a lot with profiling code that uses pipes.

wlandau · 2020-01-09T21:12:54Z

Does this story continue? Until I saw the thread, I have been using the following to de-pipe code for profiling, and I am looking for a better alternative.

depipe <- function(expr) {
  expr <- substitute(expr)
  chain <- magrittr:::split_chain(expr)
  calls <- c(chain$lhs, chain$rhs)
  calls <- purrr::map(calls, ~as.call(list(quote(`<-`), quote(.), .x)))
  as.call(c(quote(`{`), calls))
}

depipe(
  mtcars %>%
    group_by(cyl) %>%
    summarize(mpg = mean(mpg)) %>%
    ungroup()
)
#> {
#>     . <- mtcars
#>     . <- group_by(., cyl)
#>     . <- summarize(., mpg = mean(mpg))
#>     . <- ungroup(.)
#> }

Kirill Müller added 4 commits October 9, 2015 12:19

first draft

3eeb30f

take care of visibility

44ef23c

use as.function

46cc3f7

don't need fseq in env

76b9f85

Kirill Müller added 2 commits October 18, 2015 15:52

format the function call as magrittr(.)

8643a8b

add a few more fseq-related tests

8b8295e

krlmlr mentioned this pull request Oct 18, 2015

Add a few more fseq-related tests #107

Closed

Kirill Müller added 9 commits October 18, 2015 16:43

Merge branch 'fseq-tests' into as.function

18cadcc

extract function unroll_function_list

3650ddc

move code

d63b401

add test, currently failing

1d6fcb4

need to remove attributes in as.function

a372f95

also add test for visibility

af15f4d

Merge branch 'fseq-tests' into as.function

fd103af

documentation stub in the vignette

d7c097f

krlmlr changed the title ~~WIP: Convert fseq to function~~ Convert fseq to function Oct 18, 2015

document

0992f14

krlmlr referenced this pull request Oct 26, 2015

No fseq and freduce -> compose one function instead.

ac683dc

Removed aliases. Removed [ and [[ getters (as there are no more fseqs. Could be implemented though) Removed no-longer-needed tests

jimhester added a commit to r-lib/rex that referenced this pull request Nov 13, 2015

Move register for magrittr pipe to onLoad

3433667

fixes tidyverse/magrittr#106

gaborcsardi mentioned this pull request Nov 13, 2015

Spell out argument tamper vs recover gaborcsardi/tamper#6

Merged

krlmlr closed this Dec 30, 2017

Convert fseq to function #106

Convert fseq to function #106

Conversation

krlmlr commented Oct 9, 2015

gaborcsardi commented Oct 9, 2015

smbache commented Oct 18, 2015

krlmlr commented Oct 18, 2015

krlmlr commented Oct 18, 2015

hadley commented Oct 19, 2015

smbache commented Oct 26, 2015

krlmlr commented Oct 26, 2015

smbache commented Oct 26, 2015

smbache commented Oct 27, 2015

krlmlr commented Nov 12, 2015

kevinushey commented Nov 12, 2015

gaborcsardi commented Nov 13, 2015

smbache commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

krlmlr commented Nov 13, 2015

jimhester commented Nov 13, 2015

jimhester commented Nov 13, 2015

smbache commented Nov 13, 2015

smbache commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

krlmlr commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

krlmlr commented Nov 13, 2015

smbache commented Nov 13, 2015

jimhester commented Nov 13, 2015 • edited Loading

krlmlr commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

krlmlr commented Nov 13, 2015

gaborcsardi commented Nov 13, 2015

jimhester commented Nov 13, 2015

krlmlr commented Nov 16, 2015

smbache commented Nov 16, 2015

krlmlr commented Nov 18, 2015

gaborcsardi commented Nov 18, 2015

smbache commented Nov 18, 2015

krlmlr commented Dec 30, 2017

wlandau commented Jan 9, 2020 • edited Loading

jimhester commented Nov 13, 2015 •

edited

Loading

wlandau commented Jan 9, 2020 •

edited

Loading