Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved discussion of source() #289

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Improved discussion of source() #289

wants to merge 1 commit into from

Conversation

hadley
Copy link
Member

@hadley hadley commented Oct 20, 2020

  • Don't mention sys.source() as people shouldn't be using it anyway
  • Use local = TRUE which is simpler and still correct
  • Precisely describe the problem caused the default behaviour of source()

* Don't mention sys.source() as people shouldn't be using it anyway
* Use `local = TRUE` which is simpler and still correct
* Precisely describe the problem caused the default behaviour of `source()`
Copy link
Member

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Don't mention sys.source() as people shouldn't be using it anyway

I don't understand that.

  • Use local = TRUE which is simpler and still correct

Simpler, true. Correct, I don't think it is guranteed.

  • Precisely describe the problem caused the default behaviour of source()

Yes, I agree this needs more clarification.

Thank you!


````md
```{r, include=FALSE}`r ''`
source("your-script.R", local = knitr::knit_global())
Copy link
Member

@yihui yihui Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is more robust than local = TRUE. It works anywhere (e.g. inside a function). It's an absolute environment, and local = TRUE is a relative environment, which is subject to your current environment, and could be confusing when it is not called directly in a code chunk but from another function, e.g.,

load_scripts <- function(files) {
  for (f in files) source(f, local = TRUE)
}
load_scripts(c('a.R', 'b.R'))

Both local = TRUE and FALSE have their usefulness. There isn't an absolutely correct choice, so I didn't strongly emphasize one or another. Perhaps the distinction between the two choices should be explained more clearly in this section. In general, I feel source("your-script.R", local = knitr::knit_global()) is still the safest choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, if you want to run source() inside a function and have it impact code outside of the function, you're going to have to do something special. As a general principle, I don't think functions should mutate the global environment in this way, so I do believe that local = TRUE is the correct recommendation here. (And then mention local = knitr::knitr_global() as a special case if you really want to do that)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is now really about advice for writing functions that call source, not about calling source() in knitr chunks.

```
````

We recommend that you use the argument `local` in `source()` or `envir` in `sys.source()` explicitly to make sure the code is evaluated in the correct environment, i.e., `knitr::knit_global()`\index{knitr!knit\_global()}. The default values for them may not be the appropriate environment: you may end up creating variables in the wrong environment, and being surprised that certain objects are not found in later code chunks.
[^sys-source]: We don't recomend ever using `sys.source()` as it is designed specifically for internal use by base R.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't recommend ever using sys.source() as it is designed specifically for internal use by base R.

I haven't heard of this before and am surprised to learn it. Could you point out the source of this claim? The help page ?sys.source didn't say it was designed only for internal use by base R. I mentioned sys.source() because it does a much simpler job than source(), i.e., it basically does eval(parse()), whereas source() does several other things that are irrelevant to evaluating the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's implied by the envir argument which defaults to baseenv() — there is no case in which a user would want this. Do you have any evidence to suggest that people are using sys.source() in the wild?


Next in the R Markdown document, you can use objects created in these scripts (e.g., data objects or functions). This way will not only make your R Markdown document cleaner, but also make it more convenient for you to develop R code (e.g., debugging R code is often easier with pure R scripts than R Markdown).
We recommend using the `local = TRUE` argument to `source()`. The default behaviour is to evaluate the code in the global environment, which can cause confusion. Usually, knitr will evaluate code in a child of the global environment, so any assignments in the source file will be overridden by assignments in **earlier** chunks. `local = TRUE` forces `source()` to evaluate in the correct environment, avoiding any problems.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, knitr will evaluate code in a child of the global environment

Usually knitr evaluates code just in the global environment, instead of a child of the global environment. knitr::knit() and rmarkdown::render() have envir = parent.frame() by default, which points to globalenv() if you call them in the global environment.

so any assignments in the source file will be overridden by assignments in earlier chunks

I'm confused: did you mean earlier or later? If you have x <- 1 in the source file, after you source() it, the assignment will be overridden by x <- 2 in a later chunk. Similarly, if you have x <- 0 in an earlier chunk, source() will override it with x <- 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the following paragraph to imply that knitr usually evaluates in some special environment:

The default values for them may not be the appropriate environment: you may end up creating variables in the wrong environment, and being surprised that certain objects are not found in later code chunks.

If knitr usually evaluates in the global environment, then calling source() with no arguments will usually be safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants