-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathPackages_at_Grattan.Rmd
156 lines (94 loc) · 8.6 KB
/
Packages_at_Grattan.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
# Using packages {#packages}
## What are packages?
R comes with a lot of functions - commands - built in to do a broad range of tasks. You could, if you really wanted, import a dataset, clean it up, estimate a model, and make a plot just using the functions that come with R - known as 'base R'^[Technically some of the 'built-in' functions are part of packages, like the `tools`, `utils` and `stats` packages that come with R. We'll refer to all these as base R.]. But using packages will make your life easier.
Like R itself, packages are free and open source. You can install them from within RStudio using the methods described below.
## How to install packages {#install-packages}
You'll typically install packages using the console in RStudio. That's the part of the window that, by default, sits in the bottom-left corner of the screen.
In our work at Grattan, we use packages from two different source: the Comprehensive R Archive Network (CRAN) and Github. The main difference you need to know about is that we use different commands to install packages from these two sources.
To install a package from CRAN, we use the command `install.packages()`.
For example, this code will install the `ggplot2` package from CRAN:
```{r eval=FALSE}
install.packages("ggplot2")
```
The easiest way to install a package from Github is to use the function `install_github()`. Unfortunately, this function doesn't come with base R. The `install_github()` function is part of the `remotes` package. To use it, we first need to install `remotes` from CRAN:
```{r eval=FALSE}
install.packages("remotes")
```
Now we can install packages from Github using the `install_github()` function from the `remotes` package. For example, here's how we would install the Grattan `ggplot2` theme, which we'll discuss later in this website:
```{r eval=FALSE}
remotes::install_github("grattan/grattantheme", dependencies = TRUE, upgrade = "always")
```
## Get set up: install packages for Grattan {#install-grattan-packages}
Just starting out or setting up a new machine? Run this block of code to get yourself all set up:
```{r eval = FALSE}
cran_packages <- c("devtools", "tidyverse", "readabs", "janitor",
"rio", "sf")
install.packages(cran_packages)
github_packages <- c("grattan/grattantheme", "grattan/grattandata",
"runapp-aus/strayr", "grattan/grattanReporter")
remotes::install_github(github_packages,
dependencies = TRUE,
upgrade = "always")
```
## Using packages
Before using a function that comes from a package, you need to tell R where to look for the function. There are two main ways to do that.
We can either load (aka 'attach') the package by using the `library()` function. We typically do this at the top of a script.
```{r eval=FALSE}
library(remotes)
# Now that the `remotes` package is loaded, we can use its `install_github()` function:
install_github("grattan/grattantheme")
```
Or, we can use two colons `::` to tell R to use an individual function from a package without loading it:
```{r eval=FALSE}
remotes::install_github("grattan/grattantheme")
```
It usually makes sense to load a package with `library()`, unless you only need to use one of its function once or twice. There's no harm to using the `::` operator even if you have already loaded a package with `library()`. This can remove ambiguity both for R and for humans reading your code, particularly if you're using an obscure function - it makes it clearer where the function comes from.
## Package versions
### Updating packages
It's generally a good idea to keep your packages up-to-date. The easiest way to do this is to run this code:
```{r eval = FALSE}
devtools::update_packages()
```
This will upgrade all your packages - including those you've installed from CRAN and Github.
When you run the above command, it will prompt you to ask which packages you want to update - press 1 for 'All'.
If it asks you 'Do you want to install from sources the package which needs compilation?' type 'no' and press enter.^[Nothing against installing from source, but this part of the guide is aimed at people who are not familiar with R and may not have the tools installed to build from source.]
### Downgrading packages
Sometimes, when packages change, their functions evolve. The arguments to a function might change, or a function might be phased out ('deprecated') in favour of another. You can usually just adapt your workflow to the package's new version without much fuss. If you find this isn't the case, and you want to downgrade to an earlier version of a package, it's straightforward. Just use the `install_version()` function, like this:
```{r eval = FALSE}
devtools::install_version("devtools", "1.13.3")
```
It's rare that you'd need to downgrade. Better to stay up to date, and adapt your code when necessary to changes in packages.
## Packages commonly used at Grattan
Some packages we use at Grattan - like the `tidyverse` collection of packages - are very popular among R users. Some - like the `grattantheme` package - are specific to Grattan Institute. Others - like the `readabs` package - are made by Grattan people, useful at Grattan, but also used outside of the Institute. To install a core set of packages we use at Grattan, [click here and run the code chunk](#install-grattan-packages).
### Using the `tidyverse`
The main packages in the `tidyverse` include:
* *ggplot2* for making beautiful, customisable graphs
* *dplyr* for manipulating data frames
* *tidyr* for tidying your data
* *readr* for importing data from a broad range of formats
* *purrr* for functional programming
* *stringr* for manipulating strings of text
All these packages (and more!) will automatically be loaded for you when you run the command^[There's no need to install or load the individual `tidyverse` packages - like `dplyr` - separately. Just install them all together, and load them with the single `library(tidyverse)` command. That way, you don't need to remember which functions come from `tidyr` and which from `dplyr` - they're all just `tidyverse` functions.]:
```{r library-tidyverse}
library(tidyverse)
```
A range of other packages are installed on your machine as part of the `tidyverse.` These include:
* *readxl* for importing Excel spreadsheets into R
* *haven* for importing Stata, SAS and SPSS data
* *lubridate* for working with dates
* *rvest* for scraping websites
Although these packages are installed as part of the `tidyverse`, they aren't loaded automatically when you run `library(tidyverse)`. You'll need to load them individually, like:
```{r, eval = FALSE}
library(lubridate)
library(readxl)
```
### Grattan-specific packages {#grattan-specific-packages}
A range of Grattan people have written packages that come in handy at Grattan.
* *grattantheme* The `grattantheme` package, by Matt Cowgill and Will Mackey, helps to make your ggplot2 charts Grattan-y. We cover the package extensively in the data visualisation chapter. Find it on [Github](https://github.com/grattan/grattantheme).
* *grattandata* The `grattandata` package, by Matt Cowgill and Jonathan Nolan, is used to load microdata from the Grattan microdata repository. We cover this in the [reading data](#reading-data) chapter. Find it on [Github](https://github.com/grattan/grattandata).
* *grattan* The `grattan` package, created by Hugh Parsonage, contains two broad sets of functions. One set of functions (sometimes known by the nickname "Grattax") is used for modelling the personal income tax system. Another set of functions ("Grattools") are useful for a lot of our work, like converting dates to financial years (`grattan::date2fy()`) or a version of `dplyr::ntile()` that uses weights (`grattan::weighted_ntile()`). Find it on [Github](https://github.com/hughparsonage/grattan).
* *grattanReporter* The `grattanReporter` package, created by Hugh Parsonage (and currently maintained by Will Mackey), runs a series of checks on Grattan's LaTeX reports to ensure consistent style. Find it on [Github](https://github.com/grattan/grattanreporter).
### Other commonly-used, useful packages
There are other packages we commonly use at Grattan, including some developed by Grattan staff. These include:
* *strayr* This package, by Will Mackey and others, is very handy for working with Australian classifications (eg ANZSCO, ANZSIC, ASCED), and for using Australian spatial data. You'll want it if you're going to be making maps. Find it on [Github](https://github.com/runapp-aus/strayr).
* *readabs* The `readabs` package, by Matt Cowgill, provides an easy way to download, tidy, and import ABS time series data in R. Find it on [Github](https://github.com/mattcowgill/readabs).