-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathwg_tools.qmd
57 lines (30 loc) · 6.2 KB
/
wg_tools.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
---
title: "Suggested Tools"
---
We primarily work on _collaborative projects where we synthesize existing data to draw larger inferences than any single data set would allow._ Because of this, we strongly recommend that each tool used by a team accomplish as many purposes as possible to avoid a project accruing endless "one off" tools that fit a specific purpose but do not accomplish any other tasks. **Streamlining your workflow to just a few broadly useful programs also helps reduce the barrier to entry** to training new team members and ensures that within team protocols are clear and concise to follow.
The analytical software options available at NCEAS follow directly from this ethos. Although occasionally providing specialty programs (upon request), we have otherwise carefully assembled a powerful lineup of _scripted_, _cross-platform_, _scalable_ applications that are _well-supported_, generate _robust results_, and permit _batch processing_. Although these packages require an initial time investment to learn, and may seem intimidating to scientists familiar with only "point-and-click" software, we strongly argue that the long-term payoff is well worth the time investment at the start.
We recommend the following programs, websites, and platforms:
## Collaborative Tools
### Video Conferencing
<img src="images/logos/zoom.png" alt="Zoom logo" width="15%" align="right"/>
Between your in-person meetings at NCEAS (and during them if you have remote participants!) we recommend that you use [Zoom](https://zoom.us/). During the COVID-19 pandemic, Zoom became--arguably--_the_ video chatting program and the value of your group's familiarity with this program should not be underestimated.
### Messaging
<img src="images/logos/slack.png" alt="Slack logo" width="25%" align="right"/>
Your group will be doing _a lot_ of written communication so we recommend creating a [Slack community](https://slack.com/). **The advantage of Slack over email is that you can have faster real-time conversations among multiple people without crowding your inbox.** Slack also supports attaching files and "pinning" posts to make it easier to find them later. You can also create "channels" for sub-topics within your group (e.g., `#paper 1`, `#analysis`, etc.) that will allow your group members to track only the subtopics that are relevant to them. You can [download the Slack desktop app here](https://slack.com/download). Another team at NCEAS has created [this guide](https://help.nceas.ucsb.edu/NCEAS/Computing/chat_room_setup.html) for getting set up on Slack.
### Document Sharing
<img src="images/logos/google-drive.png" alt="Google Drive logo" width="25%" align="right"/>
We recommend that you share documents among your group using Google Drive to help you centralize your information. Our team will set up a Shared Google Drive for every working group. This drive can also be used as a staging area for your data during the data collection and exploration phases of your project. This is in part because many people are already at least somewhat familiar with Google Drive. In addition, there is an R package called `googledrive` that will allow you to directly connect any R scripts to Google Drive files and folders. This can be great for ensuring that all scripts use the same raw data and can be useful for exporting synthesized data products to a given Drive folder.
For the more analytical intensive phase of your project, NCEAS also has several analytical servers that can be used to store large data that need to be processed frequently. We'll discuss NCEAS' servers in greater detail further on in the onboarding process.
## Computing Tools
### Code Storage & Versioning
<img src="images/logos/git.png" alt="Git logo" width="20%" align="right"/>
Git is a great program to use for tracking changes to your code as your project grows and evolves. Git is one type of "version control" software which is specifically built to track changes to code in an informative and useful way. It is analogous to using "track changes" in Microsoft Word but is built in a more seamless way and is purpose-built for working with code. You can install Git following the instructions [here](https://lter.github.io/workshop-github/#install-git).
<img src="images/logos/github.png" alt="GitHub logo" width="15%" align="right"/>
Similarly to R versus RStudio, **we recommend that you create an account with [GitHub](https://github.com/)** so that the changes you track with Git can be viewed and interacted with in a straightforward way. GitHub is the industry standard for tracking changes to code and is a great way of making code for a specific project publicly-accessible once you are at a stage where that feels appropriate. We offer an introductory [workshop on Git and GitHub](https://lter.github.io/workshop-github/) that we are happy to offer to your group if that is of interest!
### Analysis
<img src="images/logos/r.png" alt="Logo for the R programming language" width="15%" align="right"/>
Our team is most well-versed in [R](https://cran.r-project.org/) and many data synthesis ends can be accomplished in this language. R's primary advantage is its amazing user community! R users have developed all kinds of custom functions and packages so even when what you want to do is not supported by an R package on CRAN, chances are that you can find the tools you need online!
<img src="images/logos/rstudio.png" alt="RStudio logo" width="25%" align="right"/>
**We strongly recommend using R through [RStudio](https://www.rstudio.com/)** as the RStudio interface allows several 'quality of life' improvements that you will grow to greatly appreciate if you're not already an RStudio user. RStudio enables an easier connection to Git (see below), facilitates use of the command line, and allows you to generate PDF or HTML reports with embedded code chunks for easy sharing in and outside of your group.
<img src="images/logos/python.png" alt="Python logo" width="30%" align="right"/>
If your group is working with genomes or other large data files, you may find R's memory limit to be a serious hindrance. If this is the case (or if you prefer to not use R!) we suggest [Python](https://www.python.org/) as an alternative. Python works better with larger files and is also a well-respected programming language.