-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMia_Dust_FunGuilds.Rmd
190 lines (141 loc) · 5.58 KB
/
Mia_Dust_FunGuilds.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
---
title: "Mia_Dust_FunGuilds"
author: "Nat Pombubpa"
date: "Updated on October 19, 2018"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
###STEP1: Load all necessary packages for analysis
More information about Phyloseq can be found at the following link: [Phyloseq](https://joey711.github.io/phyloseq/)
If you get error in this step, you probably need to install any packages which causes error.
```{r message=FALSE, warning=FALSE}
library(ape)
library(vegan)
library(dplyr)
library(scales)
library(grid)
library(reshape2)
library(phyloseq)
library(magrittr)
library(ggplot2)
library(ggpubr)
library(plyr)
```
###STEP2: Import Mapping file (metadate file)
1.Check mapping file before import to R, R will automatically change sample name that starts with number or contain “-” in sample name. If you get error in this step, you should check sample name first.
2.First column of first row should not start with #, R will not read the first row that starts with #
3.Please make sure that your metadata and FunGuilds SampleID are the same.
```{r}
meta = read.table("SierraMapIsoF.csv",header=TRUE,row.names=1,sep=",",stringsAsFactors=FALSE)
```
###STEP3: Check if your metadata file has been import successfully and correctly
The output will show a table of your metadata file (mapping file).
*If you do not have header, you might start your first row with #
```{r}
head(meta)
```
```{r}
dim(meta)
```
###STEP4: Construct sample_data-class using imported metadata
```{r}
sampleData <- sample_data(meta)
```
###STEP5: Import FunGuilds table
FunGuilds table from Dust data is “DustFungi.guilds.txt”. This is otu table which also includes Taxon,Trophic Mode, Guild, Growth Morphology etc.)
'FG' contains the entire table, but 'FGotus' is used to select only the otu table part from FunGuilds table (abundance + OTU ID)
```{r}
FG <- read.table("DustFungi.guilds.txt",header=T,sep="\t",row.names=1)
FGotus <- select(FG, -(Taxonomy:Citation.Source))
FGotumat <- as(as.matrix(FGotus), "matrix")
FGOTU = otu_table(FGotumat, taxa_are_rows = TRUE)
```
Entire Data from FunGuilds table (first 6 lines)
```{r}
head(FG)
```
```{r}
dim(FG)
```
Selcted Data from FunGuilds table (first 6 lines)
```{r}
head(FGOTU)
```
```{r}
dim(FGOTU)
```
###STEP6: Import Functional Guilds (Trophic Mode, Guild, Growth.Morphology) as Taxonomy feature
```{r}
FGtaxmat <- select(FG, Confidence.Ranking, Trophic.Mode, Guild, Growth.Morphology, Taxon)
FGtaxmat <- as(as.matrix(FGtaxmat),"matrix")
FGTAX = tax_table(FGtaxmat)
```
```{r}
head(FGTAX)
```
###STEP7: Construct Phyloseq object
To construct phyloseq object, otu table, taxonomy table, and sampleData are required.
```{r}
fgps = phyloseq(FGOTU,FGTAX,sampleData)
```
Check phyloseq object
```{r}
fgps
```
###STEP8: Filtering FunGuilds Data
Remove any OTUs which has less than 10 reads
Remove Guilds that have Confidence.Ranking = Possible
Remove na data
Remove NULL data
Note: you might want to check the entire data before you use these filters.
```{r}
fgps.prune = prune_taxa(taxa_sums(fgps) > 10, fgps)
#fgps.prune.confidence1 = subset_taxa(fgps.prune, Confidence.Ranking!="Possible")
fgps.prune.no.na = subset_taxa(fgps.prune, Trophic.Mode!="-")
#fgps.prune.no.null = subset_taxa(fgps.prune.no.na, Guild!="NULL")
#fgps.prune.no.null = subset_taxa(fgps.prune.no.null, Growth.Morphology!="NULL")
```
Check phyloseq object after filtering
```{r}
fgps.prune.no.na
```
###STEP9: Plot all FunGuilds data from Trophic.Mode, Guild, Growth.Morphology
x = any environmental variable (in this case, I tested with Site)
fill = Functional Guilds that you want to plot (Trophic.Mode, Guild, Growth.Morphology)
```{r fig.height=6, fig.width=10, fig.align="center"}
ggplot(data = psmelt(fgps.prune), mapping = aes_string(x = "Site" ,y = "Abundance",
fill = "Trophic.Mode" )) +
geom_bar(stat="identity", position="fill") + ggtitle("Dust Fungi Trophic.Mode by Site (Relative Abundance)")
```
```{r fig.height=6, fig.width=10, fig.align="center"}
plot_bar(fgps.prune, x="Site",fill = "Trophic.Mode") +
ggtitle("Dust Fungi Trophic.Mode by Site (Actual Abundance)") +
geom_bar(aes(color=Trophic.Mode, fill=Trophic.Mode), stat="identity", position="stack")
```
```{r fig.height=6, fig.width=10, fig.align="center"}
ggplot(data = psmelt(fgps.prune.no.na), mapping = aes_string(x = "Site" ,y = "Abundance",
fill = "Trophic.Mode" )) +
geom_bar(stat="identity", position="fill") +
ggtitle("Dust Fungi Trophic.Mode (no NA) by Site (Relative Abundance)")
```
```{r fig.height=6, fig.width=10, fig.align="center"}
plot_bar(fgps.prune.no.na, x="Site",fill = "Trophic.Mode") +
ggtitle("Dust Fungi Trophic.Mode (no NA) by Site (Actual Abundance)") +
geom_bar(aes(color=Trophic.Mode, fill=Trophic.Mode), stat="identity", position="stack")
```
###STEP10: Get the top 50 OTUs and plot only the top abundance
```{r}
ps.prune = prune_taxa(names(sort(taxa_sums(fgps.prune), TRUE))[1:50], fgps.prune)
```
```{r fig.height=6, fig.width=10, fig.align="center"}
plot_bar(ps.prune, x="Site",fill = "Trophic.Mode") +
ggtitle("Dust Fungi Trophic.Mode by Site (Actual Abundance of top 50 OTUs)") +
geom_bar(aes(color=Trophic.Mode, fill=Trophic.Mode), stat="identity", position="stack")
```
```{r fig.height=6, fig.width=12, fig.align="center"}
ggplot(data = psmelt(ps.prune), mapping = aes_string(x = "Site" ,y = "Abundance", fill = "Guild" )) +
geom_bar(stat="identity", position="fill") +
ggtitle("Dust Fungi Guild by Site (Relative Abundance of top 50 OTUs)")
```