Skip to content
/ GCAS Public

The GEO Cancer Analysis Suite (GCAS) is a versatile R package designed for analyzing and visualizing gene expression data in cancer research.

License

Notifications You must be signed in to change notification settings

WangJin93/GCAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Graphic abstract

  1. Introduction

Title: Visualization of Functional Enrichment Result

Version: 1.0.0

Author: Jin wang ([email protected])

Maintainer: Jin wang ([email protected])

Description: The GEO Cancer Analysis Suite (GCAS) is a versatile R package designed for analyzing and visualizing gene expression data in cancer research. GCAS allows for the comparison of gene expression between normal and tumor samples, correlation analysis, immune infiltration analysis, differential expression analysis, co-expression analysis, and enrichment analysis. It includes a Shiny app for interactive visualization and can also be used directly within the R environment for advanced scripting. GCAS is ideal for researchers, clinicians, and bioinformaticians seeking to explore cancer genomics data efficiently and effectively.

Depends: R (>= 3.5.0)

Imports: RobustRankAggreg, VennDiagram, digest, dplyr, ggpubr, ggrepel, httr, jsonlite, meta, psych, shiny, stringr, sva, tibble, RColorBrewer, clusterProfiler, dplyr, ggrepel, grid, ggplot2

Encoding: UTF-8

URL: https://github.com/WangJin93/GCAS

Bug Reports: https://github.com/WangJin93/GCAS/issues

License: MIT License

  1. Installation
remotes::install_github("WangJin93/GCAS")
  1. Function Reference

3.1. datasets_summary

Description

This function summarizes the sample counts and pairing status for datasets based on a specified tumor subtype.

Usage

datasets_summary(tumor_subtype = "NSCLC")

Arguments

tumor_subtype A character string specifying the tumor subtype to filter the datasets. Default is "NSCLC".

Value

A data frame summarizing the number of Normal, Tumor, and Premalignant samples, as well as the pairing status for each dataset.

Examples

dt_sum <- datasets_summary("NSCLC")

3.2. get_expr_data

Description

Retrieve expression data for specified genes from given datasets.

Usage

get_expr_data(datasets, genes)

Arguments

datasets A character vector of dataset identifiers.
genes A character vector of gene identifiers.

Value

A dataframe containing expression data for the specified genes from the given datasets.

Examples

results <- get_expr_data(datasets = "GSE74706", genes = c("GAPDH","TNS1"))
results <- get_expr_data(datasets = c("GSE62113","GSE74706"), genes = "GAPDH")
results <- get_expr_data(datasets = c("GSE62113","GSE74706"), genes = c("SIRPA","CTLA4","TIGIT","LAG3","VSIR","LILRB2","SIGLEC7","HAVCR2","LILRB4","PDCD1","BTLA"))

3.3. viz_TvsN

Description

Visualizing the different expression of mRNA expression data between Tumor and Normal tissues in GEO database.

Usage

viz_TvsN(
     df,
     df_type = c("single", "multi_gene", "multi_set"),
     tumor_subtype = NULL,
     Show.P.Value = TRUE,
     Show.P.label = TRUE,
     Method = "t.test",
     Values = c("#00AFBB", "#FC4E07"),
     Show.n = TRUE,
     Show.n.location = "default"
)

Arguments

df Gene expression data obtained from get_expr_data().
df_type The type of gene expression data, one Value of "single","multi_gene", and "multi_set".
Show.P.Value Whether to display the results of differential analysis, default TRUE.
Show.P.label Whether to display significance markers for differential analysis, default TRUE.
Method Methods of differential analysis, "t.test" or "limma", default "t.test".
Values Color palette for normal and tumor groups. Default c("#00AFBB", "#FC4E07").
Show.n Display sample size.
Show.n.location Y-axis position displayed for sample size.

Examples

df_single <- get_expr_data(datasets = "GSE27262",genes = c("TP53"))
df_multi_gene <- get_expr_data(datasets = "GSE27262",genes = c("TP53","TNS1"))
df_multi_set <- get_expr_data(datasets = c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113"), genes = "GAPDH")
viz_TvsN(df_single,df_type = "single")
viz_TvsN(df_multi_gene,df_type = "multi_gene",tumor_subtype ="LC")
viz_TvsN(df_multi_set,df_type = "multi_set")

3.4. data_summary

Description

Compute summary statistics (mean, standard deviation, etc.) and perform hypothesis testing (t-test or Wilcoxon test) for gene expression data across different datasets.

Usage

data_summary(df, tumor_subtype = NULL, method = "t.test")

Arguments

df A dataframe containing gene expression data, with columns: 'dataset', 'subtype', and the gene expression Values.
tumor_subtype A character string specifying the tumor subtype to be analyzed. Default is NULL, which means all tumor subtypes will be included.
method A character string specifying the method for hypothesis testing. Options are "t.test" for t-test and "wilcox" for Wilcoxon test. Default is "t.test".

Value

A dataframe with summary statistics and p-Values for each dataset.

Examples

df <- get_expr_data(datasets = c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113"), genes = "GAPDH")
results <- data_summary(df, tumor_subtype = "LUAD")

3.5. plot_meta_forest

Description

Plotting volcano plot for DEGs between tumor and normal samples in CPTAC datasets.

This function performs a meta-analysis on multiple datasets and generates a forest plot. It also tests for publication bias.

Usage

plot_meta_forest(results, method = "wilcox", k.min = 10)

Arguments

results Data frame. The results data frame containing columns for dataset, n_Tumor, mean_Tumor, sd_Tumor, n_Normal, mean_Normal, and sd_Normal.
method Character. The statistical method to use (default is "wilcox").
k.min Integer. Minimum number of studies for bias test (default is 7).
cohort Data cohort, for example, "LUAD_APOLLO", "LUAD_CPTAC".
data_input Expression data obtained from get_expr_data() function.

Value

A forest plot object.

Examples

df <- get_expr_data(datasets = c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113"), genes = "GAPDH")
results <- data_summary(df, tumor_subtype = "LUAD")
plot_meta_forest(results)

3.6. plot_logFC_heatmap

Description

This function generates a heatmap of log fold changes (logFC) for a gene across different datasets. It includes significance annotations based on p-Values.

Usage

plot_logFC_heatmap(results, direction = "horizontal")

Arguments

results Data frame. The results data frame containing columns for dataset, gene, logFC, and p.Value.
direction Ploting direction, horizontal or vertical.

Value

A ggplot object representing the heatmap.

Examples

df <- get_expr_data(datasets = c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113"), genes = "GAPDH")
results <- data_summary(df, tumor_subtype = "LUAD")
heatmap <- plot_logFC_heatmap(results)
print(heatmap)

3.7. plot_logFC_scatter

Description

This function generates a scatter of log fold changes (logFC) for a gene across different datasets. It includes significance annotations based on p-Values.

Usage

plot_logFC_scatter(
     results,
     p.cut = 0.05,
     logFC.cut = 1,
     colors = c("blue", "grey20", "red")
)

Arguments

results Data frame. The results data frame containing columns for dataset, gene, logFC, and p.Value.
p.cut Numeric. The cutoff for adjusted p-Value to determine significance. Default is 0.05.
logFC.cut Numeric. The cutoff for log fold change to determine significance. Default is 1.
colors A vector of color panel, default c("blue", "grey20", "red").

Value

A ggplot object representing the scatter.

Examples

df <- get_expr_data(datasets = c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113"), genes = "GAPDH")
results <- data_summary(df, tumor_subtype = "LUAD")
scatter <- plot_logFC_scatter(results, logFC.cut = 0.5, colors = c("blue","grey20", "red"))
print(scatter)

3.8. cor_cancer_genelist

Description

Perform correlation analysis of the mRNA/protein expression data in CPTAC database.

Usage

cor_cancer_genelist(
     dataset = "GSE62113",
     id1 = "STAT3",
     id2 = c("TNS1", "TP53"),
     tumor_subtype = NULL,
     sample_type = c("Tumor", "Normal"),
     cor_method = "pearson"
)

Arguments

dataset Dataset name. Use 'dataset$Abbre' to get all datasets.
id1 Gene symbol, you can input one gene symbol.
id2 Gene symbols, you can input one or multiple symbols.
tumor_subtype Tumor subtype used for correlation analysis, default is NULL.
sample_type Sample type used for correlation analysis, default all types: c("Tumor", "Normal").
cor_method Method for correlation analysis, default "pearson".

Value

A list containing the correlation results and the merged data.

Examples

results <- cor_cancer_genelist(dataset = "GSE62113",
     id1 = "STAT3",tumor_subtype = "LC",
     id2 = c("TNS1", "TP53"),
     sample_type = c("Tumor", "Normal"),
     cor_method = "pearson")

3.9. cor_gcas_drug

Description

Calculate the correlation between target gene expression and anti-tumor drug sensitivity in multiple datasets.

Usage

cor_gcas_drug(df, cor_method = "pearson", Target.pathway = c("Cell cycle"))

Arguments

df The expression data of the target gene in multiple datasets, obtained by the get_expr_data() function.
cor_method Method for correlation analysis, default "pearson".
Target.pathway The signaling pathways of anti-tumor drug targets, default "Cell cycle". Use "drug_info" to get the detailed information of these drugs.

Examples

dataset <- c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210",
     "GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072",
     "GSE74706","GSE18842","GSE62113")
df <- get_expr_data(genes = "TNS1", datasets = dataset)
result <- cor_gcas_drug(df, Target.pathway = c("Cell cycle"))

3.10. cor_gcas_genelist

Description

Perform correlation analysis of the expression data in multiple datasets.

Usage

cor_gcas_genelist(
     df,
     geneset_data,
     tumor_subtype = NULL,
     sample_type = c("Tumor", "Normal"),
     cor_method = "pearson"
)

Arguments

df The expression data of the target gene in multiple datasets, obtained by the get_expr_data() function.
geneset_data The expression data of a genelist in multiple datasets, obtained by the get_expr_data() function.
tumor_subtype Tumor subtype used for correlation analysis, default is NULL.
sample_type Sample type used for correlation analysis, default all types: c("Tumor", "Normal").
cor_method Method for correlation analysis, default "pearson".

Examples

genelist <- c("SIRPA","CTLA4","TIGIT","LAG3","VSIR","LILRB2","SIGLEC7","HAVCR2","LILRB4","PDCD1","BTLA")
dataset <- c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113")
df <- get_expr_data(genes = "TNS1",datasets = dataset)
geneset_data <- get_expr_data(genes = genelist ,datasets = dataset)
result <- cor_gcas_genelist(df, geneset_data, sample_type = c("Tumor"))

3.11. cor_gcas_TIL

Description

Calculate the correlation between target gene expression and immune cells infiltration in multiple datasets.

Usage

cor_gcas_TIL(df, cor_method = "spearman", TIL_type = "TIMER")

Arguments

df The expression data of the target gene in multiple datasets, obtained by the get_expr_data() function.
cor_method Method for correlation analysis, default "spearman".
TIL_type Algorithm for calculating immune cell infiltration, default "TIMER".

Examples

dataset <- c("GSE27262", "GSE7670", "GSE19188", "GSE19804", "GSE30219",
     "GSE31210", "GSE32665", "GSE32863", "GSE43458", "GSE46539",
     "GSE75037", "GSE10072", "GSE74706", "GSE18842", "GSE62113")
df <- get_expr_data(genes = "TNS1", datasets = dataset)
result <- cor_gcas_TIL(df, cor_method = "spearman", TIL_type = "TIMER")

3.12. viz_cor_heatmap

Description

Presenting correlation analysis results using heat maps based on ggplot2.

Usage

viz_cor_heatmap(r, p)

Arguments

r The correlation coefficient matrix r of the correlation analysis results obtained from the functions cor_pancancer_genelist(), cor_pancancer_TIL(), and cor_pancancer_drug().
p The P-Value matrix p of the correlation analysis results obtained from the functions cor_pancancer_genelist(), cor_pancancer_TIL(), and cor_pancancer_drug().

Examples

genelist <- c("SIRPA","CTLA4","TIGIT","LAG3","VSIR","LILRB2","SIGLEC7","HAVCR2","LILRB4","PDCD1","BTLA")
dataset <-  c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113")
df <- get_expr_data(genes = "TNS1", datasets = dataset)
geneset_data <- get_expr_data(genes = genelist ,datasets = dataset)
result <- cor_gcas_genelist(df, geneset_data, sample_type = c("Tumor"))
viz_cor_heatmap(result$r, result$p)

3.13. viz_cor_volcano

Description

Plotting volcano plot for the correlation analysis of a specific gene from the result of cor_gcas_genelist().

Usage

viz_cor_volcano(
     cor_result,
     item,
     p.cut = 0.05,
     r.cut = 0.3,
     colors = c("blue", "grey20", "red")

)

Arguments

cor_result DataFrame. The results from cor_gcas_genelist analysis.
item Character. Gene symbol or drug name or immune cell.
p.cut Numeric. The cutoff for adjusted p-Value to determine significance. Default is 0.05.
r.cut Numeric. The cutoff for log fold change to determine significance. Default is 0.3.
colors A vector of color panel, default c("blue", "grey20", "red").

Value

A ggplot2 object representing the volcano plot.

Examples

genelist <- c("SIRPA","CTLA4","TIGIT","LAG3","VSIR","LILRB2","SIGLEC7","HAVCR2","LILRB4","PDCD1","BTLA")
dataset <- c("GSE27262","GSE7670","GSE19188","GSE19804","GSE30219","GSE31210","GSE32665","GSE32863","GSE43458","GSE46539","GSE75037","GSE10072","GSE74706","GSE18842","GSE62113")
df <- get_expr_data(genes = "TNS1",datasets = dataset)
geneset_data <- get_expr_data(genes = genelist ,datasets = dataset)
cor_result <- cor_gcas_genelist(df, geneset_data, sample_type = c("Tumor"))
viz_cor_volcano(cor_result, "LILRB4", p.cut = 0.5, r.cut = 0.1,colors = c("blue" ,"grey20", "red"))

3.14. viz_corplot

Description

Scatter plot with sample size (n), correlation coefficient (r) and p Value (p.Value).

Usage

viz_corplot(
     data,
     a,
     b,
     method = "pearson",
     x_lab = " relative expression",
     y_lab = " relative expression"

)

Arguments

data A gene expression dataset with at least two genes included, rows represent samples, and columns represent gene expression in the matrix.
a Gene A
b Gene B
method Method for correlation analysis, "pearson" or "spearman".
x_lab X-axis label.
y_lab Y-axis label.

3.15. get_OSF_data

Description

Retrieve GEO expression datasets and sample information from the OSF repository.

Usage

get_OSF_data(table = "GSE19188", action = "geo_data")

Arguments

table A character string specifying the GEO dataset identifier (e.g., "GSE19188").
action A character string specifying the action, either "geo_data" to retrieve the expression data or "sample_info" to retrieve the sample information.

Value

A data frame containing the requested data.

Examples

df <- get_OSF_data(table = "GSE74706", action = "sample_info")
df2 <- get_OSF_data(table = "GSE74706", action = "geo_data")

3.16. DEGs_analysis

Description

Perform differential expression gene analysis on a given dataset.

Usage

DEGs_analysis(df, tumor_subtype = NULL, ...)

Arguments

df A dataframe containing gene expression data with sample IDs as columns.
tumor_subtype A character vector specifying the tumor subtypes to be analyzed. Default is NULL, which means all tumor subtypes will be included.
... Additional Arguments passed to 'lmFit', 'contrasts.fit', and 'eBayes'.

Value

A dataframe with DEG analysis results, including log fold changes and p-Values.

Examples

df <- get_OSF_data(table = "GSE74706", action = "geo_data")
results <- DEGs_analysis(df, tumor_subtype = c("NSCLC"))

3.17. plot_volcano

Description

Plotting volcano plot for DEGs between tumor and normal samples in CPTAC datasets.

Usage

plot_volcano(
     results,
     p.cut = 0.05,
     logFC.cut = 1,
     show.top = FALSE,
     show.labels = NULL,
     colors = c("blue", "grey20", "red")

)

Arguments

results DataFrame. The results from DEGs analysis containing columns 'adj.P.Val', 'P.Value', 'logFC', and 'gene'.
p.cut Numeric. The cutoff for adjusted p-Value to determine significance. Default is 0.05.
logFC.cut Numeric. The cutoff for log fold change to determine significance. Default is 1.
show.top Logical. If TRUE, labels the top 5 up- and downregulated genes. Default is FALSE.
show.labels Character vector. Specific gene labels to show. Default is NULL.
colors A vector of color panel, default c("blue", "grey20", "red").

Value

A ggplot2 object representing the volcano plot.

Examples

df <- get_OSF_data(table = "GSE74706", action = "geo_data")
results <- DEGs_analysis(df)
plot_volcano(results)

3.18. coexpression_analysis

Description

This function calculates the correlation between a given gene and all other genes in the provided expression matrix. It also provides the corresponding p-Values.

Usage

coexpression_analysis(expression_matrix, gene, method = "pearson")

Arguments

expression_matrix A numeric matrix where rows represent genes and columns represent samples.
gene A character string representing the gene for which the correlations will be calculated.
method A character string specifying the correlation method to be used. Default is "pearson".

Value

A data frame containing gene names, correlation coefficients, and p-Values.

Examples

expression_matrix <- get_OSF_data(table = "GSE74706", action = "geo_data")
     results <- coexpression_analysis(expression_matrix, "RPN1")
     print(results)

3.19. GSEA_analysis

Description

This function performs Gene Set Enrichment Analysis (GSEA) based on either correlation results or limma differential analysis results.

Usage

GSEA_analysis(data, gmt_file, pValue_cutoff = 0.05, data_type = "correlation")

Arguments

data A data frame containing gene names and corresponding Values. For correlation results, the columns should be named 'gene' and 'r'. For limma results, the columns should be named 'gene' and 'logFC'.
gmt_file Path to the GMT file containing gene sets, or directly pass GO/KEGG/Reactome datasets.
pValue_cutoff Numeric, the p-Value threshold for significance. Default is 0.05.
data_type Character, type of the input data. Either "correlation" for correlation analysis results or "limma" for limma differential analysis results.

Value

A GSEA analysis result object.

Examples

df <- get_OSF_data(table = "GSE74706", action = "geo_data")
results <- DEGs_analysis(df,tumor_subtype =c("NSCLC"))
gsea_result <- GSEA_analysis(results,  gmt_file = BP_GMT_7.5.1, data_type = "limma")
results <- coexpression_analysis(df,"RPN1")
gsea_result <- GSEA_analysis(results,  gmt_file = BP_GMT_7.5.1)

3.20. get_DEGs_list

Description

Extract significantly upregulated and downregulated genes from multiple DEG analysis results.

Usage

get_DEGs_list(DEGs_lists, logFC_cut = 1, p_cut = 0.05)

Arguments

DEGs_lists A list of dataframes containing DEG analysis results.
logFC_cut A numeric Value specifying the log fold change cutoff for significant DEGs. Default is 1.
p_cut A numeric Value specifying the p-Value cutoff for significant DEGs. Default is 0.05.

Value

A list containing two lists: one for upregulated genes and one for downregulated genes across the provided datasets.

Examples

df1 <- get_OSF_data(table = "GSE31210", action = "geo_data")
results1 <- DEGs_analysis(df1)
df2 <- get_OSF_data(table = "GSE19188", action = "geo_data")
results2 <- DEGs_analysis(df2)
DEGs_lists <- list("GSE31210" = results1, "GSE19188" = results2)
results <- get_DEGs_list(DEGs_lists)

3.21. plot_venn

Description

This function plots a Venn diagram for lists of differentially expressed genes (DEGs) across multiple datasets.

Usage

plot_venn(results, fill_colors = NULL, palette = "Set1", lty = 2, ...)

Arguments

results List of character vectors. Each vector contains DEGs for a specific dataset.
fill_colors Character vector. Colors to fill the Venn diagram circles. Default is NULL, which uses a palette.
palette Character. Name of the RColorBrewer palette to use if fill_colors is not specified. Default is "Set1".
lty Numeric. Line type for the circles in the Venn diagram. Default is 2 (dashed line).
... Additional Arguments passed to venn.diagram function.

Value

A list of intersected DEGs.

Examples

df1 <- get_OSF_data(table = "GSE31210", action = "geo_data")
results1 <- DEGs_analysis(df1)
df2 <- get_OSF_data(table = "GSE19188", action = "geo_data")
results2 <- DEGs_analysis(df2)
DEGs_lists <- list("GSE31210" = results1, "GSE19188" = results2)
results <- get_DEGs_list(DEGs_lists)
plot_venn(results$DEG_up, palette = "Set1")
plot_venn(results$DEG_up, fill_colors = c("red", "green", "blue"), alpha = 0.5, cex = 1.5)

3.22. RRA_analysis

Description

This function performs RRA analysis on differentially expressed genes (DEGs) lists obtained from various studies. It ranks genes based on their differential expression and aggregates the ranks to identify consistently regulated genes across studies.

Usage

RRA_analysis(
     DEGs_lists,
     top.num = 0,
     rra.p = 0.05,
     logFC_cut = 1,
     p_cut = 0.05
)

Arguments

DEGs_lists A list of DEGs data frames. Each data frame should contain at least a 'gene' column and a 'logFC' column.
top.num Numeric, the number of top genes to select based on their ranks. Default is 0, which selects all genes passing the thresholds.
rra.p Numeric, the p-Value threshold for RRA. Default is 0.05.
logFC_cut Numeric, the log fold change threshold for filtering genes. Default is 1.
p_cut Numeric, the p-Value threshold for filtering genes. Default is 0.05.

Value

A list containing the number of up- and down-regulated genes and a data matrix of aggregated log fold changes.

Examples

df1 <- get_OSF_data(table = "GSE31210", action = "geo_data")
results1 <- DEGs_analysis(df1)
df2 <- get_OSF_data(table = "GSE19188", action = "geo_data")
results2 <- DEGs_analysis(df2)
DEGs_lists <- list("GSE31210" = results1, "GSE19188" = results2)
RRA_results <- RRA_analysis(DEGs_lists)
ComplexHeatmap::pheatmap(RRA_results$RRA_results)

3.23. combat_datasets

Description

This function performs batch correction on multiple datasets using the ComBat function from the sva package.

Usage

combat_datasets(tables, tumor_subtype = NULL)

Arguments

tables A character vector of table names to be processed.
tumor_subtype A character string specifying the tumor subtype to filter the datasets. If NULL, all subtypes are included.

Value

A list containing the combined and batch-corrected data matrix and the sample information.

Examples

tables <- c("GSE31210", "GSE74706")
     result <- combat_datasets(tables, tumor_subtype = "LC")
     combined_data <- result$combined_data
     sample_info <- result$sample_info

3.24. merge_clinic_data

Description

Get sample_info data and merge it with expression data.

Usage

merge_clinic_data(table = "GSE19188", data_input)

Arguments

table Character. The name of the dataset table to retrieve sample information from. Default is "GSE19188".
data_input Data frame. Expression data obtained from get_expr_data() function.

Value

Data frame. Merged data containing both expression data and sample information.

Examples

data_input <- get_expr_data("GSE19188", "TP53")
results <- merge_clinic_data("GSE19188",data_input)

3.25. extract_subset

Description

This function searches for specified names within a nested list structure and extracts the names of found subsets.

Usage

extract_subset(lst, names_to_find)

Arguments

lst A list which may contain nested lists.
names_to_find A character vector of names to search for within the list.

Value

A character vector of unique names from the found subsets.

Examples

nested_list <- list(
     a = list(
     b = 1,
     c = list(d = 2)
     ),
     e = 3
)
names_to_search <- c("b", "d", "e")
result <- extract_subset(nested_list, names_to_search)
print(result)
  1. Datasets

4.1. dataset_info

Description

Summary of the general informations of the GEO datasets in this tool.

Usage

data("dataset_info")

Format

A data frame with 288 observations on the following 6 variables.

4.2. abbr_full

Description

The full name of tumor abbreviation

Usage

data("abbr_full")

Format

A data frame with 132 observations on the following 3 variables.

4.3. Subtype

Description

Subtype list of al cancer types

Usage

data("subtype")

Format

The format is: List of 21

4.4. TIL_map

Description

Mapping of immune cell types and algorithms

Usage

data("TIL_map")

Format

A data frame with 137 observations on the following 2 variables.

4.5. sample_subtype

Description

Subtype information of all samples

Usage

data("sample_subtype")

Format

A data frame with 28032 observations on the following 5 variables.

4.6. GCAS_TIL

Description

Immune cell infiltration score of all samples calculated based on IOBR package.

Usage

data("GCAS_TIL")

Format

A data frame with 19538 observations on the following 138 variables.

4.7. GCAS_drug

Description

Anti-tumor drug sensitivity of all samples calculated based on oncoPredict package and GDSC database.

Usage

data("GCAS_drug")

Format

A data frame with 19816 observations on the following 199 variables.

4.8. drug_info

Description

Anti-tumor drugs informations obtained from GDSC2.0 datasets.

Usage

data("drug_info")

Format

A data frame with 198 observations on the following 6 variables.

About

The GEO Cancer Analysis Suite (GCAS) is a versatile R package designed for analyzing and visualizing gene expression data in cancer research.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published