This is my (atb) first time making any changes to this document. My goals are:
my current understanding of the experiment is that RNA was collected from white adipose tissue from strains of mice which were iron deficient (by diet) and strains with an adipose specific knockout of TfR1. In addition, some mice were treated with a mimic of long-term cold stress.
Thus the categories of note:
In the set of figures I see that the original colors were:
IAD-Saline: #808080 IAD-CL: #8a4412 IDD-Saline: #aed8e4 IDD-CL: #6592ef
I am using mm38_100.
## The biomart annotations file already exists, loading from it.
color_choices <- list(
"IAD-Saline" = "#808080",
"IAD-CL" = "#8a4412",
"IDD-Saline" = "#aed8e4",
"IDD-CL" = "#6592ef")
So, I now have a table of mouse annotations.
The sample sheet is called ‘Experimental_design_Kim_v1.0.xlsx’ and has a column to point to the names of the count tables to load. Here I combine the metadata, count data, and annotations.
hisat_annot <- mm_annot
##rownames(hisat_annot) <- paste0("gene:", rownames(hisat_annot))
mm38_hisat <- create_expt("sample_sheets/Experimental_design_Kim_v1.0.xlsx",
gene_info = hisat_annot) %>%
set_expt_colors(color_choices) %>%
sanitize_expt_pData(columns = "treatment", spaces = TRUE)
## Reading the sample metadata.
## Did not find the column: sampleid.
## Setting the ID column to the first column.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 16 rows(samples) and 23 columns(metadata fields).
## Matched 25583 annotations and counts.
## Bringing together the count matrix and gene information.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the expressionset to 'expt.rda'.
## The final expressionset has 25760 features and 16 samples.
## The colors used in the expressionset are: #aed8e4, #6592ef, #808080, #8a4412.
## Library sizes of 16 samples,
## ranging from 30,859,887 to 60,727,032.
## The following samples have less than 16744 genes.
## [1] "5214_S1" "3740_S3" "3750_S4" "3772_S5" "3774_S6" "3775_S7" "3766_S9" "3767_S10"
## [9] "3742_S12" "3747_S13" "3741b_S14"
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## A non-zero genes plot of 16 samples.
## These samples have an average 48.73 CPM coverage and 16621 genes observed, ranging from 16058 to
## 17079.
First we will filter out the genes which have low counts across all samples. Then I will normalize without background correction (it was determined there is no need for adjusting for background noise after evaluating the SVA correction).
## Removing 13318 low-count genes (12442 remaining).
## transform_counts: Found 2 values equal to 0, adding 1 to the matrix.
mm38_nb <- normalize_expt(mm38_hisat, filter = TRUE, convert = "cpm",
transform = "log2", batch = "svaseq")
## Removing 13318 low-count genes (12442 remaining).
## Setting 128 low elements to zero.
## transform_counts: Found 128 values equal to 0, adding 1 to the matrix.
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by IDD-Saline, IDD-CL, IAD-Saline, IAD-CL
## Shapes are defined by 1, 2.
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by IDD-Saline, IDD-CL, IAD-Saline, IAD-CL
## Shapes are defined by 1, 2.
## A sankey plot describing the metadata of 16 samples,
## including 14 out of 0 nodes and traversing metadata factors:
## .
How many samples do we have currently for each group?
## treatment
## diet cl salinevehicle
## IAD 4 4
## IDD 5 3
In the conditions column, we have IDD-Saline, IDD-CL, IAD-Saline, and IAD-CL. For the differential expression analysis, we will perform the following contrasts:
Normal Pairwise Contrasts:
Contrasts after adjustment:
I will conduct these contrasts below and provide a volcano plot (both interactive and static), as well as a searchable table of the significant DE results for each contrast.
##
## IDD-Saline IDD-CL IAD-Saline IAD-CL
## 3 5 4 4
## Starting basic pairwise comparison.
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## Basic step 0/3: Transforming data.
## Basic step 1/3: Creating mean and variance tables.
## Basic step 2/3: Performing 10 comparisons.
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## Basic step 3/3: Creating faux DE Tables.
## Basic: Returning tables.
## The data should be suitable for EdgeR/DESeq/EBSeq.
## If they freak out, check the state of the count table
## and ensure that it is in integer counts.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## Starting edgeR pairwise comparisons.
## The data should be suitable for EdgeR/DESeq/EBSeq.
## If they freak out, check the state of the count table
## and ensure that it is in integer counts.
## EdgeR step 1/9: Importing and normalizing data.
## EdgeR step 2/9: Estimating the common dispersion.
## EdgeR step 3/9: Estimating dispersion across genes.
## EdgeR step 4/9: Estimating GLM Common dispersion.
## EdgeR step 5/9: Estimating GLM Trended dispersion.
## EdgeR step 6/9: Estimating GLM Tagged dispersion.
## EdgeR step 7/9: Running glmFit, switch to glmQLFit by changing the argument 'edger_test'.
## EdgeR step 8/9: Making pairwise contrasts.
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$best_libsize.
## Limma step 1/6: choosing model.
## Limma step 2/6: running limma::voom(), switch with the argument 'which_voom'.
## Using normalize.method = quantile for voom.
## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## Finished make_pairwise_contrasts.
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/6: Creating table: IADSaline_vs_IADCL. Adjust = BH
## Limma step 6/6: 2/6: Creating table: IDDCL_vs_IADCL. Adjust = BH
## Limma step 6/6: 3/6: Creating table: IDDSaline_vs_IADCL. Adjust = BH
## Limma step 6/6: 4/6: Creating table: IDDCL_vs_IADSaline. Adjust = BH
## Limma step 6/6: 5/6: Creating table: IDDSaline_vs_IADSaline. Adjust = BH
## Limma step 6/6: 6/6: Creating table: IDDSaline_vs_IDDCL. Adjust = BH
## Limma step 6/6: 1/4: Creating table: IADCL. Adjust = BH
## Limma step 6/6: 2/4: Creating table: IADSaline. Adjust = BH
## Limma step 6/6: 3/4: Creating table: IDDCL. Adjust = BH
## Limma step 6/6: 4/4: Creating table: IDDSaline. Adjust = BH
## Starting noiseq pairwise comparisons.
## The data should be suitable for EdgeR/DESeq/EBSeq.
## If they freak out, check the state of the count table
## and ensure that it is in integer counts.
## The provided conditions are:
## conditions
## IADCL IADSaline IDDCL IDDSaline
## 4 4 5 3
## Choosing among model matrix columns: conditionIADCL, conditionIADSaline, conditionIDDCL, conditionIDDSaline.
## A pairwise differential expression with results from: basic, deseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 10 comparisons.
keepers <- list("IDDCL_vs_IADCL" = c("IDDCL", "IADCL"),
"IDDSaline_vs_IADSaline" = c("IDDSaline", "IADSaline"),
"IDDSaline_vs_IDDCL" = c("IDDSaline", "IDDCL"),
"IADSaline_vs_IADCL" = c("IADSaline", "IADCL"))
mm_de_tables <- combine_de_tables(mm_de_normal, excel="excel/DE_20221003.xlsx",
keepers = keepers, label_column = "mgi_symbol")
## Deleting the file excel/DE_20221003.xlsx before writing the tables.
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 IDDCL_vs_IADCL 132 115 142 117 84 71
## 2 IDDSaline_vs_IADSaline 48 55 53 58 15 7
## 3 IDDSaline_vs_IDDCL 4 27 5 40 0 0
## 4 IADSaline_vs_IADCL 12 5 12 7 1 0
## Plot describing unique/shared genes in a differential expression table.
mm_de_sig <- extract_significant_genes(mm_de_tables,
excel = "excel/DE_sig_20221003.xlsx",
according_to = "deseq")
## Deleting the file excel/DE_sig_20221003.xlsx before writing the tables.
## A set of genes deemed significant according to deseq.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## deseq_up deseq_down
## IDDCL_vs_IADCL 132 115
## IDDSaline_vs_IADSaline 48 55
## IDDSaline_vs_IDDCL 4 27
## IADSaline_vs_IADCL 12 5
res_tbls <- c()
for (t in seq_along(keepers)) {
tbl <- names(keepers)[t]
deseq_tbl <- mm_de_normal$deseq$all_tables[[tbl]]
res_tbls[[tbl]] <- merge(deseq_tbl, hisat_annot, by = "row.names")
res_tbls[[tbl]] <- set_sig_limma(res_tbls[[tbl]],
factors = c(paste(gsub("\\_.*","",tbl), "\nEnriched"),
paste(sub('.*\\_', '', tbl), "\nEnriched")))
res_tbls[[paste0(tbl, "volc_plotly")]] <- volc_plot(res_tbls[[tbl]],
title = paste("Volcano Plot:", tbl),
type = "plotly",
tbl = "limma",
gene_name_col = "external_gene_name")
res_tbls[[paste0(tbl, "volc_ggplot")]] <- volc_plot(res_tbls[[tbl]],
title = paste("Volcano Plot:", tbl),
tbl = "limma")
}
pp(file = "pdf/iddcl_vs_iadcl_volcano.pdf")
mm_de_tables[["plots"]][["IDDCL_vs_IADCL"]][["deseq_vol_plots"]]
dev.off()
## png
## 2
pp(file = "pdf/iddsaline_vs_iadsaline_volcano.pdf")
mm_de_tables[["plots"]][["IDDSaline_vs_IADSaline"]][["deseq_vol_plots"]]
dev.off()
## png
## 2
pp(file = "pdf/iddsaline_vs_iddcl_volcano.pdf")
mm_de_tables[["plots"]][["IDDSAline_vs_IDDCL"]][["deseq_vol_plots"]]
## NULL
## png
## 2
Volcano Plot
Table of Significant DE results
This table is significant results with adjusted p-value < 0.05 and log2FC greater than 0.5.
res_tbls$IDDSaline_vs_IDDCL %>%
filter(abs(logFC) > 0.5 & adj.P.Val < 0.05) %>%
arrange(desc(logFC)) %>%
select("ensembl_gene_id", "mgi_symbol", "description", "logFC", "adj.P.Val", "Significance") %>%
DT::datatable(rownames = FALSE)
GSEA
IDD_genes <- res_tbls$IDDSaline_vs_IDDCL %>%
filter(abs(logFC) >= 0.5 & adj.P.Val <= 0.05) %>%
arrange(desc(abs(logFC)))
IDD_gost <- gost(query = IDD_genes$ensembl_gene_id,
organism = "mmusculus", ordered_query = TRUE,
evcodes = TRUE)
#IDD Saline Enriched
IDD_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
datatable(rownames = FALSE, caption = "GO: IDD Significant GSEA")
IDD_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
datatable(rownames = FALSE, caption = "REAC: IDD Significant GSEA")
Volcano Plot
Table of Significant DE results
res_tbls$IADSaline_vs_IADCL %>%
filter(abs(logFC) > 0.5 & adj.P.Val < 0.05) %>%
arrange(desc(logFC)) %>%
select("ensembl_gene_id", "mgi_symbol", "description", "logFC", "adj.P.Val", "Significance") %>%
DT::datatable(rownames = FALSE)
GSEA
IAD_genes <- res_tbls$IADSaline_vs_IADCL %>%
filter(abs(logFC) >= 0.5 & adj.P.Val <= 0.05) %>%
arrange(desc(abs(logFC)))
IAD_gost <- gost(query = IAD_genes$ensembl_gene_id,
organism = "mmusculus", ordered_query = TRUE,
evcodes = TRUE)
Volcano Plot
Table of Significant DE results
res_tbls$IDDCL_vs_IADCL %>%
filter(abs(logFC) > 0.5 & adj.P.Val < 0.05) %>%
arrange(desc(logFC)) %>%
select("ensembl_gene_id", "mgi_symbol", "description", "logFC", "adj.P.Val", "Significance") %>%
DT::datatable(rownames = FALSE)
GSEA
CL_genes <- res_tbls$IDDCL_vs_IADCL %>%
filter(abs(logFC) >= 0.5 & adj.P.Val <= 0.05) %>%
arrange(desc(abs(logFC)))
CL_gost <- gost(query = CL_genes$ensembl_gene_id,
organism = "mmusculus", ordered_query = TRUE,
evcodes = TRUE)
#CL Saline Enriched
CL_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
datatable(rownames = FALSE, caption = "GO: CL Significant GSEA")
CL_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
datatable(rownames = FALSE, caption = "REAC: CL Significant GSEA")
Volcano Plot
Table of Significant DE results
res_tbls$IDDSaline_vs_IADSaline %>%
filter(abs(logFC) > 1 & adj.P.Val < 0.05) %>%
arrange(desc(logFC)) %>%
select("ensembl_gene_id", "mgi_symbol", "description", "logFC", "adj.P.Val", "Significance") %>%
DT::datatable(rownames = FALSE)
GSEA
Saline_genes <- res_tbls$IDDSaline_vs_IADSaline %>%
filter(abs(logFC) >= 0.5 & adj.P.Val <= 0.05) %>%
arrange(desc(abs(logFC)))
Saline_gost <- gost(query = Saline_genes$ensembl_gene_id,
organism = "mmusculus", ordered_query = TRUE,
evcodes = TRUE)
#CL Saline Enriched
Saline_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
datatable(rownames = FALSE, caption = "GO: Saline Significant GSEA")
Saline_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
datatable(rownames = FALSE, caption = "REAC: Saline Significant GSEA")
countdata <- exprs(mm38_filt)
coldata <- colData(mm38_filt)
ddsMat <- DESeqDataSetFromMatrix(countData = countdata,
colData = coldata,
design = ~ diet + treatment)
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
all_pairwise() can do this too!
mm_diet_treatment <- set_expt_conditions(mm38_filt, fact = "diet") %>%
set_expt_batches(fact = "treatment")
## The numbers of samples by condition are:
##
## IAD IDD
## 8 8
## The number of samples by batch are:
##
## cl salinevehicle
## 9 7
##
## IAD IDD
## 8 8
##
## cl salinevehicle
## 9 7
## A pairwise differential expression with results from: basic, deseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: batch in model/limma.
## The primary analysis performed 10 comparisons.
## The logFC agreement among the methods follows:
## IDD_vs_IAD
## limma_vs_deseq 0.8627
## limma_vs_edger 0.8861
## limma_vs_basic 0.9907
## limma_vs_noiseq 0.8551
## deseq_vs_edger 0.9966
## deseq_vs_basic 0.8709
## deseq_vs_noiseq 0.7911
## edger_vs_basic 0.8938
## edger_vs_noiseq 0.8102
## basic_vs_noiseq 0.8877
mm_diet_treatment_table <- combine_de_tables(mm_diet_treatment_de, excel = "excel/DE_diet_treatment-table.xlsx",
label_column = "mgi_symbol")
## Deleting the file excel/DE_diet_treatment-table.xlsx before writing the tables.
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 IDD_vs_IAD 74 70 77 76 75 48
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.
mm_diet_treatment_sig <- extract_significant_genes(mm_diet_treatment_table, excel = "excel/DE_diet_treatment-sig.xlsx")
## Deleting the file excel/DE_diet_treatment-sig.xlsx before writing the tables.
## A set of genes deemed significant according to limma, edger, deseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down basic_up basic_down
## IDD_vs_IAD 75 48 77 76 74 70 66 29
GSEA
adj_genes <- res %>%
filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
arrange(desc(abs(log2FoldChange)))
adj_gost <- gost(query = adj_genes$ensembl_gene_id,
organism = "mmusculus", ordered_query = TRUE,
evcodes = TRUE)
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
datatable(rownames = FALSE, caption = "GO: Adjusted for Diet Significant GSEA")
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
datatable(rownames = FALSE, caption = "REAC: Adjusted for Diet Significant GSEA")
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "KEGG") %>%
datatable(rownames = FALSE, caption = "KEGG: Adjusted for Diet Significant GSEA")
library(openxlsx)
res_df <- res %>%
filter(padj < 0.05 & abs(log2FoldChange) >= 0.5) %>%
arrange(desc(log2FoldChange)) %>%
select(-Row.names, -lfcSE, -stat, -pvalue, -ensembl_gene_id, -version, -transcript_version, -mgi_symbol)
openxlsx::write.xlsx(res_df, file = "excel/adjdiet_DE.xlsx")
countdata <- exprs(mm38_filt)
coldata <- colData(mm38_filt)
ddsMat <- DESeqDataSetFromMatrix(countData = countdata,
colData = coldata,
design = ~ treatment + diet)
## converting counts to integer mode
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing
res <- results(dds)
res <- merge(as.data.frame(res), hisat_annot, by = 0)
res <- res[!is.na(res$padj),]
res <- set_sig(res, factors = c("IDD \n Enriched", "IAD \n Enriched"))
volc_plot(res_tbl = res, type = "plotly", title = "IDD versus IAD after adjusting for Treatment")
res %>%
filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
select("ensembl_gene_id", "mgi_symbol", "description", "log2FoldChange", "padj", "Significance") %>%
arrange(desc(log2FoldChange)) %>%
datatable(rownames = FALSE)
mm_treatment_diet <- set_expt_conditions(mm38_filt, fact = "treatment") %>%
set_expt_batches(fact = "diet")
## The numbers of samples by condition are:
##
## cl salinevehicle
## 9 7
## The number of samples by batch are:
##
## IAD IDD
## 8 8
##
## cl salinevehicle
## 9 7
##
## IAD IDD
## 8 8
## A pairwise differential expression with results from: basic, deseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: batch in model/limma.
## The primary analysis performed 10 comparisons.
## The logFC agreement among the methods follows:
## slnvhcl_v_
## limma_vs_deseq 0.8734
## limma_vs_edger 0.8871
## limma_vs_basic 0.9733
## limma_vs_noiseq 0.8583
## deseq_vs_edger 0.9974
## deseq_vs_basic 0.8454
## deseq_vs_noiseq 0.7202
## edger_vs_basic 0.8614
## edger_vs_noiseq 0.7421
## basic_vs_noiseq 0.8916
mm_treatment_diet_table <- combine_de_tables(mm_treatment_diet_de, excel = "excel/DE_treatment_diet-table.xlsx",
label_column = "mgi_symbol")
## Deleting the file excel/DE_treatment_diet-table.xlsx before writing the tables.
## A set of combined differential expression results.
## table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 salinevehicle_vs_cl 70 27 80 27 1 26
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.
mm_treatment_diet_sig <- extract_significant_genes(mm_treatment_diet_table, excel = "excel/DE_treatment_diet-sig.xlsx")
## Deleting the file excel/DE_treatment_diet-sig.xlsx before writing the tables.
## A set of genes deemed significant according to limma, edger, deseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
## limma_up limma_down edger_up edger_down deseq_up deseq_down basic_up basic_down
## salinevehicle_vs_cl 1 26 80 27 70 27 0 1
GSEA
adj_genes <- res %>%
filter(padj < 0.05 & abs(log2FoldChange) >= 1) %>%
arrange(desc(abs(log2FoldChange)))
adj_gost <- gost(query = adj_genes$ensembl_gene_id,
organism = "mmusculus",
ordered_query = TRUE,
evcodes = TRUE)
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
datatable(rownames = FALSE, caption = "GO: Diet after Adjusted for Treatment Significant GSEA")
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
datatable(rownames = FALSE, caption = "REAC: Diet after Adjusted for Treatment Significant GSEA")
adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "KEGG") %>%
datatable(rownames = FALSE, caption = "KEGG: Diet after Adjusted for Treatment Significant GSEA")
gseaplot_GO_diet_adjtreat <- adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "GO:BP" | source == "GO:MF" | source == "GO:CC") %>%
head(n = 10) %>%
ggplot(aes(x = recall, y = reorder(term_name, recall), fill = p_value)) +
geom_bar(stat = "identity")+
scale_fill_continuous(low = "blue", high = "red") +
theme_bw()+
ylab("") +
xlab("GSEA Score")
gseaplot_REAC_diet_adjtreat <- adj_gost$result %>%
select(term_name, p_value, term_size, intersection_size, recall, source, intersection) %>%
arrange(desc(recall)) %>%
filter(source == "REAC") %>%
head(n = 10) %>%
ggplot(aes(x = recall, y = reorder(term_name, recall), fill = p_value)) +
geom_bar(stat = "identity")+
scale_fill_continuous(low = "blue", high = "red") +
theme_bw()+
ylab("") +
xlab("GSEA Score")
gseaplot_GO_diet_adjtreat
I am going to use the three data structures I created in order to do the various GO/reactome/etc searches.
## There are only, 4 returning null.
## There are only, 5 returning null.
iddcl_vs_iadcl_up_gp <- write_gprofiler_data(
mm_de_gp[["IDDCL_vs_IADCL_up"]],
excel = "excel/iddcl_vs_iadcl_up_gp.xlsx")
## Deleting the file excel/iddcl_vs_iadcl_up_gp.xlsx before writing the tables.
iddcl_vs_iadcl_down_gp <- write_gprofiler_data(
mm_de_gp[["IDDCL_vs_IADCL_down"]],
excel = "excel/iddcl_vs_iadcl_down_gp.xlsx")
iddsaline_vs_iadsaline_up_gp <- write_gprofiler_data(
mm_de_gp[["IDDSaline_vs_IADSaline_up"]],
excel = "excel/iddsaline_vs_iadsaline_up_gp.xlsx")
## Deleting the file excel/iddsaline_vs_iadsaline_up_gp.xlsx before writing the tables.
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
iddsaline_vs_iadsaline_down_gp <- write_gprofiler_data(
mm_de_gp[["IDDSaline_vs_IADSaline_down"]],
excel = "excel/iddsaline_vs_iadsaline_up_gp.xlsx")
## Deleting the file excel/iddsaline_vs_iadsaline_up_gp.xlsx before writing the tables.
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?`geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?`geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?`geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
iddsaline_vs_iddcl_up_gp <- write_gprofiler_data(
mm_de_gp[["IDDSaline_vs_IDDCL_up"]],
excel = "excel/iddsaline_vs_iddcl_up_gp.xlsx")
## Warning: Workbook does not contain any worksheets. A worksheet will be added.
iddsaline_vs_iddcl_down_gp <- write_gprofiler_data(
mm_de_gp[["IDDSaline_vs_IDDCL_down"]],
excel = "excel/iddsaline_vs_iddcl_down_gp.xlsx")
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
The excel files created above have some pictures in them, but we can make some here as well…
Najib like the tree plots the best, let us try that first.
iddcl_iadcl_up_tree <- enrichplot::pairwise_termsim(
mm_de_gp[["IDDCL_vs_IADCL_up"]][["BP_enrich"]])
sm(enrichplot::treeplot(iddcl_iadcl_up_tree))
iddsaline_iadsaline_up_tree <- enrichplot::pairwise_termsim(
mm_de_gp[["IDDSaline_vs_IADSaline_up"]][["BP_enrich"]])
sm(enrichplot::treeplot(iddsaline_iadsaline_up_tree))
iddsaline_iddcl_up_tree <- enrichplot::pairwise_termsim(
mm_de_gp[["IDDSaline_vs_IDDCL_up"]][["BP_enrich"]])
## Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'pairwise_termsim' for signature '"NULL"'
mm_diet_treatment_gp <- all_gprofiler(mm_diet_treatment_sig, species = "mmusculus")
mm_diet_treatment_up_gp <- write_gprofiler_data(
mm_diet_treatment_gp[["IDD_vs_IAD_up"]],
excel = "excel/mm_diet_treatment_idd_vs_iad_up_gp.xlsx")
## Deleting the file excel/mm_diet_treatment_idd_vs_iad_up_gp.xlsx before writing the tables.
mm_diet_treatment_down_gp <- write_gprofiler_data(
mm_diet_treatment_gp[["IDD_vs_IAD_down"]],
excel = "excel/mm_diet_treatment_idd_vs_iad_down_gp.xlsx")
## Deleting the file excel/mm_diet_treatment_idd_vs_iad_down_gp.xlsx before writing the tables.
And some plots!
idd_iad_up_tree <- enrichplot::pairwise_termsim(
mm_diet_treatment_gp[["IDD_vs_IAD_up"]][["BP_enrich"]])
sm(enrichplot::treeplot(idd_iad_up_tree))
idd_iad_down_tree <- enrichplot::pairwise_termsim(
mm_diet_treatment_gp[["IDD_vs_IAD_down"]][["BP_enrich"]])
sm(enrichplot::treeplot(idd_iad_down_tree))
idd_iad_up_tree <- enrichplot::pairwise_termsim(
mm_diet_treatment_gp[["IDD_vs_IAD_up"]][["REAC_enrich"]])
sm(enrichplot::dotplot(idd_iad_up_tree))
idd_iad_down_tree <- enrichplot::pairwise_termsim(
mm_diet_treatment_gp[["IDD_vs_IAD_down"]][["REAC_enrich"]])
sm(enrichplot::dotplot(idd_iad_down_tree))
mm_treatment_diet_gp <- all_gprofiler(mm_treatment_diet_sig, species = "mmusculus")
mm_treatment_diet_up_gp <- write_gprofiler_data(
mm_treatment_diet_gp[["IDD_vs_IAD_up"]],
excel = "excel/mm_treatment_diet_idd_vs_iad_up_gp.xlsx")
## Deleting the file excel/mm_treatment_diet_idd_vs_iad_up_gp.xlsx before writing the tables.
## Warning: Workbook does not contain any worksheets. A worksheet will be added.
mm_treatment_diet_down_gp <- write_gprofiler_data(
mm_treatment_diet_gp[["IDD_vs_IAD_down"]],
excel = "excel/mm_treatment_diet_idd_vs_iad_down_gp.xlsx")
## Deleting the file excel/mm_treatment_diet_idd_vs_iad_down_gp.xlsx before writing the tables.
## Warning: Workbook does not contain any worksheets. A worksheet will be added.
And some plots!
saline_cl_up_tree <- enrichplot::pairwise_termsim(
mm_treatment_diet_gp[["salinevehicle_vs_cl_up"]][["BP_enrich"]])
sm(enrichplot::treeplot(idd_iad_up_tree))
saline_cl_down_tree <- enrichplot::pairwise_termsim(
mm_treatment_diet_gp[["salinevehicle_vs_cl_down"]][["BP_enrich"]])
sm(enrichplot::dotplot(idd_iad_down_tree))
## preparing geneSet collections...
## GSEA analysis...
## leading edge analysis...
## done...
## Reading KEGG annotation online: "https://rest.kegg.jp/link/mmu/pathway"...
## Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/mmu"...
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## No gene sets have size between 5 and 500 ...
##
## --> return NULL...
##
## No gene sets have size between 5 and 500 ...
##
## --> return NULL...
##
## No gene sets have size between 5 and 500 ...
##
## --> return NULL...
##
## No gene sets have size between 5 and 500 ...
##
## --> return NULL...
##
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## --> No gene can be mapped....
##
## --> Expected input gene ID: 30963,14120,78625,56752,69983,74246
##
## --> return NULL...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## preparing geneSet collections...
##
## GSEA analysis...
##
## leading edge analysis...
##
## done...
##
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.
iddcl_vs_iadcl_gsea <- plot_topn_gsea(mm_de_cp[["IDDCL_vs_IADCL_up"]])
iddcl_vs_iadcl_gsea[["GO"]][[1]]
iddsaline_vs_iadsaline_gsea <- plot_topn_gsea(mm_de_cp[["IDDSaline_vs_IADSaline_up"]])
iddsaline_vs_iadsaline_gsea[["GO"]][[1]]
iadsaline_vs_iadcl_gsea <- plot_topn_gsea(mm_de_cp[["IADSaline_vs_IADCL_up"]])
iadsaline_vs_iadcl_gsea[["GO"]][[1]]