The various differential expression analyses of the data generated in tmrc3_datasets will occur in this document.
I am going to try to standardize how I name the various data structures created in this document. Most of the large data created are either sets of differential expression analyses, their combined results, or the set of results deemed ‘significant’.
Hopefully by now they all follow these guidelines:
{clinic(s)}sample-subset}{primary-question(s)}{datatype}{batch-method}
With this in mind, ‘tc_biopsies_clinic_de_sva’ should be the Tumaco+Cali biopsy data after performing the differential expression analyses comparing the clinics using sva.
I suspect there remain some exceptions and/or errors.
Each of the following lists describes the set of contrasts that I think are interesting for the various ways one might consider the TMRC3 dataset. The variables are named according to the assumed data with which they will be used, thus tc_cf_contrasts is expected to be used for the Tumaco+Cali data and provide a series of cure/fail comparisons which (to the extent possible) across both locations. In every case, the name of the list element will be used as the contrast name, and will thus be seen as the sheet name in the output xlsx file(s); the two pieces of the character vector value are the numerator and denominator of the associated contrast.
<- list(
clinic_contrasts "clinics" = c("Cali", "Tumaco"))
## In some cases we have no Cali failure samples, so there remain only 2
## contrasts that are likely of interest
<- list(
tc_cf_contrasts "tumaco" = c("Tumacofailure", "Tumacocure"),
"cure" = c("Tumacocure", "Calicure"))
## In other cases, we have cure/fail for both places.
<- list(
clinic_cf_contrasts "cali" = c("Califailure", "Calicure"),
"tumaco" = c("Tumacofailure", "Tumacocure"),
"cure" = c("Tumacocure", "Calicure"),
"fail" = c("Tumacofailure", "Califailure"))
<- list(
cf_contrast "outcome" = c("Tumacofailure", "Tumacocure"))
<- list(
t_cf_contrast "outcome" = c("failure", "cure"))
<- list(
visitcf_contrasts "v1cf" = c("v1failure", "v1cure"),
"v2cf" = c("v2failure", "v2cure"),
"v3cf" = c("v3failure", "v3cure"))
<- list(
visit_contrasts "v2v1" = c("c2", "c1"),
"v3v1" = c("c3", "c1"),
"v3v2" = c("c3", "c2"))
<- list(
visit_v1later "later_vs_first" = c("later", "first"))
<- list(
celltypes "eo_mono" = c("eosinophils", "monocytes"),
"ne_mono" = c("neutrophils", "monocytes"),
"eo_ne" = c("eosinophils", "neutrophils"))
Perform a svaseq-guided comparison of the two clinics. Ideally this will give some clue about just how strong the clinic-based batch effect really is and what its causes are.
<- tc_valid %>%
tc_clinic_type set_expt_conditions(fact = "clinic") %>%
set_expt_batches(fact = "typeofcells")
##
## Cali Tumaco
## 61 123
##
## biopsy eosinophils monocytes neutrophils
## 18 41 63 62
table(pData(tc_clinic_type)[["condition"]])
##
## Cali Tumaco
## 61 123
<- all_pairwise(tc_clinic_type, model_batch = "svaseq",
tc_all_clinic_de_sva filter = TRUE)
##
## Cali Tumaco
## 61 123
## Removing 0 low-count genes (14290 remaining).
## Setting 31271 low elements to zero.
## transform_counts: Found 31271 values equal to 0, adding 1 to the matrix.
"deseq"]][["contrasts_performed"]] tc_all_clinic_de_sva[[
## [1] "Tumaco_vs_Cali"
<- combine_de_tables(
tc_all_clinic_table_sva keepers = clinic_contrasts,
tc_all_clinic_de_sva, # rda = glue("rda/tc_all_clinic_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/compare_clinics/tc_all_clinic_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/compare_clinics/tc_all_clinic_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for clinics.
<- extract_significant_genes(
tc_all_clinic_sig_sva
tc_all_clinic_table_sva,excel = glue("analyses/3_cali_and_tumaco/compare_clinics/tc_clinic_type_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/compare_clinics/tc_clinic_type_sig_sva-v202304.xlsx before writing the tables.
Let us take a quick look at the results of the comparison of Tumaco/Cali
Note: I keep re-introducing an error which causes these (volcano and MA) plots to be reversed with respect to the logFC values. Pay careful attention to these and make sure that they agree with the numbers of genes observed in the contrast.
## Check that up is up
summary(tc_all_clinic_table_sva[["data"]][["clinics"]][["deseq_logfc"]])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -20.580 -0.584 -0.155 -0.255 0.172 3.515
## I think we can assume that most genes are down when considering Tumaco/Cali.
sum(tc_all_clinic_table_sva$data$clinics$deseq_logfc < -1.0 &
$data$clinics$deseq_adjp < 0.05) tc_all_clinic_table_sva
## [1] 1792
"plots"]][["clinics"]][["deseq_vol_plots"]] tc_all_clinic_table_sva[[
## Ok, so it says 1794 up, but that is clearly the down side... Something is definitely messed up.
## The points are on the correct sides of the plot, but the categories of up/down are reversed.
## Theresa noted that she colors differently, and I think better: left side gets called
## 'increased in denominator', right side gets called 'increased in numerator';
## these two groups are colored according to their condition colors, and everything else is gray.
## I am checking out Theresa's helper_functions.R to get a sense of how she handles this, I think
## I can use a variant of her idea pretty easily:
## 1. Add a column 'Significance', which is a factor, and contains either 'Not enriched',
## 'Enriched in x', or 'Enriched in y' according to the logfc/adjp.
## 2. use the significance column for the geom_point color/fill in the volcano plot.
## My change to this idea would be to extract the colors from the input expressionset.
<- simple_gprofiler(
increased_tumaco_categories "deseq"]][["ups"]][["clinics"]]) tc_all_clinic_sig_sva[[
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
"pvalue_plots"]][["BP"]] increased_tumaco_categories[[
<- simple_gprofiler(
increased_cali_categories "deseq"]][["downs"]][["clinics"]]) tc_all_clinic_sig_sva[[
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
"pvalue_plots"]][["BP"]] increased_cali_categories[[
There appear to be many more genes which are increased in the Tumaco samples with respect to the Cali samples.
The remaining cell types all have pretty strong clinic-based variance; but I am not certain if it is consistent across cell types.
table(pData(tc_eosinophils)[["condition"]])
##
## Cali_cure Tumaco_cure Tumaco_failure
## 15 17 9
<- all_pairwise(tc_eosinophils,
tc_eosinophils_clinic_de_nobatch model_batch = FALSE, filter = TRUE)
##
## Cali_cure Tumaco_cure Tumaco_failure
## 15 17 9
"deseq"]][["contrasts_performed"]] tc_eosinophils_clinic_de_nobatch[[
## [1] "Tumacofailure_vs_Tumacocure" "Tumacofailure_vs_Calicure"
## [3] "Tumacocure_vs_Calicure"
<- combine_de_tables(
tc_eosinophils_clinic_table_nobatch keepers = tc_cf_contrasts,
tc_eosinophils_clinic_de_nobatch, # rda = glue("rda/tc_eosinophils_clinic_table_nobatch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_table_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_table_nobatch-v202304.xlsx before writing the tables.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
<- extract_significant_genes(
tc_eosinophils_clinic_sig_nobatch
tc_eosinophils_clinic_table_nobatch,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_sig_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_sig_nobatch-v202304.xlsx before writing the tables.
<- all_pairwise(tc_eosinophils, model_batch = "svaseq", filter = TRUE) tc_eosinophils_clinic_de_sva
##
## Cali_cure Tumaco_cure Tumaco_failure
## 15 17 9
## Removing 0 low-count genes (10864 remaining).
## Setting 1043 low elements to zero.
## transform_counts: Found 1043 values equal to 0, adding 1 to the matrix.
"deseq"]][["contrasts_performed"]] tc_eosinophils_clinic_de_sva[[
## [1] "Tumacofailure_vs_Tumacocure" "Tumacofailure_vs_Calicure"
## [3] "Tumacocure_vs_Calicure"
<- combine_de_tables(
tc_eosinophils_clinic_table_sva keepers = tc_cf_contrasts,
tc_eosinophils_clinic_de_sva, # rda = glue("rda/tc_eosinophils_clinic_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
<- extract_significant_genes(
tc_eosinophils_clinic_sig_sva
tc_eosinophils_clinic_table_sva,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Eosinophils/tc_eosinophils_clinic_sig_sva-v202304.xlsx before writing the tables.
Interestingly to me, the biopsy samples appear to have the least location-based variance. But we can perform an explicit DE and see how well that hypothesis holds up.
Note that these data include cure and fail samples for
table(pData(tc_biopsies)[["condition"]])
##
## Cali_cure Tumaco_cure Tumaco_failure
## 4 9 5
<- all_pairwise(tc_biopsies,
tc_biopsies_clinic_de_sva model_batch = "svaseq", filter = TRUE)
##
## Cali_cure Tumaco_cure Tumaco_failure
## 4 9 5
## Removing 0 low-count genes (13608 remaining).
## Setting 290 low elements to zero.
## transform_counts: Found 290 values equal to 0, adding 1 to the matrix.
"deseq"]][["contrasts_performed"]] tc_biopsies_clinic_de_sva[[
## [1] "Tumacofailure_vs_Tumacocure" "Tumacofailure_vs_Calicure"
## [3] "Tumacocure_vs_Calicure"
<- combine_de_tables(
tc_biopsies_clinic_table_sva keepers = tc_cf_contrasts,
tc_biopsies_clinic_de_sva, # rda = glue("rda/tc_biopsies_clinic_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Biopsies/tc_biopsies_clinic_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Biopsies/tc_biopsies_clinic_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
<- extract_significant_genes(
tc_biopsies_clinic_sig_sva
tc_biopsies_clinic_table_sva,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Biopsies/tc_biopsies_clinic_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Biopsies/tc_biopsies_clinic_sig_sva-v202304.xlsx before writing the tables.
At least for the moment, I am only looking at the differences between no-batch vs. sva across clinics for the monocyte samples. This was chosen mostly arbitrarily.
Our baseline is the comparison of the monocytes samples without batch in the model or surrogate estimation. In theory at least, this should correspond to the PCA plot above when no batch estimation was performed.
<- all_pairwise(tc_monocytes, model_batch = FALSE, filter = TRUE) tc_monocytes_de_nobatch
##
## Cali_cure Cali_failure Tumaco_cure Tumaco_failure
## 18 3 21 21
<- combine_de_tables(
tc_monocytes_table_nobatch keepers = clinic_cf_contrasts,
tc_monocytes_de_nobatch, # rda = glue("rda/tc_monocytes_clinic_table_nobatch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_table_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_table_nobatch-v202304.xlsx before writing the tables.
## Adding venn plots for cali.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
## Adding venn plots for fail.
<- extract_significant_genes(
tc_monocytes_sig_nobatch
tc_monocytes_table_nobatch,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_sig_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_sig_nobatch-v202304.xlsx before writing the tables.
In contrast, the following comparison should give a view of the data corresponding to the svaseq PCA plot above. In the best case scenario, we should therefore be able to see some significane differences between the Tumaco cure and fail samples.
<- all_pairwise(tc_monocytes, model_batch = "svaseq", filter = TRUE) tc_monocytes_de_sva
##
## Cali_cure Cali_failure Tumaco_cure Tumaco_failure
## 18 3 21 21
## Removing 0 low-count genes (11104 remaining).
## Setting 1447 low elements to zero.
## transform_counts: Found 1447 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_monocytes_table_sva keepers = clinic_cf_contrasts,
tc_monocytes_de_sva, # rda = glue("rda/tc_monocytes_clinic_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for cali.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
## Adding venn plots for fail.
<- extract_significant_genes(
tc_monocytes_sig_sva
tc_monocytes_table_sva,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Monocytes/tc_monocytes_clinic_sig_sva-v202304.xlsx before writing the tables.
The following block shows that these two results are exceedingly different, sugesting that the Cali cure/fail and Tumaco cure/fail cannot easily be considered in the same analysis. I did some playing around with my calculate_aucc function in this block and found that it is in some important way broken, at least if one expands the top-n genes to more than 20% of the number of genes in the data.
<- tc_monocytes_table_nobatch[["data"]][["cali"]]
cali_table <- tc_monocytes_table_nobatch[["data"]][["tumaco"]]
table
<- merge(cali_table, table, by = "row.names")
cali_merged cor.test(cali_merged[, "deseq_logfc.x"], cali_merged[, "deseq_logfc.y"])
##
## Pearson's product-moment correlation
##
## data: cali_merged[, "deseq_logfc.x"] and cali_merged[, "deseq_logfc.y"]
## t = 0.92, df = 11102, p-value = 0.4
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.009917 0.027280
## sample estimates:
## cor
## 0.008685
<- calculate_aucc(cali_table, table, px = "deseq_adjp", py = "deseq_adjp",
cali_aucc lx = "deseq_logfc", ly = "deseq_logfc")
$plot cali_aucc
<- tc_monocytes_table_sva[["data"]][["cali"]]
cali_table_sva <- tc_monocytes_table_sva[["data"]][["tumaco"]]
tumaco_table_sva
<- merge(cali_table_sva, tumaco_table_sva, by = "row.names")
cali_merged_sva cor.test(cali_merged_sva[, "deseq_logfc.x"], cali_merged_sva[, "deseq_logfc.y"])
##
## Pearson's product-moment correlation
##
## data: cali_merged_sva[, "deseq_logfc.x"] and cali_merged_sva[, "deseq_logfc.y"]
## t = 16, df = 11102, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1356 0.1720
## sample estimates:
## cor
## 0.1539
<- calculate_aucc(cali_table_sva, tumaco_table_sva, px = "deseq_adjp",
cali_aucc_sva py = "deseq_adjp", lx = "deseq_logfc", ly = "deseq_logfc")
$plot cali_aucc_sva
<- all_pairwise(tc_neutrophils,
tc_neutrophils_de_nobatch model_batch = FALSE, filter = TRUE)
##
## Cali_cure Cali_failure Tumaco_cure Tumaco_failure
## 18 3 20 21
<- combine_de_tables(
tc_neutrophils_table_nobatch keepers = clinic_cf_contrasts,
tc_neutrophils_de_nobatch, # rda = glue("rda/tc_neutrophils_clinic_table_nobatch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_table_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_table_nobatch-v202304.xlsx before writing the tables.
## Adding venn plots for cali.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
## Adding venn plots for fail.
<- extract_significant_genes(
tc_neutrophils_sig_nobatch
tc_neutrophils_table_nobatch,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_sig_nobatch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_sig_nobatch-v202304.xlsx before writing the tables.
<- all_pairwise(tc_neutrophils,
tc_neutrophils_de_sva model_batch = "svaseq", filter = TRUE)
##
## Cali_cure Cali_failure Tumaco_cure Tumaco_failure
## 18 3 20 21
## Removing 0 low-count genes (9242 remaining).
## Setting 1541 low elements to zero.
## transform_counts: Found 1541 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_neutrophils_table_sva keepers = clinic_cf_contrasts,
tc_neutrophils_de_sva, # rda = glue("rda/tc_neutrophils_clinic_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for cali.
## Adding venn plots for tumaco.
## Adding venn plots for cure.
## Adding venn plots for fail.
<- extract_significant_genes(
tc_neutrophils_sig_sva
tc_neutrophils_table_sva,excel = glue("analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/clinic_cf/Neutrophils/tc_neutrophils_sig_sva-v202304.xlsx before writing the tables.
Conversely, I can load some of the MsigDB categories from broad and perform a similar analysis using goseq to see if there are over represented categories.
<- load_gmt_signatures(signatures = "reference/msigdb/c7.all.v7.5.1.entrez.gmt",
broad_c7 signature_category = "c7")
<- load_gmt_signatures(signatures = "reference/msigdb/c2.all.v7.5.1.entrez.gmt",
broad_c2 signature_category = "c2")
<- load_gmt_signatures(signatures = "reference/msigdb/h.all.v7.5.1.entrez.gmt",
broad_h signature_category = "h")
<- goseq_msigdb(clinic_sigenes, length_db = hs_length,
clinic_gsea_msig_c2 signatures = broad_c2, signature_category = "c2")
## Error in "character" %in% class(sig_genes): object 'clinic_sigenes' not found
In the following block, I am looking at the gProfiler over represented groups observed across clinics in only the Eosinophils. First I do so for all genes(up or down), followed by only the up and down groups. Each of the following will include only the Reactome and GO:BP plots. These searches did not have too many other hits, excepting the transcription factor database.
<- simple_gprofiler(tc_eosinophils_sigenes) tc_eosinophils_gp
## Error in "character" %in% class(sig_genes): object 'tc_eosinophils_sigenes' not found
$pvalue_plots$REAC tc_eosinophils_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_gp' not found
$pvalue_plots$BP tc_eosinophils_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_gp' not found
<- simple_gprofiler(tc_eosinophils_sigenes_up) tc_eosinophils_up_gp
## Error in "character" %in% class(sig_genes): object 'tc_eosinophils_sigenes_up' not found
$pvalue_plots$REAC tc_eosinophils_up_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_up_gp' not found
$pvalue_plots$BP tc_eosinophils_up_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_up_gp' not found
<- simple_gprofiler(tc_eosinophils_sigenes_down) tc_eosinophils_down_gp
## Error in "character" %in% class(sig_genes): object 'tc_eosinophils_sigenes_down' not found
$pvalue_plots$REAC tc_eosinophils_down_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_down_gp' not found
$pvalue_plots$BP tc_eosinophils_down_gp
## Error in eval(expr, envir, enclos): object 'tc_eosinophils_down_gp' not found
In the following block I repeated the above query, but this time looking at the monocyte samples.
<- simple_gprofiler(tc_monocytes_sigenes) tc_monocytes_gp
## Error in "character" %in% class(sig_genes): object 'tc_monocytes_sigenes' not found
$pvalue_plots$REAC tc_monocytes_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_gp' not found
$pvalue_plots$BP tc_monocytes_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_gp' not found
<- simple_gprofiler(tc_monocytes_sigenes_up) tc_monocytes_up_gp
## Error in "character" %in% class(sig_genes): object 'tc_monocytes_sigenes_up' not found
$pvalue_plots$REAC tc_monocytes_up_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_up_gp' not found
$pvalue_plots$BP tc_monocytes_up_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_up_gp' not found
<- simple_gprofiler(tc_monocytes_sigenes_down) tc_monocytes_down_gp
## Error in "character" %in% class(sig_genes): object 'tc_monocytes_sigenes_down' not found
$pvalue_plots$REAC tc_monocytes_down_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_down_gp' not found
$pvalue_plots$BP tc_monocytes_down_gp
## Error in eval(expr, envir, enclos): object 'tc_monocytes_down_gp' not found
Ibid. This time looking at the Neutrophils. Thus the first two images should be a superset of the second and third pairs of images; assuming that the genes in the up/down list do not cause the groups to no longer be significant. Interestingly, the reactome search did not return any hits for the increased search.
<- simple_gprofiler(tc_neutrophils_sigenes) tc_neutrophils_gp
## Error in "character" %in% class(sig_genes): object 'tc_neutrophils_sigenes' not found
## tc_neutrophils_gp$pvalue_plots$REAC ## no hits
$pvalue_plots$BP tc_neutrophils_gp
## Error in eval(expr, envir, enclos): object 'tc_neutrophils_gp' not found
$pvalue_plots$TF tc_neutrophils_gp
## Error in eval(expr, envir, enclos): object 'tc_neutrophils_gp' not found
<- simple_gprofiler(tc_neutrophils_sigenes_up) tc_neutrophils_up_gp
## Error in "character" %in% class(sig_genes): object 'tc_neutrophils_sigenes_up' not found
## tc_neutrophils_up_gp$pvalue_plots$REAC ## No hits
$pvalue_plots$BP tc_neutrophils_up_gp
## Error in eval(expr, envir, enclos): object 'tc_neutrophils_up_gp' not found
<- simple_gprofiler(tc_neutrophils_sigenes_down) tc_neutrophils_down_gp
## Error in "character" %in% class(sig_genes): object 'tc_neutrophils_sigenes_down' not found
$pvalue_plots$REAC tc_neutrophils_down_gp
## Error in eval(expr, envir, enclos): object 'tc_neutrophils_down_gp' not found
$pvalue_plots$BP tc_neutrophils_down_gp
## Error in eval(expr, envir, enclos): object 'tc_neutrophils_down_gp' not found
The following expands the cross-clinic query above to also test the neutrophils. Once again, I think it will pretty strongly support the hypothesis that the two clinics are not compatible.
We are concerned that the clinic-based batch effect may make our results essentially useless. One way to test this concern is to compare the set of genes observed different between the Cali Cure/Fail vs. the Tumaco Cure/Fail.
<- tc_neutrophils_table_nobatch[["data"]][["cali"]]
cali_table_nobatch <- tc_neutrophils_table_nobatch[["data"]][["tumaco"]]
tumaco_table_nobatch
<- merge(cali_table_nobatch, tumaco_table_nobatch, by="row.names")
cali_merged_nobatch cor.test(cali_merged_nobatch[, "deseq_logfc.x"], cali_merged_nobatch[, "deseq_logfc.y"])
##
## Pearson's product-moment correlation
##
## data: cali_merged_nobatch[, "deseq_logfc.x"] and cali_merged_nobatch[, "deseq_logfc.y"]
## t = -16, df = 9240, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1800 -0.1403
## sample estimates:
## cor
## -0.1602
<- calculate_aucc(cali_table_nobatch, tumaco_table_nobatch, px = "deseq_adjp",
cali_aucc_nobatch py = "deseq_adjp", lx = "deseq_logfc", ly = "deseq_logfc")
$plot cali_aucc_nobatch
Given the above comparisons, we can extract some gene sets which resulted from those DE analyses and eventually perform some ontology/KEGG/reactome/etc searches. This reminds me, I want to make my extract_significant_ functions to return gene-set data structures and my various ontology searches to take them as inputs. This should help avoid potential errors when extracting up/down genes.
<- rownames(tc_all_clinic_sig_sva[["deseq"]][["ups"]][["clinics"]])
clinic_sigenes_up <- rownames(tc_all_clinic_sig_sva[["deseq"]][["downs"]][["clinics"]])
clinic_sigenes_down <- c(clinic_sigenes_up, clinic_sigenes_down)
clinic_sigenes
<- rownames(tc_eosinophils_clinic_sig_sva[["deseq"]][["ups"]][["cure"]])
tc_eosinophils_sigenes_up <- rownames(tc_eosinophils_clinic_sig_sva[["deseq"]][["downs"]][["cure"]])
tc_eosinophils_sigenes_down <- rownames(tc_monocytes_sig_sva[["deseq"]][["ups"]][["cure"]])
tc_monocytes_sigenes_up <- rownames(tc_monocytes_sig_sva[["deseq"]][["downs"]][["cure"]])
tc_monocytes_sigenes_down <- rownames(tc_neutrophils_sig_sva[["deseq"]][["ups"]][["cure"]])
tc_neutrophils_sigenes_up <- rownames(tc_neutrophils_sig_sva[["deseq"]][["downs"]][["cure"]])
tc_neutrophils_sigenes_down
<- c(tc_eosinophils_sigenes_up,
tc_eosinophils_sigenes
tc_eosinophils_sigenes_down)<- c(tc_monocytes_sigenes_up,
tc_monocytes_sigenes
tc_monocytes_sigenes_down)<- c(tc_neutrophils_sigenes_up,
tc_neutrophils_sigenes tc_neutrophils_sigenes_down)
I was curious to try to understand why the two clinics appear to be so different vis a vis their PCA/DE; so I thought that gProfiler might help boil those results down to something more digestible.
Note that in the following block I used the function simple_gprofiler(), but later in this document I will use all_gprofiler(). The first invocation limits the search to a single table, while the second will iterate over every result in a pairwise differential expression analysis.
In this instance, we are looking at the vector of gene IDs deemed significantly different between the two clinics in either the up or down direction.
One other thing worth noting, the new version of gProfiler provides some fun interactive plots. I will add an example here.
<- simple_gprofiler(tc_eosinophils_sigenes_up) tc_eosionphil_gprofiler
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
<- simple_gprofiler(clinic_sigenes) clinic_gp
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
## No results to show
## Please make sure that the organism is correct or set significant = FALSE
$pvalue_plots$REAC clinic_gp
$pvalue_plots$BP clinic_gp
$pvalue_plots$TF clinic_gp
$interactive_plots$GO clinic_gp
In all of the above, we are looking to understand the differences between the two location. Let us now step back and perform the original question: fail/cure without regard to location.
I performed this query with a few different parameters, notably with(out) sva and again using each cell type, including biopsies. The main reasion I am keeping these comparisons is in the relatively weak hope that there will be sufficient signal in the full dataset that it might be able to overcome the apparently ridiculous batch effect from the two clinics.
<- all_pairwise(tc_valid, filter = TRUE, model_batch = "svaseq") tc_all_cf_de_sva
##
## cure failure
## 122 62
## Removing 0 low-count genes (14290 remaining).
## Setting 27033 low elements to zero.
## transform_counts: Found 27033 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_all_cf_table_sva keepers = t_cf_contrast,
tc_all_cf_de_sva, # rda = glue("rda/tc_valid_cf_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_all_cf_sig_sva
tc_all_cf_table_sva,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_sig_sva-v202304.xlsx before writing the tables.
<- all_pairwise(tc_valid, filter = TRUE, model_batch = TRUE) tc_all_cf_de_batch
##
## cure failure
## 122 62
##
## 1 2 3
## 83 50 51
<- combine_de_tables(
tc_all_cf_table_batch
tc_all_cf_de_batch,keepers = t_cf_contrast,
# rda = glue("rda/tc_valid_cf_table_batch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_table_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_table_batch-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_all_cf_sig_batch
tc_all_cf_table_batch,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_sig_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_valid_cf_sig_batch-v202304.xlsx before writing the tables.
In the following block, we repeat the same question, but using only the biopsy samples from both clinics.
<- set_expt_conditions(tc_biopsies, fact = "finaloutcome") tc_biopsies_cf
##
## cure failure
## 13 5
<- all_pairwise(tc_biopsies_cf, filter = TRUE, model_batch = "svaseq") tc_biopsies_cf_de_sva
##
## cure failure
## 13 5
## Removing 0 low-count genes (13608 remaining).
## Setting 222 low elements to zero.
## transform_counts: Found 222 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_biopsies_cf_table_sva keepers = t_cf_contrast,
tc_biopsies_cf_de_sva, # rda = glue("rda/tc_biopsies_cf_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/Biopsies/tc_biopsies_cf_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/Biopsies/tc_biopsies_cf_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_biopsies_cf_sig_sva
tc_biopsies_cf_table_sva,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_sig_sva-v202304.xlsx before writing the tables.
<- all_pairwise(tc_biopsies_cf, filter = TRUE, model_batch = TRUE) tc_biopsies_cf_de_batch
##
## cure failure
## 13 5
##
## 1
## 18
<- combine_de_tables(
tc_biopsies_cf_table_batch keepers = t_cf_contrast,
tc_biopsies_cf_de_batch, # rda = glue("rda/tc_biopsies_cf_table_batch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_table_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_table_batch-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_biopsies_cf_sig_batch
tc_biopsies_cf_table_batch,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_sig_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_biopsies_cf_sig_batch-v202304.xlsx before writing the tables.
In the following block, we repeat the same question, but using only the Eosinophil samples from both clinics.
<- set_expt_conditions(tc_eosinophils, fact = "finaloutcome") tc_eosinophils_cf
##
## cure failure
## 32 9
<- all_pairwise(tc_eosinophils_cf, filter = TRUE, model_batch = "svaseq") tc_eosinophils_cf_de_sva
##
## cure failure
## 32 9
## Removing 0 low-count genes (10864 remaining).
## Setting 856 low elements to zero.
## transform_counts: Found 856 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_eosinophils_cf_table_sva keepers = t_cf_contrast,
tc_eosinophils_cf_de_sva, # rda = glue("rda/tc_eosinophils_cf_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/Eosinophils/tc_eosinophils_cf_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/Eosinophils/tc_eosinophils_cf_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_eosinophils_cf_sig_sva
tc_eosinophils_cf_table_sva,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_sig_sva-v202304.xlsx before writing the tables.
<- all_pairwise(tc_eosinophils_cf, filter = TRUE, model_batch = TRUE) tc_eosinophils_cf_de_batch
##
## cure failure
## 32 9
##
## 3 2 1
## 13 14 14
<- combine_de_tables(
tc_eosinophils_cf_table_batch keepers = t_cf_contrast,
tc_eosinophils_cf_de_batch, # rda = glue("rda/tc_eosinophils_cf_table_batch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_table_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_table_batch-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_eosinophils_cf_sig_batch
tc_eosinophils_cf_table_batch,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_sig_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_eosinophils_cf_sig_batch-v202304.xlsx before writing the tables.
Repeat yet again, this time with the monocyte samples. The idea is to see if there is a cell type which is particularly good (or bad) at discriminating the two clinics.
<- set_expt_conditions(tc_monocytes, fact = "finaloutcome") tc_monocytes_cf
##
## cure failure
## 39 24
<- all_pairwise(tc_monocytes_cf, filter = TRUE, model_batch = "svaseq") tc_monocytes_cf_de_sva
##
## cure failure
## 39 24
## Removing 0 low-count genes (11104 remaining).
## Setting 1326 low elements to zero.
## transform_counts: Found 1326 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_monocytes_cf_table_sva keepers = t_cf_contrast,
tc_monocytes_cf_de_sva, # rda = glue("rda/tc_monocytes_cf_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/Monocytes/tc_monocytes_cf_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/Monocytes/tc_monocytes_cf_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_monocytes_cf_sig_sva
tc_monocytes_cf_table_sva,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_sig_sva-v202304.xlsx before writing the tables.
<- all_pairwise(tc_monocytes_cf, filter = TRUE, model_batch = TRUE) tc_monocytes_cf_de_batch
##
## cure failure
## 39 24
##
## 3 2 1
## 19 18 26
<- combine_de_tables(
tc_monocytes_cf_table_batch keepers = t_cf_contrast,
tc_monocytes_cf_de_batch, # rda = glue("rda/tc_monocytes_cf_table_batch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_table_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_table_batch-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_monocytes_cf_sig_batch
tc_monocytes_cf_table_batch,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_sig_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_monocytes_cf_sig_batch-v202304.xlsx before writing the tables.
Last try, this time using the Neutrophil samples.
<- set_expt_conditions(tc_neutrophils, fact = "finaloutcome") tc_neutrophils_cf
##
## cure failure
## 38 24
<- all_pairwise(tc_neutrophils_cf,
tc_neutrophils_cf_de_sva filter = TRUE, model_batch = "svaseq")
##
## cure failure
## 38 24
## Removing 0 low-count genes (9242 remaining).
## Setting 1562 low elements to zero.
## transform_counts: Found 1562 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
tc_neutrophils_cf_table_sva keepers = t_cf_contrast,
tc_neutrophils_cf_de_sva, # rda = glue("rda/tc_neutrophils_cf_table_sva-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/Neutrophils/tc_neutrophils_cf_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/Neutrophils/tc_neutrophils_cf_table_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
tc_neutrophils_cf_sig_sva
tc_neutrophils_cf_table_sva,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_sig_sva-v202304.xlsx before writing the tables.
<- all_pairwise(tc_neutrophils_cf, filter = TRUE, model_batch = TRUE) tc_neutrophils_cf_de_batch
##
## cure failure
## 38 24
##
## 3 2 1
## 19 18 25
## Error in e$fun(obj, substitute(ex), parent.frame(), e$data): worker initialization failed: there is no package called ‘hpgltools’
<- combine_de_tables(
tc_neutrophils_cf_table_batch keepers = t_cf_contrast,
tc_neutrophils_cf_de_batch, # rda = glue("rda/tc_neutrophils_cf_table_batch-v{ver}.rda"),
excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_table_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_table_batch-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'tc_neutrophils_cf_de_batch' not found
<- extract_significant_genes(
tc_neutrophils_cf_sig_batch
tc_neutrophils_cf_table_batch,excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_sig_batch-v{ver}.xlsx"))
## Deleting the file analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_sig_batch-v202304.xlsx before writing the tables.
## Error in extract_significant_genes(tc_neutrophils_cf_table_batch, excel = glue("analyses/3_cali_and_tumaco/cf/All_Samples/tc_neutrophils_cf_sig_batch-v{ver}.xlsx")): object 'tc_neutrophils_cf_table_batch' not found
Start over, this time with only the samples from Tumaco. We currently are assuming these will prove to be the only analyses used for final interpretation. This is primarily because we have insufficient failed treatment samples from Cali.
<- "analyses/4_tumaco/DE_Cure_vs_Fail" xlsx_prefix
Start by considering all Tumaco cell types. Note that in this case we only use SVA, primarily because I am not certain what would be an appropriate batch factor, perhaps visit?
<- all_pairwise(t_clinical, model_batch = "svaseq", filter = TRUE) t_cf_clinical_de_sva
##
## cure failure
## 67 56
## Removing 0 low-count genes (14149 remaining).
## Setting 17282 low elements to zero.
## transform_counts: Found 17282 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_clinical_table_sva keepers = t_cf_contrast,
t_cf_clinical_de_sva, # rda = glue("rda/t_clinical_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/All_Samples/t_clinical_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_clinical_sig_sva
t_cf_clinical_table_sva,excel = glue("{xlsx_prefix}/All_Samples/t_clinical_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_clinical_sig_sva$deseq$ups[[1]])
## [1] 93 50
dim(t_cf_clinical_sig_sva$deseq$downs[[1]])
## [1] 183 50
The following gProfiler searches use the all_gprofiler() function instead of simple_gprofiler(). As a result, the results are separated by {contrast}_{direction}. Thus ‘outcome_down’.
The same plots are available as the previous gProfiler searches, but in many of the following runs, I used the dotplot() function to get a slightly different view of the results.
<- all_gprofiler(t_cf_clinical_sig_sva)
t_cf_clinical_gp ## Wikipathways of the up c/f genes
::dotplot(t_cf_clinical_gp[["outcome_up"]][["WP_enrich"]]) enrichplot
## Transcription factor database of the up c/f genes
::dotplot(t_cf_clinical_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
## Reactome of the up c/f genes
::dotplot(t_cf_clinical_gp[["outcome_up"]][["REAC_enrich"]]) enrichplot
## GO of the down c/f genes
::dotplot(t_cf_clinical_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
"outcome_up"]][["pvalue_plots"]][["BP"]] t_cf_clinical_gp[[
## Reactome of the down c/f genes
::dotplot(t_cf_clinical_gp[["outcome_up"]][["REAC_enrich"]]) enrichplot
Later in this document I do a bunch of visit/cf comparisons. In this block I want to explicitly only compare v1 to other visits. This is something I did quite a lot in the 2019 datasets, but never actually moved to this document.
<- all_pairwise(tc_v1vs, model_batch = "svaseq", filter = TRUE) v1_vs_later
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 'tc_v1vs' not found
<- combine_de_tables(
v1_vs_later_table keepers = visit_v1later,
v1_vs_later, excel = glue("excel/v1_vs_later_tables-v{ver}.xlsx"))
## Deleting the file excel/v1_vs_later_tables-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'v1_vs_later' not found
<- extract_significant_genes(
v1_vs_later_sig
v1_vs_later_table,excel = glue("excel/v1_vs_later_sig-v{ver}.xlsx"))
## Deleting the file excel/v1_vs_later_sig-v202304.xlsx before writing the tables.
## Error in extract_significant_genes(v1_vs_later_table, excel = glue("excel/v1_vs_later_sig-v{ver}.xlsx")): object 'v1_vs_later_table' not found
<- all_gprofiler(v1_vs_later_sig) v1later_gp
## Error in all_gprofiler(v1_vs_later_sig): object 'v1_vs_later_sig' not found
1]]$pvalue_plots$REAC v1later_gp[[
## Error in eval(expr, envir, enclos): object 'v1later_gp' not found
2]]$pvalue_plots$REAC v1later_gp[[
## Error in eval(expr, envir, enclos): object 'v1later_gp' not found
<- all_pairwise(t_v1vs, model_batch = "svaseq", filter = TRUE) tv1_vs_later
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 't_v1vs' not found
<- combine_de_tables(
tv1_vs_later_table keepers = visit_v1later,
tv1_vs_later, excel = glue("excel/tv1_vs_later_tables-v{ver}.xlsx"))
## Deleting the file excel/tv1_vs_later_tables-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'tv1_vs_later' not found
<- extract_significant_genes(
tv1_vs_later_sig
tv1_vs_later_table,excel = glue("excel/tv1_vs_later_sig-v{ver}.xlsx"))
## Deleting the file excel/tv1_vs_later_sig-v202304.xlsx before writing the tables.
## Error in extract_significant_genes(tv1_vs_later_table, excel = glue("excel/tv1_vs_later_sig-v{ver}.xlsx")): object 'tv1_vs_later_table' not found
<- all_gprofiler(v1_vs_later_sig) v1later_gp
## Error in all_gprofiler(v1_vs_later_sig): object 'v1_vs_later_sig' not found
1]]$pvalue_plots$REAC v1later_gp[[
## Error in eval(expr, envir, enclos): object 'v1later_gp' not found
2]]$pvalue_plots$REAC v1later_gp[[
## Error in eval(expr, envir, enclos): object 'v1later_gp' not found
<- all_gprofiler(tv1_vs_later_sig) tv1later_gp
## Error in all_gprofiler(tv1_vs_later_sig): object 'tv1_vs_later_sig' not found
1]]$pvalue_plots$BP tv1later_gp[[
## Error in eval(expr, envir, enclos): object 'tv1later_gp' not found
2]]$pvalue_plots$BP tv1later_gp[[
## Error in eval(expr, envir, enclos): object 'tv1later_gp' not found
<- all_pairwise(tc_sex, model_batch = "svaseq", filter = TRUE) tc_sex_de
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 'tc_sex' not found
<- combine_de_tables(
tc_sex_table excel = glue("excel/tc_sex_table-v{ver}.xlsx")) tc_sex_de,
## Deleting the file excel/tc_sex_table-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'tc_sex_de' not found
<- extract_significant_genes(
tc_sex_sig excel = glue("excel/tc_sex_sig-v{ver}.xlsx")) tc_sex_table,
## Deleting the file excel/tc_sex_sig-v202304.xlsx before writing the tables.
## Error in extract_significant_genes(tc_sex_table, excel = glue("excel/tc_sex_sig-v{ver}.xlsx")): object 'tc_sex_table' not found
<- all_gprofiler(tc_sex_sig) tc_sex_gp
## Error in all_gprofiler(tc_sex_sig): object 'tc_sex_sig' not found
<- subset_expt(tc_sex, subset = "clinic == 'Tumaco'") t_sex
## Error in h(simpleError(msg, call)): error in evaluating the argument 'expt' in selecting a method for function 'subset_expt': object 'tc_sex' not found
<- all_pairwise(t_sex, model_batch = "svaseq", filter = TRUE) t_sex_de
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 't_sex' not found
<- combine_de_tables(
t_sex_table excel = glue("excel/t_sex_table-v{ver}.xlsx")) t_sex_de,
## Deleting the file excel/t_sex_table-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 't_sex_de' not found
<- extract_significant_genes(
t_sex_sig excel = glue("excel/t_sex_sig-v{ver}.xlsx")) t_sex_table,
## Deleting the file excel/t_sex_sig-v202304.xlsx before writing the tables.
## Error in extract_significant_genes(t_sex_table, excel = glue("excel/t_sex_sig-v{ver}.xlsx")): object 't_sex_table' not found
<- all_gprofiler(t_sex_sig) t_sex_gp
## Error in all_gprofiler(t_sex_sig): object 't_sex_sig' not found
One of the most compelling ideas in the data is the opportunity to find genes in the first visit which may help predict the likelihood that a person will respond well to treatment. The following block will therefore look at cure/fail from Tumaco at visit 1.
<- all_pairwise(tv1_samples, model_batch = "svaseq", filter = TRUE) t_cf_clinical_v1_de_sva
##
## cure failure
## 30 24
## Removing 0 low-count genes (14016 remaining).
## Setting 7615 low elements to zero.
## transform_counts: Found 7615 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_clinical_v1_table_sva keepers = t_cf_contrast,
t_cf_clinical_v1_de_sva, # rda = glue("rda/t_clinical_v1_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v1_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v1_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_clinical_v1_sig_sva
t_cf_clinical_v1_table_sva,excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v1_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v1_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_clinical_v1_sig_sva$deseq$ups[[1]])
## [1] 28 50
dim(t_cf_clinical_v1_sig_sva$deseq$downs[[1]])
## [1] 74 50
The visit 2 and visit 3 samples are interesting because they provide an opportunity to see if we can observe changes in response in the middle and end of treatment…
<- all_pairwise(tv2_samples, model_batch = "svaseq", filter = TRUE) t_cf_clinical_v2_de_sva
##
## cure failure
## 20 15
## Removing 0 low-count genes (11559 remaining).
## Setting 2848 low elements to zero.
## transform_counts: Found 2848 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_clinical_v2_table_sva keepers = t_cf_contrast,
t_cf_clinical_v2_de_sva, # rda = glue("rda/t_clinical_v2_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v2_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v2_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_clinical_v2_sig_sva
t_cf_clinical_v2_table_sva,excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v2_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v2_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_clinical_v2_sig_sva$deseq$ups[[1]])
## [1] 51 50
dim(t_cf_clinical_v2_sig_sva$deseq$downs[[1]])
## [1] 15 50
<- all_pairwise(tv3_samples, model_batch = "svaseq", filter = TRUE) t_cf_clinical_v3_de_sva
##
## cure failure
## 17 17
## Removing 0 low-count genes (11449 remaining).
## Setting 1878 low elements to zero.
## transform_counts: Found 1878 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_clinical_v3_table_sva keepers = t_cf_contrast,
t_cf_clinical_v3_de_sva, # rda = glue("rda/t_clinical_v3_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v3_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v3_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_clinical_v3_sig_sva
t_cf_clinical_v3_table_sva,excel = glue("{xlsx_prefix}/All_Samples/t_clinical_v3_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/All_Samples/t_clinical_v3_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_clinical_v3_sig_sva$deseq$ups[[1]])
## [1] 120 50
dim(t_cf_clinical_v3_sig_sva$deseq$downs[[1]])
## [1] 62 50
It looks like there are very few groups in the visit 1 significant genes.
<- all_gprofiler(t_cf_clinical_v1_sig_sva)
t_cf_clinical_v1_sig_sva_gp
## Wikipathways of the up c/f genes
::dotplot(t_cf_clinical_v1_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v1_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
Up: 74 GO, 4 KEGG, 6 reactome, 4 WP, 56 TF, 1 miRNA, 0 HP/HPA/CORUM. Down: 19 GO, 1 KEGG, 1 HP, 2 HPA, 0 reactome/wp/tf/corum
<- all_gprofiler(t_cf_clinical_v2_sig_sva)
t_cf_clinical_v2_sig_sva_gp
## Wikipathways of the up c/f genes
::dotplot(t_cf_clinical_v2_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v2_sig_sva_gp[["outcome_up"]][["REAC_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v2_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v2_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
Up: 120 genes; 141 GO, 1 KEGG, 5 Reactome, 2 WP, 30 TF, 1 miRNA, 0 HPA/CORUM/HP Down: 62 genes; 30 GO, 2 KEGG, 1 Reactome, 0 WP/TF/miRNA/HPA/CORUM/HP,
<- all_gprofiler(t_cf_clinical_v3_sig_sva)
t_cf_clinical_v3_sig_sva_gp
## Wikipathways of the up c/f genes
::dotplot(t_cf_clinical_v3_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v3_sig_sva_gp[["outcome_up"]][["REAC_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v3_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_clinical_v3_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
The biopsy samples are problematic for a few reasons, so let us repeat without them.
<- all_pairwise(t_clinical_nobiop,
t_cf_clinical_nobiop_de_sva model_batch = "svaseq", filter = TRUE)
##
## cure failure
## 58 51
## Removing 0 low-count genes (11907 remaining).
## Setting 9578 low elements to zero.
## transform_counts: Found 9578 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_clinical_nobiop_table_sva keepers = t_cf_contrast,
t_cf_clinical_nobiop_de_sva, # rda = glue("rda/t_clinical_nobiop_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/No_Biopsies/t_clinical_nobiop_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/No_Biopsies/t_clinical_nobiop_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_clinical_nobiop_sig_sva
t_cf_clinical_nobiop_table_sva,excel = glue("{xlsx_prefix}/No_Biopsies/t_clinical_nobiop_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/No_Biopsies/t_clinical_nobiop_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_clinical_nobiop_sig_sva$deseq$ups[[1]])
## [1] 137 50
dim(t_cf_clinical_nobiop_sig_sva$deseq$downs[[1]])
## [1] 73 50
Up: 137 genes; 88 GO, 0 KEGG, 6 Reactome, 1 WP, 46 TF, 1 miRNA, 0 others Down: 73 genes; 78 GO, 1 KEGG, 1 Reactome, 9 TF, 0 others
<- all_gprofiler(t_cf_clinical_nobiop_sig_sva)
t_cf_clinical_nobiop_sig_sva_gp
::dotplot(t_cf_clinical_nobiop_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_clinical_nobiop_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_clinical_nobiop_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
Now let us switch our view to each individual cell type collected. The hope here is that we will be able to learn some cell-specific differences in the response for people who did(not) respond well.
<- all_pairwise(t_biopsies, model_batch = "svaseq", filter = TRUE) t_cf_biopsy_de_sva
##
## Tumaco_cure Tumaco_failure
## 9 5
## Removing 0 low-count genes (13506 remaining).
## Setting 145 low elements to zero.
## transform_counts: Found 145 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_biopsy_table_sva keepers = cf_contrast,
t_cf_biopsy_de_sva, # rda = glue("rda/t_biopsy_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Biopsies/t_biopsy_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Biopsies/t_biopsy_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_biopsy_sig_sva
t_cf_biopsy_table_sva,excel = glue("{xlsx_prefix}/Biopsies/t_cf_biopsy_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Biopsies/t_cf_biopsy_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_biopsy_sig_sva$deseq$ups[[1]])
## [1] 17 50
dim(t_cf_biopsy_sig_sva$deseq$downs[[1]])
## [1] 11 50
Up: 17 genes; 74 GO, 3 KEGG, 1 Reactome, 3 WP, 1 TF, 0 others Down: 11 genes; 2 GO, 0 others
<- all_gprofiler(t_cf_biopsy_sig_sva)
t_cf_biopsy_sig_sva_gp
::dotplot(t_cf_biopsy_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_biopsy_sig_sva_gp[["outcome_up"]][["WP_enrich"]]) enrichplot
::dotplot(t_cf_biopsy_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
Same question, but this time looking at monocytes. In addition, this comparison was done twice, once using SVA and once using visit as a batch factor.
<- all_pairwise(t_monocytes, model_batch = "svaseq",
t_cf_monocyte_de_sva filter = TRUE)
##
## Tumaco_cure Tumaco_failure
## 21 21
## Removing 0 low-count genes (10859 remaining).
## Setting 730 low elements to zero.
## transform_counts: Found 730 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_monocyte_tables_sva keepers = cf_contrast,
t_cf_monocyte_de_sva, # rda = glue("rda/t_monocyte_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_monocyte_sig_sva
t_cf_monocyte_tables_sva,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_monocyte_sig_sva$deseq$ups[[1]])
## [1] 60 50
dim(t_cf_monocyte_sig_sva$deseq$downs[[1]])
## [1] 53 50
<- all_pairwise(t_monocytes, model_batch = TRUE, filter = TRUE) t_cf_monocyte_de_batchvisit
##
## Tumaco_cure Tumaco_failure
## 21 21
##
## 3 2 1
## 13 13 16
<- combine_de_tables(
t_cf_monocyte_tables_batchvisit keepers = cf_contrast,
t_cf_monocyte_de_batchvisit, # rda = glue("rda/t_monocyte_cf_table_batchvisit-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_cf_tables_batchvisit-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_cf_tables_batchvisit-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_monocyte_sig_batchvisit
t_cf_monocyte_tables_batchvisit,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_cf_sig_batchvisit-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_cf_sig_batchvisit-v202304.xlsx before writing the tables.
dim(t_cf_monocyte_sig_batchvisit$deseq$ups[[1]])
## [1] 43 50
dim(t_cf_monocyte_sig_batchvisit$deseq$downs[[1]])
## [1] 93 50
Now that I am looking back over these results, I am not compeltely certain why I only did the gprofiler search for the sva data…
Up: 60 genes; 12 GO, 1 KEGG, 1 WP, 4 TF, 0 others Down: 53 genes; 26 GO, 1 KEGG, 1 Reactome, 2 TF, 0 others
<- all_gprofiler(t_cf_monocyte_sig_sva)
t_cf_monocyte_sig_sva_gp
::dotplot(t_cf_monocyte_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_monocyte_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_monocyte_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
<- all_gprofiler(t_cf_monocyte_sig_batchvisit)
t_cf_monocyte_sig_batch_gp ::dotplot(t_cf_monocyte_sig_batch_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_monocyte_sig_batch_gp[["outcome_up"]][["HP_enrich"]]) enrichplot
Now focus in on the monocyte samples on a per-visit basis.
<- all_pairwise(tv1_monocytes, model_batch = "svaseq", filter = TRUE) t_cf_monocyte_v1_de_sva
##
## Tumaco_cure Tumaco_failure
## 8 8
## Removing 0 low-count genes (10479 remaining).
## Setting 187 low elements to zero.
## transform_counts: Found 187 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_monocyte_v1_tables_sva keepers = cf_contrast,
t_cf_monocyte_v1_de_sva, # rda = glue("rda/t_monocyte_v1_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v1_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v1_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_monocyte_v1_sig_sva
t_cf_monocyte_v1_tables_sva,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v1_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v1_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_monocyte_v1_sig_sva$deseq$ups[[1]])
## [1] 14 50
dim(t_cf_monocyte_v1_sig_sva$deseq$downs[[1]])
## [1] 52 50
<- all_pairwise(tv2_monocytes, model_batch = "svaseq", filter = TRUE) t_cf_monocyte_v2_de_sva
##
## Tumaco_cure Tumaco_failure
## 7 6
## Removing 0 low-count genes (10520 remaining).
## Setting 115 low elements to zero.
## transform_counts: Found 115 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_monocyte_v2_tables_sva keepers = cf_contrast,
t_cf_monocyte_v2_de_sva, # rda = glue("rda/t_monocyte_v2_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v2_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v2_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_monocyte_v2_sig_sva
t_cf_monocyte_v2_tables_sva,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v2_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v2_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_monocyte_v2_sig_sva$deseq$ups[[1]])
## [1] 0 50
dim(t_cf_monocyte_v2_sig_sva$deseq$downs[[1]])
## [1] 1 50
<- all_pairwise(tv3_monocytes, model_batch = "svaseq", filter = TRUE) t_cf_monocyte_v3_de_sva
##
## Tumaco_cure Tumaco_failure
## 6 7
## Removing 0 low-count genes (10374 remaining).
## Setting 55 low elements to zero.
## transform_counts: Found 55 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_monocyte_v3_tables_sva keepers = cf_contrast,
t_cf_monocyte_v3_de_sva, # rda = glue("rda/t_monocyte_v3_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v3_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v3_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_monocyte_v3_sig_sva
t_cf_monocyte_v3_tables_sva,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_v3_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_v3_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_monocyte_v3_sig_sva$deseq$ups[[1]])
## [1] 0 50
dim(t_cf_monocyte_v3_sig_sva$deseq$downs[[1]])
## [1] 4 50
<- calculate_aucc(t_cf_monocyte_tables_sva[["data"]][[1]],
sva_aucc tbl2 = t_cf_monocyte_tables_batchvisit[["data"]][[1]],
py = "deseq_adjp", ly = "deseq_logfc")
sva_aucc
## $aucc
## [1] 0.6943
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 180, df = 10857, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8611 0.8705
## sample estimates:
## cor
## 0.8659
##
##
## $plot
<- rownames(t_cf_monocyte_tables_sva[["data"]][[1]]) %in%
shared_ids rownames(t_cf_monocyte_tables_batchvisit[["data"]][[1]])
<- t_cf_monocyte_tables_sva[["data"]][[1]][shared_ids, ]
first <- t_cf_monocyte_tables_batchvisit[["data"]][[1]][rownames(first), ]
second cor.test(first[["deseq_logfc"]], second[["deseq_logfc"]])
##
## Pearson's product-moment correlation
##
## data: first[["deseq_logfc"]] and second[["deseq_logfc"]]
## t = 180, df = 10857, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8611 0.8705
## sample estimates:
## cor
## 0.8659
V1: Up: 14 genes; No categories V1: Down: 52 genes; 20 GO, 5 TF
<- all_gprofiler(t_cf_monocyte_v1_sig_sva)
t_cf_monocyte_v1_sig_sva_gp
::dotplot(t_cf_monocyte_v1_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
V2: Up: 1 gene V2: Down: 0 genes.
V3: Up: 4 genes. V3: Down: 0 genes.
Switch context to the Neutrophils, once again repeat the analysis using SVA and visit as a batch factor.
<- all_pairwise(t_neutrophils, model_batch = "svaseq", filter = TRUE) t_cf_neutrophil_de_sva
##
## Tumaco_cure Tumaco_failure
## 20 21
## Removing 0 low-count genes (9099 remaining).
## Setting 750 low elements to zero.
## transform_counts: Found 750 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_neutrophil_tables_sva keepers = cf_contrast,
t_cf_neutrophil_de_sva, # rda = glue("rda/t_neutrophil_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_neutrophil_sig_sva
t_cf_neutrophil_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_sig_sva$deseq$ups[[1]])
## [1] 84 50
dim(t_cf_neutrophil_sig_sva$deseq$downs[[1]])
## [1] 29 50
<- all_pairwise(t_neutrophils, model_batch = TRUE, filter = TRUE) t_cf_neutrophil_de_batchvisit
##
## Tumaco_cure Tumaco_failure
## 20 21
##
## 3 2 1
## 12 13 16
<- combine_de_tables(
t_cf_neutrophil_tables_batchvisit keepers = cf_contrast,
t_cf_neutrophil_de_batchvisit, # rda = glue("rda/t_neutrophil_cf_table_batchvisit-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_cf_tables_batchvisit-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_cf_tables_batchvisit-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_neutrophil_sig_batchvisit
t_cf_neutrophil_tables_batchvisit,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_cf_sig_batchvisit-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_cf_sig_batchvisit-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_sig_batchvisit$deseq$ups[[1]])
## [1] 92 50
dim(t_cf_neutrophil_sig_batchvisit$deseq$downs[[1]])
## [1] 47 50
Up: 84 genes; 5 GO, 2 Reactome, 3 TF, no others. Down: 29 genes: 12 GO, 1 Reactome, 1 TF, 1 miRNA, 11 HP, 0 others
<- all_gprofiler(t_cf_neutrophil_sig_sva)
t_cf_neutrophil_sig_sva_gp
::dotplot(t_cf_neutrophil_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_neutrophil_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_neutrophil_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_neutrophil_sig_sva_gp[["outcome_down"]][["HP_enrich"]]) enrichplot
When I did this with the monocytes, I split it up into multiple blocks for each visit. This time I am just going to run them all together.
<- paste0("v", pData(t_neutrophils)[["visitnumber"]],
visitcf_factor pData(t_neutrophils)[["finaloutcome"]])
<- set_expt_conditions(t_neutrophils, fact=visitcf_factor) t_neutrophil_visitcf
##
## v1cure v1failure v2cure v2failure v3cure v3failure
## 8 8 7 6 5 7
<- all_pairwise(t_neutrophil_visitcf, model_batch = "svaseq",
t_cf_neutrophil_visits_de_sva filter = TRUE)
##
## v1cure v1failure v2cure v2failure v3cure v3failure
## 8 8 7 6 5 7
## Removing 0 low-count genes (9099 remaining).
## Setting 686 low elements to zero.
## transform_counts: Found 686 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_neutrophil_visits_tables_sva keepers = visitcf_contrasts,
t_cf_neutrophil_visits_de_sva, # rda = glue("rda/t_neutrophil_visitcf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_cf_neutrophil_visits_sig_sva
t_cf_neutrophil_visits_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_visitcf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_visits_sig_sva$deseq$ups[[1]])
## [1] 12 50
dim(t_cf_neutrophil_visits_sig_sva$deseq$downs[[1]])
## [1] 6 50
<- all_pairwise(tv1_neutrophils, model_batch = "svaseq", filter = TRUE) t_cf_neutrophil_v1_de_sva
##
## Tumaco_cure Tumaco_failure
## 8 8
## Removing 0 low-count genes (8715 remaining).
## Setting 145 low elements to zero.
## transform_counts: Found 145 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_neutrophil_v1_tables_sva keepers = cf_contrast,
t_cf_neutrophil_v1_de_sva, # rda = glue("rda/t_neutrophil_v1_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v1_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v1_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_neutrophil_v1_sig_sva
t_cf_neutrophil_v1_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v1_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v1_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_v1_sig_sva$deseq$ups[[1]])
## [1] 5 50
dim(t_cf_neutrophil_v1_sig_sva$deseq$downs[[1]])
## [1] 8 50
<- all_pairwise(tv2_neutrophils, model_batch = "svaseq", filter = TRUE) t_cf_neutrophil_v2_de_sva
##
## Tumaco_cure Tumaco_failure
## 7 6
## Removing 0 low-count genes (8450 remaining).
## Setting 78 low elements to zero.
## transform_counts: Found 78 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_neutrophil_v2_tables_sva
t_cf_neutrophil_v2_de_sva,keepers = cf_contrast,
# rda = glue("rda/t_neutrophil_v2_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v2_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v2_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_neutrophil_v2_sig_sva
t_cf_neutrophil_v2_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v2_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v2_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_v2_sig_sva$deseq$ups[[1]])
## [1] 9 50
dim(t_cf_neutrophil_v2_sig_sva$deseq$downs[[1]])
## [1] 3 50
<- all_pairwise(tv3_neutrophils, model_batch = "svaseq", filter = TRUE) t_cf_neutrophil_v3_de_sva
##
## Tumaco_cure Tumaco_failure
## 5 7
## Removing 0 low-count genes (8503 remaining).
## Setting 83 low elements to zero.
## transform_counts: Found 83 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_neutrophil_v3_tables_sva keepers = cf_contrast,
t_cf_neutrophil_v3_de_sva, # rda = glue("rda/t_neutrophil_v3_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v3_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v3_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_neutrophil_v3_sig_sva
t_cf_neutrophil_v3_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_v3_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_v3_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_neutrophil_v3_sig_sva$deseq$ups[[1]])
## [1] 5 50
dim(t_cf_monocyte_v3_sig_sva$deseq$downs[[1]])
## [1] 4 50
V1: Up: 5 genes V1: Down: 8 genes; 14 GO.
<- all_gprofiler(t_cf_neutrophil_v1_sig_sva)
t_cf_neutrophil_v1_sig_sva_gp
::dotplot(t_cf_neutrophil_v1_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
Up: 5 genes; 3 GO, 10 TF. Down: 1 gene.
<- calculate_aucc(t_cf_neutrophil_tables_sva[["data"]][[1]],
sva_aucc tbl2 = t_cf_neutrophil_tables_batchvisit[["data"]][[1]],
py = "deseq_adjp", ly = "deseq_logfc")
sva_aucc
## $aucc
## [1] 0.611
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 192, df = 9097, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8915 0.8996
## sample estimates:
## cor
## 0.8956
##
##
## $plot
<- rownames(t_cf_neutrophil_tables_sva[["data"]][[1]]) %in%
shared_ids rownames(t_cf_neutrophil_tables_batchvisit[["data"]][[1]])
<- t_cf_neutrophil_tables_sva[["data"]][[1]][shared_ids, ]
first <- t_cf_neutrophil_tables_batchvisit[["data"]][[1]][rownames(first), ]
second cor.test(first[["deseq_logfc"]], second[["deseq_logfc"]])
##
## Pearson's product-moment correlation
##
## data: first[["deseq_logfc"]] and second[["deseq_logfc"]]
## t = 192, df = 9097, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8915 0.8996
## sample estimates:
## cor
## 0.8956
This time, with feeling! Repeating the same set of tasks with the eosinophil samples.
<- all_pairwise(t_eosinophils, model_batch = "svaseq", filter = TRUE) t_cf_eosinophil_de_sva
##
## Tumaco_cure Tumaco_failure
## 17 9
## Removing 0 low-count genes (10530 remaining).
## Setting 325 low elements to zero.
## transform_counts: Found 325 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_eosinophil_tables_sva keepers = cf_contrast,
t_cf_eosinophil_de_sva, # rda = glue("rda/t_eosinophil_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_eosinophil_sig_sva
t_cf_eosinophil_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_sig_sva$deseq$ups[[1]])
## [1] 116 50
dim(t_cf_eosinophil_sig_sva$deseq$downs[[1]])
## [1] 74 50
<- all_pairwise(t_eosinophils, model_batch = TRUE, filter = TRUE) t_cf_eosinophil_de_batchvisit
##
## Tumaco_cure Tumaco_failure
## 17 9
##
## 3 2 1
## 9 9 8
<- combine_de_tables(
t_cf_eosinophil_tables_batchvisit keepers = cf_contrast,
t_cf_eosinophil_de_batchvisit, # rda = glue("rda/t_eosinophil_cf_table_batchvisit-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_cf_tables_batchvisit-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_cf_tables_batchvisit-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_eosinophil_sig_batchvisit
t_cf_eosinophil_tables_batchvisit,excel = glue("excel/t_eosinophil_cf_sig_batchvisit-v{ver}.xlsx"))
## Deleting the file excel/t_eosinophil_cf_sig_batchvisit-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_sig_batchvisit$deseq$ups[[1]])
## [1] 99 50
dim(t_cf_eosinophil_sig_batchvisit$deseq$downs[[1]])
## [1] 35 50
<- paste0("v", pData(t_eosinophils)[["visitnumber"]],
visitcf_factor pData(t_eosinophils)[["finaloutcome"]])
<- set_expt_conditions(t_eosinophils, fact = visitcf_factor) t_eosinophil_visitcf
##
## v1cure v1failure v2cure v2failure v3cure v3failure
## 5 3 6 3 6 3
<- all_pairwise(t_eosinophil_visitcf, model_batch = "svaseq",
t_cf_eosinophil_visits_de_sva filter = TRUE)
##
## v1cure v1failure v2cure v2failure v3cure v3failure
## 5 3 6 3 6 3
## Removing 0 low-count genes (10530 remaining).
## Setting 374 low elements to zero.
## transform_counts: Found 374 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_eosinophil_visits_tables_sva keepers = visitcf_contrasts,
t_cf_eosinophil_visits_de_sva, # rda = glue("rda/t_eosinophil_visitcf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_cf_eosinophil_visits_sig_sva
t_cf_eosinophil_visits_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_visitcf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_visits_sig_sva$deseq$ups[[1]])
## [1] 9 50
dim(t_cf_eosinophil_visits_sig_sva$deseq$downs[[1]])
## [1] 11 50
<- color_choices[["clinic_cf"]][["Tumaco_failure"]]
num_color <- color_choices[["clinic_cf"]][["Tumaco_cure"]]
den_color <- c("FI44L", "IFI27", "PRR5", "PRR5-ARHGAP8", "RHCE",
wanted_genes "FBXO39", "RSAD2", "SMTNL1", "USP18", "AFAP1")
<- t_cf_monocyte_tables_sva[["data"]][["outcome"]]
cf_monocyte_table <- plot_volcano_condition_de(
cf_monocyte_volcano "outcome", label = wanted_genes,
cf_monocyte_table, fc_col = "deseq_logfc", p_col = "deseq_adjp", line_position = NULL,
color_high = num_color, color_low = den_color, label_size = 6)
pp(file = glue("images/cf_monocyte_volcano_labeled-v{ver}.svg"))
$plot
cf_monocyte_volcanodev.off()
## png
## 2
$plot cf_monocyte_volcano
<- t_cf_eosinophil_tables_sva[["data"]][["outcome"]]
cf_eosinophil_table <- plot_volcano_condition_de(
cf_eosinophil_volcano "outcome", label = wanted_genes,
cf_eosinophil_table, fc_col = "deseq_logfc", p_col = "deseq_adjp", line_position = NULL,
color_high = num_color, color_low = den_color, label_size = 6)
pp(file = glue("images/cf_eosinophil_volcano_labeled-v{ver}.svg"))
$plot
cf_eosinophil_volcanodev.off()
## png
## 2
$plot cf_eosinophil_volcano
<- t_cf_neutrophil_tables_sva[["data"]][["outcome"]]
cf_neutrophil_table <- plot_volcano_condition_de(
cf_neutrophil_volcano "outcome", label = wanted_genes,
cf_neutrophil_table, fc_col = "deseq_logfc", p_col = "deseq_adjp", line_position = NULL,
color_high = num_color, color_low = den_color, label_size = 6)
pp(file = glue("images/cf_neutrophil_volcano_labeled-v{ver}.svg"))
$plot
cf_neutrophil_volcanodev.off()
## png
## 2
$plot cf_neutrophil_volcano
Up: 116 genes; 123 GO, 2 KEGG, 7 Reactome, 5 WP, 69 TF, 1 miRNA, 0 others Down: 74 genes; 5 GO, 1 Reactome, 4 TF, 0 others
<- all_gprofiler(t_cf_eosinophil_sig_sva)
t_cf_eosinophil_sig_sva_gp
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["REAC_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["WP_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["TF_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_down"]][["TF_enrich"]]) enrichplot
<- all_pairwise(tv1_eosinophils, model_batch = "svaseq", filter = TRUE) t_cf_eosinophil_v1_de_sva
##
## Tumaco_cure Tumaco_failure
## 5 3
## Removing 0 low-count genes (9977 remaining).
## Setting 57 low elements to zero.
## transform_counts: Found 57 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_eosinophil_v1_tables_sva keepers = cf_contrast,
t_cf_eosinophil_v1_de_sva, # rda = glue("rda/t_eosinophil_v1_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v1_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v1_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_eosinophil_v1_sig_sva
t_cf_eosinophil_v1_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v1_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v1_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_v1_sig_sva$deseq$ups[[1]])
## [1] 13 50
dim(t_cf_eosinophil_v1_sig_sva$deseq$downs[[1]])
## [1] 19 50
<- all_pairwise(tv2_eosinophils, model_batch = "svaseq", filter = TRUE) t_cf_eosinophil_v2_de_sva
##
## Tumaco_cure Tumaco_failure
## 6 3
## Removing 0 low-count genes (10115 remaining).
## Setting 90 low elements to zero.
## transform_counts: Found 90 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_eosinophil_v2_tables_sva keepers = cf_contrast,
t_cf_eosinophil_v2_de_sva, # rda = glue("rda/t_eosinophil_v2_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v2_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v2_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_eosinophil_v2_sig_sva
t_cf_eosinophil_v2_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v2_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v2_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_v2_sig_sva$deseq$ups[[1]])
## [1] 9 50
dim(t_cf_eosinophil_v2_sig_sva$deseq$downs[[1]])
## [1] 4 50
<- all_pairwise(tv3_eosinophils, model_batch = "svaseq", filter = TRUE) t_cf_eosinophil_v3_de_sva
##
## Tumaco_cure Tumaco_failure
## 6 3
## Removing 0 low-count genes (10078 remaining).
## Setting 48 low elements to zero.
## transform_counts: Found 48 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_cf_eosinophil_v3_tables_sva keepers = cf_contrast,
t_cf_eosinophil_v3_de_sva, # rda = glue("rda/t_eosinophil_v3_cf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v3_cf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v3_cf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for outcome.
<- extract_significant_genes(
t_cf_eosinophil_v3_sig_sva
t_cf_eosinophil_v3_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_v3_cf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_v3_cf_sig_sva-v202304.xlsx before writing the tables.
dim(t_cf_eosinophil_v3_sig_sva$deseq$ups[[1]])
## [1] 68 50
dim(t_cf_eosinophil_v3_sig_sva$deseq$downs[[1]])
## [1] 29 50
Up: 13 genes, no hits. Down: 19 genes; 11 GO, 1 Reactome, 1 TF
<- all_gprofiler(t_cf_eosinophil_v1_sig_sva)
t_cf_eosinophil_v1_sig_sva_gp
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_down"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_down"]][["TF_enrich"]]) enrichplot
Up: 9 genes; 23 GO, 2 KEGG, 2 Reactome, 4 WP Down: 4 genes; no hits
<- all_gprofiler(t_cf_eosinophil_v2_sig_sva)
t_cf_eosinophil_v2_sig_sva_gp
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["WP_enrich"]]) enrichplot
Up: 68 genes; 95 GO, 2 KEGG, 12 Reactome, 3 WP, 63 TF, 1 miRNA Down: 29 genes; 3 GO, 1 WP, 1 TF, 3 miRNA
<- all_gprofiler(t_cf_eosinophil_v3_sig_sva)
t_cf_eosinophil_v3_sig_sva_gp
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["GO_enrich"]]) enrichplot
::dotplot(t_cf_eosinophil_sig_sva_gp[["outcome_up"]][["WP_enrich"]]) enrichplot
<- calculate_aucc(t_cf_eosinophil_tables_sva[["data"]][[1]],
sva_aucc tbl2 = t_cf_eosinophil_tables_batchvisit[["data"]][[1]],
py = "deseq_adjp", ly = "deseq_logfc")
sva_aucc
## $aucc
## [1] 0.5764
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 152, df = 10528, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8232 0.8352
## sample estimates:
## cor
## 0.8293
##
##
## $plot
<- rownames(t_cf_eosinophil_tables_sva[["data"]][[1]]) %in%
shared_ids rownames(t_cf_eosinophil_tables_batchvisit[["data"]][[1]])
<- t_cf_eosinophil_tables_sva[["data"]][[1]][shared_ids, ]
first <- t_cf_eosinophil_tables_batchvisit[["data"]][[1]][rownames(first), ]
second cor.test(first[["deseq_logfc"]], second[["deseq_logfc"]])
##
## Pearson's product-moment correlation
##
## data: first[["deseq_logfc"]] and second[["deseq_logfc"]]
## t = 152, df = 10528, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.8232 0.8352
## sample estimates:
## cor
## 0.8293
<- calculate_aucc(t_cf_monocyte_tables_sva[["data"]][["outcome"]],
t_mono_neut_sva_aucc tbl2 = t_cf_neutrophil_tables_sva[["data"]][["outcome"]],
py = "deseq_adjp", ly = "deseq_logfc")
t_mono_neut_sva_aucc
## $aucc
## [1] 0.2058
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 43, df = 8575, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.4033 0.4381
## sample estimates:
## cor
## 0.4209
##
##
## $plot
<- calculate_aucc(t_cf_monocyte_tables_sva[["data"]][["outcome"]],
t_mono_eo_sva_aucc tbl2 = t_cf_eosinophil_tables_sva[["data"]][["outcome"]],
py = "deseq_adjp", ly = "deseq_logfc")
t_mono_eo_sva_aucc
## $aucc
## [1] 0.09657
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 22, df = 9763, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2028 0.2405
## sample estimates:
## cor
## 0.2217
##
##
## $plot
<- calculate_aucc(t_cf_neutrophil_tables_sva[["data"]][["outcome"]],
t_neut_eo_sva_aucc tbl2 = t_cf_eosinophil_tables_sva[["data"]][["outcome"]],
py = "deseq_adjp", ly = "deseq_logfc")
t_neut_eo_sva_aucc
## $aucc
## [1] 0.1583
##
## $cor
##
## Pearson's product-moment correlation
##
## data: tbl[[lx]] and tbl[[ly]]
## t = 36, df = 8569, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3467 0.3834
## sample estimates:
## cor
## 0.3652
##
##
## $plot
For these contrasts, we want to see fail_v1 vs. cure_v1, fail_v2 vs. cure_v2 etc. As a result, we will need to juggle the data slightly and add another set of contrasts.
<- all_pairwise(t_visitcf, model_batch = "svaseq", filter = TRUE) t_visit_cf_all_de_sva
##
## v1cure v1failure v2cure v2failure v3cure v3failure
## 30 24 20 15 17 17
## Removing 0 low-count genes (14149 remaining).
## Setting 17117 low elements to zero.
## transform_counts: Found 17117 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_cf_all_tables_sva keepers = visitcf_contrasts,
t_visit_cf_all_de_sva, # rda = glue("rda/t_all_visitcf_table_sva-v{ver}.rda"),
excel = glue("analyses/4_tumaco/DE_Cure_vs_Fail/t_all_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/t_all_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_visit_cf_all_sig_sva
t_visit_cf_all_tables_sva,excel = glue("analyses/4_tumaco/DE_Cure_vs_Fail/t_all_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/t_all_visitcf_sig_sva-v202304.xlsx before writing the tables.
<- all_gprofiler(t_visit_cf_all_sig_sva) t_visit_cf_all_gp
<- paste0("v", pData(t_monocytes)[["visitnumber"]], "_",
visitcf_factor pData(t_monocytes)[["finaloutcome"]])
<- set_expt_conditions(t_monocytes, fact = visitcf_factor) t_monocytes_visitcf
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 8 8 7 6 6 7
<- all_pairwise(t_monocytes_visitcf, model_batch = "svaseq",
t_visit_cf_monocyte_de_sva filter = TRUE)
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 8 8 7 6 6 7
## Removing 0 low-count genes (10859 remaining).
## Setting 688 low elements to zero.
## transform_counts: Found 688 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_cf_monocyte_tables_sva keepers = visitcf_contrasts,
t_visit_cf_monocyte_de_sva, # rda = glue("rda/t_monocyte_visitcf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_visit_cf_monocyte_sig_sva
t_visit_cf_monocyte_tables_sva,excel = glue("{xlsx_prefix}/Monocytes/t_monocyte_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Monocytes/t_monocyte_visitcf_sig_sva-v202304.xlsx before writing the tables.
<- t_visit_cf_monocyte_tables_sva[["plots"]][["v1cf"]][["deseq_ma_plots"]][["plot"]]
t_v1fc_deseq_ma <- pp(file = "images/monocyte_cf_de_v1_maplot.png")
dev t_v1fc_deseq_ma
## NULL
<- dev.off()
closed t_v1fc_deseq_ma
## NULL
<- t_visit_cf_monocyte_tables_sva[["plots"]][["v2cf"]][["deseq_ma_plots"]][["plot"]]
t_v2fc_deseq_ma <- pp(file = "images/monocyte_cf_de_v2_maplot.png")
dev t_v2fc_deseq_ma
## NULL
<- dev.off()
closed t_v2fc_deseq_ma
## NULL
<- t_visit_cf_monocyte_tables_sva[["plots"]][["v3cf"]][["deseq_ma_plots"]][["plot"]]
t_v3fc_deseq_ma <- pp(file = "images/monocyte_cf_de_v3_maplot.png")
dev t_v3fc_deseq_ma
## NULL
<- dev.off()
closed t_v3fc_deseq_ma
## NULL
One query from Alejandro is to look at the genes shared up/down across visits. I am not entirely certain we have enough samples for this to work, but let us find out.
I am thinking this is a good place to use the AUCC curves I learned about thanks to Julie Cridland.
Note that the following is all monocyte samples, this should therefore potentially be moved up and a version of this with only the Tumaco samples put here?
<- t_visit_cf_monocyte_tables_sva[["data"]][["v1cf"]]
v1cf <- t_visit_cf_monocyte_tables_sva[["data"]][["v2cf"]]
v2cf <- t_visit_cf_monocyte_tables_sva[["data"]][["v3cf"]]
v3cf
<- c(
v1_sig rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["ups"]][["v1cf"]]),
rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["downs"]][["v1cf"]]))
length(v1_sig)
## [1] 25
<- c(
v2_sig rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["ups"]][["v2cf"]]),
rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["downs"]][["v2cf"]]))
length(v2_sig)
## [1] 0
<- c(
v3_sig rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["ups"]][["v2cf"]]),
rownames(t_visit_cf_monocyte_sig_sva[["deseq"]][["downs"]][["v2cf"]]))
length(v3_sig)
## [1] 0
<- calculate_aucc(v1cf, tbl2 = v2cf,
t_monocyte_visit_aucc_v2v1 py = "deseq_adjp", ly = "deseq_logfc")
<- pp(file = "images/monocyte_visit_v2v1_aucc.png")
dev "plot"]]
t_monocyte_visit_aucc_v2v1[[<- dev.off()
closed "plot"]] t_monocyte_visit_aucc_v2v1[[
<- calculate_aucc(v1cf, tbl2 = v3cf,
t_monocyte_visit_aucc_v3v1 py = "deseq_adjp", ly = "deseq_logfc")
<- pp(file = "images/monocyte_visit_v3v1_aucc.png")
dev "plot"]]
t_monocyte_visit_aucc_v3v1[[<- dev.off()
closed "plot"]] t_monocyte_visit_aucc_v3v1[[
<- paste0("v", pData(t_neutrophils)[["visitnumber"]], "_",
visitcf_factor pData(t_neutrophils)[["finaloutcome"]])
<- set_expt_conditions(t_neutrophils, fact = visitcf_factor) t_neutrophil_visitcf
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 8 8 7 6 5 7
<- all_pairwise(t_neutrophil_visitcf, model_batch = "svaseq",
t_visit_cf_neutrophil_de_sva filter = TRUE)
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 8 8 7 6 5 7
## Removing 0 low-count genes (9099 remaining).
## Setting 686 low elements to zero.
## transform_counts: Found 686 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_cf_neutrophil_tables_sva keepers = visitcf_contrasts,
t_visit_cf_neutrophil_de_sva, # rda = glue("rda/t_neutrophil_visitcf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_visit_cf_neutrophil_sig_sva
t_visit_cf_neutrophil_tables_sva,excel = glue("{xlsx_prefix}/Neutrophils/t_neutrophil_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Neutrophils/t_neutrophil_visitcf_sig_sva-v202304.xlsx before writing the tables.
<- paste0("v", pData(t_eosinophils)[["visitnumber"]], "_",
visitcf_factor pData(t_eosinophils)[["finaloutcome"]])
<- set_expt_conditions(t_eosinophils, fact = visitcf_factor) t_eosinophil_visitcf
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 5 3 6 3 6 3
<- all_pairwise(t_eosinophil_visitcf, model_batch = "svaseq",
t_visit_cf_eosinophil_de_sva filter = TRUE)
##
## v1_cure v1_failure v2_cure v2_failure v3_cure v3_failure
## 5 3 6 3 6 3
## Removing 0 low-count genes (10530 remaining).
## Setting 374 low elements to zero.
## transform_counts: Found 374 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_cf_eosinophil_tables_sva keepers = visitcf_contrasts,
t_visit_cf_eosinophil_de_sva, # rda = glue("rda/t_eosinophil_visitcf_table_sva-v{ver}.rda"),
excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_visitcf_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_visitcf_tables_sva-v202304.xlsx before writing the tables.
## Adding venn plots for v1cf.
## Adding venn plots for v2cf.
## Adding venn plots for v3cf.
<- extract_significant_genes(
t_visit_cf_eosinophil_sig_sva
t_visit_cf_eosinophil_tables_sva,excel = glue("{xlsx_prefix}/Eosinophils/t_eosinophil_visitcf_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Cure_vs_Fail/Eosinophils/t_eosinophil_visitcf_sig_sva-v202304.xlsx before writing the tables.
Having put some SL read mapping information in the sample sheet, Maria Adelaida added a new column using it with the putative persistence state on a per-sample basis. One question which arised from that: what differences are observable between the persistent yes vs. no samples on a per-cell-type basis among the visit 3 samples.
First things first, create the datasets.
<- subset_expt(t_clinical, subset = "persistence=='Y'|persistence=='N'") %>%
persistence_expt subset_expt(subset = 'visitnumber==3') %>%
set_expt_conditions(fact = 'persistence')
## subset_expt(): There were 123, now there are 97 samples.
## subset_expt(): There were 97, now there are 30 samples.
##
## N Y
## 6 24
## persistence_biopsy <- subset_expt(persistence_expt, subset = "typeofcells=='biopsy'")
<- subset_expt(persistence_expt, subset = "typeofcells=='monocytes'") persistence_monocyte
## subset_expt(): There were 30, now there are 12 samples.
<- subset_expt(persistence_expt, subset = "typeofcells=='neutrophils'") persistence_neutrophil
## subset_expt(): There were 30, now there are 10 samples.
<- subset_expt(persistence_expt, subset = "typeofcells=='eosinophils'") persistence_eosinophil
## subset_expt(): There were 30, now there are 8 samples.
See if there are any patterns which look usable.
## All
<- normalize_expt(persistence_expt, transform = "log2", convert = "cpm",
persistence_norm norm = "quant", filter = TRUE)
## Removing 8537 low-count genes (11386 remaining).
## transform_counts: Found 15 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_norm)$plot
<- normalize_expt(persistence_expt, transform = "log2", convert = "cpm",
persistence_nb batch = "svaseq", filter = TRUE)
## Removing 8537 low-count genes (11386 remaining).
## Setting 1538 low elements to zero.
## transform_counts: Found 1538 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_nb)$plot
## Biopsies
##persistence_biopsy_norm <- normalize_expt(persistence_biopsy, transform = "log2", convert = "cpm",
## norm = "quant", filter = TRUE)
##plot_pca(persistence_biopsy_norm)$plot
## Insufficient data
## Monocytes
<- normalize_expt(persistence_monocyte, transform = "log2", convert = "cpm",
persistence_monocyte_norm norm = "quant", filter = TRUE)
## Removing 9597 low-count genes (10326 remaining).
## transform_counts: Found 1 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_monocyte_norm)$plot
<- normalize_expt(persistence_monocyte, transform = "log2", convert = "cpm",
persistence_monocyte_nb batch = "svaseq", filter = TRUE)
## Removing 9597 low-count genes (10326 remaining).
## Setting 46 low elements to zero.
## transform_counts: Found 46 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_monocyte_nb)$plot
## Neutrophils
<- normalize_expt(persistence_neutrophil, transform = "log2", convert = "cpm",
persistence_neutrophil_norm norm = "quant", filter = TRUE)
## Removing 11531 low-count genes (8392 remaining).
## transform_counts: Found 2 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_neutrophil_norm)$plot
<- normalize_expt(persistence_neutrophil, transform = "log2", convert = "cpm",
persistence_neutrophil_nb batch = "svaseq", filter = TRUE)
## Removing 11531 low-count genes (8392 remaining).
## Setting 46 low elements to zero.
## transform_counts: Found 46 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_neutrophil_nb)$plot
## Eosinophils
<- normalize_expt(persistence_eosinophil, transform = "log2", convert = "cpm",
persistence_eosinophil_norm norm = "quant", filter = TRUE)
## Removing 9895 low-count genes (10028 remaining).
## transform_counts: Found 1 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_eosinophil_norm)$plot
<- normalize_expt(persistence_eosinophil, transform = "log2", convert = "cpm",
persistence_eosinophil_nb batch = "svaseq", filter = TRUE)
## Removing 9895 low-count genes (10028 remaining).
## Setting 25 low elements to zero.
## transform_counts: Found 25 values equal to 0, adding 1 to the matrix.
plot_pca(persistence_eosinophil_nb)$plot
<- all_pairwise(persistence_expt, filter = TRUE, model_batch = "svaseq") persistence_de_sva
##
## N Y
## 6 24
## Removing 0 low-count genes (11386 remaining).
## Setting 1538 low elements to zero.
## transform_counts: Found 1538 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
persistence_table_sva
persistence_de_sva,excel = glue("analyses/4_tumaco/DE_Persistence/persistence_all_de_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Persistence/persistence_all_de_sva-v202304.xlsx before writing the tables.
## Adding venn plots for Y_vs_N.
<- all_pairwise(persistence_monocyte, filter = TRUE, model_batch = "svaseq") persistence_monocyte_de_sva
##
## N Y
## 2 10
## Removing 0 low-count genes (10326 remaining).
## Setting 46 low elements to zero.
## transform_counts: Found 46 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
persistence_monocyte_table_sva
persistence_monocyte_de_sva,excel = glue("analyses/4_tumaco/DE_Persistence/persistence_monocyte_de_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Persistence/persistence_monocyte_de_sva-v202304.xlsx before writing the tables.
## Adding venn plots for Y_vs_N.
<- all_pairwise(persistence_neutrophil, filter = TRUE, model_batch = "svaseq") persistence_neutrophil_de_sva
##
## N Y
## 3 7
## Removing 0 low-count genes (8392 remaining).
## Setting 46 low elements to zero.
## transform_counts: Found 46 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
persistence_neutrophil_table_sva
persistence_neutrophil_de_sva,excel = glue("analyses/4_tumaco/DE_Persistence/persistence_neutrophil_de_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Persistence/persistence_neutrophil_de_sva-v202304.xlsx before writing the tables.
## Adding venn plots for Y_vs_N.
<- all_pairwise(persistence_eosinophil, filter = TRUE, model_batch = "svaseq") persistence_eosinophil_de_sva
##
## N Y
## 1 7
## Removing 0 low-count genes (10028 remaining).
## Setting 25 low elements to zero.
## transform_counts: Found 25 values equal to 0, adding 1 to the matrix.
## Error in checkForRemoteErrors(val): one node produced an error: c("Error in NOISeq::noiseqbio(norm, k = 0.5, norm = \"rpkm\", factor = \"condition\", : \n ERROR: To run NOISeqBIO at least two replicates per condition are needed.\n Please, run NOISeq if there are not enough replicates in your experiment.\n\n", "noiseq")
<- combine_de_tables(
persistence_eosinophil_table_sva
persistence_eosinophil_de_sva,excel = glue("analyses/4_tumaco/DE_Persistence/persistence_eosinophil_de_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Persistence/persistence_eosinophil_de_sva-v202304.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'persistence_eosinophil_de_sva' not found
<- all_pairwise(t_visit, filter = TRUE, model_batch = "svaseq") t_visit_all_de_sva
##
## 3 2 1
## 34 35 40
## Removing 0 low-count genes (11907 remaining).
## Setting 9614 low elements to zero.
## transform_counts: Found 9614 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_all_table_sva keepers = visit_contrasts,
t_visit_all_de_sva, # rda = glue("rda/t_all_visit_table_sva-v{ver}.rda"),
excel = glue("analyses/4_tumaco/DE_Visits/t_all_visit_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/t_all_visit_tables_sva-v202304.xlsx before writing the tables.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Adding venn plots for v2v1.
## Adding venn plots for v3v1.
## Adding venn plots for v3v2.
<- extract_significant_genes(
t_visit_all_sig_sva
t_visit_all_table_sva,excel = glue("analyses/4_tumaco/DE_Visits/t_all_visit_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/t_all_visit_sig_sva-v202304.xlsx before writing the tables.
<- set_expt_conditions(t_monocytes, fact = "visitnumber") t_visit_monocytes
##
## 3 2 1
## 13 13 16
<- all_pairwise(t_visit_monocytes, filter = TRUE, model_batch = "svaseq") t_visit_monocyte_de_sva
##
## 3 2 1
## 13 13 16
## Removing 0 low-count genes (10859 remaining).
## Setting 648 low elements to zero.
## transform_counts: Found 648 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_monocyte_table_sva keepers = visit_contrasts,
t_visit_monocyte_de_sva, # rda = glue("rda/t_monocyte_visit_table_sva-v{ver}.rda"),
excel = glue("analyses/4_tumaco/DE_Visits/Monocytes/t_monocyte_visit_tables_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Monocytes/t_monocyte_visit_tables_sva-v202304.xlsx before writing the tables.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Adding venn plots for v2v1.
## Adding venn plots for v3v1.
## Adding venn plots for v3v2.
<- extract_significant_genes(
t_visit_monocyte_sig_sva
t_visit_monocyte_table_sva,excel = glue("analyses/4_tumaco/DE_Visits/Monocytes/t_monocyte_visit_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Monocytes/t_monocyte_visit_sig_sva-v202304.xlsx before writing the tables.
<- set_expt_conditions(t_neutrophils, fact = "visitnumber") t_visit_neutrophils
##
## 3 2 1
## 12 13 16
<- all_pairwise(t_visit_neutrophils, filter = TRUE, model_batch = "svaseq") t_visit_neutrophil_de_sva
##
## 3 2 1
## 12 13 16
## Removing 0 low-count genes (9099 remaining).
## Setting 589 low elements to zero.
## transform_counts: Found 589 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_neutrophil_table_sva keepers = visit_contrasts,
t_visit_neutrophil_de_sva, # rda = glue("rda/t_neutrophil_visit_table_sva-v{ver}.rda"),
excel = glue("analyses/4_tumaco/DE_Visits/Neutrophils/t_neutrophil_visit_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Neutrophils/t_neutrophil_visit_table_sva-v202304.xlsx before writing the tables.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Adding venn plots for v2v1.
## Adding venn plots for v3v1.
## Adding venn plots for v3v2.
<- extract_significant_genes(
t_visit_neutrophil_sig_sva
t_visit_neutrophil_table_sva,excel = glue("analyses/4_tumaco/DE_Visits/Neutrophils/t_neutrophil_visit_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Neutrophils/t_neutrophil_visit_sig_sva-v202304.xlsx before writing the tables.
<- set_expt_conditions(t_eosinophils, fact="visitnumber") t_visit_eosinophils
##
## 3 2 1
## 9 9 8
<- all_pairwise(t_visit_eosinophils, filter = TRUE, model_batch = "svaseq") t_visit_eosinophil_de
##
## 3 2 1
## 9 9 8
## Removing 0 low-count genes (10530 remaining).
## Setting 272 low elements to zero.
## transform_counts: Found 272 values equal to 0, adding 1 to the matrix.
<- combine_de_tables(
t_visit_eosinophil_table keepers = visit_contrasts,
t_visit_eosinophil_de, # rda = glue("rda/t_eosinophil_visit_table_sva-v{ver}.rda"),
excel = glue("analyses/4_tumaco/DE_Visits/Eosinophils/t_eosinophil_visit_table_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Eosinophils/t_eosinophil_visit_table_sva-v202304.xlsx before writing the tables.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Warning in combine_extracted_plots(entry_name, combined, wanted_denominator, :
## I think this is an extra contrast table, the plots may be weird.
## Adding venn plots for v2v1.
## Adding venn plots for v3v1.
## Adding venn plots for v3v2.
<- extract_significant_genes(
t_visit_eosinophil_sig
t_visit_eosinophil_table,excel = glue("analyses/4_tumaco/DE_Visits/Eosinophils/t_eosinophil_visit_sig_sva-v{ver}.xlsx"))
## Deleting the file analyses/4_tumaco/DE_Visits/Eosinophils/t_eosinophil_visit_sig_sva-v202304.xlsx before writing the tables.
Alejandro showed some ROC curves for eosinophil data showing sensitivity vs. specificity of a couple genes which were observed in v1 eosinophils vs. all-times eosinophils across cure/fail. I am curious to better understand how this was done and what utility it might have in other contexts.
To that end, I want to try something similar myself. In order to properly perform the analysis with these various tools, I need to reconfigure the data in a pretty specific format:
If I intend to use this for our tx data, I will likely need a utility function to create the properly formatted input df.
For the purposes of my playing, I will choose three genes from the eosinophil C/F table, one which is significant, one which is not, and an arbitrary.
The input genes will therefore be chosen from the data structure: t_cf_eosinophil_tables_sva:
ENSG00000198178, ENSG00000179344, ENSG00000182628
<- normalize_expt(tv1_eosinophils, convert = "rpkm", column = "cds_length") eo_rpkm
## There appear to be 5391 genes without a length.
<- all_pairwise(tmrc_external, model_batch = "svaseq", filter = "simple") test
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 'tmrc_external' not found
<- combine_de_tables(test, excel = "excel/tmrc3_scott_biopsies.xlsx") test_table
## Deleting the file excel/tmrc3_scott_biopsies.xlsx before writing the tables.
## Error in get_expt_colors(apr[["input"]]): object 'test' not found
<- extract_significant_genes(test_table, excel = "excel/tmrc3_scott_biopsies_sig.xlsx") test_sig
## Deleting the file excel/tmrc3_scott_biopsies_sig.xlsx before writing the tables.
## Error in extract_significant_genes(test_table, excel = "excel/tmrc3_scott_biopsies_sig.xlsx"): object 'test_table' not found
<- set_expt_conditions(tmrc_external, fact = "ParasiteSpecies") %>%
tmrc_external_species set_expt_colors(color_choices[["parasite"]])
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': error in evaluating the argument 'object' in selecting a method for function 'pData': object 'tmrc_external' not found
## Skipping this because it is taking too long.
##if (!isTRUE(get0("skip_load"))) {
## pander::pander(sessionInfo())
## message(paste0("This is hpgltools commit: ", get_git_commit()))
## message(paste0("Saving to ", savefile))
## tmp <- sm(saveme(filename=savefile))
##}
<- loadme(filename = savefile) tmp