index.html macrophage_estimation.html

1 RNA sequencing of Leishmania panamensis during infection of human macrophages: Differential Expression

In ‘macrophage_estimation’, we did a series of analyses to try to pick out some of the surrogate variables in the data. Now we will perform a set of differential expression analyses using the results from that. Since the ‘batch’ element of the data is reasonably well explained, we will not abuse the data with sva/combat, but instead include batch in the experimental model.

2 Differential expression analyses

It appears that it is possible though somewhat difficult to apply batch estimations generated by sva to the model given to DESeq/EdgeR/limma. In the case of limma it is fairly simple, but in the other two it is a bit more difficult. There is a nice discussion of this at: https://www.biostars.org/p/156186/ I am attempting to apply that logic to this data with limited success.

my_contrasts <- list(
    "macro_chr-sh" = c("macro_ch","macro_sh"),
    "macro_chr-nil" = c("macro_ch","macro_nil"),
    "macro_sh-nil" = c("macro_sh", "macro_nil"))
macro_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, convert="cpm"))
macro_combat_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, batch="combat_scale"))
macro_sva_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, batch="svaseq"))
macro_lowfilt <- sm(normalize_expt(macrophage_human, filter=TRUE))
## Set up the data used in the comparative contrast sets.

2.1 No batch in the model

macro_nobatch <- sm(all_pairwise(input=macro_lowfilt, model_batch=FALSE))
## wow, all tools including basic agree almost completely
medians_by_condition <- macro_nobatch$basic$medians
macro_nobatch_tables <- sm(combine_de_tables(macro_nobatch,
                                             excel=paste0("excel/macrophage_nobatch-v", ver, ".xlsx"),
                                             keepers=my_contrasts,
                                             extra_annot=medians_by_condition))
macro_nobatch_sig <- sm(extract_significant_genes(macro_nobatch_tables,
                                                  excel=paste0("excel/macrophage_nobatch_significant-v", ver, ".xlsx"),
                                                  p_type="unadjusted",
                                                  according_to="all"))

2.2 Batch in the model

In this attempt, we add a batch factor in the experimental model and see how it does.

## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_batch <- sm(all_pairwise(input=macro_lowfilt))
medians_by_condition <- macro_batch$basic$medians
macro_batch_tables <- sm(combine_de_tables(macro_batch, excel=paste0("excel/macrophage_batchmodel-v", ver, ".xlsx"),
                                           keepers=my_contrasts,
                                           extra_annot=medians_by_condition))
macro_batch_sig <- sm(extract_significant_genes(macro_batch_tables,
                                             excel=paste0("excel/macrophage_batchmodel_significant-v", ver, ".xlsx"),
                                             p_type="unadjusted",
                                             according_to="limma"))

2.3 Batch estimated with SVA

## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_sva <- sm(all_pairwise(input=macro_lowfilt, model_batch="svaseq"))
medians_by_condition <- macro_sva$basic$medians
macro_sva_tables <- sm(combine_de_tables(macro_sva, excel=paste0("excel/macrophage_sva-v", ver, ".xlsx"),
                                         keepers=my_contrasts,
                                         extra_annot=medians_by_condition))
macro_sva_sig <- sm(extract_significant_genes(macro_sva_tables,
                                              excel=paste0("excel/macrophage_sva_significant-v", ver, ".xlsx"),
                                              p_type="unadjusted",
                                              according_to="all"))
macro_sva_ma_limma <- extract_de_ma(pairwise=macro_sva, type="limma", table="macro_sh_vs_macro_ch")
macro_sva_ma_limma$plot

2.4 Batch correction via ruv residuals

## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
## Bizarrely, sometimes if one runs this, it gives an error "Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'RUVr' for signature '"matrix", "logical", "numeric", "NULL"'"  -- however, if one then simply runs it again it works fine.
## I am going to assume that it is because I do not explicitly invoke the library.
library(ruv)  ## hopefully a small code change made this not needed.
testme <- get_model_adjust(data=macro_lowfilt, estimate_type="ruv_residuals")
## The be method chose 1 surrogate variable(s).
## Attempting ruvseq residual surrogate estimation.
macro_ruvres <- sm(all_pairwise(input=macro_lowfilt, model_batch="ruv_residuals"))
medians_by_condition <- macro_ruvres$basic$medians
macro_ruvres_tables <- sm(combine_de_tables(macro_ruvres, excel=paste0("excel/macrophage_ruvres-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_ruvres_sig <- sm(extract_significant_genes(macro_ruvres_tables,
                                                 excel=paste0("excel/macrophage_ruvres_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))

2.5 Batch correction with pca

## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_pca <- sm(all_pairwise(input=macro_lowfilt, model_batch="pca"))
medians_by_condition <- macro_pca$basic$medians
macro_pca_tables <- sm(combine_de_tables(macro_pca, excel=paste0("excel/macrophage_pca-v", ver, ".xlsx"),
                                         keepers=my_contrasts,
                                         extra_annot=medians_by_condition))
macro_pca_sig <- sm(extract_significant_genes(macro_pca_tables,
                                              excel=paste0("excel/macrophage_pca_significant-v", ver, ".xlsx"),
                                              p_type="unadjusted",
                                              according_to="all"))

2.6 Batch correction with ruv empirical

## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_ruvemp <- sm(all_pairwise(input=macro_lowfilt, model_batch="ruv_empirical"))
medians_by_condition <- macro_ruvemp$basic$medians
macro_ruvemp_tables <- sm(combine_de_tables(macro_ruvemp, excel=paste0("excel/macrophage_ruvemp-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_ruvemp_sig <- sm(extract_significant_genes(macro_ruvemp_tables,
                                                 excel=paste0("excel/macrophage_ruvemp_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))

2.7 Batch correction with combat

Then repeat with the batch-corrected data and see the differences.

macro_combat <- sm(all_pairwise(input=macro_combat_norm, force=TRUE))
medians_by_condition <- macro_combat$basic$medians
macro_combat_tables <- sm(combine_de_tables(macro_combat,
                                            excel=paste0("excel/macrophage_combat-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_combat_sig <- sm(extract_significant_genes(macro_combat_tables,
                                                 excel=paste0("excel/macrophage_combat_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))
macro_combat_ma_limma <- limma_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
## Error in eval(expr, envir, enclos): could not find function "limma_ma"
png(file="images/macro_combat_ma-limma.png")
macro_combat_ma_limma$plot
## Error in eval(expr, envir, enclos): object 'macro_combat_ma_limma' not found
dev.off()
## X11cairo
##        2
macro_combat_ma_edger <- edger_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
## Error in eval(expr, envir, enclos): could not find function "edger_ma"

png(file="images/macro_combat_ma-edger.png")
macro_combat_ma_edger$plot
## Error in eval(expr, envir, enclos): object 'macro_combat_ma_edger' not found
dev.off()
## X11cairo
##        2
macro_combat_ma_deseq <- deseq_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
## Error in eval(expr, envir, enclos): could not find function "deseq_ma"
png(file="images/macro_combat_ma-deseq.png")
macro_combat_ma_deseq$plot
## Error in eval(expr, envir, enclos): object 'macro_combat_ma_deseq' not found
dev.off()
## X11cairo
##        2

3 Figure out how to compare these results

I have 4 methods of performing this differential expression analysis. Each one comes with a set of metrics defining ‘significant’. Perhaps I can make a table of the # of genes defined as significant by contrast for each. In addition it may be worth while to do a scatter plots of the logFCs between these comparisons and see how well they agree?

4 Look first at the de counts

macro_nobatch_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               104                 55
## macro_chr-nil              187                137
## macro_sh-nil                 7                 25
macro_batch_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               134                129
## macro_chr-nil              255                390
## macro_sh-nil                24                 43
macro_sva_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               166                173
## macro_chr-nil              496                260
## macro_sh-nil                25                 31
macro_ruvres_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               492                275
## macro_chr-nil              643                318
## macro_sh-nil                26                 34
macro_pca_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh                84                 73
## macro_chr-nil              221                232
## macro_sh-nil                25                 33
macro_ruvemp_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               102                107
## macro_chr-nil              286                207
## macro_sh-nil                24                 30
macro_combat_sig$limma$counts
##               change_counts_up change_counts_down
## macro_chr-sh               156                757
## macro_chr-nil              420                361
## macro_sh-nil                43                  5

4.1 Compare DeSeq / Basic without batch in model

nobatch_basic <- merge(macro_nobatch$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_basic) <- nobatch_basic[["Row.names"]]
nobatch_logfc <- nobatch_basic[, c("logFC.x","logFC.y")]
colnames(nobatch_logfc) <- c("nobatch","basic")
lfc_nb_b <- sm(plot_linear_scatter(nobatch_logfc, pretty_colors=FALSE))
lfc_nb_b$scatter

lfc_nb_b$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 230, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8899 0.8968
## sample estimates:
##    cor
## 0.8934
nobatch_p <- nobatch_basic[, c("P.Value","p")]
nobatch_p[[2]] <- as.numeric(nobatch_p[[2]])
colnames(nobatch_p) <- c("nobatch","basic")
nobatch_p <- -1 * log(nobatch_p)
p_nb_b <- sm(plot_linear_scatter(nobatch_p, pretty_colors=FALSE))
p_nb_b$scatter

p_nb_b$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 120, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7083 0.7250
## sample estimates:
##    cor
## 0.7167

4.2 Compare SVA to batch in model, DESeq

sva_batch <- merge(macro_sva$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(sva_batch) <- sva_batch[["Row.names"]]
sva_logfc <- sva_batch[, c("logFC.x","logFC.y")]
colnames(sva_logfc) <- c("sva","batch")
lfc_b_s <- sm(plot_linear_scatter(sva_logfc, pretty_colors=FALSE))
lfc_b_s$scatter

lfc_b_s$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 49, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.379 0.408
## sample estimates:
##    cor
## 0.3936

4.2.1 Include p-value estimations

Try putting some information of the p-values with the comparative log2fc

lfcp_b_s <- sva_batch[, c("logFC.x", "logFC.y", "P.Value.x", "P.Value.y")]
colnames(lfcp_b_s) <- c("l2fcsva", "l2fcbatch", "psva", "pbatch")
lfc_b_s$scatter

cutoff <- 0.1
lfcp_b_s$state <- ifelse(lfcp_b_s$psva > cutoff & lfcp_b_s$pbatch > cutoff, "bothinsig",
                  ifelse(lfcp_b_s$psva <= cutoff & lfcp_b_s$pbatch <= cutoff, "bothsig",
                  ifelse(lfcp_b_s$psva <= cutoff, "svasig", "batchsig")))
##lfcp_b_s$lfcstate <- ifelse(lfcp_b_s$l2fcsva >= 0.75 & lfcp_b_s$l2fcbatch, "", "")
num_bothinsig <- sum(lfcp_b_s$state == "bothinsig")
num_bothsig <- sum(lfcp_b_s$state == "bothsig")
num_svasig <- sum(lfcp_b_s$state == "svasig")
num_batchsig <- sum(lfcp_b_s$state == "batchsig")

library(ggplot2)
aes_color = "(l2fcsva >= 0.75 | l2fcsva <= -0.75 | l2fcbatch >= 0.75 | l2fcbatch <= -0.75)"

plt <- ggplot2::ggplot(lfcp_b_s, aes_string(x="l2fcsva", y="l2fcbatch")) +
    ## ggplot2::geom_point(stat="identity", size=2, alpha=0.2, aes_string(shape="as.factor(aes_color)", colour="as.factor(state)", fill="as.factor(state)")) +
    ggplot2::geom_abline(colour="blue", slope=1, intercept=0, size=0.5) +
    ggplot2::geom_hline(yintercept=c(-0.75, 0.75), color="red", size=0.5) +
    ggplot2::geom_vline(xintercept=c(-0.75, 0.75), color="red", size=0.5) +
    ggplot2::geom_point(stat="identity", size=2, alpha=0.2, aes_string(colour="as.factor(state)", fill="as.factor(state)")) +
    ggplot2::scale_color_manual(name="state", values=c("bothinsig"="grey", "bothsig"="forestgreen", "svasig"="darkred", "batchsig"="darkblue")) +
    ggplot2::scale_fill_manual(name="state", values=c("bothinsig"="grey", "bothsig"="forestgreen", "svasig"="darkred", "batchsig"="darkblue"),
                               labels=c(
                                   paste0("Both InSig.: ", num_bothinsig),
                                   paste0("Both Sig.: ", num_bothsig),
                                   paste0("Sva Sig.: ", num_svasig),
                                   paste0("Batch Sig.: ", num_batchsig)),
                               guide=ggplot2::guide_legend(override.aes=aes(size=3, fill="grey"))) +
    ggplot2::guides(fill=ggplot2::guide_legend(override.aes=list(size=3))) +
    ggplot2::theme_bw()
plt

4.3 Compare ruvresid to batch in model, DESeq

batch_ruvresid_deseq <- merge(macro_ruvres$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_ruvresid_deseq) <- batch_ruvresid_deseq[["Row.names"]]
batch_ruvresid_logfc <- batch_ruvresid_deseq[, c("logFC.x","logFC.y")]
colnames(batch_ruvresid_logfc) <- c("nobatch","basic")
lfc_ruv_bat <- plot_linear_scatter(batch_ruvresid_logfc, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
lfc_ruv_bat$scatter

lfc_ruv_bat$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 230, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8899 0.8968
## sample estimates:
##    cor
## 0.8934

4.4 Compare pca to batch in model, DESeq

batch_pca_deseq <- merge(macro_pca$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_pca_deseq) <- batch_pca_deseq[["Row.names"]]
batch_pca_logfc <- batch_pca_deseq[, c("logFC.x","logFC.y")]
colnames(batch_pca_logfc) <- c("nobatch","basic")
lfc_pca_bat <- plot_linear_scatter(batch_pca_logfc, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
lfc_pca_bat$scatter

lfc_pca_bat$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 230, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8899 0.8968
## sample estimates:
##    cor
## 0.8934

4.5 Compare ruv empirical to batch in model, DESeq

batch_ruvemp_deseq <- merge(macro_ruvemp$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_ruvemp_deseq) <- batch_ruvemp_deseq[["Row.names"]]
batch_ruvemp_logfc <- batch_ruvemp_deseq[, c("logFC.x","logFC.y")]
colnames(batch_ruvemp_logfc) <- c("nobatch","basic")
lfc_ruvemp_bat <- sm(plot_linear_scatter(batch_ruvemp_logfc, pretty_colors=FALSE))
lfc_ruvemp_bat$scatter

lfc_ruvemp_bat$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 230, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8899 0.8968
## sample estimates:
##    cor
## 0.8934

4.6 Compare combat to batch in model, DESeq

combat_batch <- merge(macro_combat$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(combat_batch) <- combat_batch[["Row.names"]]
combat_batch <- combat_batch[, c("logFC.x","logFC.y")]
colnames(combat_batch) <- c("batch","combat")
b_c <- plot_linear_scatter(combat_batch, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
b_c$scatter

b_c$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 560, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9794 0.9807
## sample estimates:
##    cor
## 0.9801

4.7 Compare no batch to batch in model, limma

nobatch_batch <- merge(macro_nobatch$limma$all_tables$macro_sh_vs_macro_ch, macro_batch$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- plot_linear_scatter(nobatch_batch, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
nb_b$scatter

nb_b$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 56, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4278 0.4554
## sample estimates:
##    cor
## 0.4417

4.8 Batch in model vs. SVA, limma

batch_sva <- merge(macro_batch$limma$all_tables$macro_sh_vs_macro_ch, macro_sva$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- plot_linear_scatter(batch_sva, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
b_s$scatter

b_s$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 65, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4826 0.5084
## sample estimates:
##    cor
## 0.4956

4.9 Batch in model vs. combat, limma

batch_combat <- merge(macro_batch$limma$all_tables$macro_sh_vs_macro_ch, macro_combat$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- plot_linear_scatter(batch_combat, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
b_c$scatter

b_c$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 410, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9614 0.9639
## sample estimates:
##    cor
## 0.9627

4.10 Nobatch vs. batch in model, edger

nobatch_batch <- merge(macro_nobatch$edger$all_tables$macro_sh_vs_macro_ch, macro_batch$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- sm(plot_linear_scatter(nobatch_batch, pretty_colors=FALSE))
nb_b$scatter

nb_b$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 22, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1724 0.2055
## sample estimates:
##   cor
## 0.189

4.11 Batch in model vs. SVA, edger

batch_sva <- merge(macro_batch$edger$all_tables$macro_sh_vs_macro_ch, macro_sva$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- plot_linear_scatter(batch_sva, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
b_s$scatter

b_s$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 65, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4824 0.5083
## sample estimates:
##    cor
## 0.4955

4.12 Batch in model vs. combat, edger

batch_combat <- merge(macro_batch$edger$all_tables$macro_sh_vs_macro_ch, macro_combat$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- plot_linear_scatter(batch_combat, pretty_colors=FALSE)
## Used Bon Ferroni corrected t test(s) between columns.
b_c$scatter

b_c$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 370, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9534 0.9564
## sample estimates:
##    cor
## 0.9549

4.13 Compare nobatch vs. batch, deseq

nobatch_batch <- merge(macro_nobatch$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- sm(plot_linear_scatter(nobatch_batch, pretty_colors=FALSE))
nb_b$scatter

nb_b$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 49, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.379 0.408
## sample estimates:
##    cor
## 0.3936

4.14 Compare batch vs. SVA, deseq

batch_sva <- merge(macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, macro_sva$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- sm(plot_linear_scatter(batch_sva, pretty_colors=FALSE))
b_s$scatter

b_s$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 49, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.379 0.408
## sample estimates:
##    cor
## 0.3936

4.15 Batch in model vs. combat, deseq

batch_combat <- merge(macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, macro_combat$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- sm(plot_linear_scatter(batch_combat, pretty_colors=FALSE))
b_c$scatter

b_c$correlation
##
##  Pearson's product-moment correlation
##
## data:  df[, 1] and df[, 2]
## t = 560, df = 13000, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9794 0.9807
## sample estimates:
##    cor
## 0.9801
tmp <- sm(saveme(filename="macrophage_expression_human.rda.xz"))

index.html macrophage_estimation.html

---
title: "RNAseq of L.panamensis in human macrophages, during infection, in biopsies, and with antimonial treatment."
author: "atb abelew@gmail.com"
date: "`r Sys.Date()`"
output:
 html_document:
  code_download: true
  code_folding: show
  fig_caption: true
  fig_height: 7
  fig_width: 7
  highlight: tango
  keep_md: false
  mode: selfcontained
  number_sections: true
  self_contained: true
  theme: cosmo
  toc: true
  toc_float:
    collapsed: false
    smooth_scroll: false
---

<style>
  <!-- Document prelude revision 2016-10 -->
  body .main-container {
    max-width: 1600px;
}
</style>

```{r options, include=FALSE}
## These are the options I tend to favor
library("hpgltools")
devtools::load_all("~/hpgltools")
knitr::opts_knit$set(
    progress = TRUE,
    verbose = TRUE,
    width = 90,
    echo = TRUE)
knitr::opts_chunk$set(
    error = TRUE,
    fig.width = 8,
    fig.height = 8,
    dpi = 96)
options(
    digits = 4,
    stringsAsFactors = FALSE,
    knitr.duplicate.label = "allow")
ggplot2::theme_set(ggplot2::theme_bw(base_size=10))
set.seed(1)
rmd_file <- "macrophage_expression_human.Rmd"
ver <- "20170206"
```

[index.html](index.html) [macrophage_estimation.html](macrophage_estimation.html)

```{r rendering, include=FALSE, eval=FALSE}
## This block is used to render a document from within it.
rmarkdown::render(rmd_file)

## An extra renderer for pdf output
rmarkdown::render(rmd_file, output_format="pdf_document", output_options=c("skip_html"))
## Or to save/load large Rdata files.
hpgltools:::saveme()
hpgltools:::loadme()
rm(list=ls())
```

RNA sequencing of Leishmania panamensis during infection of human macrophages: Differential Expression
======================================================================================================

In 'macrophage_estimation', we did a series of analyses to try to pick out some of the surrogate
variables in the data.  Now we will perform a set of differential expression analyses using the
results from that.  Since the 'batch' element of the data is reasonably well explained, we will not
abuse the data with sva/combat, but instead include batch in the experimental model.

```{r loadme, include=FALSE}
tmp <- sm(loadme(filename="macrophage_estimation_human.rda.xz"))
```

# Differential expression analyses

It appears that it is possible though somewhat difficult to apply batch estimations generated by sva
to the model given to DESeq/EdgeR/limma.  In the case of limma it is fairly simple, but in the other
two it is a bit more difficult.  There is a nice discussion of this at: https://www.biostars.org/p/156186/
I am attempting to apply that logic to this data with limited success.

```{r setup_de, fig.show="hide"}
my_contrasts <- list(
    "macro_chr-sh" = c("macro_ch","macro_sh"),
    "macro_chr-nil" = c("macro_ch","macro_nil"),
    "macro_sh-nil" = c("macro_sh", "macro_nil"))
macro_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, convert="cpm"))
macro_combat_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, batch="combat_scale"))
macro_sva_norm <- sm(normalize_expt(macrophage_human, filter=TRUE, batch="svaseq"))
macro_lowfilt <- sm(normalize_expt(macrophage_human, filter=TRUE))
## Set up the data used in the comparative contrast sets.
```

## No batch in the model

```{r macro_nobatch, fig.show="hide"}
macro_nobatch <- sm(all_pairwise(input=macro_lowfilt, model_batch=FALSE))
## wow, all tools including basic agree almost completely
medians_by_condition <- macro_nobatch$basic$medians
macro_nobatch_tables <- sm(combine_de_tables(macro_nobatch,
                                             excel=paste0("excel/macrophage_nobatch-v", ver, ".xlsx"),
                                             keepers=my_contrasts,
                                             extra_annot=medians_by_condition))
macro_nobatch_sig <- sm(extract_significant_genes(macro_nobatch_tables,
                                                  excel=paste0("excel/macrophage_nobatch_significant-v", ver, ".xlsx"),
                                                  p_type="unadjusted",
                                                  according_to="all"))
```

## Batch in the model

In this  attempt, we add a batch factor in the experimental model and see how it does.

```{r macro_batch, fig.show="hide"}
## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_batch <- sm(all_pairwise(input=macro_lowfilt))
medians_by_condition <- macro_batch$basic$medians
macro_batch_tables <- sm(combine_de_tables(macro_batch, excel=paste0("excel/macrophage_batchmodel-v", ver, ".xlsx"),
                                           keepers=my_contrasts,
                                           extra_annot=medians_by_condition))
macro_batch_sig <- sm(extract_significant_genes(macro_batch_tables,
                                             excel=paste0("excel/macrophage_batchmodel_significant-v", ver, ".xlsx"),
                                             p_type="unadjusted",
                                             according_to="limma"))
```

## Batch estimated with SVA

```{r macro_sva, fig.show="hide"}
## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_sva <- sm(all_pairwise(input=macro_lowfilt, model_batch="svaseq"))
medians_by_condition <- macro_sva$basic$medians
macro_sva_tables <- sm(combine_de_tables(macro_sva, excel=paste0("excel/macrophage_sva-v", ver, ".xlsx"),
                                         keepers=my_contrasts,
                                         extra_annot=medians_by_condition))
macro_sva_sig <- sm(extract_significant_genes(macro_sva_tables,
                                              excel=paste0("excel/macrophage_sva_significant-v", ver, ".xlsx"),
                                              p_type="unadjusted",
                                              according_to="all"))
macro_sva_ma_limma <- extract_de_ma(pairwise=macro_sva, type="limma", table="macro_sh_vs_macro_ch")
```

```{r sva_ma_plot}
macro_sva_ma_limma$plot
```

## Batch correction via ruv residuals

```{r macro_ruvresid, fig.show="hide"}
## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
## Bizarrely, sometimes if one runs this, it gives an error "Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'RUVr' for signature '"matrix", "logical", "numeric", "NULL"'"  -- however, if one then simply runs it again it works fine.
## I am going to assume that it is because I do not explicitly invoke the library.
library(ruv)  ## hopefully a small code change made this not needed.
testme <- get_model_adjust(data=macro_lowfilt, estimate_type="ruv_residuals")
macro_ruvres <- sm(all_pairwise(input=macro_lowfilt, model_batch="ruv_residuals"))
medians_by_condition <- macro_ruvres$basic$medians
macro_ruvres_tables <- sm(combine_de_tables(macro_ruvres, excel=paste0("excel/macrophage_ruvres-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_ruvres_sig <- sm(extract_significant_genes(macro_ruvres_tables,
                                                 excel=paste0("excel/macrophage_ruvres_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))
```

## Batch correction with pca

```{r macro_pca, fig.show="hide"}
## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_pca <- sm(all_pairwise(input=macro_lowfilt, model_batch="pca"))
medians_by_condition <- macro_pca$basic$medians
macro_pca_tables <- sm(combine_de_tables(macro_pca, excel=paste0("excel/macrophage_pca-v", ver, ".xlsx"),
                                         keepers=my_contrasts,
                                         extra_annot=medians_by_condition))
macro_pca_sig <- sm(extract_significant_genes(macro_pca_tables,
                                              excel=paste0("excel/macrophage_pca_significant-v", ver, ".xlsx"),
                                              p_type="unadjusted",
                                              according_to="all"))
```

## Batch correction with ruv empirical

```{r macro_ruvemp, fig.show="hide"}
## Here just let all_pairwise run on filtered data and do its normal ~ 0 + condition + batch analyses
macro_ruvemp <- sm(all_pairwise(input=macro_lowfilt, model_batch="ruv_empirical"))
medians_by_condition <- macro_ruvemp$basic$medians
macro_ruvemp_tables <- sm(combine_de_tables(macro_ruvemp, excel=paste0("excel/macrophage_ruvemp-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_ruvemp_sig <- sm(extract_significant_genes(macro_ruvemp_tables,
                                                 excel=paste0("excel/macrophage_ruvemp_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))
```

## Batch correction with combat

Then repeat with the batch-corrected data and see the differences.

```{r repeat_pairwise_batch, fig.show="hide"}
macro_combat <- sm(all_pairwise(input=macro_combat_norm, force=TRUE))
medians_by_condition <- macro_combat$basic$medians
macro_combat_tables <- sm(combine_de_tables(macro_combat,
                                            excel=paste0("excel/macrophage_combat-v", ver, ".xlsx"),
                                            keepers=my_contrasts,
                                            extra_annot=medians_by_condition))
macro_combat_sig <- sm(extract_significant_genes(macro_combat_tables,
                                                 excel=paste0("excel/macrophage_combat_significant-v", ver, ".xlsx"),
                                                 p_type="unadjusted",
                                                 according_to="all"))
```

```{r finished_ma_plots}
macro_combat_ma_limma <- limma_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
png(file="images/macro_combat_ma-limma.png")
macro_combat_ma_limma$plot
dev.off()
macro_combat_ma_edger <- edger_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
png(file="images/macro_combat_ma-edger.png")
macro_combat_ma_edger$plot
dev.off()
macro_combat_ma_deseq <- deseq_ma(output=macro_combat, table="macro_sh_vs_macro_ch")
png(file="images/macro_combat_ma-deseq.png")
macro_combat_ma_deseq$plot
dev.off()
```

# Figure out how to compare these results

I have 4 methods of performing this differential expression analysis.  Each one comes with a set of
metrics defining 'significant'.  Perhaps I can make a table of the # of genes defined as significant
by contrast for each.  In addition it may be worth while to do a scatter plots of the logFCs between
these comparisons and see how well they agree?

# Look first at the de counts

```{r compare_de_setup}
macro_nobatch_sig$limma$counts
macro_batch_sig$limma$counts
macro_sva_sig$limma$counts
macro_ruvres_sig$limma$counts
macro_pca_sig$limma$counts
macro_ruvemp_sig$limma$counts
macro_combat_sig$limma$counts
```

## Compare DeSeq / Basic without batch in model

```{r basic_deseq_nobatch}
nobatch_basic <- merge(macro_nobatch$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_basic) <- nobatch_basic[["Row.names"]]
nobatch_logfc <- nobatch_basic[, c("logFC.x","logFC.y")]
colnames(nobatch_logfc) <- c("nobatch","basic")
lfc_nb_b <- sm(plot_linear_scatter(nobatch_logfc, pretty_colors=FALSE))
lfc_nb_b$scatter
lfc_nb_b$correlation
nobatch_p <- nobatch_basic[, c("P.Value","p")]
nobatch_p[[2]] <- as.numeric(nobatch_p[[2]])
colnames(nobatch_p) <- c("nobatch","basic")
nobatch_p <- -1 * log(nobatch_p)
p_nb_b <- sm(plot_linear_scatter(nobatch_p, pretty_colors=FALSE))
p_nb_b$scatter
p_nb_b$correlation
```

## Compare SVA to batch in model, DESeq

```{r deseq_sva_batch}
sva_batch <- merge(macro_sva$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(sva_batch) <- sva_batch[["Row.names"]]
sva_logfc <- sva_batch[, c("logFC.x","logFC.y")]
colnames(sva_logfc) <- c("sva","batch")
lfc_b_s <- sm(plot_linear_scatter(sva_logfc, pretty_colors=FALSE))
lfc_b_s$scatter
lfc_b_s$correlation
```

### Include p-value estimations

Try putting some information of the p-values with the comparative log2fc

```{r l2fs_pvals}
lfcp_b_s <- sva_batch[, c("logFC.x", "logFC.y", "P.Value.x", "P.Value.y")]
colnames(lfcp_b_s) <- c("l2fcsva", "l2fcbatch", "psva", "pbatch")
lfc_b_s$scatter
cutoff <- 0.1
lfcp_b_s$state <- ifelse(lfcp_b_s$psva > cutoff & lfcp_b_s$pbatch > cutoff, "bothinsig",
                  ifelse(lfcp_b_s$psva <= cutoff & lfcp_b_s$pbatch <= cutoff, "bothsig",
                  ifelse(lfcp_b_s$psva <= cutoff, "svasig", "batchsig")))
##lfcp_b_s$lfcstate <- ifelse(lfcp_b_s$l2fcsva >= 0.75 & lfcp_b_s$l2fcbatch, "", "")
num_bothinsig <- sum(lfcp_b_s$state == "bothinsig")
num_bothsig <- sum(lfcp_b_s$state == "bothsig")
num_svasig <- sum(lfcp_b_s$state == "svasig")
num_batchsig <- sum(lfcp_b_s$state == "batchsig")

library(ggplot2)
aes_color = "(l2fcsva >= 0.75 | l2fcsva <= -0.75 | l2fcbatch >= 0.75 | l2fcbatch <= -0.75)"

plt <- ggplot2::ggplot(lfcp_b_s, aes_string(x="l2fcsva", y="l2fcbatch")) +
    ## ggplot2::geom_point(stat="identity", size=2, alpha=0.2, aes_string(shape="as.factor(aes_color)", colour="as.factor(state)", fill="as.factor(state)")) +
    ggplot2::geom_abline(colour="blue", slope=1, intercept=0, size=0.5) +
    ggplot2::geom_hline(yintercept=c(-0.75, 0.75), color="red", size=0.5) +
    ggplot2::geom_vline(xintercept=c(-0.75, 0.75), color="red", size=0.5) +
    ggplot2::geom_point(stat="identity", size=2, alpha=0.2, aes_string(colour="as.factor(state)", fill="as.factor(state)")) +
    ggplot2::scale_color_manual(name="state", values=c("bothinsig"="grey", "bothsig"="forestgreen", "svasig"="darkred", "batchsig"="darkblue")) +
    ggplot2::scale_fill_manual(name="state", values=c("bothinsig"="grey", "bothsig"="forestgreen", "svasig"="darkred", "batchsig"="darkblue"),
                               labels=c(
                                   paste0("Both InSig.: ", num_bothinsig),
                                   paste0("Both Sig.: ", num_bothsig),
                                   paste0("Sva Sig.: ", num_svasig),
                                   paste0("Batch Sig.: ", num_batchsig)),
                               guide=ggplot2::guide_legend(override.aes=aes(size=3, fill="grey"))) +
    ggplot2::guides(fill=ggplot2::guide_legend(override.aes=list(size=3))) +
    ggplot2::theme_bw()
plt
```

## Compare ruvresid to batch in model, DESeq

```{r batch_ruvresid_deseq}
batch_ruvresid_deseq <- merge(macro_ruvres$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_ruvresid_deseq) <- batch_ruvresid_deseq[["Row.names"]]
batch_ruvresid_logfc <- batch_ruvresid_deseq[, c("logFC.x","logFC.y")]
colnames(batch_ruvresid_logfc) <- c("nobatch","basic")
lfc_ruv_bat <- plot_linear_scatter(batch_ruvresid_logfc, pretty_colors=FALSE)
lfc_ruv_bat$scatter
lfc_ruv_bat$correlation
```

## Compare pca to batch in model, DESeq

```{r batch_pca_deseq}
batch_pca_deseq <- merge(macro_pca$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_pca_deseq) <- batch_pca_deseq[["Row.names"]]
batch_pca_logfc <- batch_pca_deseq[, c("logFC.x","logFC.y")]
colnames(batch_pca_logfc) <- c("nobatch","basic")
lfc_pca_bat <- plot_linear_scatter(batch_pca_logfc, pretty_colors=FALSE)
lfc_pca_bat$scatter
lfc_pca_bat$correlation
```

## Compare ruv empirical to batch in model, DESeq

```{r batch_ruvemp_deseq}
batch_ruvemp_deseq <- merge(macro_ruvemp$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$basic$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_ruvemp_deseq) <- batch_ruvemp_deseq[["Row.names"]]
batch_ruvemp_logfc <- batch_ruvemp_deseq[, c("logFC.x","logFC.y")]
colnames(batch_ruvemp_logfc) <- c("nobatch","basic")
lfc_ruvemp_bat <- sm(plot_linear_scatter(batch_ruvemp_logfc, pretty_colors=FALSE))
lfc_ruvemp_bat$scatter
lfc_ruvemp_bat$correlation
```

## Compare combat to batch in model, DESeq

```{r compare_batch_combat}
combat_batch <- merge(macro_combat$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(combat_batch) <- combat_batch[["Row.names"]]
combat_batch <- combat_batch[, c("logFC.x","logFC.y")]
colnames(combat_batch) <- c("batch","combat")
b_c <- plot_linear_scatter(combat_batch, pretty_colors=FALSE)
b_c$scatter
b_c$correlation
```

## Compare no batch to batch in model, limma

```{r compare_batch_nobatch_limma}
nobatch_batch <- merge(macro_nobatch$limma$all_tables$macro_sh_vs_macro_ch, macro_batch$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- plot_linear_scatter(nobatch_batch, pretty_colors=FALSE)
nb_b$scatter
nb_b$correlation
```

## Batch in model vs. SVA, limma

```{r compare_batch_sva_limma}
batch_sva <- merge(macro_batch$limma$all_tables$macro_sh_vs_macro_ch, macro_sva$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- plot_linear_scatter(batch_sva, pretty_colors=FALSE)
b_s$scatter
b_s$correlation
```

## Batch in model vs. combat, limma

```{r compare_batch_combat_limma}
batch_combat <- merge(macro_batch$limma$all_tables$macro_sh_vs_macro_ch, macro_combat$limma$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- plot_linear_scatter(batch_combat, pretty_colors=FALSE)
b_c$scatter
b_c$correlation
```

## Nobatch vs. batch in model, edger

```{r compare_nobatch_batch_edger}
nobatch_batch <- merge(macro_nobatch$edger$all_tables$macro_sh_vs_macro_ch, macro_batch$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- sm(plot_linear_scatter(nobatch_batch, pretty_colors=FALSE))
nb_b$scatter
nb_b$correlation
```

## Batch in model vs. SVA, edger

```{r compare_batch_sva_edger}
batch_sva <- merge(macro_batch$edger$all_tables$macro_sh_vs_macro_ch, macro_sva$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- plot_linear_scatter(batch_sva, pretty_colors=FALSE)
b_s$scatter
b_s$correlation
```

## Batch in model vs. combat, edger

```{r compare_batch_combat_edger}
batch_combat <- merge(macro_batch$edger$all_tables$macro_sh_vs_macro_ch, macro_combat$edger$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- plot_linear_scatter(batch_combat, pretty_colors=FALSE)
b_c$scatter
b_c$correlation
```

## Compare nobatch vs. batch, deseq

```{r compare_nobatch_batch_deseq}
nobatch_batch <- merge(macro_nobatch$deseq$all_tables$macro_sh_vs_macro_ch, macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(nobatch_batch) <- nobatch_batch[["Row.names"]]
nobatch_batch <- nobatch_batch[, c("logFC.x","logFC.y")]
colnames(nobatch_batch) <- c("nobatch","batch")
nb_b <- sm(plot_linear_scatter(nobatch_batch, pretty_colors=FALSE))
nb_b$scatter
nb_b$correlation
```

## Compare batch vs. SVA, deseq

```{r compare_batch_sva_deseq}
batch_sva <- merge(macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, macro_sva$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_sva) <- batch_sva[["Row.names"]]
batch_sva <- batch_sva[, c("logFC.x","logFC.y")]
colnames(batch_sva) <- c("batch","sva")
b_s <- sm(plot_linear_scatter(batch_sva, pretty_colors=FALSE))
b_s$scatter
b_s$correlation
```

## Batch in model vs. combat, deseq

```{r batch_combat_deseq}
batch_combat <- merge(macro_batch$deseq$all_tables$macro_sh_vs_macro_ch, macro_combat$deseq$all_tables$macro_sh_vs_macro_ch, by="row.names")
rownames(batch_combat) <- batch_combat[["Row.names"]]
batch_combat <- batch_combat[, c("logFC.x","logFC.y")]
colnames(batch_combat) <- c("batch","combat")
b_c <- sm(plot_linear_scatter(batch_combat, pretty_colors=FALSE))
b_c$scatter
b_c$correlation
```

```{r save_macro_expression}
tmp <- sm(saveme(filename="macrophage_expression_human.rda.xz"))
```

[index.html](index.html) [macrophage_estimation.html](macrophage_estimation.html)
