index.html

This document is concerned with analyzing RNAseq data of human and parasite during the infection of human macrophages. A few different strains of L. panamensis were used; the experiment is therefore segregated into groups named ‘self-healing’, ‘chronic’, and ‘uninfected’. Two separate sets of libraries were generated, one earlier set with greater coverage, and a later set including only 1 uninfected sample, and 2-3 chronic samples.

1 Sample metrics

I just realized that what I need to do is to take the surrogate estimates from the human experiment. Using those I think I will be able to get a more realistic estimate of what is going on.

1.1 But first, attempt a normal estimation

macrophage_parasite <- expt_subset(parasite_expt, subset="experimentname=='macrophage'")
chosen_colors <- c("#990000", "#000099")
names(chosen_colors) <- c("macro_ch", "macro_sh")
macrophage_parasite <- set_expt_colors(macrophage_parasite, colors=chosen_colors)
testing <- sm(write_expt(macrophage_parasite,
                         excel=paste0("excel/macrophage_parasite_data-v", ver, ".xlsx"),
                         violin=TRUE)
## Error: <text>:8:0: unexpected end of input
## 6:                          excel=paste0("excel/macrophage_parasite_data-v", ver, ".xlsx"),
## 7:                          violin=TRUE)
##   ^

Now lets try removing some of the surrogates, the two primary candidates are the strains which I proxy with 3 columns: snpclade, snpcladev2, and snpcladev3; and the batch. In this data set batch is either a or b.

In theory, sva should pick up both of those in one invocation.

## Perform a 'normal' sva
macrophage_sva_norm <- sm(normalize_expt(macrophage_parasite, batch="svaseq", filter=TRUE))
## Error in normalize_expt(macrophage_parasite, batch = "svaseq", filter = TRUE): object 'macrophage_parasite' not found
## Play with the normalization of it
macrophage_sva_norm <- sm(normalize_expt(macrophage_sva_norm, filter=TRUE, convert="cpm",
                                         norm="quant", transform="log2"))
## Error in normalize_expt(macrophage_sva_norm, filter = TRUE, convert = "cpm", : object 'macrophage_sva_norm' not found
macrophage_sva_met <- sm(graph_metrics(macrophage_sva_norm))
## Error in corheat$plot: $ operator is invalid for atomic vectors
## Try limma's method
macrophage_limma_norm <- sm(normalize_expt(macrophage_parasite, batch="limma", filter=TRUE,
                                           batch2="snpcladev3", convert="cpm", norm="quant"))
## Error in normalize_expt(macrophage_parasite, batch = "limma", filter = TRUE, : object 'macrophage_parasite' not found
macrophage_limma_norm <- sm(normalize_expt(macrophage_limma_norm, convert="cpm",
                                           norm="quant", filter=TRUE))
## Error in normalize_expt(macrophage_limma_norm, convert = "cpm", norm = "quant", : object 'macrophage_limma_norm' not found
macrophage_limma_met <- sm(graph_metrics(macrophage_limma_norm))
## Error in corheat$plot: $ operator is invalid for atomic vectors

1.2 Start with a pca from sva

macrophage_sva_met$pcaplot
## Error in eval(expr, envir, enclos): object 'macrophage_sva_met' not found

1.3 Try limma’s removebatcheffect

Another method might be to try using limmaresid to explicitly pull both columns.

macrophage_limma_met$pcaplot
## Error in eval(expr, envir, enclos): object 'macrophage_limma_met' not found

Another method might be to try pca on two separate invocations.

macrophage_pca_norm <- sm(normalize_expt(macrophage_parasite, batch="pca", filter=TRUE))
## Error in normalize_expt(macrophage_parasite, batch = "pca", filter = TRUE): object 'macrophage_parasite' not found
macrophage_pca_norm <- set_expt_batch(expt=macrophage_pca_norm, fact="snpcladev3")
## Error in set_expt_batch(expt = macrophage_pca_norm, fact = "snpcladev3"): object 'macrophage_pca_norm' not found
macrophage_pca_norm <- sm(normalize_expt(macrophage_pca_norm, batch="pca"))
## Error in normalize_expt(macrophage_pca_norm, batch = "pca"): object 'macrophage_pca_norm' not found
macrophage_pca_norm <- sm(normalize_expt(macrophage_pca_norm, convert="cpm",
                                         norm="quant", transform="log2",
                                         filter=TRUE))
## Error in normalize_expt(macrophage_pca_norm, convert = "cpm", norm = "quant", : object 'macrophage_pca_norm' not found
macrophage_pca_met <- sm(graph_metrics(macrophage_pca_norm))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macrophage_pca_met$pcaplot
## Error in eval(expr, envir, enclos): object 'macrophage_pca_met' not found

1.4 Attempting count extraction from human surrogates

If indeed the human surrogates are a primary determinant in the parasite data, then we should be able to see that in some sample estimations of the parasite data. Let us try!

surrogate_estimations <- new.env()
load("savefiles/macrophage_estimation_human.rda.xz", envir=surrogate_estimations)
surrogate_estimations <- surrogate_estimations$surrogate_estimations
summary(surrogate_estimations)
##                         Length Class        Mode   
## pca_adjust               7     -none-       list   
## sva_supervised_adjust    7     -none-       list   
## sva_unsupervised_adjust  7     -none-       list   
## ruv_supervised_adjust    7     -none-       list   
## ruv_residual_adjust      7     -none-       list   
## ruv_empirical_adjust     7     -none-       list   
## adjustments              8     -none-       list   
## correlations            64     -none-       numeric
## plot                     3     recordedplot list   
## pca_plots                7     -none-       list   
## catplots                 0     -none-       NULL
adjusts <- surrogate_estimations$sva_supervised_adjust$model_adjust
## Remove surrogate estimates #1 and #9
new_adjusts <- as.matrix(adjusts[c(-1, -9), ])
## The question of course,  how do I use these estimations to adjust the data?

test_adjusted <- counts_from_surrogates(macrophage_parasite, new_adjusts)
## Error in counts_from_surrogates(macrophage_parasite, new_adjusts): object 'macrophage_parasite' not found
sv_adjusted <- macrophage_parasite
## Error in eval(expr, envir, enclos): object 'macrophage_parasite' not found
Biobase::exprs(sv_adjusted$expressionset) <- test_adjusted
## Error in eval(expr, envir, enclos): object 'test_adjusted' not found
human_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
## Error in expt_subset(...): object 'sv_adjusted' not found
human_adjusted_norm <- sm(normalize_expt(human_adjusted, convert="cpm", filter="simple",
                                         batch="sva", norm="quant", transform="log2"))
## Error in normalize_expt(human_adjusted, convert = "cpm", filter = "simple", : object 'human_adjusted' not found
human_adjusted_norm_metrics <- graph_metrics(human_adjusted_norm)
## Graphing number of non-zero genes with respect to CPM by library.
## Graphing library sizes.
## Warning in plot_libsize(expt, title = libsize_title, ...): restarting
## interrupted promise evaluation
## Graphing a boxplot.
## Warning in plot_boxplot(expt, title = boxplot_title, ...): restarting
## interrupted promise evaluation
## Graphing a correlation heatmap.
## Warning in plot_heatmap(expt_data, expt_colors = expt_colors, expt_design =
## expt_design, : restarting interrupted promise evaluation
## Graphing a standard median correlation.
## Warning in plot_sm(expt, method = cormethod, title = smc_title, ...):
## restarting interrupted promise evaluation
## Graphing a distance heatmap.
## Warning in plot_heatmap(expt_data, expt_colors = expt_colors, expt_design =
## expt_design, : restarting interrupted promise evaluation
## Graphing a standard median distance.
## Warning in plot_sm(expt, method = distmethod, title = smd_title, ...):
## restarting interrupted promise evaluation
## Graphing a PCA plot.
## Warning in plot_pca(expt, title = pca_title, ...): restarting interrupted
## promise evaluation
## Plotting a density plot.
## Warning in plot_density(expt, title = dens_title): restarting interrupted
## promise evaluation
## Printing a color to condition legend.
## Error in corheat$plot: $ operator is invalid for atomic vectors
sv_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
## Error in expt_subset(...): object 'sv_adjusted' not found
sv_adjusted_norm <- sm(normalize_expt(sv_adjusted, convert="cpm", filter="simple",
                                      batch="sva", norm="quant", transform="log2"))
## Error in normalize_expt(sv_adjusted, convert = "cpm", filter = "simple", : object 'sv_adjusted' not found
sv_adjusted_norm_metrics <- sm(graph_metrics(sv_adjusted_norm))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macropara_metrics <- sm(graph_metrics(macrophage_parasite))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macropara_norm <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE, norm="quant"))
## Error in normalize_expt(macrophage_parasite, convert = "cpm", filter = TRUE, : object 'macrophage_parasite' not found
macropara_norm_metrics <- sm(graph_metrics(macropara_norm))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macropara_normsva <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE, norm="quant",
                                       batch="fsva", transform="log2"))
## Error in normalize_expt(macrophage_parasite, convert = "cpm", filter = TRUE, : object 'macrophage_parasite' not found
macropara_normsva_metrics <- sm(graph_metrics(macropara_normsva))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macropara_normcombat <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE,
                                          norm="quant", batch="combat_scale", transform="log2"))
## Error in normalize_expt(macrophage_parasite, convert = "cpm", filter = TRUE, : object 'macrophage_parasite' not found
macropara_normcombat_metrics <- sm(graph_metrics(macropara_normcombat))
## Error in corheat$plot: $ operator is invalid for atomic vectors
macropara_normsvaseq <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter="cbcbseq",
                                          transform="log2", batch="svaseq"))
## Error in normalize_expt(macrophage_parasite, convert = "cpm", filter = "cbcbseq", : object 'macrophage_parasite' not found
macropara_normsvaseq_metrics <- sm(graph_metrics(macropara_normsvaseq))
## Error in corheat$plot: $ operator is invalid for atomic vectors

2 Compare results

Now lets compare the results of using the human data to inform the parasite estimations.

2.1 Before including the new estimates

Lets first print a few using different metrics before using the human SV estimates. It looks like sva etc are not too bad.

macropara_norm_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'macropara_norm_metrics' not found
## Just normalizing shows the central concern.
macropara_normsva_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'macropara_normsva_metrics' not found
## Not too bad
macropara_normcombat_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'macropara_normcombat_metrics' not found
## Combat cannot deal with this data very well.
macropara_normsvaseq_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'macropara_normsvaseq_metrics' not found
## svaseq gets a little confused, but not bad.
macropara_normsvaseq_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'macropara_normsvaseq_metrics' not found

2.2 Using human SVA data

Seems pretty good to me.

sv_adjusted_norm_metrics$pcaplot
## Error in eval(expr, envir, enclos): object 'sv_adjusted_norm_metrics' not found
## This is at least as good as the other methods.

index.html

3 Variance Partition

In the following, I will attempt to find the variance associated with different experimental factors in the macrophage data with mappings against the parasite transcriptome.

3.1 Setting up

As in macrophage_estimation_parasite.html, I need to pull out the relevant samples:

4 Perform variancePartition analyses

4.1 Condition + batch

Start out with just condition and batch

## The default arguments are to query ~ condition + (1|batch)
## Which, because condition is categorical, will end badly.
vp_cb <- sm(varpart(macrophage_parasite, predictor=NULL, factors=c("condition","batch")))
## Error in is.factor(x): object 'macrophage_parasite' not found
vp_cb$percent_plot
## Error in eval(expr, envir, enclos): object 'vp_cb' not found
vp_cb$partition_plot
## Error in eval(expr, envir, enclos): object 'vp_cb' not found
vp_norm <- sm(varpart(macropara_norm, predictor=NULL, factors=c("condition","batch")))
## Error in is.factor(x): object 'macropara_norm' not found
vp_norm$percent_plot
## Error in eval(expr, envir, enclos): object 'vp_norm' not found
vp_norm$partition_plot
## Error in eval(expr, envir, enclos): object 'vp_norm' not found
vp_normsva <- sm(varpart(macropara_normsva, predictor=NULL, factors=c("condition","batch")))
## Error in is.factor(x): object 'macropara_normsva' not found
vp_normsva$percent_plot
## Error in eval(expr, envir, enclos): object 'vp_normsva' not found
vp_normsva$partition_plot
## Error in eval(expr, envir, enclos): object 'vp_normsva' not found

index.html annotation.html infection_estimation.html

---
title: "RNAseq of L.panamensis:  Parasite Macrophage Sample Estimation."
author: "atb abelew@gmail.com"
date: "`r Sys.Date()`"
output:
 html_document:
  code_download: true
  code_folding: show
  fig_caption: true
  fig_height: 7
  fig_width: 7
  highlight: tango
  keep_md: false
  mode: selfcontained
  number_sections: true
  self_contained: true
  theme: cosmo
  toc: true
  toc_float:
   collapsed: false
   smooth_scroll: false
---

<style>
  <!-- Document prelude revision 2016-10 -->
  body .main-container {
    max-width: 1600px;
}
</style>

```{r options, include=FALSE}
## These are the options I tend to favor
library("hpgltools")
tt <- sm(devtools::load_all("~/hpgltools"))
knitr::opts_knit$set(
    progress = TRUE,
    verbose = TRUE,
    width = 90,
    echo = TRUE)
knitr::opts_chunk$set(
    error = TRUE,
    fig.width = 8,
    fig.height = 8,
    dpi = 96)
options(
    digits = 4,
    stringsAsFactors = FALSE,
    knitr.duplicate.label = "allow")
ggplot2::theme_set(ggplot2::theme_bw(base_size=10))
set.seed(1)
rmd_file <- "macrophage_estimation_parasite.Rmd"
ver <- "20170202"
```

[index.html](index.html)

```{r rendering, include=FALSE, eval=FALSE}
## This block is used to render a document from within it.
rmarkdown::render(rmd_file)

## An extra renderer for pdf output
rmarkdown::render(rmd_file, output_format="pdf_document", output_options=c("skip_html"))
## Or to save/load large Rdata files.
hpgltools:::saveme()
hpgltools:::loadme()
rm(list=ls())
```

This document is concerned with analyzing RNAseq data of human and parasite during the infection of
human macrophages.  A few different strains of L. panamensis were used; the experiment is therefore
segregated into groups named 'self-healing', 'chronic', and 'uninfected'.  Two separate sets of
libraries were generated, one earlier set with greater coverage, and a later set including only 1
uninfected sample, and 2-3 chronic samples.

```{r loadme, include=FALSE}
tmp <- sm(loadme(filename="annotation.rda.xz"))
```

# Sample metrics

I just realized that what I need to do is to take the surrogate estimates from the human experiment.
Using those I think I will be able to get a more realistic estimate of what is going on.

## But first, attempt a normal estimation

```{r parasite_estimates_normal, fig.show="hide"}
macrophage_parasite <- expt_subset(parasite_expt, subset="experimentname=='macrophage'")
chosen_colors <- c("#990000", "#000099")
names(chosen_colors) <- c("macro_ch", "macro_sh")
macrophage_parasite <- set_expt_colors(macrophage_parasite, colors=chosen_colors)
testing <- sm(write_expt(macrophage_parasite,
                         excel=paste0("excel/macrophage_parasite_data-v", ver, ".xlsx"),
                         violin=TRUE)
```

Now lets try removing some of the surrogates, the two primary candidates are the strains
which I proxy with 3 columns: snpclade, snpcladev2, and snpcladev3; and the batch.  In this
data set batch is either a or b.

In theory, sva should pick up both of those in one invocation.

```{r generate_graphs, fig.show="hide"}
## Perform a 'normal' sva
macrophage_sva_norm <- sm(normalize_expt(macrophage_parasite, batch="svaseq", filter=TRUE))
## Play with the normalization of it
macrophage_sva_norm <- sm(normalize_expt(macrophage_sva_norm, filter=TRUE, convert="cpm",
                                         norm="quant", transform="log2"))
macrophage_sva_met <- sm(graph_metrics(macrophage_sva_norm))
## Try limma's method
macrophage_limma_norm <- sm(normalize_expt(macrophage_parasite, batch="limma", filter=TRUE,
                                           batch2="snpcladev3", convert="cpm", norm="quant"))
macrophage_limma_norm <- sm(normalize_expt(macrophage_limma_norm, convert="cpm",
                                           norm="quant", filter=TRUE))
macrophage_limma_met <- sm(graph_metrics(macrophage_limma_norm))
```

## Start with a pca from sva

```{r sva_attempt}
macrophage_sva_met$pcaplot
```

## Try limma's removebatcheffect

Another method might be to try using limmaresid to explicitly pull both columns.

```{r resid_attempt}
macrophage_limma_met$pcaplot
```

Another method might be to try pca on two separate invocations.

```{r perform_residuals, fig.show=FALSE}
macrophage_pca_norm <- sm(normalize_expt(macrophage_parasite, batch="pca", filter=TRUE))
macrophage_pca_norm <- set_expt_batch(expt=macrophage_pca_norm, fact="snpcladev3")
macrophage_pca_norm <- sm(normalize_expt(macrophage_pca_norm, batch="pca"))
macrophage_pca_norm <- sm(normalize_expt(macrophage_pca_norm, convert="cpm",
                                         norm="quant", transform="log2",
                                         filter=TRUE))
macrophage_pca_met <- sm(graph_metrics(macrophage_pca_norm))
```

```{r pca_attempt}
macrophage_pca_met$pcaplot
```

## Attempting count extraction from human surrogates

If indeed the human surrogates are a primary determinant in the parasite data, then we should
be able to see that in some sample estimations of the parasite data.  Let us try!

```{r parasite_estimation_human_surrogates, fig.show="hide"}
surrogate_estimations <- new.env()
load("savefiles/macrophage_estimation_human.rda.xz", envir=surrogate_estimations)
surrogate_estimations <- surrogate_estimations$surrogate_estimations
summary(surrogate_estimations)
adjusts <- surrogate_estimations$sva_supervised_adjust$model_adjust
## Remove surrogate estimates #1 and #9
new_adjusts <- as.matrix(adjusts[c(-1, -9), ])
## The question of course,  how do I use these estimations to adjust the data?

test_adjusted <- counts_from_surrogates(macrophage_parasite, new_adjusts)
sv_adjusted <- macrophage_parasite
Biobase::exprs(sv_adjusted$expressionset) <- test_adjusted

human_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
human_adjusted_norm <- sm(normalize_expt(human_adjusted, convert="cpm", filter="simple",
                                         batch="sva", norm="quant", transform="log2"))
human_adjusted_norm_metrics <- graph_metrics(human_adjusted_norm)

sv_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
sv_adjusted_norm <- sm(normalize_expt(sv_adjusted, convert="cpm", filter="simple",
                                      batch="sva", norm="quant", transform="log2"))
sv_adjusted_norm_metrics <- sm(graph_metrics(sv_adjusted_norm))

macropara_metrics <- sm(graph_metrics(macrophage_parasite))
macropara_norm <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE, norm="quant"))
macropara_norm_metrics <- sm(graph_metrics(macropara_norm))

macropara_normsva <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE, norm="quant",
                                       batch="fsva", transform="log2"))
macropara_normsva_metrics <- sm(graph_metrics(macropara_normsva))

macropara_normcombat <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter=TRUE,
                                          norm="quant", batch="combat_scale", transform="log2"))
macropara_normcombat_metrics <- sm(graph_metrics(macropara_normcombat))

macropara_normsvaseq <- sm(normalize_expt(macrophage_parasite, convert="cpm", filter="cbcbseq",
                                          transform="log2", batch="svaseq"))
macropara_normsvaseq_metrics <- sm(graph_metrics(macropara_normsvaseq))
```

# Compare results

Now lets compare the results of using the human data to inform the parasite estimations.

## Before including the new estimates

Lets first print a few using different metrics before using the human SV estimates.
It looks like sva etc are not too bad.

```{r compare_pca_among_estimates}
macropara_norm_metrics$pcaplot
## Just normalizing shows the central concern.
macropara_normsva_metrics$pcaplot
## Not too bad
macropara_normcombat_metrics$pcaplot
## Combat cannot deal with this data very well.
macropara_normsvaseq_metrics$pcaplot
## svaseq gets a little confused, but not bad.
macropara_normsvaseq_metrics$pcaplot
```

## Using human SVA data

Seems pretty good to me.

```{r using human sva}
sv_adjusted_norm_metrics$pcaplot
## This is at least as good as the other methods.
```

```{r saveme, include=FALSE}
tmp <- sm(saveme(filename="macrophage_estimation_parasite.rda.xz"))
```

[index.html](index.html)

# Variance Partition

In the following, I will attempt to find the variance associated with different experimental factors
in the macrophage data with mappings against the parasite transcriptome.

## Setting up

As in [macrophage_estimation_parasite.html](macrophage_estimation_parasite.html), I need to pull out the
relevant samples:

# Perform variancePartition analyses

## Condition + batch

Start out with just condition and batch

```{r varpart_condbatch}
## The default arguments are to query ~ condition + (1|batch)
## Which, because condition is categorical, will end badly.
vp_cb <- sm(varpart(macrophage_parasite, predictor=NULL, factors=c("condition","batch")))
vp_cb$percent_plot
vp_cb$partition_plot

vp_norm <- sm(varpart(macropara_norm, predictor=NULL, factors=c("condition","batch")))
vp_norm$percent_plot
vp_norm$partition_plot

vp_normsva <- sm(varpart(macropara_normsva, predictor=NULL, factors=c("condition","batch")))
vp_normsva$percent_plot
vp_normsva$partition_plot
```

[index.html](index.html) [annotation.html](annotation.html) [infection_estimation.html](infection_estimation.html)
