1 Sample Estimation: 20180410

2 L. major sample estimation

3 Creating expressionset(s)

lm_annotations <- lm_annotv2
lm_expt <- create_expt(metadata="sample_sheets/all_samples_201804.xlsx",
4.1 Look at metrics

Ok, before removing the uninfected samples, lets look at some metrics with them and see if any worrisome patterns jump out.


## pink and green are the uninfected samples

## Still more reads than ideal for uninfected, but way better than before
## I feel like this will probably work?
## Lets see how they cluster before removing them and looking more seriously at the infected.

## wow not much variance is in either axis, interesting.

## tsne suggests 1088 is suspicious

## It seems to me the clustering is actually beautiful, I bet the uninfected
## samples are just making it look bad.

4.2 Drop uninfected

Now lets get rid of the uninfected samples and try again and see if I am right.

lmpara_expt <- subset_expt(lm_expt, subset="parasitesp=='yes'")
paranormal_batch <- normalize_expt(lmpara_expt, transform="log2", convert="cpm",
4.3 Show paranormal plots!




## hmm interesting




Well, it looks like I was not right. I don’t think I was entirely wrong either, but I was certainly not right. Lets see if varpart can pick out anything.

## ok, so my arbitrary creation of 'batch' out of the replicate number is useless.
## But damn there is a lot of unexplained variance in this data!

5 Write out the data

lm_fun <- write_expt(lmpara_expt, excel=paste0("excel/lmparasite_samples_written-v", ver, ".xlsx"),
                       filter=TRUE, norm="quant", convert="raw", transform="log2")
R version 3.4.4 (2018-03-15)

**Platform:** x86_64-pc-linux-gnu (64-bit)


attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: ruv(v.0.9.7) and hpgltools(v.2018.03)

loaded via a namespace (and not attached): nlme(v.3.1-131.1), bitops(v.1.0-6), matrixStats(v.0.53.1), pbkrtest(v.0.4-7), devtools(v.1.13.5), bit64(v.0.9-7), doParallel(v.1.0.11), RColorBrewer(v.1.1-2), rprojroot(v.1.3-2), tools(v.3.4.4), backports(v.1.1.2), R6(v.2.2.2), KernSmooth(v.2.23-15), DBI(v.0.8), lazyeval(v.0.2.1), BiocGenerics(v.0.24.0), mgcv(v.1.8-23), colorspace(v.1.3-2), withr(v.2.1.2), gridExtra(v.2.3), bit(v.1.1-12), compiler(v.3.4.4), preprocessCore(v.1.40.0), Biobase(v.2.38.0), xml2(v.1.2.0), labeling(v.0.3), caTools(v.1.17.1), scales(v., readr(v.1.1.1), genefilter(v.1.60.0), quadprog(v.1.5-5), commonmark(v.1.4), stringr(v.1.3.0), digest(v.0.6.15), minqa(v.1.2.4), rmarkdown(v.1.9), variancePartition(v.1.8.1), colorRamps(v.2.3), base64enc(v.0.1-3), pkgconfig(v.2.0.1), htmltools(v.0.3.6), lme4(v.1.1-15), limma(v.3.34.9), rlang(v., RSQLite(v.2.0), BiocParallel(v.1.12.0), gtools(v.3.5.0), RCurl(v.1.95-4.10), magrittr(v.1.5), Matrix(v.1.2-12), Rcpp(v.0.12.16), munsell(v.0.4.3), S4Vectors(v.0.16.0), stringi(v.1.1.7), yaml(v.2.1.18), edgeR(v.3.20.9), MASS(v.7.3-49), gplots(v.3.0.1), Rtsne(v.0.13), plyr(v.1.8.4), grid(v.3.4.4), blob(v.1.1.0), parallel(v.3.4.4), gdata(v.2.18.0), ggrepel(v.0.7.0), lattice(v.0.20-35), splines(v.3.4.4), annotate(v.1.56.1), pander(v.0.6.1), hms(v.0.4.2), locfit(v.1.5-9.1), knitr(v.1.20), pillar(v.1.2.1), rjson(v.0.2.15), corpcor(v.1.6.9), reshape2(v.1.4.3), codetools(v.0.2-15), stats4(v.3.4.4), XML(v.3.98-1.10), evaluate(v.0.10.1), data.table(v.1.10.4-3), nloptr(v.1.0.4), foreach(v.1.4.4), gtable(v.0.2.0), ggplot2(v.2.2.1), openxlsx(v.4.0.17), xtable(v.1.8-2), roxygen2(v.6.0.1), survival(v.2.41-3), tibble(v.1.4.2), iterators(v.1.0.9), AnnotationDbi(v.1.40.0), memoise(v.1.1.0), IRanges(v.2.12.0), tximport(v.1.6.0), sva(v.3.26.0) and directlabels(v.2017.03.31)

this_save <- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
