1 An analysis of data taken from RNAseq of Ixodes scapularis

2 TODO

  • 2017-04-20:
  1. Perform analyses of the mouse transcriptomic data.
  2. Make 100% certain for myself that limma is working properly.
  • 2016-12-13:
  1. Analysis for gut: As we have discussed today, the first mix-up samples for uninfected and infected gut (dated 3/1/15) needed to be compared and if necessary, added to the second experiments for naïve and infected gut (4/1/16). Please provide us additional analysis for differential expression for genes and pathways (or any routine analysis for such studies).
  2. Analysis of Salivary glands: This has only been done for second expt (4/1/16) and we need the similar analysis as above.
  3. We could have transcript that are unique to either gut or salivary gland or present in both (or ones equally modulated by borrelial infection).
  • Previous:
  1. Split sample estimation more (done)
  2. For sending to Utpal, remove all mention of old samples and the last 4 (done)
  3. Otherwise rerun analyses as-is and send along results. (done)
  4. KEGG of infected/uninfected for the 2 tissues.
  5. Ontology of infected/uninfected for the 2 tissues.

I hope to use this document to attempt to make it easy for anyone to redo/improve what I did. Most(all) of the preprocessing tasks covered in this document are taken from a small command line utility to handle many of the repetetive tasks performed when preprocessing new data. It may be found here:

http://github.com/abelew/CYOA

It invokes all the various commands for me and passes them to the computer cluster. In doing this, it writes the commands to a series of shell scripts to (hopefully) make it easier to see what was actually done. With that in mind, here is what I have done so far.

The main topics of this are:

3 Installation and setup

These are rmarkdown documents which make heavy use of the hpgltools package. The following section demonstrates how to set that up in a clean R environment.

## Use R's install.packages to install devtools.
install.packages("devtools")
## Use devtools to install hpgltools.
devtools::install_github("elsayedlab/hpgltools")
## Load hpgltools into the R environment.
library(hpgltools)
## Use hpgltools' autoloads_all() function to install the many packages used by hpgltools.
autoloads_all()
library('pander')
pander(sessionInfo())

R version 3.4.4 (2018-03-15)

**Platform:** x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: pander(v.0.6.1) and hpgltools(v.2018.03)

loaded via a namespace (and not attached): Rcpp(v.0.12.16), xml2(v.1.2.0), roxygen2(v.6.0.1), knitr(v.1.20), magrittr(v.1.5), devtools(v.1.13.5), BiocGenerics(v.0.24.0), munsell(v.0.4.3), colorspace(v.1.3-2), R6(v.2.2.2), rlang(v.0.2.0.9001), foreach(v.1.4.4), plyr(v.1.8.4), stringr(v.1.3.0), tools(v.3.4.4), parallel(v.3.4.4), grid(v.3.4.4), Biobase(v.2.38.0), data.table(v.1.10.4-3), gtable(v.0.2.0), withr(v.2.1.2), commonmark(v.1.4), htmltools(v.0.3.6), iterators(v.1.0.9), lazyeval(v.0.2.1), yaml(v.2.1.18), rprojroot(v.1.3-2), digest(v.0.6.15), tibble(v.1.4.2), ggplot2(v.2.2.1), base64enc(v.0.1-3), codetools(v.0.2-15), memoise(v.1.1.0), evaluate(v.0.10.1), rmarkdown(v.1.9), stringi(v.1.1.7), pillar(v.1.2.1), compiler(v.3.4.4), scales(v.0.5.0.9000) and backports(v.1.1.2)

LS0tCnRpdGxlOiAiSS5zY2FwdWxhcmlzIDIwMTcwNjE0IEludHJvZHVjdGlvbiBhbmQgVE9ETyBsaXN0LiIKYXV0aG9yOiAiYXRiIGFiZWxld0BnbWFpbC5jb20iCmRhdGU6ICJgciBTeXMuRGF0ZSgpYCIKb3V0cHV0OgogaHRtbF9kb2N1bWVudDoKICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgY29kZV9mb2xkaW5nOiBzaG93CiAgZmlnX2NhcHRpb246IHRydWUKICBmaWdfaGVpZ2h0OiA3CiAgZmlnX3dpZHRoOiA3CiAgaGlnaGxpZ2h0OiBkZWZhdWx0CiAga2VlcF9tZDogZmFsc2UKICBtb2RlOiBzZWxmY29udGFpbmVkCiAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCiAgc2VsZl9jb250YWluZWQ6IHRydWUKICB0aGVtZTogcmVhZGFibGUKICB0b2M6IHRydWUKICB0b2NfZmxvYXQ6CiAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICBzbW9vdGhfc2Nyb2xsOiBmYWxzZQotLS0KCjxzdHlsZT4KICBib2R5IC5tYWluLWNvbnRhaW5lciB7CiAgICBtYXgtd2lkdGg6IDE2MDBweDsKICB9Cjwvc3R5bGU+CgpgYGB7ciBvcHRpb25zLCBpbmNsdWRlPUZBTFNFfQppZiAoIWlzVFJVRShnZXQwKCJza2lwX2xvYWQiKSkpIHsKICBsaWJyYXJ5KGhwZ2x0b29scykKICB0dCA8LSBkZXZ0b29sczo6bG9hZF9hbGwoIn4vaHBnbHRvb2xzIikKICBrbml0cjo6b3B0c19rbml0JHNldChwcm9ncmVzcz1UUlVFLAogICAgICAgICAgICAgICAgICAgICAgIHZlcmJvc2U9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgICB3aWR0aD05MCwKICAgICAgICAgICAgICAgICAgICAgICBlY2hvPVRSVUUpCiAga25pdHI6Om9wdHNfY2h1bmskc2V0KGVycm9yPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgICAgIGZpZy53aWR0aD04LAogICAgICAgICAgICAgICAgICAgICAgICBmaWcuaGVpZ2h0PTgsCiAgICAgICAgICAgICAgICAgICAgICAgIGRwaT05NikKICBvbGRfb3B0aW9ucyA8LSBvcHRpb25zKGRpZ2l0cz00LAogICAgICAgICAgICAgICAgICAgICAgICAgc3RyaW5nc0FzRmFjdG9ycz1GQUxTRSwKICAgICAgICAgICAgICAgICAgICAgICAgIGtuaXRyLmR1cGxpY2F0ZS5sYWJlbD0iYWxsb3ciKQogIGdncGxvdDI6OnRoZW1lX3NldChnZ3Bsb3QyOjp0aGVtZV9idyhiYXNlX3NpemU9MTApKQogIHZlciA8LSAiMjAxNzA2MTQiCiAgcHJldmlvdXNfZmlsZSA8LSBwYXN0ZTAoImluZGV4LlJtZCIpCgogIHRtcCA8LSB0cnkoc20obG9hZG1lKGZpbGVuYW1lPWdzdWIocGF0dGVybj0iXFwuUm1kIiwgcmVwbGFjZT0iXFwucmRhXFwueHoiLCB4PXByZXZpb3VzX2ZpbGUpKSkpCiAgcm1kX2ZpbGUgPC0gImluZGV4LlJtZCIKICBzYXZlZmlsZSA8LSBnc3ViKHBhdHRlcm49IlxcLlJtZCIsIHJlcGxhY2U9IlxcLnJkYVxcLnh6IiwgeD1ybWRfZmlsZSkKfQpgYGAKCkFuIGFuYWx5c2lzIG9mIGRhdGEgdGFrZW4gZnJvbSBSTkFzZXEgb2YgSXhvZGVzIHNjYXB1bGFyaXMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKIyBUT0RPCgoqIDIwMTctMDQtMjA6CgoxLiAgUGVyZm9ybSBhbmFseXNlcyBvZiB0aGUgbW91c2UgdHJhbnNjcmlwdG9taWMgZGF0YS4KMi4gIE1ha2UgMTAwJSBjZXJ0YWluIGZvciBteXNlbGYgdGhhdCBsaW1tYSBpcyB3b3JraW5nIHByb3Blcmx5LgoKKiAyMDE2LTEyLTEzOgoKMS4gIEFuYWx5c2lzIGZvciBndXQ6IEFzIHdlIGhhdmUgZGlzY3Vzc2VkIHRvZGF5LCB0aGUgZmlyc3QgbWl4LXVwIHNhbXBsZXMgZm9yIHVuaW5mZWN0ZWQgYW5kCmluZmVjdGVkIGd1dCAoZGF0ZWQgMy8xLzE1KSBuZWVkZWQgdG8gYmUgY29tcGFyZWQgYW5kIGlmIG5lY2Vzc2FyeSwgYWRkZWQgdG8gdGhlIHNlY29uZCBleHBlcmltZW50cwpmb3IgbmHDr3ZlIGFuZCBpbmZlY3RlZCBndXQgKDQvMS8xNikuIFBsZWFzZSBwcm92aWRlIHVzIGFkZGl0aW9uYWwgYW5hbHlzaXMgZm9yIGRpZmZlcmVudGlhbApleHByZXNzaW9uIGZvciBnZW5lcyBhbmQgcGF0aHdheXMgKG9yIGFueSByb3V0aW5lIGFuYWx5c2lzIGZvciBzdWNoIHN0dWRpZXMpLgoyLiAgQW5hbHlzaXMgb2YgU2FsaXZhcnkgZ2xhbmRzOiAgVGhpcyBoYXMgb25seSBiZWVuIGRvbmUgZm9yIHNlY29uZCBleHB0ICg0LzEvMTYpIGFuZCB3ZSBuZWVkIHRoZQogICAgc2ltaWxhciBhbmFseXNpcyBhcyBhYm92ZS4KMy4gIFdlIGNvdWxkIGhhdmUgdHJhbnNjcmlwdCB0aGF0IGFyZSB1bmlxdWUgdG8gZWl0aGVyIGd1dCBvciBzYWxpdmFyeSBnbGFuZCBvciBwcmVzZW50IGluIGJvdGggKG9yCiAgICBvbmVzIGVxdWFsbHkgbW9kdWxhdGVkIGJ5IGJvcnJlbGlhbCBpbmZlY3Rpb24pLgoKKiBQcmV2aW91czoKCjEuICBTcGxpdCBzYW1wbGUgZXN0aW1hdGlvbiBtb3JlIChkb25lKQoyLiAgRm9yIHNlbmRpbmcgdG8gVXRwYWwsIHJlbW92ZSBhbGwgbWVudGlvbiBvZiBvbGQgc2FtcGxlcyBhbmQgdGhlIGxhc3QgNCAoZG9uZSkKMy4gIE90aGVyd2lzZSByZXJ1biBhbmFseXNlcyBhcy1pcyBhbmQgc2VuZCBhbG9uZyByZXN1bHRzLiAoZG9uZSkKNC4gIEtFR0cgb2YgaW5mZWN0ZWQvdW5pbmZlY3RlZCBmb3IgdGhlIDIgdGlzc3Vlcy4KNS4gIE9udG9sb2d5IG9mIGluZmVjdGVkL3VuaW5mZWN0ZWQgZm9yIHRoZSAyIHRpc3N1ZXMuCgpJIGhvcGUgdG8gdXNlIHRoaXMgZG9jdW1lbnQgdG8gYXR0ZW1wdCB0byBtYWtlIGl0IGVhc3kgZm9yIGFueW9uZSB0byByZWRvL2ltcHJvdmUgd2hhdCBJIGRpZC4KTW9zdChhbGwpIG9mIHRoZSBwcmVwcm9jZXNzaW5nIHRhc2tzIGNvdmVyZWQgaW4gdGhpcyBkb2N1bWVudCBhcmUgdGFrZW4gZnJvbSBhIHNtYWxsIGNvbW1hbmQgbGluZQp1dGlsaXR5IHRvIGhhbmRsZSBtYW55IG9mIHRoZSByZXBldGV0aXZlIHRhc2tzIHBlcmZvcm1lZCB3aGVuIHByZXByb2Nlc3NpbmcgbmV3IGRhdGEuICBJdCBtYXkgYmUKZm91bmQgaGVyZToKCmh0dHA6Ly9naXRodWIuY29tL2FiZWxldy9DWU9BCgpJdCBpbnZva2VzIGFsbCB0aGUgdmFyaW91cyBjb21tYW5kcyBmb3IgbWUgYW5kIHBhc3NlcyB0aGVtIHRvIHRoZSBjb21wdXRlciBjbHVzdGVyLiAgSW4gZG9pbmcgdGhpcywKaXQgd3JpdGVzIHRoZSBjb21tYW5kcyB0byBhIHNlcmllcyBvZiBzaGVsbCBzY3JpcHRzIHRvIChob3BlZnVsbHkpIG1ha2UgaXQgZWFzaWVyIHRvIHNlZSB3aGF0IHdhcwphY3R1YWxseSBkb25lLiBXaXRoIHRoYXQgaW4gbWluZCwgaGVyZSBpcyB3aGF0IEkgaGF2ZSBkb25lIHNvIGZhci4KClRoZSBtYWluIHRvcGljcyBvZiB0aGlzIGFyZToKCiogIFtQcmVwcm9jZXNzaW5nXShwcmVwcm9jZXNzaW5nLmh0bWwpICAgRnJvbSByYXcgc2VxdWVuY2luZyBkYXRhIHRvIGNvdW50IHRhYmxlcy4KKiAgW0Fubm90YXRpb25dKGFubm90YXRpb24uaHRtbCkgICBDb2xsZWN0aW9uIGFubm90YXRpb24gZGF0YS4KKiAgW0Fubm90YXRpb24gbW91c2VdKGFubm90YXRpb25fbW11c2N1bHVzLmh0bWwpICAgQ29sbGVjdGlvbiBhbm5vdGF0aW9uIGRhdGEuCiogIFtTYW1wbGUgRXN0aW1hdGlvbl0oc2FtcGxlX2VzdGltYXRpb24uaHRtbCkgIEVzdGltYXRpbmcgYmF0Y2ggZWZmZWN0cyBldGMgaW4gdGhlIGRhdGEuCiogIFtTYW1wbGUgRXN0aW1hdGlvbiBtb3VzZV0oc2FtcGxlX2VzdGltYXRpb25fbW11c2N1bHVzLmh0bWwpICBFc3RpbWF0aW5nIGJhdGNoIGVmZmVjdHMgZXRjIGluIHRoZSBkYXRhLgoqICBbRGlmZmVyZW50aWFsIGV4cHJlc3Npb25dKGRpZmZlcmVudGlhbF9leHByZXNzaW9uLmh0bWwpICBQZXJmb3JtIHRoZSBERSBhbmFseXNpcy4KKiAgW0RpZmZlcmVudGlhbCBleHByZXNzaW9uIG1vdXNlXShkaWZmZXJlbnRpYWxfZXhwcmVzc2lvbl9tbXVzY3VsdXMuaHRtbCkgIFBlcmZvcm0gdGhlIERFIGFuYWx5c2lzLgoKIyBJbnN0YWxsYXRpb24gYW5kIHNldHVwCgpUaGVzZSBhcmUgcm1hcmtkb3duIGRvY3VtZW50cyB3aGljaCBtYWtlIGhlYXZ5IHVzZSBvZiB0aGUgaHBnbHRvb2xzIHBhY2thZ2UuICBUaGUgZm9sbG93aW5nIHNlY3Rpb24KZGVtb25zdHJhdGVzIGhvdyB0byBzZXQgdGhhdCB1cCBpbiBhIGNsZWFuIFIgZW52aXJvbm1lbnQuCgpgYGB7ciBzZXR1cCwgZXZhbD1GQUxTRX0KIyMgVXNlIFIncyBpbnN0YWxsLnBhY2thZ2VzIHRvIGluc3RhbGwgZGV2dG9vbHMuCmluc3RhbGwucGFja2FnZXMoImRldnRvb2xzIikKIyMgVXNlIGRldnRvb2xzIHRvIGluc3RhbGwgaHBnbHRvb2xzLgpkZXZ0b29sczo6aW5zdGFsbF9naXRodWIoImVsc2F5ZWRsYWIvaHBnbHRvb2xzIikKIyMgTG9hZCBocGdsdG9vbHMgaW50byB0aGUgUiBlbnZpcm9ubWVudC4KbGlicmFyeShocGdsdG9vbHMpCiMjIFVzZSBocGdsdG9vbHMnIGF1dG9sb2Fkc19hbGwoKSBmdW5jdGlvbiB0byBpbnN0YWxsIHRoZSBtYW55IHBhY2thZ2VzIHVzZWQgYnkgaHBnbHRvb2xzLgphdXRvbG9hZHNfYWxsKCkKYGBgCgpgYGB7ciBzeXNpbmZvLCByZXN1bHRzPSdhc2lzJ30KbGlicmFyeSgncGFuZGVyJykKcGFuZGVyKHNlc3Npb25JbmZvKCkpCmBgYAo=