TODO
- Perform analyses of the mouse transcriptomic data.
- Make 100% certain for myself that limma is working properly.
- Analysis for gut: As we have discussed today, the first mix-up samples for uninfected and infected gut (dated 3/1/15) needed to be compared and if necessary, added to the second experiments for naïve and infected gut (4/1/16). Please provide us additional analysis for differential expression for genes and pathways (or any routine analysis for such studies).
- Analysis of Salivary glands: This has only been done for second expt (4/1/16) and we need the similar analysis as above.
- We could have transcript that are unique to either gut or salivary gland or present in both (or ones equally modulated by borrelial infection).
- Split sample estimation more (done)
- For sending to Utpal, remove all mention of old samples and the last 4 (done)
- Otherwise rerun analyses as-is and send along results. (done)
- KEGG of infected/uninfected for the 2 tissues.
- Ontology of infected/uninfected for the 2 tissues.
I hope to use this document to attempt to make it easy for anyone to redo/improve what I did. Most(all) of the preprocessing tasks covered in this document are taken from a small command line utility to handle many of the repetetive tasks performed when preprocessing new data. It may be found here:
http://github.com/abelew/CYOA
It invokes all the various commands for me and passes them to the computer cluster. In doing this, it writes the commands to a series of shell scripts to (hopefully) make it easier to see what was actually done. With that in mind, here is what I have done so far.
The main topics of this are:
Installation and setup
These are rmarkdown documents which make heavy use of the hpgltools package. The following section demonstrates how to set that up in a clean R environment.
## Use R's install.packages to install devtools.
install.packages("devtools")
## Use devtools to install hpgltools.
devtools::install_github("elsayedlab/hpgltools")
## Load hpgltools into the R environment.
library(hpgltools)
## Use hpgltools' autoloads_all() function to install the many packages used by hpgltools.
autoloads_all()
library('pander')
pander(sessionInfo())
R version 3.4.4 (2018-03-15)
**Platform:** x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: pander(v.0.6.1) and hpgltools(v.2018.03)
loaded via a namespace (and not attached): Rcpp(v.0.12.16), xml2(v.1.2.0), roxygen2(v.6.0.1), knitr(v.1.20), magrittr(v.1.5), devtools(v.1.13.5), BiocGenerics(v.0.24.0), munsell(v.0.4.3), colorspace(v.1.3-2), R6(v.2.2.2), rlang(v.0.2.0.9001), foreach(v.1.4.4), plyr(v.1.8.4), stringr(v.1.3.0), tools(v.3.4.4), parallel(v.3.4.4), grid(v.3.4.4), Biobase(v.2.38.0), data.table(v.1.10.4-3), gtable(v.0.2.0), withr(v.2.1.2), commonmark(v.1.4), htmltools(v.0.3.6), iterators(v.1.0.9), lazyeval(v.0.2.1), yaml(v.2.1.18), rprojroot(v.1.3-2), digest(v.0.6.15), tibble(v.1.4.2), ggplot2(v.2.2.1), base64enc(v.0.1-3), codetools(v.0.2-15), memoise(v.1.1.0), evaluate(v.0.10.1), rmarkdown(v.1.9), stringi(v.1.1.7), pillar(v.1.2.1), compiler(v.3.4.4), scales(v.0.5.0.9000) and backports(v.1.1.2)
LS0tCnRpdGxlOiAiSS5zY2FwdWxhcmlzIDIwMTcwNjE0IEludHJvZHVjdGlvbiBhbmQgVE9ETyBsaXN0LiIKYXV0aG9yOiAiYXRiIGFiZWxld0BnbWFpbC5jb20iCmRhdGU6ICJgciBTeXMuRGF0ZSgpYCIKb3V0cHV0OgogaHRtbF9kb2N1bWVudDoKICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgY29kZV9mb2xkaW5nOiBzaG93CiAgZmlnX2NhcHRpb246IHRydWUKICBmaWdfaGVpZ2h0OiA3CiAgZmlnX3dpZHRoOiA3CiAgaGlnaGxpZ2h0OiBkZWZhdWx0CiAga2VlcF9tZDogZmFsc2UKICBtb2RlOiBzZWxmY29udGFpbmVkCiAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCiAgc2VsZl9jb250YWluZWQ6IHRydWUKICB0aGVtZTogcmVhZGFibGUKICB0b2M6IHRydWUKICB0b2NfZmxvYXQ6CiAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICBzbW9vdGhfc2Nyb2xsOiBmYWxzZQotLS0KCjxzdHlsZT4KICBib2R5IC5tYWluLWNvbnRhaW5lciB7CiAgICBtYXgtd2lkdGg6IDE2MDBweDsKICB9Cjwvc3R5bGU+CgpgYGB7ciBvcHRpb25zLCBpbmNsdWRlPUZBTFNFfQppZiAoIWlzVFJVRShnZXQwKCJza2lwX2xvYWQiKSkpIHsKICBsaWJyYXJ5KGhwZ2x0b29scykKICB0dCA8LSBkZXZ0b29sczo6bG9hZF9hbGwoIn4vaHBnbHRvb2xzIikKICBrbml0cjo6b3B0c19rbml0JHNldChwcm9ncmVzcz1UUlVFLAogICAgICAgICAgICAgICAgICAgICAgIHZlcmJvc2U9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgICB3aWR0aD05MCwKICAgICAgICAgICAgICAgICAgICAgICBlY2hvPVRSVUUpCiAga25pdHI6Om9wdHNfY2h1bmskc2V0KGVycm9yPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgICAgIGZpZy53aWR0aD04LAogICAgICAgICAgICAgICAgICAgICAgICBmaWcuaGVpZ2h0PTgsCiAgICAgICAgICAgICAgICAgICAgICAgIGRwaT05NikKICBvbGRfb3B0aW9ucyA8LSBvcHRpb25zKGRpZ2l0cz00LAogICAgICAgICAgICAgICAgICAgICAgICAgc3RyaW5nc0FzRmFjdG9ycz1GQUxTRSwKICAgICAgICAgICAgICAgICAgICAgICAgIGtuaXRyLmR1cGxpY2F0ZS5sYWJlbD0iYWxsb3ciKQogIGdncGxvdDI6OnRoZW1lX3NldChnZ3Bsb3QyOjp0aGVtZV9idyhiYXNlX3NpemU9MTApKQogIHZlciA8LSAiMjAxNzA2MTQiCiAgcHJldmlvdXNfZmlsZSA8LSBwYXN0ZTAoImluZGV4LlJtZCIpCgogIHRtcCA8LSB0cnkoc20obG9hZG1lKGZpbGVuYW1lPWdzdWIocGF0dGVybj0iXFwuUm1kIiwgcmVwbGFjZT0iXFwucmRhXFwueHoiLCB4PXByZXZpb3VzX2ZpbGUpKSkpCiAgcm1kX2ZpbGUgPC0gImluZGV4LlJtZCIKICBzYXZlZmlsZSA8LSBnc3ViKHBhdHRlcm49IlxcLlJtZCIsIHJlcGxhY2U9IlxcLnJkYVxcLnh6IiwgeD1ybWRfZmlsZSkKfQpgYGAKCkFuIGFuYWx5c2lzIG9mIGRhdGEgdGFrZW4gZnJvbSBSTkFzZXEgb2YgSXhvZGVzIHNjYXB1bGFyaXMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PQoKIyBUT0RPCgoqIDIwMTctMDQtMjA6CgoxLiAgUGVyZm9ybSBhbmFseXNlcyBvZiB0aGUgbW91c2UgdHJhbnNjcmlwdG9taWMgZGF0YS4KMi4gIE1ha2UgMTAwJSBjZXJ0YWluIGZvciBteXNlbGYgdGhhdCBsaW1tYSBpcyB3b3JraW5nIHByb3Blcmx5LgoKKiAyMDE2LTEyLTEzOgoKMS4gIEFuYWx5c2lzIGZvciBndXQ6IEFzIHdlIGhhdmUgZGlzY3Vzc2VkIHRvZGF5LCB0aGUgZmlyc3QgbWl4LXVwIHNhbXBsZXMgZm9yIHVuaW5mZWN0ZWQgYW5kCmluZmVjdGVkIGd1dCAoZGF0ZWQgMy8xLzE1KSBuZWVkZWQgdG8gYmUgY29tcGFyZWQgYW5kIGlmIG5lY2Vzc2FyeSwgYWRkZWQgdG8gdGhlIHNlY29uZCBleHBlcmltZW50cwpmb3IgbmHDr3ZlIGFuZCBpbmZlY3RlZCBndXQgKDQvMS8xNikuIFBsZWFzZSBwcm92aWRlIHVzIGFkZGl0aW9uYWwgYW5hbHlzaXMgZm9yIGRpZmZlcmVudGlhbApleHByZXNzaW9uIGZvciBnZW5lcyBhbmQgcGF0aHdheXMgKG9yIGFueSByb3V0aW5lIGFuYWx5c2lzIGZvciBzdWNoIHN0dWRpZXMpLgoyLiAgQW5hbHlzaXMgb2YgU2FsaXZhcnkgZ2xhbmRzOiAgVGhpcyBoYXMgb25seSBiZWVuIGRvbmUgZm9yIHNlY29uZCBleHB0ICg0LzEvMTYpIGFuZCB3ZSBuZWVkIHRoZQogICAgc2ltaWxhciBhbmFseXNpcyBhcyBhYm92ZS4KMy4gIFdlIGNvdWxkIGhhdmUgdHJhbnNjcmlwdCB0aGF0IGFyZSB1bmlxdWUgdG8gZWl0aGVyIGd1dCBvciBzYWxpdmFyeSBnbGFuZCBvciBwcmVzZW50IGluIGJvdGggKG9yCiAgICBvbmVzIGVxdWFsbHkgbW9kdWxhdGVkIGJ5IGJvcnJlbGlhbCBpbmZlY3Rpb24pLgoKKiBQcmV2aW91czoKCjEuICBTcGxpdCBzYW1wbGUgZXN0aW1hdGlvbiBtb3JlIChkb25lKQoyLiAgRm9yIHNlbmRpbmcgdG8gVXRwYWwsIHJlbW92ZSBhbGwgbWVudGlvbiBvZiBvbGQgc2FtcGxlcyBhbmQgdGhlIGxhc3QgNCAoZG9uZSkKMy4gIE90aGVyd2lzZSByZXJ1biBhbmFseXNlcyBhcy1pcyBhbmQgc2VuZCBhbG9uZyByZXN1bHRzLiAoZG9uZSkKNC4gIEtFR0cgb2YgaW5mZWN0ZWQvdW5pbmZlY3RlZCBmb3IgdGhlIDIgdGlzc3Vlcy4KNS4gIE9udG9sb2d5IG9mIGluZmVjdGVkL3VuaW5mZWN0ZWQgZm9yIHRoZSAyIHRpc3N1ZXMuCgpJIGhvcGUgdG8gdXNlIHRoaXMgZG9jdW1lbnQgdG8gYXR0ZW1wdCB0byBtYWtlIGl0IGVhc3kgZm9yIGFueW9uZSB0byByZWRvL2ltcHJvdmUgd2hhdCBJIGRpZC4KTW9zdChhbGwpIG9mIHRoZSBwcmVwcm9jZXNzaW5nIHRhc2tzIGNvdmVyZWQgaW4gdGhpcyBkb2N1bWVudCBhcmUgdGFrZW4gZnJvbSBhIHNtYWxsIGNvbW1hbmQgbGluZQp1dGlsaXR5IHRvIGhhbmRsZSBtYW55IG9mIHRoZSByZXBldGV0aXZlIHRhc2tzIHBlcmZvcm1lZCB3aGVuIHByZXByb2Nlc3NpbmcgbmV3IGRhdGEuICBJdCBtYXkgYmUKZm91bmQgaGVyZToKCmh0dHA6Ly9naXRodWIuY29tL2FiZWxldy9DWU9BCgpJdCBpbnZva2VzIGFsbCB0aGUgdmFyaW91cyBjb21tYW5kcyBmb3IgbWUgYW5kIHBhc3NlcyB0aGVtIHRvIHRoZSBjb21wdXRlciBjbHVzdGVyLiAgSW4gZG9pbmcgdGhpcywKaXQgd3JpdGVzIHRoZSBjb21tYW5kcyB0byBhIHNlcmllcyBvZiBzaGVsbCBzY3JpcHRzIHRvIChob3BlZnVsbHkpIG1ha2UgaXQgZWFzaWVyIHRvIHNlZSB3aGF0IHdhcwphY3R1YWxseSBkb25lLiBXaXRoIHRoYXQgaW4gbWluZCwgaGVyZSBpcyB3aGF0IEkgaGF2ZSBkb25lIHNvIGZhci4KClRoZSBtYWluIHRvcGljcyBvZiB0aGlzIGFyZToKCiogIFtQcmVwcm9jZXNzaW5nXShwcmVwcm9jZXNzaW5nLmh0bWwpICAgRnJvbSByYXcgc2VxdWVuY2luZyBkYXRhIHRvIGNvdW50IHRhYmxlcy4KKiAgW0Fubm90YXRpb25dKGFubm90YXRpb24uaHRtbCkgICBDb2xsZWN0aW9uIGFubm90YXRpb24gZGF0YS4KKiAgW0Fubm90YXRpb24gbW91c2VdKGFubm90YXRpb25fbW11c2N1bHVzLmh0bWwpICAgQ29sbGVjdGlvbiBhbm5vdGF0aW9uIGRhdGEuCiogIFtTYW1wbGUgRXN0aW1hdGlvbl0oc2FtcGxlX2VzdGltYXRpb24uaHRtbCkgIEVzdGltYXRpbmcgYmF0Y2ggZWZmZWN0cyBldGMgaW4gdGhlIGRhdGEuCiogIFtTYW1wbGUgRXN0aW1hdGlvbiBtb3VzZV0oc2FtcGxlX2VzdGltYXRpb25fbW11c2N1bHVzLmh0bWwpICBFc3RpbWF0aW5nIGJhdGNoIGVmZmVjdHMgZXRjIGluIHRoZSBkYXRhLgoqICBbRGlmZmVyZW50aWFsIGV4cHJlc3Npb25dKGRpZmZlcmVudGlhbF9leHByZXNzaW9uLmh0bWwpICBQZXJmb3JtIHRoZSBERSBhbmFseXNpcy4KKiAgW0RpZmZlcmVudGlhbCBleHByZXNzaW9uIG1vdXNlXShkaWZmZXJlbnRpYWxfZXhwcmVzc2lvbl9tbXVzY3VsdXMuaHRtbCkgIFBlcmZvcm0gdGhlIERFIGFuYWx5c2lzLgoKIyBJbnN0YWxsYXRpb24gYW5kIHNldHVwCgpUaGVzZSBhcmUgcm1hcmtkb3duIGRvY3VtZW50cyB3aGljaCBtYWtlIGhlYXZ5IHVzZSBvZiB0aGUgaHBnbHRvb2xzIHBhY2thZ2UuICBUaGUgZm9sbG93aW5nIHNlY3Rpb24KZGVtb25zdHJhdGVzIGhvdyB0byBzZXQgdGhhdCB1cCBpbiBhIGNsZWFuIFIgZW52aXJvbm1lbnQuCgpgYGB7ciBzZXR1cCwgZXZhbD1GQUxTRX0KIyMgVXNlIFIncyBpbnN0YWxsLnBhY2thZ2VzIHRvIGluc3RhbGwgZGV2dG9vbHMuCmluc3RhbGwucGFja2FnZXMoImRldnRvb2xzIikKIyMgVXNlIGRldnRvb2xzIHRvIGluc3RhbGwgaHBnbHRvb2xzLgpkZXZ0b29sczo6aW5zdGFsbF9naXRodWIoImVsc2F5ZWRsYWIvaHBnbHRvb2xzIikKIyMgTG9hZCBocGdsdG9vbHMgaW50byB0aGUgUiBlbnZpcm9ubWVudC4KbGlicmFyeShocGdsdG9vbHMpCiMjIFVzZSBocGdsdG9vbHMnIGF1dG9sb2Fkc19hbGwoKSBmdW5jdGlvbiB0byBpbnN0YWxsIHRoZSBtYW55IHBhY2thZ2VzIHVzZWQgYnkgaHBnbHRvb2xzLgphdXRvbG9hZHNfYWxsKCkKYGBgCgpgYGB7ciBzeXNpbmZvLCByZXN1bHRzPSdhc2lzJ30KbGlicmFyeSgncGFuZGVyJykKcGFuZGVyKHNlc3Npb25JbmZvKCkpCmBgYAo=