The goal of this project is to look for changes in the yeast transcriptome as a result of a mutation(s) in the CBF5 gene, responsible for the pseudouridylation of important ribosomal bases, which in turn lower the fidelity of the yeast ribosome vis a vis programmed -1 ribosomal frameshifting (among other things). This document is intended to make it easier to reproduce/improve analyses performed during a RNASequencing experiment of 2 yeast strains.
The following are some requests I have received and whether or not I think I did them.
These are rmarkdown documents which make heavy use of the hpgltools package. The following section demonstrates how to set that up in a clean R environment.
## Use R's install.packages to install devtools.
install.packages("devtools")
## Use devtools to install hpgltools.
devtools::install_github("elsayedlab/hpgltools")
## Load hpgltools into the R environment.
library(hpgltools)
## Use hpgltools' autoloads_all() function to install the many packages used by hpgltools.
autoloads_all()
For some projects, I have been relying heavily on the illumina iGenomes. It seems to me to be a fairly consistent and well annotated data set for the species I have worked with so far.
http://support.illumina.com/sequencing/sequencing_software/igenome.html
I left a copy of the Ensembl data set in $LAB/ref_data/illumina/ and made symbolic links into $LAB/ref_data/scerevisiae/
Some tools I use ask for .gff files while others look for .gtf files. I have a converter which I am copying to the local bin/ directory (gff_convert.pl)
In addition, I have recently been using a mix of the bioconductor OrganismDbi/TxDb/OrgDb interfaces along with Ensembl’s biomart. The triumvirate of gff annotations, biomart, and extant orgdb instances provide a powerful, if confusing combination.
library('pander')
pander(sessionInfo())
R version 3.3.1 (2016-06-21)
**Platform:** x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: pander(v.0.6.0) and hpgltools(v.2016.02)
loaded via a namespace (and not attached): Rcpp(v.0.12.6), formatR(v.1.4), compiler(v.3.3.1), RColorBrewer(v.1.1-2), plyr(v.1.8.4), iterators(v.1.0.8), tools(v.3.3.1), testthat(v.1.0.2), digest(v.0.6.9), lattice(v.0.20-33), preprocessCore(v.1.35.0), evaluate(v.0.9), memoise(v.1.0.0), gtable(v.0.2.0), openxlsx(v.3.0.0), foreach(v.1.4.3), yaml(v.2.1.13), ggrepel(v.0.5), parallel(v.3.3.1), withr(v.1.0.2), stringr(v.1.0.0), roxygen2(v.5.0.1), knitr(v.1.13), gtools(v.3.5.0), devtools(v.1.12.0), locfit(v.1.5-9.1), grid(v.3.3.1), data.table(v.1.9.6), Biobase(v.2.33.0), R6(v.2.1.2), rmarkdown(v.1.0), limma(v.3.29.14), edgeR(v.3.15.2), ggplot2(v.2.1.0), reshape2(v.1.4.1), corpcor(v.1.6.8), magrittr(v.1.5), scales(v.0.4.0), codetools(v.0.2-14), htmltools(v.0.3.5), matrixStats(v.0.50.2), BiocGenerics(v.0.19.2), colorspace(v.1.2-6), labeling(v.0.3), stringi(v.1.1.1), munsell(v.0.4.3), chron(v.2.3-47) and crayon(v.1.3.2)