I hope to use this document to attempt to make it easy for anyone to redo/improve what I did. The logic has been split into a few pieces:
Currently I have only extended the 2…n analysis for the set of 2-10 nucleotides. This is being done via a small Makefile which invokes the following:
make
## The Makefile contains the following:
##.PHONY: clean all
##all: copy 2mers.html 3mers.html 4mers.html 5mers.html 6mers.html 7mers.html 8mers.html 9mers.html 10mers.html
##%.html: %.Rmd
## @rm -f $@ && echo "Compiling $<" ;
## @Rscript -e "setwd('$(dir $<)');\
## library('rmarkdown');\
## render('$(notdir $<)', output_format='html_document')" 2>$<.out 1>&2
##
##copy: 2mers.Rmd
## cp 2mers.Rmd 3mers.Rmd && sed -i s'/mers <- 2/mers <- 3/g' 3mers.Rmd
## cp 2mers.Rmd 4mers.Rmd && sed -i s'/mers <- 2/mers <- 4/g' 4mers.Rmd
## cp 2mers.Rmd 5mers.Rmd && sed -i s'/mers <- 2/mers <- 5/g' 5mers.Rmd
## cp 2mers.Rmd 6mers.Rmd && sed -i s'/mers <- 2/mers <- 6/g' 6mers.Rmd
## cp 2mers.Rmd 7mers.Rmd && sed -i s'/mers <- 2/mers <- 7/g' 7mers.Rmd
## cp 2mers.Rmd 8mers.Rmd && sed -i s'/mers <- 2/mers <- 8/g' 8mers.Rmd
## cp 2mers.Rmd 9mers.Rmd && sed -i s'/mers <- 2/mers <- 9/g' 9mers.Rmd
## cp 2mers.Rmd 10mers.Rmd && sed -i s'/mers <- 2/mers <- 10/g' 10mers.Rmd
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: pander(v.0.6.2) and hpgltools(v.2018.03)
loaded via a namespace (and not attached): Rcpp(v.0.12.17), compiler(v.3.5.1), pillar(v.1.3.0), plyr(v.1.8.4), bindr(v.0.1.1), iterators(v.1.0.10), tools(v.3.5.1), digest(v.0.6.15), evaluate(v.0.11), tibble(v.1.4.2), gtable(v.0.2.0), pkgconfig(v.2.0.1), rlang(v.0.2.1), foreach(v.1.4.4), yaml(v.2.1.19), parallel(v.3.5.1), bindrcpp(v.0.2.2), stringr(v.1.3.1), dplyr(v.0.7.6), knitr(v.1.20), rprojroot(v.1.3-2), grid(v.3.5.1), tidyselect(v.0.2.4), glue(v.1.3.0), data.table(v.1.11.4), Biobase(v.2.40.0), R6(v.2.2.2), rmarkdown(v.1.10), ggplot2(v.3.0.0), purrr(v.0.2.5), magrittr(v.1.5), backports(v.1.1.2), scales(v.0.5.0), codetools(v.0.2-15), htmltools(v.0.3.6), BiocGenerics(v.0.26.0), assertthat(v.0.2.0), colorspace(v.1.3-2), stringi(v.1.2.3), lazyeval(v.0.2.1), munsell(v.0.5.0) and crayon(v.1.3.4)