1 Mus musculus sample estimation

2 Visualizing raw data

3 Normalize and visualize

4 Try a batch estimation

5 Write the expt

1 Mus musculus sample estimation

2 Visualizing raw data

There are lots of toys we have learned to use to play with with raw data and explore stuff like batch effects or non-canonical distributions or skewed counts. hpgltools provides some functionality to make this process easier. The graphs shown below and many more are generated with the wrapper ‘graph_metrics()’ but that takes away the chance to explain the graphs as I generate them.

mm_expt <- expt_exclude_genes(expt=mm_expt, column="Type", method="keep", patterns=c("protein_coding"))

## Before removal, there were 88198 entries.

## Now there are 78166 entries.

## Percent kept: 94.6918356393496, 95.6014030466883, 96.0192325721982, 94.1560724825351, 95.4277875550635, 95.5467362467663, 94.8747441594825, 95.3755486209204, 94.5476892156688, 94.7540455861558, 93.5291291731876, 93.0865081043525, 94.1866870128134, 94.3584950782656, 92.7195382093603, 94.3755866812252

## Percent removed: 5.30816436065035, 4.39859695331166, 3.98076742780181, 5.84392751746488, 4.57221244493648, 4.45326375323367, 5.12525584051753, 4.62445137907965, 5.45231078433121, 5.24595441384418, 6.47087082681239, 6.9134918956475, 5.81331298718665, 5.64150492173442, 7.2804617906397, 5.62441331877485

mm_filt <- normalize_expt(mm_expt, low_to_zero=TRUE)

## This function will replace the expt$expressionset slot with:

## raw(data)

## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize

## Filter is false, this should likely be set to something, good
##  choices include cbcb, kofa, pofa (anything but FALSE).  If you want this to
##  stay FALSE, keep in mind that if other normalizations are performed, then the
##  resulting libsizes are likely to be strange (potentially negative!)

## Leaving the data in its current base format, keep in mind that
##  some metrics are easier to see when the data is log2 transformed, but
##  EdgeR/DESeq do not accept transformed data.

## Leaving the data unconverted.  It is often advisable to cpm/rpkm
##  the data to normalize for sampling differences, keep in mind though that rpkm
##  has some annoying biases, and voom() by default does a cpm (though hpgl_voom()
##  will try to detect this).

## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.

## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.

## Step 1: not doing count filtering.

## Step 2: not normalizing the data.

## Step 3: not converting the data.

## Step 4: not transforming the data.

## Step 5: not doing batch correction.

mm_filt <- normalize_expt(mm_filt, filter="cbcb")

## This function will replace the expt$expressionset slot with:

## cbcb(data)

## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize

## Leaving the data in its current base format, keep in mind that
##  some metrics are easier to see when the data is log2 transformed, but
##  EdgeR/DESeq do not accept transformed data.

## Leaving the data unconverted.  It is often advisable to cpm/rpkm
##  the data to normalize for sampling differences, keep in mind though that rpkm
##  has some annoying biases, and voom() by default does a cpm (though hpgl_voom()
##  will try to detect this).

## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.

## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.

## Step 1: performing count filter with option: cbcb

## Removing 40835 low-count genes (37331 remaining).

## Step 2: not normalizing the data.

## Step 3: not converting the data.

## Step 4: not transforming the data.

## Step 5: not doing batch correction.

mm_libsize <- plot_libsize(mm_filt)

## The scale difference between the smallest and largest
##    libraries is > 10. Assuming a log10 scale is better, set scale=FALSE if not.

mm_libsize

mm_nonzero <- plot_nonzero(mm_filt)
mm_nonzero

mm_density <- plot_density(mm_filt) + ggplot2::scale_x_continuous(limits=c(0, 5))

## This data will benefit from being displayed on the log scale.

## If this is not desired, set scale='raw'

## Some entries are 0.  We are on log scale, setting them to 0.5.

## Changed 280477 zero count features.

## Scale for 'x' is already present. Adding another scale for 'x', which will replace the
## existing scale.

mm_density

mm_boxplot <- plot_boxplot(mm_filt)

## This data will benefit from being displayed on the log scale.

## If this is not desired, set scale='raw'

## Some entries are 0.  We are on log scale, setting them to 0.5.

## Changed 280477 zero count features.

mm_boxplot

3 Normalize and visualize

mm_norm <- sm(normalize_expt(mm_expt, transform="log2", norm="quant", convert="cpm", filter=TRUE))
mm_metrics <- sm(graph_metrics(mm_norm))

mm_metrics$pcaplot

##mm_varpart <- varpart(expt=mm_expt, predictor=NULL, factors=c("condition", "batch"))
##mm_varpart$partition_plot

The data should now be normalized, lets view some metrics post-facto.

mm_metrics$corheat

mm_metrics$smc

mm_metrics$pcaplot

4 Try a batch estimation

mm_sva <- sm(normalize_expt(mm_expt, transform="log2", norm="quant",
                             filter=TRUE, batch="ssva", low_to_zero=TRUE))
mm_sva_metrics <- graph_metrics(mm_sva)

## Graphing number of non-zero genes with respect to CPM by library.

## Graphing library sizes.

## Graphing a boxplot.

## This data will benefit from being displayed on the log scale.

## If this is not desired, set scale='raw'

## Some entries are 0.  We are on log scale, setting them to 0.5.

## Changed 111759 zero count features.

## Graphing a correlation heatmap.

## Graphing a standard median correlation.

## Performing correlation.

## Graphing a distance heatmap.

## Graphing a standard median distance.

## Performing distance.

## Graphing a PCA plot.

## Plotting a density plot.

## This data will benefit from being displayed on the log scale.

## If this is not desired, set scale='raw'

## Some entries are 0.  We are on log scale, setting them to 0.5.

## Changed 111759 zero count features.

## Printing a color to condition legend.

mm_sva_metrics$pcaplot

mm_fsva <- sm(normalize_expt(mm_expt, transform="log2", norm="quant",
                              filter=TRUE, batch="fsva", low_to_zero=TRUE))
mm_fsva_metrics <- graph_metrics(mm_fsva)

## Graphing number of non-zero genes with respect to CPM by library.

## Graphing library sizes.

## Graphing a boxplot.

## Graphing a correlation heatmap.

## Graphing a standard median correlation.

## Performing correlation.

## Graphing a distance heatmap.

## Graphing a standard median distance.

## Performing distance.

## Graphing a PCA plot.

## Plotting a density plot.

## Printing a color to condition legend.

mm_fsva_metrics$pcaplot

5 Write the expt

mm_first <- write_expt(mm_expt, excel=paste0("excel/mm_samples_written_default-v", ver, ".xlsx"),
                       filter=TRUE, norm="quant", convert="cpm", batch="raw", transform="log2")

## Writing the legend.

## The sheet: legend is in legend.

## Writing the raw reads.

## Graphing the raw reads.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## Writing the normalized reads.

## Graphing the normalized reads.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## Writing the median reads by factor.

## The factor pmn_inf_12hr has 4 rows.

## The factor pmn_inf_14d has 4 rows.

## The factor pmn_uninf_12hr has 4 rows.

## The factor pmn_uninf_14d has 4 rows.

mm_second <- write_expt(mm_expt, excel=paste0("excel/mm_samples_written_ssva-v", ver, ".xlsx"),
                        filter=TRUE, norm="quant", convert="raw", batch="ssva", transform="log2")

## Writing the legend.

## The sheet: legend is in legend.

## Writing the raw reads.

## Graphing the raw reads.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## Writing the normalized reads.

## Graphing the normalized reads.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## Writing the median reads by factor.

## The factor pmn_inf_12hr has 4 rows.

## The factor pmn_inf_14d has 4 rows.

## The factor pmn_uninf_12hr has 4 rows.

## The factor pmn_uninf_14d has 4 rows.

mm_third <- write_expt(mm_expt, excel=paste0("excel/mm_samples_written_fsva-v", ver, ".xlsx"),
                       filter=TRUE, norm="quant", convert="raw", batch="fsva", transform="log2")

## Writing the legend.

## The sheet: legend is in legend.

## Writing the raw reads.

## Graphing the raw reads.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.

## Writing the normalized reads.

## Graphing the normalized reads.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.

## Writing the median reads by factor.

## The factor pmn_inf_12hr has 4 rows.

## The factor pmn_inf_14d has 4 rows.

## The factor pmn_uninf_12hr has 4 rows.

## The factor pmn_uninf_14d has 4 rows.

pander::pander(sessionInfo())

R version 3.3.3 (2017-03-06)

**Platform:** x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C

attached base packages: stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: ruv(v.0.9.6) and hpgltools(v.2017.01)

loaded via a namespace (and not attached): ggrepel(v.0.6.5), Rcpp(v.0.12.11), locfit(v.1.5-9.1), lattice(v.0.20-35), corpcor(v.1.6.9), gtools(v.3.5.0), rprojroot(v.1.2), digest(v.0.6.12), foreach(v.1.4.3), R6(v.2.2.2), plyr(v.1.8.4), backports(v.1.1.0), stats4(v.3.3.3), RSQLite(v.2.0), evaluate(v.0.10.1), sva(v.3.22.0), ggplot2(v.2.2.1), rlang(v.0.1.1), lazyeval(v.0.2.0), data.table(v.1.10.4), annotate(v.1.52.1), blob(v.1.1.0), S4Vectors(v.0.12.2), Matrix(v.1.2-10), preprocessCore(v.1.36.0), rmarkdown(v.1.6), splines(v.3.3.3), labeling(v.0.3), devtools(v.1.13.2), pander(v.0.6.0), stringr(v.1.2.0), RCurl(v.1.95-4.8), bit(v.1.1-12), munsell(v.0.4.3), BiocGenerics(v.0.20.0), base64enc(v.0.1-3), mgcv(v.1.8-17), htmltools(v.0.3.6), tibble(v.1.3.3), roxygen2(v.6.0.1), edgeR(v.3.16.5), IRanges(v.2.8.2), codetools(v.0.2-15), matrixStats(v.0.52.2), XML(v.3.98-1.9), crayon(v.1.3.2), withr(v.1.0.2), bitops(v.1.0-6), commonmark(v.1.2), grid(v.3.3.3), nlme(v.3.1-131), xtable(v.1.8-2), gtable(v.0.2.0), DBI(v.0.7), magrittr(v.1.5), scales(v.0.4.1), stringi(v.1.1.5), reshape2(v.1.4.2), genefilter(v.1.56.0), testthat(v.1.0.2), limma(v.3.30.13), xml2(v.1.1.1), openxlsx(v.4.0.17), RColorBrewer(v.1.1-2), iterators(v.1.0.8), tools(v.3.3.3), bit64(v.0.9-7), Biobase(v.2.34.0), parallel(v.3.3.3), survival(v.2.41-3), yaml(v.2.1.14), AnnotationDbi(v.1.36.2), colorspace(v.1.3-2), memoise(v.1.1.0) and knitr(v.1.16)

this_save <- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
message(paste0("Saving to ", this_save))

## Saving to 02_sample_estimation_mmusculus-v20170706.rda.xz

tt <- sm(saveme(filename=this_save))

LS0tCnRpdGxlOiAiTS4gbXVzY3VsdXMgKHVuKWluZmVjdGVkIHNhbXBsZSBlc3RpbWF0aW9uLiIKYXV0aG9yOiAiYXRiIGFiZWxld0BnbWFpbC5jb20iCmRhdGU6ICJgciBTeXMuRGF0ZSgpYCIKb3V0cHV0OgogaHRtbF9kb2N1bWVudDoKICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgY29kZV9mb2xkaW5nOiBzaG93CiAgZmlnX2NhcHRpb246IHRydWUKICBmaWdfaGVpZ2h0OiA3CiAgZmlnX3dpZHRoOiA3CiAgaGlnaGxpZ2h0OiBkZWZhdWx0CiAga2VlcF9tZDogZmFsc2UKICBtb2RlOiBzZWxmY29udGFpbmVkCiAgbnVtYmVyX3NlY3Rpb25zOiB0cnVlCiAgc2VsZl9jb250YWluZWQ6IHRydWUKICB0aGVtZTogcmVhZGFibGUKICB0b2M6IHRydWUKICB0b2NfZmxvYXQ6CiAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICBzbW9vdGhfc2Nyb2xsOiBmYWxzZQotLS0KCjxzdHlsZT4KICBib2R5IC5tYWluLWNvbnRhaW5lciB7CiAgICBtYXgtd2lkdGg6IDE2MDBweDsKfQo8L3N0eWxlPgoKYGBge3Igb3B0aW9ucywgaW5jbHVkZT1GQUxTRX0KbGlicmFyeSgiaHBnbHRvb2xzIikKdHQgPC0gZGV2dG9vbHM6OmxvYWRfYWxsKCJ+L2hwZ2x0b29scyIpCmtuaXRyOjpvcHRzX2tuaXQkc2V0KHByb2dyZXNzPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgIHZlcmJvc2U9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgd2lkdGg9OTAsCiAgICAgICAgICAgICAgICAgICAgIGVjaG89VFJVRSkKa25pdHI6Om9wdHNfY2h1bmskc2V0KGVycm9yPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgICBmaWcud2lkdGg9OCwKICAgICAgICAgICAgICAgICAgICAgIGZpZy5oZWlnaHQ9OCwKICAgICAgICAgICAgICAgICAgICAgIGRwaT05NikKb2xkX29wdGlvbnMgPC0gb3B0aW9ucyhkaWdpdHM9NCwKICAgICAgICAgICAgICAgICAgICAgICBzdHJpbmdzQXNGYWN0b3JzPUZBTFNFLAogICAgICAgICAgICAgICAgICAgICAgIGtuaXRyLmR1cGxpY2F0ZS5sYWJlbD0iYWxsb3ciKQpnZ3Bsb3QyOjp0aGVtZV9zZXQoZ2dwbG90Mjo6dGhlbWVfYncoYmFzZV9zaXplPTEwKSkKc2V0LnNlZWQoMSkKdmVyIDwtICIyMDE3MDcwNiIKcHJldmlvdXNfZmlsZSA8LSAiMDFfYW5ub3RhdGlvbl9tbXVzY3VsdXMuUm1kIgoKdG1wIDwtIHRyeShzbShsb2FkbWUoZmlsZW5hbWU9cGFzdGUwKGdzdWIocGF0dGVybj0iXFwuUm1kIiwgcmVwbGFjZT0iIiwgeD1wcmV2aW91c19maWxlKSwgIi12IiwgdmVyLCAiLnJkYS54eiIpKSkpCgpybWRfZmlsZSA8LSAiMDJfc2FtcGxlX2VzdGltYXRpb25fbW11c2N1bHVzLlJtZCIKYGBgCgpgYGB7ciByZW5kZXIsIGV2YWw9RkFMU0UsIGluY2x1ZGU9RkFMU0V9CnJtYXJrZG93bjo6cmVuZGVyKHJtZF9maWxlKQoKcm1hcmtkb3duOjpyZW5kZXIocm1kX2ZpbGUsIG91dHB1dF9mb3JtYXQ9InBkZl9kb2N1bWVudCIsIG91dHB1dF9vcHRpb25zPWMoInNraXBfaHRtbCIpKQpgYGAKCk11cyBtdXNjdWx1cyBzYW1wbGUgZXN0aW1hdGlvbgo9PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KCiMgVmlzdWFsaXppbmcgcmF3IGRhdGEKClRoZXJlIGFyZSBsb3RzIG9mIHRveXMgd2UgaGF2ZSBsZWFybmVkIHRvIHVzZSB0byBwbGF5IHdpdGggd2l0aCByYXcgZGF0YSBhbmQgZXhwbG9yZSBzdHVmZiBsaWtlCmJhdGNoIGVmZmVjdHMgb3Igbm9uLWNhbm9uaWNhbCBkaXN0cmlidXRpb25zIG9yIHNrZXdlZCBjb3VudHMuICBocGdsdG9vbHMgcHJvdmlkZXMgc29tZSBmdW5jdGlvbmFsaXR5CnRvIG1ha2UgdGhpcyBwcm9jZXNzIGVhc2llci4gIFRoZSBncmFwaHMgc2hvd24gYmVsb3cgYW5kIG1hbnkgbW9yZSBhcmUgZ2VuZXJhdGVkIHdpdGggdGhlIHdyYXBwZXIKJ2dyYXBoX21ldHJpY3MoKScgYnV0IHRoYXQgdGFrZXMgYXdheSB0aGUgY2hhbmNlIHRvIGV4cGxhaW4gdGhlIGdyYXBocyBhcyBJIGdlbmVyYXRlIHRoZW0uCgpgYGB7ciByYXdfZXhwbG9yZX0KbW1fZXhwdCA8LSBleHB0X2V4Y2x1ZGVfZ2VuZXMoZXhwdD1tbV9leHB0LCBjb2x1bW49IlR5cGUiLCBtZXRob2Q9ImtlZXAiLCBwYXR0ZXJucz1jKCJwcm90ZWluX2NvZGluZyIpKQoKCm1tX2ZpbHQgPC0gbm9ybWFsaXplX2V4cHQobW1fZXhwdCwgbG93X3RvX3plcm89VFJVRSkKbW1fZmlsdCA8LSBub3JtYWxpemVfZXhwdChtbV9maWx0LCBmaWx0ZXI9ImNiY2IiKQptbV9saWJzaXplIDwtIHBsb3RfbGlic2l6ZShtbV9maWx0KQptbV9saWJzaXplCgptbV9ub256ZXJvIDwtIHBsb3Rfbm9uemVybyhtbV9maWx0KQptbV9ub256ZXJvCgptbV9kZW5zaXR5IDwtIHBsb3RfZGVuc2l0eShtbV9maWx0KSArIGdncGxvdDI6OnNjYWxlX3hfY29udGludW91cyhsaW1pdHM9YygwLCA1KSkKbW1fZGVuc2l0eQoKbW1fYm94cGxvdCA8LSBwbG90X2JveHBsb3QobW1fZmlsdCkKbW1fYm94cGxvdApgYGAKCiMgTm9ybWFsaXplIGFuZCB2aXN1YWxpemUKCmBgYHtyIG5vcm1hbGl6ZSwgZmlnLnNob3c9ImhpZGUifQptbV9ub3JtIDwtIHNtKG5vcm1hbGl6ZV9leHB0KG1tX2V4cHQsIHRyYW5zZm9ybT0ibG9nMiIsIG5vcm09InF1YW50IiwgY29udmVydD0iY3BtIiwgZmlsdGVyPVRSVUUpKQptbV9tZXRyaWNzIDwtIHNtKGdyYXBoX21ldHJpY3MobW1fbm9ybSkpCm1tX21ldHJpY3MkcGNhcGxvdAoKIyNtbV92YXJwYXJ0IDwtIHZhcnBhcnQoZXhwdD1tbV9leHB0LCBwcmVkaWN0b3I9TlVMTCwgZmFjdG9ycz1jKCJjb25kaXRpb24iLCAiYmF0Y2giKSkKIyNtbV92YXJwYXJ0JHBhcnRpdGlvbl9wbG90CmBgYAoKVGhlIGRhdGEgc2hvdWxkIG5vdyBiZSBub3JtYWxpemVkLCBsZXRzIHZpZXcgc29tZSBtZXRyaWNzIHBvc3QtZmFjdG8uCgpgYGB7ciBub3Jtdml6fQptbV9tZXRyaWNzJGNvcmhlYXQKCm1tX21ldHJpY3Mkc21jCgptbV9tZXRyaWNzJHBjYXBsb3QKYGBgCgojIFRyeSBhIGJhdGNoIGVzdGltYXRpb24KCmBgYHtyIGJhdGNoX2V4b30KbW1fc3ZhIDwtIHNtKG5vcm1hbGl6ZV9leHB0KG1tX2V4cHQsIHRyYW5zZm9ybT0ibG9nMiIsIG5vcm09InF1YW50IiwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICBmaWx0ZXI9VFJVRSwgYmF0Y2g9InNzdmEiLCBsb3dfdG9femVybz1UUlVFKSkKbW1fc3ZhX21ldHJpY3MgPC0gZ3JhcGhfbWV0cmljcyhtbV9zdmEpCm1tX3N2YV9tZXRyaWNzJHBjYXBsb3QKCm1tX2ZzdmEgPC0gc20obm9ybWFsaXplX2V4cHQobW1fZXhwdCwgdHJhbnNmb3JtPSJsb2cyIiwgbm9ybT0icXVhbnQiLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICBmaWx0ZXI9VFJVRSwgYmF0Y2g9ImZzdmEiLCBsb3dfdG9femVybz1UUlVFKSkKbW1fZnN2YV9tZXRyaWNzIDwtIGdyYXBoX21ldHJpY3MobW1fZnN2YSkKbW1fZnN2YV9tZXRyaWNzJHBjYXBsb3QKYGBgCgojIFdyaXRlIHRoZSBleHB0CgpgYGB7ciBmc3ZhX3dyaXR0ZW4sIGZpZy5zaG93PSJoaWRlIn0KbW1fZmlyc3QgPC0gd3JpdGVfZXhwdChtbV9leHB0LCBleGNlbD1wYXN0ZTAoImV4Y2VsL21tX3NhbXBsZXNfd3JpdHRlbl9kZWZhdWx0LXYiLCB2ZXIsICIueGxzeCIpLAogICAgICAgICAgICAgICAgICAgICAgIGZpbHRlcj1UUlVFLCBub3JtPSJxdWFudCIsIGNvbnZlcnQ9ImNwbSIsIGJhdGNoPSJyYXciLCB0cmFuc2Zvcm09ImxvZzIiKQptbV9zZWNvbmQgPC0gd3JpdGVfZXhwdChtbV9leHB0LCBleGNlbD1wYXN0ZTAoImV4Y2VsL21tX3NhbXBsZXNfd3JpdHRlbl9zc3ZhLXYiLCB2ZXIsICIueGxzeCIpLAogICAgICAgICAgICAgICAgICAgICAgICBmaWx0ZXI9VFJVRSwgbm9ybT0icXVhbnQiLCBjb252ZXJ0PSJyYXciLCBiYXRjaD0ic3N2YSIsIHRyYW5zZm9ybT0ibG9nMiIpCm1tX3RoaXJkIDwtIHdyaXRlX2V4cHQobW1fZXhwdCwgZXhjZWw9cGFzdGUwKCJleGNlbC9tbV9zYW1wbGVzX3dyaXR0ZW5fZnN2YS12IiwgdmVyLCAiLnhsc3giKSwKICAgICAgICAgICAgICAgICAgICAgICBmaWx0ZXI9VFJVRSwgbm9ybT0icXVhbnQiLCBjb252ZXJ0PSJyYXciLCBiYXRjaD0iZnN2YSIsIHRyYW5zZm9ybT0ibG9nMiIpCmBgYAoKYGBge3Igc2F2ZW1lfQpwYW5kZXI6OnBhbmRlcihzZXNzaW9uSW5mbygpKQp0aGlzX3NhdmUgPC0gcGFzdGUwKGdzdWIocGF0dGVybj0iXFwuUm1kIiwgcmVwbGFjZT0iIiwgeD1ybWRfZmlsZSksICItdiIsIHZlciwgIi5yZGEueHoiKQptZXNzYWdlKHBhc3RlMCgiU2F2aW5nIHRvICIsIHRoaXNfc2F2ZSkpCnR0IDwtIHNtKHNhdmVtZShmaWxlbmFtZT10aGlzX3NhdmUpKQpgYGAK

M. musculus (un)infected sample estimation.

atb abelew@gmail.com

2017-07-07

1 Mus musculus sample estimation

2 Visualizing raw data

3 Normalize and visualize

4 Try a batch estimation

5 Write the expt