This document is concerned with analyzing RNAseq data of human and parasite during the infection of human macrophages. A few different strains of L. panamensis were used; the experiment is therefore segregated into groups named ‘self-healing’, ‘chronic’, and ‘uninfected’. Two separate sets of libraries were generated, one earlier set with greater coverage, and a later set including only 1 uninfected sample, and 2-3 chronic samples.
The following subset operation extract the samples used for the macrophage experiment. The next three lines then change the colors from the defaults.
hs_expt <- sm(create_expt(metadata="sample_sheets/all_samples-combined.xlsx",
gene_info=hs_annotations,
file_column="humanfile"))
hs_macr <- subset_expt(hs_expt, subset="experimentname=='macrophage'")
## There were 50, now there are 11 samples.
new_colors <- c("#009900", "#990000", "#000099")
names(new_colors) <- c("uninf", "chr", "sh")
hs_macr <- set_expt_colors(hs_macr, colors=new_colors)
## The new colors are a character, changing according to condition.
labels <- as.character(pData(hs_macr)[["label"]])
hs_macr <- set_expt_samplenames(hs_macr, labels)
hs_cds_expt <- exclude_genes_expt(hs_expt, column='Type', patterns="protein_coding")
## Before removal, there were 51041 entries.
## Now there are 32233 entries.
## Percent kept: 6.800, 4.913, 4.463, 3.420, 4.095, 3.758, 4.250, 3.901, 2.844, 2.354, 3.398, 2.979, 3.360, 2.824, 3.827, 3.483, 5.058, 5.009, 4.342, 4.748, 5.045, 5.140, 5.233, 3.496, 3.612, 4.589, 2.427, 2.134, 2.082, 1.978, 12.185, 10.054, 9.846, 9.990, 16.555, 7.559, 5.374, 3.647, 3.770, 4.198, 3.756, 3.777, 3.723, 6.072, 5.246, 5.389, 5.451, 5.071, 5.186, 4.931
## Percent removed: 93.200, 95.087, 95.537, 96.580, 95.905, 96.242, 95.750, 96.099, 97.156, 97.646, 96.602, 97.021, 96.640, 97.176, 96.173, 96.517, 94.942, 94.991, 95.658, 95.252, 94.955, 94.860, 94.767, 96.504, 96.388, 95.411, 97.573, 97.866, 97.918, 98.022, 87.815, 89.946, 90.154, 90.010, 83.445, 92.441, 94.626, 96.353, 96.230, 95.802, 96.244, 96.223, 96.277, 93.928, 94.754, 94.611, 94.549, 94.929, 94.814, 95.069
hs_cds_macr <- subset_expt(hs_cds_expt, subset="experimentname=='macrophage'")
## There were 50, now there are 11 samples.
hs_cds_macr <- set_expt_colors(hs_cds_macr, colors=new_colors)
## The new colors are a character, changing according to condition.
hs_cds_macr <- set_expt_samplenames(hs_cds_macr, labels)
Figure S1 should include nice versions of the sample metrics. The normalization chosen is batch-in-model.
First, however, we will make some plots of the raw data.
Sample names are going to be ‘infectionstate_strainnumber’ : chr_7721
fig_s1 <- sm(write_expt(hs_macr, norm="quant", violin=FALSE, convert="cpm",
transform="log2", batch="pca", filter=TRUE,
excel=paste0("excel/figure_s1_sample_estimation-v", ver, ".xlsx")))
fig_s1_cds <- sm(write_expt(hs_cds_macr, norm="quant", violin=FALSE, convert="cpm",
transform="log2", batch="pca", filter=TRUE,
excel=paste0("excel/figure_s1_sample_estimation_cds_only-v", ver, ".xlsx")))
First look for clustering patterns in the raw data, in all data followed by only the CDS regions.
plot_tsne(hs_macr)$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
plot_tsne(hs_cds_macr)$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
Once the experiment has been written with the various normalizations in place, we can use that to print and view some representative plots.
pp(file="images/figure_s1a.pdf", image=fig_s1$raw_libsize)
## Writing the image to: images/figure_s1a.pdf and calling dev.off().
## Show the library sizes of all features.
pp(file="images/figure_s1b.pdf", image=fig_s1$raw_disheat)
## Writing the image to: images/figure_s1b.pdf and calling dev.off().
## Show the clustering by euclidean distance of the raw data all features.
pp(file="images/figure_s1c.pdf", image=fig_s1$raw_scaled_pca)
## Writing the image to: images/figure_s1c.pdf and calling dev.off().
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Do a PCA plot of the log2(normalized(data)) without any batch-in-model
pp(file="images/figure_s1d.pdf", image=fig_s1$norm_tsne)
## Writing the image to: images/figure_s1d.pdf and calling dev.off().
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Do a TSNE of the normalized data after condition is in the model.
In a similar fashion to what is above, I am printing the figure 1 images.
pp(file="images/figure_1a.pdf", fig_s1$norm_disheat)
## Writing the image to: images/figure_1a.pdf and calling dev.off().
## Show the changes in clustering after adding batch to the model and normalizing.
pp(file="images/figure_1a_cds.pdf", image=fig_s1_cds$norm_disheat)
## Writing the image to: images/figure_1a_cds.pdf and calling dev.off().
## Observe that the clustering is nearly identical when only looking at the CDS features.
pp(file="images/figure_1b.pdf", image=fig_s1$norm_pca)
## Writing the image to: images/figure_1b.pdf and calling dev.off().
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Visualize the clustering after normalization via PCA.
knitr::kable(fig_s1$norm_pca_table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.3001 | 0.5411 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.1826 | -0.1392 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.1623 | -0.0610 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.5607 | 0.1367 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.1739 | -0.2052 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.3305 | -0.1978 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.0119 | -0.1666 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.2537 | -0.2304 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.0142 | 0.5954 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | -0.1111 | 0.0995 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.5651 | -0.3727 |
write.csv(fig_s1$norm_pca_table, file="images/fig_1a.csv")
pp(file="images/figure_1b_cds.pdf", image=fig_s1_cds$norm_pca)
## Writing the image to: images/figure_1b_cds.pdf and calling dev.off().
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Once again, see that it is mostly unchanged when using only CDS features.
knitr::kable(fig_s1_cds$norm_pca_table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.7944 | 0.1829 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | 0.1567 | -0.2831 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | 0.1705 | -0.4056 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.1587 | 0.7099 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.1713 | 0.2040 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.1925 | 0.1750 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | 0.1967 | 0.1279 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.1991 | -0.1948 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | -0.3759 | -0.2131 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | -0.0127 | -0.1561 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.0624 | -0.1470 |
write.csv(fig_s1_cds$norm_pca_table, file="images/fig_1a_cds.csv")
The following is mostly used to compare different methodologies and is therefore not likely to be useful for most.
The various metrics used to describe and examine the data come once before, and once after normalization.
graph_metrics creates a set of plots we can use to look at the data before messing with it.
written <- sm(write_expt(hs_macr,
excel=paste0("excel/macrophage_human_data-v", ver, ".xlsx"),
violin=TRUE))
hs_macr_metrics <- sm(graph_metrics(hs_macr))
The two most likely plots one will want to see immediately are the library sizes and data densities.
hs_macr_metrics$libsize
## The libraries > 600 in HPGL id have lower coverage as they were sequenced many more to a lane.
hs_macr_metrics$density
## This same phenomenon is visible in the density plot
hs_macr_metrics$nonzero
## and the non-zero gene / cpm plot.
The plotted library sizes suggest that differences observed in the raw data are due at least partially to less saturation in the new samples. Therefore, low-count filtering is almost certainly to prove an important parameter. Similarly, a batch correction should help alleviate these differences.
hs_macr_nonil <- subset_expt(hs_macr, subset="condition=='sh'|condition=='chr'")
## There were 11, now there are 9 samples.
hs_macr_cpmlow <- sm(normalize_expt(hs_macr, filter="cbcb", convert="cpm"))
hs_macr_l2cpmlow <- sm(normalize_expt(hs_macr, filter="cbcb", convert="cpm", transform="log2"))
hs_macr_l2cpmpofa <- sm(normalize_expt(hs_macr, filter="pofa", convert="cpm", transform="log2"))
hs_macr_cpmrlelow <- sm(normalize_expt(hs_macr, filter="cbcb", convert="cpm", norm="rle"))
hs_macr_l2cpmrlelow <- sm(normalize_expt(hs_macr, filter="cbcb", convert="cpm",
norm="rle", transform="log2"))
The normalizations in the block above do not attempt any batch correction. Now take a moment to view how they affect pca plots.
hs_macr_pca <- plot_pca(hs_macr)
## Start with the unmodified data
## It is clearly a mess.
hs_macr_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
knitr::kable(hs_macr_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.5683 | -0.3528 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.1753 | -0.2831 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.2902 | -0.1149 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | -0.0763 | 0.6999 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.0355 | 0.0086 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | -0.2197 | 0.3669 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.0294 | -0.0494 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.1368 | 0.0229 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.3007 | 0.1848 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.4894 | -0.1976 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | 0.3966 | -0.2854 |
hs_macr_cpmlow_pca <- plot_pca(hs_macr_cpmlow)
hs_macr_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## low-filtering and cpm is insufficient to clarify the data, but it is a start
knitr::kable(hs_macr_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.5683 | -0.3528 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.1753 | -0.2831 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.2902 | -0.1149 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | -0.0763 | 0.6999 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.0355 | 0.0086 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | -0.2197 | 0.3669 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.0294 | -0.0494 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.1368 | 0.0229 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.3007 | 0.1848 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.4894 | -0.1976 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | 0.3966 | -0.2854 |
hs_macr_l2cpmlow_pca <- plot_pca(hs_macr_l2cpmlow)
hs_macr_l2cpmlow_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Moving to the log scale makes much clearer the batch effect
## and shows that the data is actually pretty well separated by condition already.
knitr::kable(hs_macr_l2cpmlow_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.2740 | -0.6084 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2364 | -0.3174 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.1964 | -0.1616 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | -0.1603 | 0.4705 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | -0.1480 | 0.2027 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | -0.1549 | 0.2445 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.1323 | 0.1467 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | -0.1605 | 0.2113 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.5074 | 0.1015 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.5063 | 0.0315 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | 0.4492 | -0.3213 |
hs_macr_l2cpmpofa_pca <- plot_pca(hs_macr_l2cpmpofa)
hs_macr_l2cpmpofa_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## I have access to a few low-count filters, but they appear to make little or no difference.
knitr::kable(hs_macr_l2cpmpofa_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.2784 | -0.6197 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2380 | -0.3080 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.1973 | -0.1527 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | -0.1575 | 0.4586 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | -0.1471 | 0.2057 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | -0.1526 | 0.2483 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.1320 | 0.1522 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | -0.1590 | 0.2123 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.5043 | 0.0919 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.5071 | 0.0361 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | 0.4506 | -0.3249 |
hs_macr_cpmrlelow_pca <- plot_pca(hs_macr_cpmrlelow)
hs_macr_cpmrlelow_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## RLE normalization has an interesting effect on the data
knitr::kable(hs_macr_cpmrlelow_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.3996 | 0.1372 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.3481 | 0.0650 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.3362 | -0.0977 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.3113 | -0.5311 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | -0.1105 | -0.1660 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.0868 | -0.2796 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.1558 | -0.1185 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.0940 | -0.1828 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.5731 | 0.1938 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.3514 | 0.4037 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.0665 | 0.5759 |
hs_macr_l2cpmrlelow_pca <- plot_pca(hs_macr_l2cpmrlelow, size_column="pctcategory")
hs_macr_l2cpmrlelow_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Still the batch effect is pretty ridiculous
## This plot resizes the glyphs to indicate the % mapping.
knitr::kable(hs_macr_l2cpmrlelow_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 | pctcategory | size |
---|---|---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.2742 | -0.5998 | 0 | 1 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2458 | -0.3404 | 3 | 4 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.1979 | -0.1365 | 3 | 4 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | -0.1559 | 0.4444 | 1 | 2 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | -0.1431 | 0.2312 | 4 | 5 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | -0.1560 | 0.2440 | 2 | 3 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | -0.1326 | 0.1528 | 2 | 3 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | -0.1557 | 0.2075 | 4 | 5 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.5065 | 0.1159 | 0 | 1 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.5091 | 0.0200 | 5 | 6 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | 0.4456 | -0.3390 | 3 | 4 |
hs_macr_pca_info <- sm(pca_information(hs_macr,
expt_factors=c("condition","batch","pctmappedparasite",
"state","pathogenstrain","rnangul")))
##write.csv(macro_pca_info$rsquared_table, file="images/rsquared.csv")
##write.csv(macro_pca_info$anova_p, file="images/anova_p.csv")
hs_macr_pca_info$anova_fstat_heatmap
I could continue the above normalizations and plots, but it quickly becomes boring and redundant. The batch effect is too significant to continue without addressing it. One thing that was brought up in previous discussions: does the mapping percentage act as a surrogate variable? This suggests that if it does, it is pretty weak, and the last PCA plot shows that is cosegregates with the existing conditions: uninfected has the lowest mapping percent to the parasite [0%]; while chronic has the lower percentage of the infections, but not so much lower than self-healing in general.
We have good information that suggests a strong batch effect in this data. Let us put that information to use in the following block.
How do these modifications affect the separation of the data?
hs_macr_cpmlowcbcb <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm", batch=TRUE))
hs_macr_l2cpmlowcom <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="combat_scale", transform="log2"))
hs_macr_l2cpmrlelowcom <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="combat_scale", transform="log2", norm="rle"))
hs_macr_l2cpmquantlowcmod <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="combatmod", transform="log2", norm="quant"))
hs_macr_l2cpmquantlowcom <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="combat_scale", transform="log2", norm="quant"))
hs_macr_l2cpmquantlowsva <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="svaseq", norm="quant"))
hs_macr_l2cpmquantlowfsva <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="fsva", norm="quant"))
hs_macr_l2cpmquantlowruv <- sm(normalize_expt(hs_macr, filter="hpgl", convert="cpm",
batch="combat_scale", transform="log2", norm="quant"))
tt <- sm(graph_metrics(hs_macr_l2cpmquantlowruv))
hs_macr_cpmlowcbcb_pca <- plot_pca(hs_macr_cpmlowcbcb)
hs_macr_cpmlowcbcb_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## cbcb's batch correction is insufficient, at least without log2
hs_macr_cpmlowcbcb_pca$table
## sampleid condition batch batch_int colors labels PC1 PC2
## 1 HPGL0241 uninf a 1 #009900 uninf_1 -0.09337 0.66888
## 2 HPGL0242 sh a 1 #000099 sh_2271 -0.23685 -0.15485
## 3 HPGL0243 sh a 1 #000099 sh_2272 -0.06401 -0.01482
## 4 HPGL0244 chr a 1 #990000 chr_5433 0.60514 -0.04000
## 5 HPGL0245 chr a 1 #990000 chr_1320 0.09059 -0.20559
## 6 HPGL0246 chr a 1 #990000 chr_2504 0.21245 -0.28906
## 7 HPGL0247 chr a 1 #990000 chr_5430 -0.05050 -0.20619
## 8 HPGL0248 chr a 1 #990000 chr_5397 0.19782 -0.22250
## 9 HPGL0637 uninf b 2 #009900 uninf 0.13221 0.51916
## 10 HPGL0638 sh b 2 #000099 sh_2189 -0.13449 0.11016
## 11 HPGL0639 sh b 2 #000099 sh_1022 -0.65901 -0.16518
hs_macr_l2cpmlowcom_pca <- plot_pca(hs_macr_l2cpmlowcom)
hs_macr_l2cpmlowcom_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## oooo pretty
knitr::kable(hs_macr_l2cpmlowcom_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.5445 | 0.0803 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2561 | 0.2708 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.0981 | -0.2656 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.4802 | 0.1559 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.2490 | -0.3045 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.2955 | 0.4898 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | 0.2064 | -0.5245 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.2660 | 0.2246 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | -0.2121 | -0.2582 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | -0.1054 | -0.1486 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.2810 | 0.2800 |
hs_macr_l2cpmrlelowcom_pca <- plot_pca(hs_macr_l2cpmrlelowcom)
hs_macr_l2cpmrlelowcom_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## also looking good
knitr::kable(hs_macr_l2cpmrlelowcom_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.5606 | 0.0681 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2444 | -0.2858 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.0863 | 0.0807 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.4560 | 0.1062 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.2571 | 0.0781 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.2825 | -0.3644 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | 0.2205 | 0.4784 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.2767 | -0.3788 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | -0.2143 | 0.4161 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | -0.0971 | 0.2104 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.2902 | -0.4090 |
hs_macr_l2cpmquantlowcmod_pca <- plot_pca(hs_macr_l2cpmquantlowcmod)
hs_macr_l2cpmquantlowcmod_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## not as good
knitr::kable(hs_macr_l2cpmquantlowcmod_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.6808 | 0.1257 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.3746 | -0.1335 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.2021 | 0.2806 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.3934 | -0.1607 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.1805 | 0.2191 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.1726 | -0.4126 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | 0.1340 | 0.3985 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.1736 | -0.2283 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | 0.2009 | 0.3443 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | 0.1591 | 0.1152 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.1566 | -0.5484 |
hs_macr_l2cpmquantlowfsva_pca <- plot_pca(hs_macr_l2cpmquantlowfsva)
hs_macr_l2cpmquantlowfsva_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
pretty_pca <- plot_pca(hs_macr_l2cpmquantlowcom)
pretty_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## perhaps the best?
knitr::kable(pretty_pca$table)
sampleid | condition | batch | batch_int | colors | labels | PC1 | PC2 |
---|---|---|---|---|---|---|---|
HPGL0241 | uninf | a | 1 | #009900 | uninf_1 | -0.5479 | 0.0811 |
HPGL0242 | sh | a | 1 | #000099 | sh_2271 | -0.2444 | 0.2656 |
HPGL0243 | sh | a | 1 | #000099 | sh_2272 | -0.0984 | -0.2749 |
HPGL0244 | chr | a | 1 | #990000 | chr_5433 | 0.4800 | 0.1569 |
HPGL0245 | chr | a | 1 | #990000 | chr_1320 | 0.2584 | -0.3114 |
HPGL0246 | chr | a | 1 | #990000 | chr_2504 | 0.2798 | 0.5005 |
HPGL0247 | chr | a | 1 | #990000 | chr_5430 | 0.2043 | -0.5226 |
HPGL0248 | chr | a | 1 | #990000 | chr_5397 | 0.2774 | 0.2092 |
HPGL0637 | uninf | b | 2 | #009900 | uninf | -0.2271 | -0.2421 |
HPGL0638 | sh | b | 2 | #000099 | sh_2189 | -0.1144 | -0.1436 |
HPGL0639 | sh | b | 2 | #000099 | sh_1022 | -0.2676 | 0.2813 |
##write.csv(pretty_pca$table, file="excel/pretty_pca.csv")
## Now try and combine this with
## https://github.com/jtleek/svaseq/blob/master/recount.Rmd
## To see if things become clearer
hs_macr_filt <- sm(normalize_expt(expt=hs_macr, filter=TRUE))
hs_macr_se <- sm(compare_surrogate_estimates(hs_macr_filt))
## I think my implementation of ruv_empirical may be borken.
hs_macr_se$pca_plots$ruvsup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
hs_macr_se$ruv_supervised_adjust$svs_sample
hs_macr_se$pca_plots$svasup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
hs_macr_se$sva_supervised_adjust$svs_sample
## The new uninfected sample still seems oddly placed
hs_macr_se$pca_plots$pca
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
hs_macr_se$pca_adjust$svs_sample
## Ditto here
hs_macr_se$pca_plots$svaunsup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## This seems pretty reasonable to me
hs_macr_se$pca_plots$ruvresid
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## This looks a lot like pca and ruvsup, but the corrplot suggested to me that it would look weirder.
Lets see how these modification affect the visualizations
hs_macr_batchnorm_metrics <- sm(graph_metrics(hs_macr_l2cpmquantlowcom))
print(hs_macr_batchnorm_metrics$corheat)
## The combat adjusted counts seem to split relatively clearly between types.
print(hs_macr_batchnorm_metrics$smc)
## This one sample is still problematic, though.
print(hs_macr_batchnorm_metrics$pcaplot)
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## But quantile normalization really does smash them into the same distribution.
Some other analyses performed suggest a possible switch between two samples. We can artifically see what the data would look like if that is true by switching those two samples in a separate experimental design.
PBMC: variance partition human, “Are there any genes with variance we can ascribe to condition?” That set of genes is the most interesting. – use the variance parition table and pull the top 200. Report back the % variance by condition for all of these genes.
In the following, I will attempt to find the variance associated with different experimental factors in the macrophage data with mappings against the human transcriptome.
As in macrophage_estimation_human.html, I need to pull out the relevant samples:
Start out with just condition and batch
## The default arguments are to query ~ condition + (1|batch)
## Which, because condition is categorical, will end badly.
hs_macr_vp_cb <- sm(varpart(hs_macr_filt,
predictor=NULL,
factors=c("condition","batch")))
hs_macr_vp_cb$percent_plot
hs_macr_vp_cb$partition_plot
## Batch is dominant, but there is still quite a lot left in residuals.
hs_macr_vp_cbs3 <- sm(varpart(hs_macr_l2cpmquantlowsva,
predictor=NULL,
factors=c("condition","batch","snpcladev3")))
hs_macr_vp_cbs3$percent_plot
hs_macr_vp_cbs3$partition_plot
I just realized that what I need to do is to take the surrogate estimates from the human experiment. Using those I think I will be able to get a more realistic estimate of what is going on.
parasite_expt <- create_expt(metadata="sample_sheets/all_samples-combined.xlsx",
gene_info=lp_annotations,
file_column="parasitefile")
## Reading the sample metadata.
## The sample definitions comprises: 50, 37 rows, columns.
## Reading count tables.
## Reading count tables with read.table().
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0242/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0243/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0244/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0245/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0246/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0247/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0248/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0316/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0318/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0320/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0322/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0631/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0632/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0633/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0634/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0635/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0636/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0638/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0639/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0641/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0643/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0651/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0652/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0653/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0654/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0655/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0656/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0658/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0659/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0660/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0661/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0662/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## /cbcb/nelsayed-scratch/atb/rnaseq/lpanamensis_2016/preprocessing/hpgl0663/outputs/tophat_lpanamensis/accepted_paired.count.xz contains 8846 rows and merges to 8846 rows.
## Finished reading count tables.
## Matched 8778 annotations and counts.
## Bringing together the count matrix and gene information.
## Some annotations were lost in merging, setting them to 'undefined'.
lp_macr <- subset_expt(parasite_expt, subset="experimentname=='macrophage'")
## There were 33, now there are 9 samples.
chosen_colors <- c("#990000", "#000099")
names(chosen_colors) <- c("chr", "sh")
lp_macr <- set_expt_colors(lp_macr, colors=chosen_colors)
## The new colors are a character, changing according to condition.
lp_testing <- sm(write_expt(lp_macr,
excel=paste0("excel/macrophage_parasite_data-v", ver, ".xlsx"),
violin=TRUE))
lp_testing$norm_pca
Now lets try removing some of the surrogates, the two primary candidates are the strains which I proxy with 3 columns: snpclade, snpcladev2, and snpcladev3; and the batch. In this data set batch is either a or b.
In theory, sva should pick up both of those in one invocation.
## Perform a 'normal' sva
lp_macr_sva <- sm(normalize_expt(lp_macr, batch="svaseq", filter=TRUE))
## Play with the normalization of it
lp_macr_sva_norm <- sm(normalize_expt(lp_macr_sva, filter=TRUE, convert="cpm",
norm="quant", transform="log2"))
lp_macr_sva_norm_metr <- sm(graph_metrics(lp_macr_sva_norm))
## Try limma's method
lp_macr_limma_fcqlr <- sm(normalize_expt(lp_macr, batch="limma", filter=TRUE,
batch2="snpcladev3", convert="cpm", norm="quant"))
lp_macr_sva_norm_metr$pcaplot
Another method might be to try using limmaresid to explicitly pull both columns.
plot_pca(lp_macr_limma_fcqlr)$plot
Another method might be to try pca on two separate invocations.
## frrpr rprrf
## raw(pca(raw(raw(filter(data)))))
## thus fcqsl would be log2(sva(quant(cpm(filter(data)))))
lp_macr_rprrf <- sm(normalize_expt(lp_macr, batch="pca", filter=TRUE))
lp_macr_rprrf <- set_expt_batches(expt=lp_macr_rprrf, fact="snpcladev3")
lp_macr_rprrf <- sm(normalize_expt(lp_macr_rprrf, batch="pca"))
lp_macr_lpqcf <- sm(normalize_expt(lp_macr_rprrf, convert="cpm",
norm="quant", transform="log2",
filter=TRUE))
lp_macr_lpqcf_metr <- sm(graph_metrics(lp_macr_lpqcf))
lp_macr_lpqcf_metr$pcaplot
If indeed the human surrogates are a primary determinant in the parasite data, then we should be able to see that in some sample estimations of the parasite data. Let us try!
summary(hs_macr_se)
## Length Class Mode
## pca_adjust 7 -none- list
## sva_supervised_adjust 7 -none- list
## sva_unsupervised_adjust 7 -none- list
## ruv_supervised_adjust 7 -none- list
## ruv_residual_adjust 7 -none- list
## ruv_empirical_adjust 7 -none- list
## adjustments 8 -none- list
## correlations 64 -none- numeric
## plot 3 recordedplot list
## pca_plots 7 -none- list
## catplots 0 -none- NULL
adjusts <- hs_macr_se$sva_supervised_adjust$model_adjust
## Remove surrogate estimates #1 and #9
new_adjusts <- as.matrix(adjusts[c(-1, -9), ])
## The question of course, how do I use these estimations to adjust the data?
test_adjusted <- counts_from_surrogates(lp_macr, new_adjusts)
sv_adjusted <- lp_macr
Biobase::exprs(sv_adjusted$expressionset) <- test_adjusted
hs_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
## There were 9, now there are 8 samples.
hs_adjusted_fcqsl <- sm(normalize_expt(hs_adjusted, convert="cpm", filter=TRUE,
batch="sva", norm="quant", transform="log2"))
hs_adjusted_fcqsl_metr <- sm(graph_metrics(hs_adjusted_fcqsl))
hs_sv_adjusted <- subset_expt(sv_adjusted, subset="sampleid!='HPGL0248'")
## There were 9, now there are 8 samples.
hs_sv_adjusted_fcqsl <- sm(normalize_expt(hs_sv_adjusted, convert="cpm", filter=TRUE,
batch="sva", norm="quant", transform="log2"))
hs_sv_adjusted_fcqsl_metr <- sm(graph_metrics(hs_sv_adjusted_fcqsl))
lp_macr_rrrrr_metr <- sm(graph_metrics(lp_macr))
lp_macr_fcqrr <- sm(normalize_expt(lp_macr, convert="cpm", filter=TRUE, norm="quant"))
lp_macr_fcqrr_metr <- sm(graph_metrics(lp_macr_fcqrr))
lp_macr_fcqfl <- sm(normalize_expt(lp_macr, convert="cpm", filter=TRUE, norm="quant",
batch="fsva", transform="log2"))
lp_macr_fcqfl_metr <- sm(graph_metrics(lp_macr_fcqfl))
lp_macr_fcqcl <- sm(normalize_expt(lp_macr, convert="cpm", filter=TRUE,
norm="quant", batch="combat_scale", transform="log2"))
lp_macr_fcqcl_metr <- sm(graph_metrics(lp_macr_fcqcl))
lp_macr_fcrfl <- sm(normalize_expt(lp_macr, filter=TRUE, norm="quant",
transform="log2", batch="fsva"))
lp_macr_fcrfl_metr <- sm(graph_metrics(lp_macr_fcrfl))
Now lets compare the results of using the human data to inform the parasite estimations.
Lets first print a few using different metrics before using the human SV estimates. It looks like sva etc are not too bad.
lp_macr_fcqrr_metr$pcaplot
## Just normalizing shows the central concern.
hs_sv_adjusted_fcqsl_metr$pcaplot
## The human surrogates work pretty well
lp_macr_fcqcl_metr$pcaplot
## Combat cannot deal with this data very well.
lp_macr_fcrfl_metr$pcaplot
## fsva gets a little confused, but not bad.
Start out with just condition and batch
## The default arguments are to query ~ condition + (1|batch)
## Which, because condition is categorical, will end badly.
lp_macr_vp_cb <- sm(varpart(lp_macr, predictor=NULL, factors=c("condition","batch")))
lp_macr_vp_cb$percent_plot
lp_macr_vp_cb$partition_plot
lp_macr_vp_lpqcf <- sm(varpart(lp_macr_lpqcf, predictor=NULL, factors=c("condition","batch")))
lp_macr_vp_lpqcf$percent_plot
lp_macr_vp_lpqcf$partition_plot
lp_macr_vp_hsfcqsl <- sm(varpart(hs_adjusted_fcqsl, predictor=NULL, factors=c("condition","batch")))
lp_macr_vp_hsfcqsl$percent_plot
lp_macr_vp_hsfcqsl$partition_plot
pander::pander(sessionInfo())
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.utf8, LC_NUMERIC=C, LC_TIME=en_US.utf8, LC_COLLATE=en_US.utf8, LC_MONETARY=en_US.utf8, LC_MESSAGES=en_US.utf8, LC_PAPER=en_US.utf8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.utf8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: ruv(v.0.9.7) and hpgltools(v.2018.03)
loaded via a namespace (and not attached): backports(v.1.1.2), corrplot(v.0.84), aroma.light(v.3.8.0), plyr(v.1.8.4), lazyeval(v.0.2.1), survJamda.data(v.1.0.2), splines(v.3.4.4), BiocParallel(v.1.12.0), GenomeInfoDb(v.1.14.0), ggplot2(v.2.2.1), sva(v.3.26.0), digest(v.0.6.15), SuppDists(v.1.1-9.4), foreach(v.1.4.4), BiocInstaller(v.1.28.0), htmltools(v.0.3.6), gdata(v.2.18.0), magrittr(v.1.5), memoise(v.1.1.0), survJamda(v.1.1.4), doParallel(v.1.0.11), openxlsx(v.4.0.17), limma(v.3.34.9), Biostrings(v.2.46.0), annotate(v.1.56.2), matrixStats(v.0.53.1), R.utils(v.2.6.0), prettyunits(v.1.0.2), colorspace(v.1.3-2), blob(v.1.1.1), ggrepel(v.0.7.0), RCurl(v.1.95-4.10), graph(v.1.56.0), roxygen2(v.6.0.1), genefilter(v.1.60.0), lme4(v.1.1-17), survival(v.2.42-3), iterators(v.1.0.9), gtable(v.0.2.0), zlibbioc(v.1.24.0), XVector(v.0.18.0), DelayedArray(v.0.4.1), BiocGenerics(v.0.24.0), scales(v.0.5.0.9000), DESeq(v.1.30.0), DBI(v.0.8), edgeR(v.3.20.9), Rcpp(v.0.12.16), xtable(v.1.8-2), progress(v.1.1.2), bit(v.1.1-12), OrganismDbi(v.1.20.0), preprocessCore(v.1.40.0), stats4(v.3.4.4), lava(v.1.6.1), prodlim(v.2018.04.18), ecodist(v.2.0.1), httr(v.1.3.1), gplots(v.3.0.1), RColorBrewer(v.1.1-2), XML(v.3.98-1.11), R.methodsS3(v.1.7.1), locfit(v.1.5-9.1), labeling(v.0.3), rlang(v.0.2.0.9001), reshape2(v.1.4.3), AnnotationDbi(v.1.40.0), munsell(v.0.4.3), tools(v.3.4.4), RSQLite(v.2.1.0), devtools(v.1.13.5), evaluate(v.0.10.1), stringr(v.1.3.0), yaml(v.2.1.19), bootstrap(v.2017.2), knitr(v.1.20), bit64(v.0.9-7), pander(v.0.6.1), caTools(v.1.17.1), EDASeq(v.2.12.0), RBGL(v.1.54.0), nlme(v.3.1-137), R.oo(v.1.22.0), xml2(v.1.2.0), biomaRt(v.2.34.2), compiler(v.3.4.4), pbkrtest(v.0.4-7), testthat(v.2.0.0), variancePartition(v.1.8.1), tibble(v.1.4.2), statmod(v.1.4.30), geneplotter(v.1.56.0), stringi(v.1.1.7), highr(v.0.6), GenomicFeatures(v.1.30.3), lattice(v.0.20-35), Matrix(v.1.2-14), commonmark(v.1.5), survcomp(v.1.28.5), nloptr(v.1.0.4), survivalROC(v.1.0.3), pillar(v.1.2.2), data.table(v.1.10.4-3), bitops(v.1.0-6), corpcor(v.1.6.9), rtracklayer(v.1.38.3), GenomicRanges(v.1.30.3), colorRamps(v.2.3), R6(v.2.2.2), latticeExtra(v.0.6-28), directlabels(v.2017.03.31), hwriter(v.1.3.2), RMySQL(v.0.10.14), ShortRead(v.1.36.1), KernSmooth(v.2.23-15), gridExtra(v.2.3), IRanges(v.2.12.0), codetools(v.0.2-15), MASS(v.7.3-50), gtools(v.3.5.0), assertthat(v.0.2.0), SummarizedExperiment(v.1.8.1), rprojroot(v.1.3-2), withr(v.2.1.2), RUVSeq(v.1.12.0), GenomicAlignments(v.1.14.2), Rsamtools(v.1.30.0), S4Vectors(v.0.16.0), GenomeInfoDbData(v.1.0.0), mgcv(v.1.8-23), parallel(v.3.4.4), quadprog(v.1.5-5), grid(v.3.4.4), minqa(v.1.2.4), rmarkdown(v.1.9), Rtsne(v.0.13), Biobase(v.2.38.0), base64enc(v.0.1-3) and rmeta(v.3.0)
message(paste0("This is hpgltools commit: ", get_git_commit()))
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 19d49d10f887097fff276c8f5a1db2dbc9168e8d
## R> packrat::restore()
## This is hpgltools commit: Wed May 2 10:24:05 2018 -0400: 19d49d10f887097fff276c8f5a1db2dbc9168e8d
this_save <- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
message(paste0("Saving to ", this_save))
## Saving to 02_estimation_macrophage-v20170820.rda.xz
tmp <- sm(saveme(filename=this_save))