index.html preprocessing.html

1 Raw global metrics

## Graph the raw metrics of all samples mapped against the transcripts and miRNAs
mmmi_mature_metrics <- sm(graph_metrics(mmmi_mature))
mmmi_mature_norm <- default_norm(mmmi_mature, transform="log2")
mmmi_mature_norm_metrics <- sm(graph_metrics(mmmi_mature_norm))
mmmi_mature_writer <- write_expt(mmmi_mature,  excel=paste0("excel/mature_data-v", ver, ".xlsx"))
## Writing the legend.
## The sheet: legend is in legend.
## Writing the raw reads.
## Graphing the raw reads.
## Warning: Transformation introduced infinite values in continuous x-axis
## Warning: Transformation introduced infinite values in continuous y-axis
## This function will replace the expt$expressionset slot with:
## log2(cpm(data))
## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Filter is false, this should likely be set to something, good
##  choices include cbcb, kofa, pofa (anything but FALSE).  If you want this to
##  stay FALSE, keep in mind that if other normalizations are performed, then the
##  resulting libsizes are likely to be strange (potentially negative!)
## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.
## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.
## Step 1: not doing count filtering.
## Step 2: not normalizing the data.
## Step 3: converting the data with cpm.
## Step 4: transforming the data with log2.
## transform_counts: Found 5653 values equal to 0, adding 1 to the matrix.
## Step 5: not doing batch correction.
## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.
## The sheet: raw_graphs is in legend, raw_reads, raw_graphs.
## Writing the normalized reads.
## This function will replace the expt$expressionset slot with:
## log2(sva(cpm(quant(cbcb(data)))))
## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Warning in normalize_expt(expt = expt, transform = transform, norm = norm, : Quantile
## normalization and sva do not always play well together.
## Step 1: performing count filter with option: cbcb
## Removing 801 low-count genes (427 remaining).
## Step 2: normalizing the data with quant.
## Step 3: converting the data with cpm.
## Step 4: transforming the data with log2.
## transform_counts: Found 2 values equal to 0, adding 1 to the matrix.
## Step 5: doing batch correction with sva.
## Note to self:  If you get an error like 'x contains missing values'; I think this means that the data has too many 0's and needs to have a better low-count filter applied.
## Note to self:  I keep forgetting this, but the most common batch correction performed in Dr. El-Sayed's lab is implemented here as 'limmaresid'.
## batch_counts: Before batch correction, 390 entries 0<x<1.
## batch_counts: Before batch correction, 2 entries are >= 0.
## Passing the batch method to get_model_adjust().
## It currently understands: 'sva_(un)supervised', 'ruv_empirical', 'svaseq', 'pca', 'ruv_supervised', and 'ruv_residuals'.
## Not able to discern the state of the data.  Going to use a simplistic metric to guess if it is log scale.
## The be method chose 1 surrogate variable(s).
## Estimate type 'sva' is shorthand for 'sva_unsupervised'.
## Other sva options include: sva_supervised and svaseq.
## Attempting sva unsupervised surrogate estimation.
## The number of elements which are < 0 after batch correction is: 383
## The variable low_to_zero sets whether to change <0 values to 0 and is: FALSE
## Graphing the normalized reads.
## Warning in self$trans$transform(x): NaNs produced

## Warning in self$trans$transform(x): Transformation introduced infinite values in
## continuous x-axis
## Warning: Transformation introduced infinite values in continuous y-axis
## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.
## The sheet: norm_graphs is in legend, raw_reads, raw_graphs, norm_data, norm_graphs.
## Writing the median reads by factor.
## The factor cell_mirna_mut has 2 rows.
## The factor cell_mirna_wt has 2 rows.
## The factor exo_mirna_mut has 2 rows.
## The factor exo_mirna_wt has 2 rows.

1.1 A Legend!

First, do not forget to print a legend showing the colors used and what they mean:

mmmi_mature_metrics$legend$plot

## This should be the same for the mm_mi and mm_tx objects.

1.2 Start with some global metrics

mmmi_mature_metrics$libsize

mmmi_mature_norm_metrics$corheat

mmmi_mature_norm_metrics$pcaplot

mmmi_mature_de <- all_pairwise(mmmi_mature, model_batch="sva")
## The be method chose 1 surrogate variable(s).
## Estimate type 'sva' is shorthand for 'sva_unsupervised'.
## Other sva options include: sva_supervised and svaseq.
## Attempting sva unsupervised surrogate estimation.
## Finished running DE analyses, collecting outputs.
## Comparing analyses 1/6: cell_mirna_wt_vs_cell_mirna_mut
## Comparing analyses 2/6: exo_mirna_mut_vs_cell_mirna_mut
## Comparing analyses 3/6: exo_mirna_wt_vs_cell_mirna_mut
## Comparing analyses 4/6: exo_mirna_mut_vs_cell_mirna_wt
## Comparing analyses 5/6: exo_mirna_wt_vs_cell_mirna_wt
## Comparing analyses 6/6: exo_mirna_wt_vs_exo_mirna_mut

mature_keepers <- list(
    "mutvwt_cell" = c("cell_mirna_mut", "cell_mirna_wt"),
    "mutvwt_exo" = c("exo_mirna_mut", "exo_mirna_wt"),
    "exovcell_wt" = c("exo_mirna_wt", "cell_mirna_wt"),
    "exovcell_mut" = c("exo_mirna_mut", "cell_mirna_mut"))
mmmi_mature_tables <- combine_de_tables(mmmi_mature_de,
                                        excel=paste0("excel/mature", ver, ".xlsx"),
                                        keepers=mature_keepers)
## Deleting the file excel/mature15.xlsx before writing the tables.
## Writing a legend of columns.
## Working on 1/4: mutvwt_cell
## Found inverse table with cell_mirna_wt_vs_cell_mirna_mut
## Working on 2/4: mutvwt_exo
## Found inverse table with exo_mirna_wt_vs_exo_mirna_mut
## Working on 3/4: exovcell_wt
## Found table with exo_mirna_wt_vs_cell_mirna_wt
## Working on 4/4: exovcell_mut
## Found table with exo_mirna_mut_vs_cell_mirna_mut
## Adding venn plots for mutvwt_cell.

## Limma expression coefficients for mutvwt_cell; R^2: 0.975; equation: y = 0.969x - 0.161

## Warning: Removed 838 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).

## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 838 rows containing missing values (geom_point).
## Warning: Removed 212 rows containing missing values (geom_point).
## Warning: Removed 73 rows containing missing values (geom_point).
## Warning: Removed 838 rows containing missing values (geom_point).
## Edger expression coefficients for mutvwt_cell; R^2: 0.971; equation: y = 0.993x + 0.195

## DESeq2 expression coefficients for mutvwt_cell; R^2: 0.985; equation: y = 1.01x - 0.04

## Warning: Removed 237 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 237 rows containing missing values (geom_point).
## Warning: Removed 125 rows containing missing values (geom_point).
## Warning: Removed 64 rows containing missing values (geom_point).
## Warning: Removed 237 rows containing missing values (geom_point).
## Adding venn plots for mutvwt_exo.

## Limma expression coefficients for mutvwt_exo; R^2: 0.252; equation: y = 0.0113x - 1.26

## Warning: Removed 887 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).

## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 887 rows containing missing values (geom_point).
## Warning: Removed 198 rows containing missing values (geom_point).
## Warning: Removed 66 rows containing missing values (geom_point).
## Warning: Removed 887 rows containing missing values (geom_point).
## Edger expression coefficients for mutvwt_exo; R^2: 0.351; equation: y = -0.009x + 17.5

## DESeq2 expression coefficients for mutvwt_exo; R^2: 0.997; equation: y = 0.986x + 0.043

## Warning: Removed 232 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 232 rows containing missing values (geom_point).
## Warning: Removed 47 rows containing missing values (geom_point).
## Warning: Removed 123 rows containing missing values (geom_point).
## Warning: Removed 232 rows containing missing values (geom_point).
## Adding venn plots for exovcell_wt.

## Limma expression coefficients for exovcell_wt; R^2: 0.903; equation: y = 1.11x - 0.685

## Warning: Removed 861 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).

## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 861 rows containing missing values (geom_point).
## Warning: Removed 23 rows containing missing values (geom_point).
## Warning: Removed 199 rows containing missing values (geom_point).
## Warning: Removed 861 rows containing missing values (geom_point).
## Edger expression coefficients for exovcell_wt; R^2: 0.998; equation: y = 1.05x - 0.898

## DESeq2 expression coefficients for exovcell_wt; R^2: 0.989; equation: y = 0.98x - 0.0233

## Warning: Removed 232 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 232 rows containing missing values (geom_point).
## Warning: Removed 87 rows containing missing values (geom_point).
## Warning: Removed 56 rows containing missing values (geom_point).
## Warning: Removed 232 rows containing missing values (geom_point).
## Adding venn plots for exovcell_mut.

## Limma expression coefficients for exovcell_mut; R^2: 0.989; equation: y = 1.24x - 1.78

## Warning: Removed 889 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).

## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).

## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 889 rows containing missing values (geom_point).
## Warning: Removed 38 rows containing missing values (geom_point).
## Warning: Removed 241 rows containing missing values (geom_point).
## Warning: Removed 889 rows containing missing values (geom_point).
## Edger expression coefficients for exovcell_mut; R^2: 0.893; equation: y = 0.939x + 0.605

## DESeq2 expression coefficients for exovcell_mut; R^2: 0.988; equation: y = 0.997x - 0.0747

## Warning: Removed 249 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_vline).
## Warning: Removed 1 rows containing missing values (geom_hline).
## Warning: Removed 249 rows containing missing values (geom_point).
## Warning: Removed 137 rows containing missing values (geom_point).
## Warning: Removed 30 rows containing missing values (geom_point).
## Warning: Removed 249 rows containing missing values (geom_point).
## Writing summary information.
## The sheet: pairwise_summary is in legend, mutvwt_cell, mutvwt_exo, exovcell_wt, exovcell_mut, pairwise_summary.
## Attempting to add the comparison plot to pairwise_summary at row: 22 and column: 1

## Performing save of the workbook.

saveme(filename="mature.rda.xz")
## The savefile is: /cbcb/nelsayed-scratch/atb/small_rna/mmusculus_exosomev2/savefiles/mature.rda.xz
## Renaming /cbcb/nelsayed-scratch/atb/small_rna/mmusculus_exosomev2/savefiles/mature.rda.xz to /cbcb/nelsayed-scratch/atb/small_rna/mmusculus_exosomev2/savefiles/mature.rda.xz.01.
## The save string is: con <- base::pipe(paste0('pxz -T4 > /cbcb/nelsayed-scratch/atb/small_rna/mmusculus_exosomev2/savefiles/mature.rda.xz'), 'wb');
##  save(list=ls(all.names=TRUE, envir=globalenv()), envir=globalenv(), file=con, compress=FALSE);
##  close(con)

index.html annotation.html

LS0tCnRpdGxlOiAiTWF0dXJlIG1pUk5BIE0ubXVzY3VsdXMgc2FtcGxlcy4iCmF1dGhvcjogImF0YiBhYmVsZXdAZ21haWwuY29tIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDoKIGh0bWxfZG9jdW1lbnQ6CiAgY29kZV9kb3dubG9hZDogdHJ1ZQogIGNvZGVfZm9sZGluZzogc2hvdwogIGZpZ19jYXB0aW9uOiB0cnVlCiAgZmlnX2hlaWdodDogNwogIGZpZ193aWR0aDogNwogIGhpZ2hsaWdodDogdGFuZ28KICBrZWVwX21kOiBmYWxzZQogIG1vZGU6IHNlbGZjb250YWluZWQKICBudW1iZXJfc2VjdGlvbnM6IHRydWUKICBzZWxmX2NvbnRhaW5lZDogdHJ1ZQogIHRoZW1lOiBjb3NtbwogIHRvYzogdHJ1ZQogIHRvY19mbG9hdDoKICAgIGNvbGxhcHNlZDogZmFsc2UKICAgIHNtb290aF9zY3JvbGw6IGZhbHNlCi0tLQoKPHN0eWxlPgogIGJvZHkgLm1haW4tY29udGFpbmVyIHsKICAgbWF4LXdpZHRoOiAxNjAwcHg7CiAgfQo8L3N0eWxlPgoKYGBge3Igb3B0aW9ucywgaW5jbHVkZT1GQUxTRX0KIyMgVGhlc2UgYXJlIHRoZSBvcHRpb25zIEkgdGVuZCB0byBmYXZvcgpsaWJyYXJ5KCJocGdsdG9vbHMiKQpkZXZ0b29sczo6bG9hZF9hbGwoIn4vaHBnbHRvb2xzIikKa25pdHI6Om9wdHNfa25pdCRzZXQoCiAgICBwcm9ncmVzcyA9IFRSVUUsCiAgICB2ZXJib3NlID0gVFJVRSwKICAgIHdpZHRoID0gOTAsCiAgICBlY2hvID0gVFJVRSkKa25pdHI6Om9wdHNfY2h1bmskc2V0KAogICAgZXJyb3IgPSBUUlVFLAogICAgZmlnLndpZHRoID0gOCwKICAgIGZpZy5oZWlnaHQgPSA4LAogICAgZHBpID0gOTYpCm9wdGlvbnMoCiAgICBkaWdpdHMgPSA0LAogICAgc3RyaW5nc0FzRmFjdG9ycyA9IEZBTFNFLAogICAga25pdHIuZHVwbGljYXRlLmxhYmVsID0gImFsbG93IikKZ2dwbG90Mjo6dGhlbWVfc2V0KGdncGxvdDI6OnRoZW1lX2J3KGJhc2Vfc2l6ZT0xMCkpCnNldC5zZWVkKDEpCnJtZF9maWxlIDwtICJtYXR1cmUuUm1kIgp2ZXIgPC0gIjAzIgpgYGAKCltpbmRleC5odG1sXShpbmRleC5odG1sKSBbcHJlcHJvY2Vzc2luZy5odG1sXShwcmVwcm9jZXNzaW5nLmh0bWwpCgpgYGB7ciByZW5kZXJpbmcsIGluY2x1ZGU9RkFMU0UsIGV2YWw9RkFMU0V9CiMjIFRoaXMgYmxvY2sgaXMgdXNlZCB0byByZW5kZXIgYSBkb2N1bWVudCBmcm9tIHdpdGhpbiBpdC4Kcm1hcmtkb3duOjpyZW5kZXIocm1kX2ZpbGUpCgojIyBBbiBleHRyYSByZW5kZXJlciBmb3IgcGRmIG91dHB1dApybWFya2Rvd246OnJlbmRlcihybWRfZmlsZSwgb3V0cHV0X2Zvcm1hdD0icGRmX2RvY3VtZW50Iiwgb3V0cHV0X29wdGlvbnM9Yygic2tpcF9odG1sIikpCiMjIE9yIHRvIHNhdmUvbG9hZCBsYXJnZSBSZGF0YSBmaWxlcy4KaHBnbHRvb2xzOjo6c2F2ZW1lKCkKaHBnbHRvb2xzOjo6bG9hZG1lKCkKcm0obGlzdD1scygpKQpgYGAKCmBgYHtyIGxvYWRtZSwgaW5jbHVkZT1GQUxTRX0KdG1wIDwtIHNtKGxvYWRtZShmaWxlbmFtZT0iYW5ub3RhdGlvbi5yZGEueHoiKSkKYGBgCgojIFJhdyBnbG9iYWwgbWV0cmljcwoKYGBge3IgYmFzZV9tZXRyaWNzLCBmaWcuc2hvdz0iaGlkZSJ9CiMjIEdyYXBoIHRoZSByYXcgbWV0cmljcyBvZiBhbGwgc2FtcGxlcyBtYXBwZWQgYWdhaW5zdCB0aGUgdHJhbnNjcmlwdHMgYW5kIG1pUk5BcwptbW1pX21hdHVyZV9tZXRyaWNzIDwtIHNtKGdyYXBoX21ldHJpY3MobW1taV9tYXR1cmUpKQptbW1pX21hdHVyZV9ub3JtIDwtIGRlZmF1bHRfbm9ybShtbW1pX21hdHVyZSwgdHJhbnNmb3JtPSJsb2cyIikKbW1taV9tYXR1cmVfbm9ybV9tZXRyaWNzIDwtIHNtKGdyYXBoX21ldHJpY3MobW1taV9tYXR1cmVfbm9ybSkpCm1tbWlfbWF0dXJlX3dyaXRlciA8LSB3cml0ZV9leHB0KG1tbWlfbWF0dXJlLCAgZXhjZWw9cGFzdGUwKCJleGNlbC9tYXR1cmVfZGF0YS12IiwgdmVyLCAiLnhsc3giKSkKYGBgCgojIyBBIExlZ2VuZCEKCkZpcnN0LCBkbyBub3QgZm9yZ2V0IHRvIHByaW50IGEgbGVnZW5kIHNob3dpbmcgdGhlIGNvbG9ycyB1c2VkIGFuZCB3aGF0IHRoZXkgbWVhbjoKCmBgYHtyIGxlZ2VuZH0KbW1taV9tYXR1cmVfbWV0cmljcyRsZWdlbmQkcGxvdAojIyBUaGlzIHNob3VsZCBiZSB0aGUgc2FtZSBmb3IgdGhlIG1tX21pIGFuZCBtbV90eCBvYmplY3RzLgpgYGAKCiMjIFN0YXJ0IHdpdGggc29tZSBnbG9iYWwgbWV0cmljcwoKYGBge3IgbGlic2l6ZV9taX0KbW1taV9tYXR1cmVfbWV0cmljcyRsaWJzaXplCm1tbWlfbWF0dXJlX25vcm1fbWV0cmljcyRjb3JoZWF0Cm1tbWlfbWF0dXJlX25vcm1fbWV0cmljcyRwY2FwbG90CmBgYAoKYGBge3IgdGVzdF9kZX0KbW1taV9tYXR1cmVfZGUgPC0gYWxsX3BhaXJ3aXNlKG1tbWlfbWF0dXJlLCBtb2RlbF9iYXRjaD0ic3ZhIikKbWF0dXJlX2tlZXBlcnMgPC0gbGlzdCgKICAgICJtdXR2d3RfY2VsbCIgPSBjKCJjZWxsX21pcm5hX211dCIsICJjZWxsX21pcm5hX3d0IiksCiAgICAibXV0dnd0X2V4byIgPSBjKCJleG9fbWlybmFfbXV0IiwgImV4b19taXJuYV93dCIpLAogICAgImV4b3ZjZWxsX3d0IiA9IGMoImV4b19taXJuYV93dCIsICJjZWxsX21pcm5hX3d0IiksCiAgICAiZXhvdmNlbGxfbXV0IiA9IGMoImV4b19taXJuYV9tdXQiLCAiY2VsbF9taXJuYV9tdXQiKSkKbW1taV9tYXR1cmVfdGFibGVzIDwtIGNvbWJpbmVfZGVfdGFibGVzKG1tbWlfbWF0dXJlX2RlLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZXhjZWw9cGFzdGUwKCJleGNlbC9tYXR1cmUiLCB2ZXIsICIueGxzeCIpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAga2VlcGVycz1tYXR1cmVfa2VlcGVycykKYGBgCgpgYGB7ciBzYXZlbWV9CnNhdmVtZShmaWxlbmFtZT0ibWF0dXJlLnJkYS54eiIpCmBgYAoKW2luZGV4Lmh0bWxdKGluZGV4Lmh0bWwpIFthbm5vdGF0aW9uLmh0bWxdKGFubm90YXRpb24uaHRtbCkK