Keith sent me a sample sheet describing some Mtb data. I do not quite know what the long-term goals are, but the immediate goal is to understand better some of the sources of variance and/or batch effect(s) in the data.
Because I am an asshat, I changed 2 column headings: Sample Name -> Sample ID and filepath -> file
## Reading the sample metadata.
## The sample definitions comprises: 19, 8 rows, columns.
## Reading count tables.
## Reading count tables with read.table().
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0618.count contains 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0619.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0620.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0621.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0622.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0623.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0624.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0625.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0626.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0628.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0629.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0664.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0665.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0666.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0668.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0669.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0670.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0671.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0672.count contains 41096 rows and merges to 41096 rows.
## Finished reading count tables.
## Matched 41091 annotations and counts.
## Bringing together the count matrix and gene information.
## sampleid condition batch infectiondate
## HPGL0618 HPGL0618 Uninfected_plus_IFNb A 2015/05/06
## HPGL0619 HPGL0619 Rv A 2015/05/06
## HPGL0620 HPGL0620 delta_esxA A 2015/05/06
## HPGL0621 HPGL0621 delta_esxA_comp A 2015/05/06
## HPGL0622 HPGL0622 Uninfected_plus_IFNb B 2015/05/06
## HPGL0623 HPGL0623 Rv B 2015/05/06
## HPGL0624 HPGL0624 delta_esxA B 2015/05/06
## HPGL0625 HPGL0625 delta_esxA_comp B 2015/05/06
## HPGL0626 HPGL0626 Uninfected_plus_IFNb C 2015/05/12
## HPGL0628 HPGL0628 delta_esxA C 2015/05/12
## HPGL0629 HPGL0629 delta_esxA_comp C 2015/05/12
## HPGL0664 HPGL0664 Uninfected D 2015/11/04
## HPGL0665 HPGL0665 Uninfected_plus_IFNb D 2015/11/04
## HPGL0666 HPGL0666 Uninfected E 2015/11/13
## HPGL0668 HPGL0668 Rv E 2015/11/13
## HPGL0669 HPGL0669 delta_esxA E 2015/11/13
## HPGL0670 HPGL0670 delta_esxA_comp E 2015/11/13
## HPGL0671 HPGL0671 Uninfected F 2015/05/06
## HPGL0672 HPGL0672 Uninfected_plus_IFNb F 2015/05/12
## rnacollectiondate rnaisolationdate libraryprepdate
## HPGL0618 2015/05/06 2015/06/28 2015/07/01
## HPGL0619 2015/05/06 2015/06/28 2015/07/01
## HPGL0620 2015/05/06 2015/06/28 2015/07/01
## HPGL0621 2015/05/06 2015/06/28 2015/07/01
## HPGL0622 2015/05/06 2015/06/28 2015/07/01
## HPGL0623 2015/05/06 2015/06/28 2015/07/01
## HPGL0624 2015/05/06 2015/06/28 2015/07/01
## HPGL0625 2015/05/06 2015/06/28 2015/07/01
## HPGL0626 2015/05/12 2015/06/28 2015/07/01
## HPGL0628 2015/05/12 2015/06/28 2015/07/01
## HPGL0629 2015/05/12 2015/06/28 2015/07/01
## HPGL0664 2015/11/04 2015/12/16 2015/12/18
## HPGL0665 2015/11/04 2015/12/16 2015/12/18
## HPGL0666 2015/11/13 2015/12/16 2015/12/18
## HPGL0668 2015/11/13 2015/12/16 2015/12/18
## HPGL0669 2015/11/13 2015/12/16 2015/12/18
## HPGL0670 2015/11/13 2015/12/16 2015/12/18
## HPGL0671 2015/12/10 2015/12/16 2015/12/18
## HPGL0672 2015/12/10 2015/12/16 2015/12/18
## file
## HPGL0618 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0618.count
## HPGL0619 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0619.count
## HPGL0620 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0620.count
## HPGL0621 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0621.count
## HPGL0622 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0622.count
## HPGL0623 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0623.count
## HPGL0624 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0624.count
## HPGL0625 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0625.count
## HPGL0626 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0626.count
## HPGL0628 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0628.count
## HPGL0629 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0629.count
## HPGL0664 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0664.count
## HPGL0665 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0665.count
## HPGL0666 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0666.count
## HPGL0668 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0668.count
## HPGL0669 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0669.count
## HPGL0670 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0670.count
## HPGL0671 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0671.count
## HPGL0672 /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0672.count
## Reading the sample metadata.
## The sample definitions comprises: 19, 11 rows, columns.
## Reading count tables.
## Reading count tables with read.table().
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0618.count contains 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0619.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0620.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0621.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0622.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0623.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0624.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0625.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0626.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0628.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0629.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0664.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0665.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0666.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0668.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0669.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0670.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0671.count contains 41096 rows and merges to 41096 rows.
## /cbcb/nelsayed-scratch/keith/mtb_ifnb/counts/mmusculus/HPGL0672.count contains 41096 rows and merges to 41096 rows.
## Finished reading count tables.
## Matched 41091 annotations and counts.
## Bringing together the count matrix and gene information.
initial_metrics <- sm(graph_metrics(mtb_expt, qq=TRUE, ma=FALSE))
dirty_norm <- sm(normalize_expt(mtb_expt, transform="log2", convert="cpm", norm="quant", filter=TRUE))
dirty_metrics <- sm(graph_metrics(dirty_norm))
initial_metrics$libsize
initial_metrics$nonzero
initial_metrics$boxplot
initial_metrics$density
## Now the normalized plots
dirty_metrics$pcaplot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
### hmm too many things going on for me to see
dirty_info <- pca_information(
dirty_norm,
expt_factors=c("condition", "batch", "rnaisolationdate", "libraryprepdate",
"infectiondate", "rnacollectiondate"),
plot_pcas=TRUE)
## More shallow curves in these plots suggest more genes in this principle component.
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
dirty_info$anova_fstat_heatmap
## hmm so PC2 is pretty hard up for all the metadata factors we don't want
dirty_info$pca_plots$PC1_PC3
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## to my eyes, pulling PCs 1 and 3 puts everybody except the esx samples in reasonably good order.
## In a moment, lets look at those jerks alone
## Pull just Rv vs. uninfected
rv_uninf <- subset_expt(dirty_norm, subset="condition=='Rv'|condition=='Uninfected'")
## There were 19, now there are 6 samples.
rv_uninf_pca <- plot_pca(rv_uninf)
summary(rv_uninf_pca)
## Length Class Mode
## pca 3 -none- list
## plot 9 gg list
## table 8 data.frame list
## res 4 data.frame list
## variance 5 -none- numeric
rv_uninf_pca$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## ok, so RV is left with 3 batches intersecting on E while uninfected is on right with 2 separate batches.
## 66+% of the variance is in PCA1/2
## the two strains labeled 'esx' arbitrarily chosen
esx <- subset_expt(dirty_norm, subset="condition=='delta_esxA'|condition=='delta_esxA_comp'")
## There were 19, now there are 8 samples.
esx_pca <- plot_pca(esx)
esx_pca$plot
def <- subset_expt(dirty_norm, subset="batch=='D'|batch=='E'|batch=='F'")
## There were 19, now there are 8 samples.
tt <- pca_information(
def,
expt_factors=c("condition", "batch", "rnaisolationdate", "libraryprepdate",
"infectiondate", "rnacollectiondate"),
plot_pcas=TRUE)
## More shallow curves in these plots suggest more genes in this principle component.
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## The standard deviation was 0 for rnaisolationdate and PC1.
## The standard deviation was 0 for rnaisolationdate and PC2.
## The standard deviation was 0 for rnaisolationdate and PC3.
## The standard deviation was 0 for rnaisolationdate and PC4.
## The standard deviation was 0 for rnaisolationdate and PC5.
## The standard deviation was 0 for rnaisolationdate and PC6.
## The standard deviation was 0 for libraryprepdate and PC1.
## The standard deviation was 0 for libraryprepdate and PC2.
## The standard deviation was 0 for libraryprepdate and PC3.
## The standard deviation was 0 for libraryprepdate and PC4.
## The standard deviation was 0 for libraryprepdate and PC5.
## The standard deviation was 0 for libraryprepdate and PC6.
plot_pca(def)$plot
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
Quick conversation with Keith, acquired some other ideas about the meta-information Therefore made a new design sheet with a couple extra factors
atb_norm <- normalize_expt(mtb_atb, transform="log2", convert="cpm", filter=TRUE, norm="quant")
## This function will replace the expt$expressionset slot with:
## log2(cpm(quant(hpgl(data))))
## It backs up the current data into a slot named:
## expt$backup_expressionset. It will also save copies of each step along the way
## in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
## when invoking limma. The appropriate libsize is the non-log(cpm(normalized)).
## This is most likely kept at:
## 'new_expt$normalized$intermediate_counts$normalization$libsizes'
## A copy of this may also be found at:
## new_expt$best_libsize
## Not correcting the count-data for batch effects. If batch is
## included in EdgerR/limma's model, then this is probably wise; but in extreme
## batch effects this is a good parameter to play with.
## Step 1: performing count filter with option: hpgl
## Removing 29475 low-count genes (11616 remaining).
## Step 2: normalizing the data with quant.
## Step 3: converting the data with cpm.
## Step 4: transforming the data with log2.
## transform_counts: Found 58 values equal to 0, adding 1 to the matrix.
## Step 5: not doing batch correction.
atb_pca <- pca_information(
atb_norm,
num_components=5,
expt_factors=c("condition1", "condition", "batch", "ifnadded", "infectstatus",
"rnaisolationdate", "libraryprepdate", "infectiondate", "rnacollectiondate"),
plot_pcas=TRUE)
## More shallow curves in these plots suggest more genes in this principle component.
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
atb_pca$anova_p
## PC1 PC2 PC3 PC4 PC5
## condition1 0.0001992444 0.035224 0.36320890 0.8672 0.8384
## condition 0.0689378520 0.309709 0.72108528 0.7170 0.5377
## batch 0.0151229165 0.384079 0.00871834 0.3846 0.5685
## ifnadded 0.0001003657 0.006406 0.74148980 0.8211 0.8428
## infectstatus 0.0000002038 0.176583 0.32590940 0.8795 0.6271
## rnaisolationdate 0.0034424334 0.441193 0.00142764 0.4846 0.6499
## libraryprepdate 0.0034424334 0.441193 0.00142764 0.4846 0.6499
## infectiondate 0.2267300056 0.537462 0.00009331 0.8140 0.6277
## rnacollectiondate 0.0103726974 0.413962 0.00610853 0.2198 0.7327
fun <- compare_surrogate_estimates(
mtb_atb,
extra_factors=c("ifnadded", "infectstatus", "rnaisolationdate"))
## The expt has not been filtered, set filter_type/filter_it if you want other options.
## The be method chose 3 surrogate variable(s).
## Attempting pca surrogate estimation with 3 surrogates.
## The be method chose 3 surrogate variable(s).
## Attempting sva supervised surrogate estimation with 3 surrogates.
## The be method chose 3 surrogate variable(s).
## Attempting sva unsupervised surrogate estimation with 3 surrogates.
## The be method chose 3 surrogate variable(s).
## Attempting ruvseq supervised surrogate estimation with 3 surrogates.
## The be method chose 3 surrogate variable(s).
## Attempting ruvseq residual surrogate estimation with 3 surrogates.
## The be method chose 3 surrogate variable(s).
## Attempting ruvseq empirical surrogate estimation with 3 surrogates.
## 1/8: Performing lmFit(data) etc. with null in the model.
## Setting options('download.file.method.GEOquery'='auto')
## Setting options('GEOquery.inmemory.gpl'=FALSE)
## 2/8: Performing lmFit(data) etc. with + batch_adjustments$batch in the model.
## 3/8: Performing lmFit(data) etc. with + batch_adjustments$pca in the model.
## 4/8: Performing lmFit(data) etc. with + batch_adjustments$sva_sup in the model.
## 5/8: Performing lmFit(data) etc. with + batch_adjustments$sva_unsup in the model.
## 6/8: Performing lmFit(data) etc. with + batch_adjustments$ruv_sup in the model.
## 7/8: Performing lmFit(data) etc. with + batch_adjustments$ruv_resid in the model.
## 8/8: Performing lmFit(data) etc. with + batch_adjustments$ruv_emp in the model.
fun$pca_plots
## $null
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $pca
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $svasup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $svaunsup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $ruvsup
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $ruvresid
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
##
## $ruvemp
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
if (!isTRUE(get0("skip_load"))) {
pander::pander(sessionInfo())
message(paste0("This is hpgltools commit: ", get_git_commit()))
this_save <- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
message(paste0("Saving to ", this_save))
tmp <- sm(saveme(filename=this_save))
}
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 7de4503f6bb5724c28cce24af5dbee22bb1c0cae
## R> packrat::restore()
## This is hpgltools commit: Thu Apr 12 22:08:53 2018 -0400: 7de4503f6bb5724c28cce24af5dbee22bb1c0cae
## Saving to index-v20180330.rda.xz