1 Lots of samples!

1.1 Load all the data

## The biomart annotations file already exists, loading from it.
## Reading the sample metadata.
## The sample definitions comprises: 285 rows(samples) and 30 columns(metadata fields).
## Reading count tables.
## Using the transcript to gene mapping.
## Reading salmon data with tximport.
## Finished reading count tables.
## Matched 19629 annotations and counts.
## Bringing together the count matrix and gene information.
## Hey, your new gene map IDs are not the rownames of your gene information, changing them now.
## Some annotations were lost in merging, setting them to 'undefined'.

3 Perform a sample run of GSVA

Take the above expressionset and perform GSVA using it and a simple, controlled subset as the trainer.

## There were 266, now there are 203 samples.
## This function will replace the expt$expressionset slot with:
## rpkm(hpgl(data))
## It backs up the current data into a slot named:
##  expt$backup_expressionset. It will also save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep the libsizes in mind
##  when invoking limma.  The appropriate libsize is the non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Leaving the data in its current base format, keep in mind that
##  some metrics are easier to see when the data is log2 transformed, but
##  EdgeR/DESeq do not accept transformed data.
## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.
## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.
## Step 1: performing count filter with option: hpgl
## Removing 5595 low-count genes (14034 remaining).
## Step 2: not normalizing the data.
## Step 3: converting the data with rpkm.
## Step 4: not transforming the data.
## Step 5: not doing batch correction.
## Converting the rownames() of the expressionset to ENTREZID.
## Converting the rownames() of the expressionset to ENTREZID.
## Converting the rownames() of the expressionset to ENTREZID.

4 Invoke GSVA

##   [1] "Melanocytes%FANTOM%1.txt"                                                 
##   [2] "GSE9988_ANTI_TREM1_VS_VEHICLE_TREATED_MONOCYTES_UP"                       
##   [3] "GSE9988_ANTI_TREM1_VS_CTRL_TREATED_MONOCYTES_UP"                          
##   [4] "GSE9988_ANTI_TREM1_AND_LPS_VS_CTRL_TREATED_MONOCYTES_UP"                  
##   [5] "Mesangial cells%ENCODE%2.txt"                                             
##   [6] "GSE9988_ANTI_TREM1_AND_LPS_VS_VEHICLE_TREATED_MONOCYTES_UP"               
##   [7] "Epithelial cells%HPCA%2.txt"                                              
##   [8] "GSE22140_HEALTHY_VS_ARTHRITIC_GERMFREE_MOUSE_CD4_TCELL_DN"                
##   [9] "Chondrocytes%HPCA%3.txt"                                                  
##  [10] "GSE9988_LOW_LPS_VS_ANTI_TREM1_AND_LPS_MONOCYTE_DN"                        
##  [11] "ELSAYED_MACROPHAGE_INFECTED_HGH"                                          
##  [12] "Epithelial cells%HPCA%1.txt"                                              
##  [13] "GSE9988_LPS_VS_LPS_AND_ANTI_TREM1_MONOCYTE_DN"                            
##  [14] "NK cells%NOVERSHTERN%1.txt"                                               
##  [15] "Monocytes%IRIS%2.txt"                                                     
##  [16] "GSE9988_LOW_LPS_VS_CTRL_TREATED_MONOCYTE_UP"                              
##  [17] "Neurons%ENCODE%1.txt"                                                     
##  [18] "Monocytes%BLUEPRINT%3.txt"                                                
##  [19] "GSE2706_UNSTIM_VS_2H_R848_DC_DN"                                          
##  [20] "GSE9988_LOW_LPS_VS_VEHICLE_TREATED_MONOCYTE_UP"                           
##  [21] "GSE9988_LPS_VS_VEHICLE_TREATED_MONOCYTE_UP"                               
##  [22] "ELSAYED_MACROPHAGE_UNINFECTED_HGH"                                        
##  [23] "cDC%HPCA%3.txt"                                                           
##  [24] "GSE22140_GERMFREE_VS_SPF_MOUSE_CD4_TCELL_DN"                              
##  [25] "GSE22886_DAY0_VS_DAY1_MONOCYTE_IN_CULTURE_DN"                             
##  [26] "Monocytes%NOVERSHTERN%3.txt"                                              
##  [27] "Chondrocytes%HPCA%2.txt"                                                  
##  [28] "GSE19401_NAIVE_VS_IMMUNIZED_MOUSE_PLN_FOLLICULAR_DC_UP"                   
##  [29] "Keratinocytes%ENCODE%3.txt"                                               
##  [30] "GSE9988_LPS_VS_CTRL_TREATED_MONOCYTE_UP"                                  
##  [31] "GSE2706_UNSTIM_VS_2H_LPS_DC_DN"                                           
##  [32] "HSC%FANTOM%3.txt"                                                         
##  [33] "HSC%NOVERSHTERN%2.txt"                                                    
##  [34] "GSE22196_HEALTHY_VS_OBESE_MOUSE_SKIN_GAMMADELTA_TCELL_DN"                 
##  [35] "Monocytes%IRIS%1.txt"                                                     
##  [36] "Eosinophils%BLUEPRINT%3.txt"                                              
##  [37] "Sebocytes%FANTOM%2.txt"                                                   
##  [38] "GSE29617_CTRL_VS_DAY7_TIV_FLU_VACCINE_PBMC_2008_UP"                       
##  [39] "HSC%FANTOM%2.txt"                                                         
##  [40] "Monocytes%HPCA%1.txt"                                                     
##  [41] "Eosinophils%NOVERSHTERN%1.txt"                                            
##  [42] "Monocytes%IRIS%3.txt"                                                     
##  [43] "Monocytes%BLUEPRINT%1.txt"                                                
##  [44] "GSE2706_UNSTIM_VS_2H_LPS_AND_R848_DC_DN"                                  
##  [45] "Eosinophils%BLUEPRINT%1.txt"                                              
##  [46] "GSE36891_POLYIC_TLR3_VS_PAM_TLR2_STIM_PERITONEAL_MACROPHAGE_UP"           
##  [47] "GSE9988_ANTI_TREM1_VS_LOW_LPS_MONOCYTE_UP"                                
##  [48] "GSE1432_1H_VS_24H_IFNG_MICROGLIA_UP"                                      
##  [49] "GSE9988_ANTI_TREM1_VS_LPS_MONOCYTE_UP"                                    
##  [50] "GSE14769_UNSTIM_VS_80MIN_LPS_BMDM_DN"                                     
##  [51] "GSE18791_CTRL_VS_NEWCASTLE_VIRUS_DC_2H_DN"                                
##  [52] "Monocytes%NOVERSHTERN%2.txt"                                              
##  [53] "pDC%FANTOM%2.txt"                                                         
##  [54] "aDC%IRIS%1.txt"                                                           
##  [55] "GMP%NOVERSHTERN%1.txt"                                                    
##  [56] "GMP%NOVERSHTERN%2.txt"                                                    
##  [57] "GMP%NOVERSHTERN%3.txt"                                                    
##  [58] "GSE37605_TREG_VS_TCONV_NOD_FOXP3_FUSION_GFP_UP"                           
##  [59] "GSE22886_DAY1_VS_DAY7_MONOCYTE_IN_CULTURE_UP"                             
##  [60] "NK cells%NOVERSHTERN%2.txt"                                               
##  [61] "Chondrocytes%HPCA%1.txt"                                                  
##  [62] "CMP%BLUEPRINT%1.txt"                                                      
##  [63] "GSE9988_ANTI_TREM1_VS_ANTI_TREM1_AND_LPS_MONOCYTE_DN"                     
##  [64] "Tregs%HPCA%1.txt"                                                         
##  [65] "Erythrocytes%FANTOM%2.txt"                                                
##  [66] "GSE36891_UNSTIM_VS_POLYIC_TLR3_STIM_PERITONEAL_MACROPHAGE_UP"             
##  [67] "GSE14769_UNSTIM_VS_40MIN_LPS_BMDM_DN"                                     
##  [68] "GSE30971_CTRL_VS_LPS_STIM_MACROPHAGE_WBP7_HET_2H_UP"                      
##  [69] "CMP%BLUEPRINT%2.txt"                                                      
##  [70] "Tregs%HPCA%3.txt"                                                         
##  [71] "GSE19401_UNSTIM_VS_RETINOIC_ACID_AND_PAM2CSK4_STIM_FOLLICULAR_DC_DN"      
##  [72] "GSE14769_UNSTIM_VS_60MIN_LPS_BMDM_DN"                                     
##  [73] "GSE12392_IFNAR_KO_VS_IFNB_KO_CD8_NEG_SPLEEN_DC_UP"                        
##  [74] "iDC%HPCA%3.txt"                                                           
##  [75] "CMP%BLUEPRINT%3.txt"                                                      
##  [76] "GSE25123_CTRL_VS_ROSIGLITAZONE_STIM_PPARG_KO_MACROPHAGE_UP"               
##  [77] "Myocytes%FANTOM%2.txt"                                                    
##  [78] "Sebocytes%FANTOM%1.txt"                                                   
##  [79] "pDC%FANTOM%3.txt"                                                         
##  [80] "GSE46606_UNSTIM_VS_CD40L_IL2_IL5_1DAY_STIMULATED_IRF4HIGH_SORTED_BCELL_DN"
##  [81] "GSE3920_UNTREATED_VS_IFNG_TREATED_ENDOTHELIAL_CELL_UP"                    
##  [82] "NK cells%FANTOM%1.txt"                                                    
##  [83] "Eosinophils%BLUEPRINT%2.txt"                                              
##  [84] "CD4+ Tem%NOVERSHTERN%2.txt"                                               
##  [85] "CD4+ Tem%NOVERSHTERN%3.txt"                                               
##  [86] "GSE30971_CTRL_VS_LPS_STIM_MACROPHAGE_WBP7_KO_2H_UP"                       
##  [87] "GSE30971_CTRL_VS_LPS_STIM_MACROPHAGE_WBP7_KO_4H_UP"                       
##  [88] "Sebocytes%FANTOM%3.txt"                                                   
##  [89] "GSE369_PRE_VS_POST_IL6_INJECTION_SOCS3_KO_LIVER_UP"                       
##  [90] "GSE6269_FLU_VS_STAPH_AUREUS_INF_PBMC_DN"                                  
##  [91] "GSE40666_STAT1_KO_VS_STAT4_KO_CD8_TCELL_DN"                               
##  [92] "GSE45365_WT_VS_IFNAR_KO_CD8A_DC_DN"                                       
##  [93] "Pericytes%FANTOM%2.txt"                                                   
##  [94] "GSE27434_WT_VS_DNMT1_KO_TREG_DN"                                          
##  [95] "Astrocytes%ENCODE%3.txt"                                                  
##  [96] "GSE34156_NOD2_LIGAND_VS_TLR1_TLR2_LIGAND_6H_TREATED_MONOCYTE_DN"          
##  [97] "GSE19923_WT_VS_HEB_AND_E2A_KO_DP_THYMOCYTE_DN"                            
##  [98] "GSE21360_SECONDARY_VS_QUATERNARY_MEMORY_CD8_TCELL_UP"                     
##  [99] "GSE29617_CTRL_VS_TIV_FLU_VACCINE_PBMC_2008_UP"                            
## [100] "Neurons%FANTOM%1.txt"
##  [1] "Melanocytes%FANTOM%1.txt"                                  
##  [2] "GSE9988_ANTI_TREM1_VS_VEHICLE_TREATED_MONOCYTES_UP"        
##  [3] "GSE9988_ANTI_TREM1_VS_CTRL_TREATED_MONOCYTES_UP"           
##  [4] "GSE9988_ANTI_TREM1_AND_LPS_VS_CTRL_TREATED_MONOCYTES_UP"   
##  [5] "Mesangial cells%ENCODE%2.txt"                              
##  [6] "GSE9988_ANTI_TREM1_AND_LPS_VS_VEHICLE_TREATED_MONOCYTES_UP"
##  [7] "Epithelial cells%HPCA%2.txt"                               
##  [8] "GSE22140_HEALTHY_VS_ARTHRITIC_GERMFREE_MOUSE_CD4_TCELL_DN" 
##  [9] "Chondrocytes%HPCA%3.txt"                                   
## [10] "GSE9988_LOW_LPS_VS_ANTI_TREM1_AND_LPS_MONOCYTE_DN"         
## [11] "ELSAYED_MACROPHAGE_INFECTED_HGH"                           
## [12] "Epithelial cells%HPCA%1.txt"                               
## [13] "GSE9988_LPS_VS_LPS_AND_ANTI_TREM1_MONOCYTE_DN"             
## [14] "NK cells%NOVERSHTERN%1.txt"                                
## [15] "Monocytes%IRIS%2.txt"                                      
## [16] "GSE9988_LOW_LPS_VS_CTRL_TREATED_MONOCYTE_UP"               
## [17] "Neurons%ENCODE%1.txt"                                      
## [18] "Monocytes%BLUEPRINT%3.txt"                                 
## [19] "GSE2706_UNSTIM_VS_2H_R848_DC_DN"                           
## [20] "GSE9988_LOW_LPS_VS_VEHICLE_TREATED_MONOCYTE_UP"            
## [21] "GSE9988_LPS_VS_VEHICLE_TREATED_MONOCYTE_UP"                
## [22] "ELSAYED_MACROPHAGE_UNINFECTED_HGH"                         
## [23] "cDC%HPCA%3.txt"                                            
## [24] "GSE22140_GERMFREE_VS_SPF_MOUSE_CD4_TCELL_DN"               
## [25] "GSE22886_DAY0_VS_DAY1_MONOCYTE_IN_CULTURE_DN"              
## [26] "Monocytes%NOVERSHTERN%3.txt"                               
## [27] "Chondrocytes%HPCA%2.txt"                                   
## [28] "GSE19401_NAIVE_VS_IMMUNIZED_MOUSE_PLN_FOLLICULAR_DC_UP"    
## [29] "Keratinocytes%ENCODE%3.txt"                                
## [30] "GSE9988_LPS_VS_CTRL_TREATED_MONOCYTE_UP"                   
## [31] "GSE2706_UNSTIM_VS_2H_LPS_DC_DN"                            
## [32] "HSC%FANTOM%3.txt"                                          
## [33] "HSC%NOVERSHTERN%2.txt"                                     
## [34] "GSE22196_HEALTHY_VS_OBESE_MOUSE_SKIN_GAMMADELTA_TCELL_DN"  
## [35] "Monocytes%IRIS%1.txt"                                      
## [36] "Eosinophils%BLUEPRINT%3.txt"                               
## [37] "Sebocytes%FANTOM%2.txt"                                    
## [38] "GSE29617_CTRL_VS_DAY7_TIV_FLU_VACCINE_PBMC_2008_UP"        
## [39] "HSC%FANTOM%2.txt"                                          
## [40] "Monocytes%HPCA%1.txt"

6 Venn the elements of the intersections

A corollary question from scoring: what signatures are shared/unique to each set of genes, infected/uninfected? We can use a set of venn diagrams to explore that question…

7 Venn genes in signatures for infected and uninfected samples

Now lets go one step deeper, completing a logical circle…

## 'select()' returned 1:many mapping between keys and columns

7.1 Repeat the above query but better

The following block attempts to formalize the logic above and make it possible to query these intersections in more flexible and hopefully informative ways.

## Examining the mean score for experimental factor: infected from column: infectstate.
## Examining the mean score for experimental factor: uninfected from column: infectstate.

7.2 The final piece of circular logic

What about genes implicated in the infected-only, uninfected-only, and shared signature sets? Well, our intersect_signatures() function provides an opportunity to query even that…

##  9816  3559  6723 23212 29889  1503 
##    25    23    22    22    21    21
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 7c4477bb4fa3639cc6cf7940216e4c4b8cbee7ce
## This is hpgltools commit: Fri Oct 26 17:27:11 2018 -0400: 7c4477bb4fa3639cc6cf7940216e4c4b8cbee7ce
## Saving to 05_gsva_testing_v20180828.rda.xz
