1 Sample Estimation, PBMC Infections: 20200220

Start out by extracting the relevant data and querying it to see the general quality.

This first block sets the names of the samples and colors. It also makes separate data sets for:

  • All features with infected and uninfected samples.
  • All features with only the infected samples.
  • Only CDS features with infected and uninfected samples.
  • Only CDS features with only the infected samples.
## Using a subset expression.
## There were 50, now there are 21 samples.
## The new colors are a character, changing according to condition.
## Using a subset expression.
## There were 21, now there are 18 samples.
## Using a subset expression.
## There were 50, now there are 21 samples.
## The new colors are a character, changing according to condition.
## Using a subset expression.
## There were 21, now there are 18 samples.
## Using a subset expression.
## There were 21, now there are 18 samples.
## Using a subset expression.
## There were 21, now there are 18 samples.
## Using a subset expression.
## There were 21, now there are 18 samples.
## Using a subset expression.
## There were 18, now there are 15 samples.

1.1 Generate plots describing the data

The following creates metric plots of the raw data.

Now let us visualize some of these metrics for the data set of all features including the uninfected samples.

Start with the relative library sizes. Note that this includes all feature types.

The picture is slightly different if we only look at coding sequences.

Look at the density of counts / feature for all samples. Use density plots and boxplots to view this information.

Now let us look at how the samples relate to each other via pairwise correlation heatmaps. Once again, show this first for all features, then only cds features.

2 Write the experiments

Having looked at these metrics, now let us write out the results in 4 excel workbooks, representing the same 4 data sets.

2.0.0.1 Extra writeout for Julieth

The following is in response to a query from Julieth on 20190507.

“Also, I’m going back to our data, and I would like to ask you if I can see the transcriptome per each patient in the PBMCs data. Just to show to Adelaida, if this time the gene expression values among them have less variability.”

## Warning in `[<-.factor`(`*tmp*`, null_ids, value = "null"): invalid factor
## level, NA generated
## This function will replace the expt$expressionset slot with:
## cpm(data)
## It will save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep libsizes in mind
##  when invoking limma.  The appropriate libsize is non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Filter is false, this should likely be set to something, good
##  choices include cbcb, kofa, pofa (anything but FALSE).  If you want this to
##  stay FALSE, keep in mind that if other normalizations are performed, then the
##  resulting libsizes are likely to be strange (potentially negative!)
## Leaving the data in its current base format, keep in mind that
##  some metrics are easier to see when the data is log2 transformed, but
##  EdgeR/DESeq do not accept transformed data.
## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.
## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.
## Step 1: not doing count filtering.
## Step 2: not normalizing the data.
## Step 3: converting the data with cpm.
## Step 4: not transforming the data.
## Step 5: not doing batch correction.
## The factor d108 has 6 rows.
## The factor d110 has 6 rows.
## The factor d107 has 6 rows.

3 Figure 4

Construct figure 4, this should include the following panels:

  1. Library sizes of pbmc data
  2. PCA of log2(quant(data)), with uninfected
  3. PCA of log2(quant(data)), without uninfected
  4. TSNE of b
  5. TSNE of c
## Going to write the image to: images/figure_4a.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4b.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4c.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4d.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4e.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4a_cds.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4b_cds.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4c_cds.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4d_cds.pdf when dev.off() is called.
## png 
##   2
## Going to write the image to: images/figure_4e_cds.pdf when dev.off() is called.
## png 
##   2

3.1 Start without the uninfected: no, patient, strain

Now let us try a few different ways of dealing with the batch effects/surrogate variables. In each case, I will use a PCA plot to see how the method changes the sample clustering.

3.1.1 PCA: No Batch correction

In this first iteration, we will log2(cpm(quant(filter()))) the data and leave the experimental parameters as the default: condition == the 6 strains, 3 chronic 3 self-healing batch == the three patients p107

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
chr_5430_d108 HPGL0631 chr d108 1 #990000 HPGL0631 -0.1611 -0.3489 -0.1611 -0.3489 0.0175 0.0500 0.2739 -0.1742 0.0085 0.3742 0.3141 -0.3494 0.0302 -0.4659 0.0027 0.2831 -0.0969 0.1376 0.0359
chr_5397_d108 HPGL0632 chr d108 1 #990000 HPGL0632 -0.1371 -0.3165 -0.1371 -0.3165 0.1619 0.0851 -0.0078 -0.2895 -0.1536 -0.5641 -0.1409 0.3518 -0.0417 -0.2268 -0.2738 0.1949 -0.0864 -0.2028 -0.0898
chr_2504_d108 HPGL0633 chr d108 1 #990000 HPGL0633 -0.0603 -0.2326 -0.0603 -0.2326 -0.3443 -0.1573 -0.0773 0.2400 -0.3291 -0.1835 0.2584 -0.2486 -0.3200 0.2421 -0.2556 -0.2938 -0.2348 0.1599 -0.1262
sh_2272_d108 HPGL0634 sh d108 1 #000099 HPGL0634 -0.2064 -0.3508 -0.2064 -0.3508 -0.0520 -0.0138 -0.0706 -0.3329 0.3878 0.2936 -0.1981 0.2079 -0.3153 0.2244 0.0513 -0.3498 0.2433 0.0571 0.0614
sh_1022_d108 HPGL0635 sh d108 1 #000099 HPGL0635 -0.0561 -0.2715 -0.0561 -0.2715 -0.1747 -0.3092 0.0068 0.2653 -0.3167 -0.0266 -0.0432 0.2000 0.1539 0.0329 0.6144 0.1565 0.2289 -0.1421 0.1697
sh_2189_d108 HPGL0636 sh d108 1 #000099 HPGL0636 -0.0979 -0.2974 -0.0979 -0.2974 0.0565 0.3311 -0.2410 0.4418 0.2958 0.0237 -0.1382 -0.0996 0.5382 0.1843 -0.1773 0.0371 -0.0787 0.0031 -0.0243
chr_5430_d110 HPGL0651 chr d110 2 #990000 HPGL0651 0.2928 -0.0196 0.2928 -0.0196 0.3686 0.1002 0.2949 0.1006 -0.1359 -0.0518 -0.0204 -0.2008 0.0743 -0.2060 0.0178 -0.6104 0.1955 -0.3022 0.0551
chr_5397_d110 HPGL0652 chr d110 2 #990000 HPGL0652 0.3039 0.0113 0.3039 0.0113 0.4205 -0.2108 -0.0020 0.1728 -0.0744 0.3666 -0.0049 0.3971 -0.1452 0.1083 -0.1054 0.1016 -0.4769 0.0765 0.1253
chr_2504_d110 HPGL0653 chr d110 2 #990000 HPGL0653 0.3358 0.0548 0.3358 0.0548 -0.1369 -0.5220 -0.1394 -0.0875 0.5226 -0.2721 -0.0907 -0.2165 0.0759 -0.2193 0.1120 -0.0022 -0.1810 0.0069 -0.0939
sh_2272_d110 HPGL0654 sh d110 2 #000099 HPGL0654 0.3020 0.0296 0.3020 0.0296 0.1251 0.1788 -0.2149 -0.0797 -0.2441 0.1080 -0.2757 -0.1524 -0.1235 0.0260 0.1417 0.1926 0.2496 0.2837 -0.6048
sh_1022_d110 HPGL0655 sh d110 2 #000099 HPGL0655 0.3684 0.1068 0.3684 0.1068 -0.3788 0.2019 0.2048 -0.1897 0.0539 0.0015 0.4444 0.4206 0.3123 0.0881 -0.0694 -0.0608 0.1246 0.1404 -0.0937
sh_2189_d110 HPGL0656 sh d110 2 #000099 HPGL0656 0.3447 0.0641 0.3447 0.0641 -0.1350 0.2721 -0.1265 -0.1287 -0.0225 -0.0512 -0.0607 -0.2885 -0.2474 0.2228 -0.0799 0.3306 0.1022 -0.2287 0.5634
chr_5430_d107 HPGL0658 chr d107 3 #990000 HPGL0658 -0.2481 0.1907 -0.2481 0.1907 0.2123 -0.1483 0.5268 -0.1961 0.0366 -0.1378 -0.0469 -0.2107 0.1579 0.5687 0.0662 0.1566 -0.0737 0.0097 -0.1172
chr_5397_d107 HPGL0659 chr d107 3 #990000 HPGL0659 -0.2296 0.2644 -0.2296 0.2644 0.1715 -0.0996 -0.2535 0.1823 0.1764 0.0822 0.4762 0.0475 -0.1967 0.0024 -0.0909 0.1541 0.2845 -0.4517 -0.2498
chr_2504_d107 HPGL0660 chr d107 3 #990000 HPGL0660 -0.1645 0.2974 -0.1645 0.2974 -0.1983 -0.3384 0.0060 0.0071 -0.2208 0.2287 -0.3465 0.0225 0.2212 -0.1732 -0.5146 0.0014 0.2868 0.0792 0.1452
sh_2272_d107 HPGL0661 sh d107 3 #000099 HPGL0661 -0.2237 0.2504 -0.2237 0.2504 0.3195 0.0669 -0.3098 -0.0730 -0.0352 -0.2641 0.2244 -0.0032 0.0521 -0.0894 0.1789 -0.1384 0.0798 0.5624 0.3436
sh_1022_d107 HPGL0662 sh d107 3 #000099 HPGL0662 -0.1457 0.2712 -0.1457 0.2712 -0.2147 0.2992 0.3804 0.4301 0.2327 -0.1132 -0.2373 0.1515 -0.3729 -0.2455 0.1190 0.0560 -0.0737 0.1253 -0.0199
sh_2189_d107 HPGL0663 sh d107 3 #000099 HPGL0663 -0.2170 0.2965 -0.2170 0.2965 -0.2187 0.2142 -0.2506 -0.2886 -0.1820 0.1859 -0.1139 -0.0292 0.1466 -0.0738 0.2630 -0.2091 -0.4930 -0.3144 -0.0799
## NULL

3.1.2 PCA: Repeat with combat adjustment

For the second iteration, use the same normalization, but add a combat correction in an attempt to minimize patient’s effect in the variance.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
chr_5430_d108 HPGL0631 chr d108 1 #990000 HPGL0631 -0.2243 0.0618 -0.2243 0.0618 -0.1738 0.0994 0.1807 -0.0309 -0.0519 -0.3219 -0.0779 -0.0746 0.5048 0.2196 -0.0576 0.3475 0.5040 -0.0129 -0.1320
chr_5397_d108 HPGL0632 chr d108 1 #990000 HPGL0632 -0.2346 -0.0373 -0.2346 -0.0373 -0.2685 -0.1030 -0.1316 -0.2640 -0.2451 0.4797 -0.0514 0.0850 0.1285 0.0756 0.0251 0.1813 -0.3464 -0.3608 -0.3352
chr_2504_d108 HPGL0633 chr d108 1 #990000 HPGL0633 0.4068 -0.0334 0.4068 -0.0334 -0.0041 -0.2138 -0.1543 -0.2776 0.4786 -0.1843 -0.1131 0.4568 0.0349 -0.1676 -0.0956 -0.0441 0.1217 -0.1594 -0.2573
sh_2272_d108 HPGL0634 sh d108 1 #000099 HPGL0634 -0.3095 -0.0203 -0.3095 -0.0203 -0.0849 -0.3288 0.1453 0.2562 -0.0842 -0.5106 -0.0662 0.0282 -0.3922 0.1469 -0.2169 -0.1342 -0.2056 -0.2995 -0.0283
sh_1022_d108 HPGL0635 sh d108 1 #000099 HPGL0635 0.4041 -0.0484 0.4041 -0.0484 0.2526 0.1768 0.1347 -0.0160 -0.0445 -0.0156 0.4978 -0.4217 -0.1784 0.0275 -0.1273 0.2138 0.0149 -0.2983 -0.2319
sh_2189_d108 HPGL0636 sh d108 1 #000099 HPGL0636 -0.1588 0.0372 -0.1588 0.0372 0.2590 0.4078 -0.1489 0.5533 0.1858 0.1344 -0.2831 0.0015 0.0719 -0.2131 0.1717 -0.1598 0.0613 -0.2653 -0.2089
chr_5430_d110 HPGL0651 chr d110 2 #990000 HPGL0651 -0.0671 -0.3062 -0.0671 -0.3062 -0.3705 0.2754 0.1361 0.0292 0.5337 0.2037 0.0436 -0.1050 -0.0122 0.1652 -0.2994 -0.0098 -0.1455 0.0359 0.3671
chr_5397_d110 HPGL0652 chr d110 2 #990000 HPGL0652 -0.0473 -0.1996 -0.0473 -0.1996 0.1561 0.3570 0.2516 -0.4305 -0.3863 -0.0281 -0.2139 0.1742 -0.1671 -0.2882 -0.0946 -0.1784 0.1700 -0.1316 0.2791
chr_2504_d110 HPGL0653 chr d110 2 #990000 HPGL0653 -0.0747 0.0931 -0.0747 0.0931 0.4458 -0.5279 0.3060 0.1193 0.0815 0.4013 0.0495 0.0704 0.1699 0.0147 -0.0663 0.0368 0.1568 -0.0735 0.3240
sh_2272_d110 HPGL0654 sh d110 2 #000099 HPGL0654 -0.0882 -0.2269 -0.0882 -0.2269 -0.0304 0.0384 -0.3594 0.0995 -0.1013 -0.1168 0.5221 0.3529 0.0095 0.1165 0.4443 0.0387 0.0872 -0.0691 0.3128
sh_1022_d110 HPGL0655 sh d110 2 #000099 HPGL0655 0.2304 0.4492 0.2304 0.4492 -0.1456 -0.0047 -0.0672 -0.1666 0.0678 0.0335 -0.3273 -0.2554 -0.2605 0.3266 0.4375 -0.0214 0.0894 -0.1313 0.2577
sh_2189_d110 HPGL0656 sh d110 2 #000099 HPGL0656 0.1941 0.2992 0.1941 0.2992 -0.0648 -0.0303 -0.4177 0.0749 -0.2252 -0.1504 0.0093 -0.1723 0.3945 -0.2468 -0.3734 -0.1118 -0.2341 -0.0080 0.3250
chr_5430_d107 HPGL0658 chr d107 3 #990000 HPGL0658 0.1002 -0.1494 0.1002 -0.1494 -0.3679 -0.2275 0.3555 0.0844 -0.0192 -0.0517 0.0770 -0.2458 0.1383 -0.4974 0.4204 -0.1965 -0.0706 0.1606 -0.0972
chr_5397_d107 HPGL0659 chr d107 3 #990000 HPGL0659 0.1550 -0.3130 0.1550 -0.3130 0.3145 0.0038 0.0017 0.0429 -0.0514 -0.1744 -0.3479 0.0098 0.0426 0.0950 0.1530 0.4711 -0.4365 0.3461 0.0312
chr_2504_d107 HPGL0660 chr d107 3 #990000 HPGL0660 -0.3053 0.2694 -0.3053 0.2694 0.3031 0.1496 0.0865 -0.3418 0.1928 -0.1226 0.2397 -0.0291 0.1563 0.1840 0.0848 -0.3986 -0.2791 0.2947 -0.2012
sh_2272_d107 HPGL0661 sh d107 3 #000099 HPGL0661 0.1340 -0.3948 0.1340 -0.3948 0.0069 -0.1724 -0.2825 0.0615 -0.2193 0.1675 -0.1285 -0.1905 -0.0531 0.3269 -0.1420 -0.4152 0.3072 0.2917 -0.2160
sh_1022_d107 HPGL0662 sh d107 3 #000099 HPGL0662 0.2443 0.3620 0.2443 0.3620 -0.2158 0.1822 0.2706 0.3101 -0.2346 0.1704 0.1057 0.4634 -0.1610 0.1028 -0.1506 0.0434 -0.0214 0.3399 -0.1326
sh_2189_d107 HPGL0663 sh d107 3 #000099 HPGL0663 -0.3593 0.1574 -0.3593 0.1574 -0.0117 -0.0819 -0.3071 -0.1041 0.1228 0.0859 0.0647 -0.1478 -0.4266 -0.3883 -0.1132 0.3373 0.2270 0.3407 -0.0562
## NULL

3.1.3 Look at correlations between experimental factors and variance

## More shallow curves in these plots suggest more genes in this principle component.

Look for significant correlations between the PCs and some factors in the experimental design.

3.1.4 PCA: Change batch to strain and condition to patient+state

Here we will set the batch to the humansite strains and condition to a combination of the patient and state state; then perform the pca again.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
chr_5430_d108 HPGL0631 chronic_d108 s5430 6 #1B9E77 HPGL0631 -0.2461 -0.3340 -0.2461 -0.3340 0.1227 0.1293 -0.0621 0.2893 0.3116 -0.2105 0.1267 -0.0251 -0.5026 0.0298 -0.1305 -0.1411 0.3074 -0.3284 0.0472
chr_5397_d108 HPGL0632 chronic_d108 s5397 5 #1B9E77 HPGL0632 -0.2400 -0.3215 -0.2400 -0.3215 0.1822 0.3904 0.3038 -0.3043 -0.2679 0.2419 0.1933 -0.0879 -0.0444 -0.0005 -0.1256 -0.1420 0.0260 0.4460 -0.0101
chr_2504_d108 HPGL0633 chronic_d108 s2504 4 #1B9E77 HPGL0633 -0.2342 -0.3000 -0.2342 -0.3000 -0.1008 0.3011 -0.2960 0.1002 0.0045 0.1124 -0.5541 0.0755 0.1677 -0.0740 -0.0829 0.3309 -0.3350 -0.1055 -0.0486
sh_2272_d108 HPGL0634 self_heal_d108 s2272 3 #D95F02 HPGL0634 0.1518 -0.3703 0.1518 -0.3703 0.2434 -0.2180 0.0782 -0.0544 0.0502 -0.4095 0.0467 0.0619 0.4451 0.2498 0.0420 -0.2587 -0.1832 -0.0878 -0.3489
sh_1022_d108 HPGL0635 self_heal_d108 s1022 1 #D95F02 HPGL0635 0.1733 -0.3071 0.1733 -0.3071 -0.0807 -0.4396 0.0959 0.0349 0.1886 0.4479 -0.0904 -0.3117 0.0277 0.1050 0.1864 0.3086 0.3332 0.1043 -0.0682
sh_2189_d108 HPGL0636 self_heal_d108 s2189 2 #D95F02 HPGL0636 0.1737 -0.3450 0.1737 -0.3450 -0.3421 -0.2519 -0.2129 -0.0681 -0.2849 -0.0414 0.2859 0.2885 -0.0648 -0.2974 0.0952 -0.0740 -0.1474 -0.0219 0.4331
chr_5430_d110 HPGL0651 chronic_d110 s5430 6 #7570B3 HPGL0651 -0.0378 0.1807 -0.0378 0.1807 -0.4260 0.2027 -0.0213 -0.0617 0.0135 0.1604 0.2327 -0.0540 0.4635 0.1947 -0.3720 -0.0879 0.2914 -0.3381 0.0505
chr_5397_d110 HPGL0652 chronic_d110 s5397 5 #7570B3 HPGL0652 -0.0401 0.1595 -0.0401 0.1595 -0.3054 -0.2224 0.1113 -0.2300 0.4114 -0.2043 -0.2949 0.2429 -0.1521 -0.0060 -0.3535 -0.1254 0.0074 0.4342 0.0007
chr_2504_d110 HPGL0653 chronic_d110 s2504 4 #7570B3 HPGL0653 -0.0525 0.1480 -0.0525 0.1480 -0.0429 -0.1885 0.5184 0.3256 -0.2631 -0.1934 0.1138 -0.1999 -0.0887 -0.1215 -0.3073 0.3681 -0.2914 -0.0985 -0.0308
sh_2272_d110 HPGL0654 self_heal_d110 s2272 3 #E7298A HPGL0654 0.3698 0.1392 0.3698 0.1392 0.0586 0.0641 -0.2870 -0.1507 0.0380 0.1217 0.0320 -0.3945 -0.1724 -0.4673 -0.1769 -0.2302 -0.1542 -0.0877 -0.3727
sh_1022_d110 HPGL0655 self_heal_d110 s1022 1 #E7298A HPGL0655 0.3563 0.1394 0.3563 0.1394 0.3988 0.1496 -0.0804 0.1389 -0.0252 0.0316 0.0974 0.5405 0.0780 -0.1310 -0.1404 0.3516 0.3166 0.1150 -0.0875
sh_2189_d110 HPGL0656 self_heal_d110 s2189 2 #E7298A HPGL0656 0.3695 0.1344 0.3695 0.1344 0.2197 0.0526 -0.1325 -0.0337 -0.1746 0.0036 -0.2234 -0.1937 -0.2006 0.5644 -0.1246 -0.0828 -0.1389 -0.0080 0.4492
chr_5430_d107 HPGL0658 chronic_d107 s5430 6 #66A61E HPGL0658 -0.3279 0.2139 -0.3279 0.2139 0.2613 -0.3041 0.0898 -0.2866 -0.3376 0.0314 -0.3768 0.0889 0.0198 -0.2074 0.1009 -0.1786 0.2837 -0.3250 0.0607
chr_5397_d107 HPGL0659 chronic_d107 s5397 5 #66A61E HPGL0659 -0.3296 0.2319 -0.3296 0.2319 0.0505 -0.1608 -0.4004 0.5205 -0.1390 -0.0115 0.1046 -0.1437 0.1733 0.0253 0.0831 -0.2061 0.0500 0.4341 -0.0170
chr_2504_d107 HPGL0660 chronic_d107 s2504 4 #66A61E HPGL0660 -0.3209 0.2329 -0.3209 0.2329 0.1589 -0.1270 -0.2180 -0.3868 0.2575 0.0859 0.4189 0.0590 -0.1030 0.1959 0.1548 0.3267 -0.3219 -0.1048 -0.0426
sh_2272_d107 HPGL0661 self_heal_d107 s2272 3 #E6AB02 HPGL0661 0.0868 0.1523 0.0868 0.1523 -0.2705 0.1135 0.2275 0.2180 -0.0959 0.3175 -0.0445 0.3566 -0.2657 0.2280 0.3595 -0.2641 -0.1711 -0.1104 -0.3554
sh_1022_d107 HPGL0662 self_heal_d107 s1022 1 #E6AB02 HPGL0662 0.0796 0.1052 0.0796 0.1052 -0.2807 0.2981 -0.0291 -0.1796 -0.1458 -0.5312 -0.0456 -0.2200 -0.0589 0.0139 0.4544 0.2573 0.2747 0.1079 -0.0937
sh_2189_d107 HPGL0663 self_heal_d107 s2189 2 #E6AB02 HPGL0663 0.0683 0.1406 0.0683 0.1406 0.1532 0.2109 0.3148 0.1286 0.4588 0.0475 -0.0221 -0.0833 0.2781 -0.3014 0.3372 -0.1525 -0.1476 -0.0253 0.4343
## NULL

3.1.5 PCA: Repeat but with just chronic/self-state

Now change only the condition to self/chronic and make super-explicit the split in the samples.

## The new colors are a character, changing according to condition.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
chr_5430_d108 HPGL0631 chronic s5430 6 #880000 HPGL0631 -0.2461 -0.3340 -0.2461 -0.3340 0.1227 0.1293 -0.0621 0.2893 0.3116 -0.2105 0.1267 -0.0251 -0.5026 0.0298 -0.1305 -0.1411 0.3074 -0.3284 0.0472
chr_5397_d108 HPGL0632 chronic s5397 5 #880000 HPGL0632 -0.2400 -0.3215 -0.2400 -0.3215 0.1822 0.3904 0.3038 -0.3043 -0.2679 0.2419 0.1933 -0.0879 -0.0444 -0.0005 -0.1256 -0.1420 0.0260 0.4460 -0.0101
chr_2504_d108 HPGL0633 chronic s2504 4 #880000 HPGL0633 -0.2342 -0.3000 -0.2342 -0.3000 -0.1008 0.3011 -0.2960 0.1002 0.0045 0.1124 -0.5541 0.0755 0.1677 -0.0740 -0.0829 0.3309 -0.3350 -0.1055 -0.0486
sh_2272_d108 HPGL0634 self_heal s2272 3 #000088 HPGL0634 0.1518 -0.3703 0.1518 -0.3703 0.2434 -0.2180 0.0782 -0.0544 0.0502 -0.4095 0.0467 0.0619 0.4451 0.2498 0.0420 -0.2587 -0.1832 -0.0878 -0.3489
sh_1022_d108 HPGL0635 self_heal s1022 1 #000088 HPGL0635 0.1733 -0.3071 0.1733 -0.3071 -0.0807 -0.4396 0.0959 0.0349 0.1886 0.4479 -0.0904 -0.3117 0.0277 0.1050 0.1864 0.3086 0.3332 0.1043 -0.0682
sh_2189_d108 HPGL0636 self_heal s2189 2 #000088 HPGL0636 0.1737 -0.3450 0.1737 -0.3450 -0.3421 -0.2519 -0.2129 -0.0681 -0.2849 -0.0414 0.2859 0.2885 -0.0648 -0.2974 0.0952 -0.0740 -0.1474 -0.0219 0.4331
chr_5430_d110 HPGL0651 chronic s5430 6 #880000 HPGL0651 -0.0378 0.1807 -0.0378 0.1807 -0.4260 0.2027 -0.0213 -0.0617 0.0135 0.1604 0.2327 -0.0540 0.4635 0.1947 -0.3720 -0.0879 0.2914 -0.3381 0.0505
chr_5397_d110 HPGL0652 chronic s5397 5 #880000 HPGL0652 -0.0401 0.1595 -0.0401 0.1595 -0.3054 -0.2224 0.1113 -0.2300 0.4114 -0.2043 -0.2949 0.2429 -0.1521 -0.0060 -0.3535 -0.1254 0.0074 0.4342 0.0007
chr_2504_d110 HPGL0653 chronic s2504 4 #880000 HPGL0653 -0.0525 0.1480 -0.0525 0.1480 -0.0429 -0.1885 0.5184 0.3256 -0.2631 -0.1934 0.1138 -0.1999 -0.0887 -0.1215 -0.3073 0.3681 -0.2914 -0.0985 -0.0308
sh_2272_d110 HPGL0654 self_heal s2272 3 #000088 HPGL0654 0.3698 0.1392 0.3698 0.1392 0.0586 0.0641 -0.2870 -0.1507 0.0380 0.1217 0.0320 -0.3945 -0.1724 -0.4673 -0.1769 -0.2302 -0.1542 -0.0877 -0.3727
sh_1022_d110 HPGL0655 self_heal s1022 1 #000088 HPGL0655 0.3563 0.1394 0.3563 0.1394 0.3988 0.1496 -0.0804 0.1389 -0.0252 0.0316 0.0974 0.5405 0.0780 -0.1310 -0.1404 0.3516 0.3166 0.1150 -0.0875
sh_2189_d110 HPGL0656 self_heal s2189 2 #000088 HPGL0656 0.3695 0.1344 0.3695 0.1344 0.2197 0.0526 -0.1325 -0.0337 -0.1746 0.0036 -0.2234 -0.1937 -0.2006 0.5644 -0.1246 -0.0828 -0.1389 -0.0080 0.4492
chr_5430_d107 HPGL0658 chronic s5430 6 #880000 HPGL0658 -0.3279 0.2139 -0.3279 0.2139 0.2613 -0.3041 0.0898 -0.2866 -0.3376 0.0314 -0.3768 0.0889 0.0198 -0.2074 0.1009 -0.1786 0.2837 -0.3250 0.0607
chr_5397_d107 HPGL0659 chronic s5397 5 #880000 HPGL0659 -0.3296 0.2319 -0.3296 0.2319 0.0505 -0.1608 -0.4004 0.5205 -0.1390 -0.0115 0.1046 -0.1437 0.1733 0.0253 0.0831 -0.2061 0.0500 0.4341 -0.0170
chr_2504_d107 HPGL0660 chronic s2504 4 #880000 HPGL0660 -0.3209 0.2329 -0.3209 0.2329 0.1589 -0.1270 -0.2180 -0.3868 0.2575 0.0859 0.4189 0.0590 -0.1030 0.1959 0.1548 0.3267 -0.3219 -0.1048 -0.0426
sh_2272_d107 HPGL0661 self_heal s2272 3 #000088 HPGL0661 0.0868 0.1523 0.0868 0.1523 -0.2705 0.1135 0.2275 0.2180 -0.0959 0.3175 -0.0445 0.3566 -0.2657 0.2280 0.3595 -0.2641 -0.1711 -0.1104 -0.3554
sh_1022_d107 HPGL0662 self_heal s1022 1 #000088 HPGL0662 0.0796 0.1052 0.0796 0.1052 -0.2807 0.2981 -0.0291 -0.1796 -0.1458 -0.5312 -0.0456 -0.2200 -0.0589 0.0139 0.4544 0.2573 0.2747 0.1079 -0.0937
sh_2189_d107 HPGL0663 self_heal s2189 2 #000088 HPGL0663 0.0683 0.1406 0.0683 0.1406 0.1532 0.2109 0.3148 0.1286 0.4588 0.0475 -0.0221 -0.0833 0.2781 -0.3014 0.3372 -0.1525 -0.1476 -0.0253 0.4343

3.2 Restart but include the uninfected samples

For the next few blocks we will just repeat what we did but include the uninfected samples. Ideally doing so will have ~0 effect on the positions of the sample types.

3.2.1 PCA: +uninfected: No Batch correction

In this first example, we see why the uninfected samples were initially removed from the analyses I think.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17 pc_18 pc_19 pc_20
uninf_d108 HPGL0630 uninf d108 1 #009900 HPGL0630 -0.1207 -0.4890 -0.1207 -0.4890 0.2042 -0.0010 0.1801 0.0256 -0.4680 -0.1982 -0.3184 -0.0699 -0.0664 0.4153 -0.2499 0.0389 -0.0509 0.1066 -0.0158 -0.0075 0.0563 -0.0325
chr_5430_d108 HPGL0631 chr d108 1 #990000 HPGL0631 0.1685 0.0230 0.1685 0.0230 0.3256 0.0978 -0.2264 -0.2030 0.3077 0.0827 0.0784 0.2725 -0.3607 0.1960 -0.0891 0.0297 -0.4937 0.0983 0.2568 0.0891 0.1666 0.0342
chr_5397_d108 HPGL0632 chr d108 1 #990000 HPGL0632 0.1457 0.0320 0.1457 0.0320 0.3047 -0.0173 -0.2518 0.0624 -0.0225 -0.3685 -0.2999 -0.1566 0.3814 -0.3376 0.1286 0.1561 -0.1731 -0.2600 0.2409 0.1364 -0.1880 -0.0950
chr_2504_d108 HPGL0633 chr d108 1 #990000 HPGL0633 0.0706 0.0257 0.0706 0.0257 0.2163 0.1976 0.4273 -0.0552 0.1167 -0.0814 -0.1244 0.0979 -0.3156 -0.0588 0.4164 0.0309 0.2728 -0.3486 -0.2359 0.2165 0.2184 -0.0799
sh_2272_d108 HPGL0634 sh d108 1 #000099 HPGL0634 0.1907 -0.0513 0.1907 -0.0513 0.3313 -0.0248 -0.1537 0.1210 0.2523 0.1033 0.2052 0.0055 0.2964 0.2762 -0.0048 0.3935 0.2580 0.0906 -0.3973 -0.3007 -0.0680 0.0392
sh_1022_d108 HPGL0635 sh d108 1 #000099 HPGL0635 0.0678 0.0401 0.0678 0.0401 0.2576 0.0547 0.3849 -0.2119 0.0834 -0.0711 0.1009 -0.2277 0.1075 -0.1689 0.1521 -0.4172 -0.0025 0.4983 0.1018 -0.2146 -0.2121 0.1507
sh_2189_d108 HPGL0636 sh d108 1 #000099 HPGL0636 0.1273 0.0977 0.1273 0.0977 0.2842 0.1943 0.0012 0.2873 -0.3449 0.3629 0.2714 0.0996 0.0249 -0.2745 -0.4240 -0.2398 0.1628 -0.1832 0.0671 0.1072 0.0443 -0.0068
uninf_d110 HPGL0650 uninf d110 2 #009900 HPGL0650 -0.2913 -0.4788 -0.2913 -0.4788 0.0564 -0.4752 -0.1570 0.1051 0.2184 0.4044 -0.0311 -0.1753 -0.1120 -0.1968 0.1911 -0.1416 0.0343 -0.1149 0.0618 0.0125 0.0269 0.0619
chr_5430_d110 HPGL0651 chr d110 2 #990000 HPGL0651 -0.2272 0.2436 -0.2272 0.2436 0.0150 -0.0325 -0.2961 -0.2572 -0.2514 -0.0551 0.0068 -0.0568 -0.2474 -0.0495 0.0163 -0.1874 -0.2325 -0.1760 -0.5584 -0.1674 -0.3045 -0.0009
chr_5397_d110 HPGL0652 chr d110 2 #990000 HPGL0652 -0.2341 0.2651 -0.2341 0.2651 -0.0011 -0.2458 -0.0048 -0.1762 -0.1144 -0.0529 0.3493 -0.1016 0.3389 0.3939 0.1270 -0.0949 0.0362 -0.0549 0.0785 0.4933 0.1680 0.1021
chr_2504_d110 HPGL0653 chr d110 2 #990000 HPGL0653 -0.3187 0.0923 -0.3187 0.0923 -0.0558 -0.2409 0.3500 -0.0811 0.0240 0.0190 -0.0008 0.3896 0.0768 -0.3614 -0.2284 0.4341 -0.1357 0.2388 -0.1009 0.1274 -0.0790 -0.1183
sh_2272_d110 HPGL0654 sh d110 2 #000099 HPGL0654 -0.2463 0.2089 -0.2463 0.2089 -0.0356 0.0203 -0.1105 0.2538 -0.0409 -0.1979 0.2038 -0.2510 -0.2110 -0.0236 0.0863 0.0816 0.0651 0.1614 0.1965 -0.3061 0.3137 -0.5516
sh_1022_d110 HPGL0655 sh d110 2 #000099 HPGL0655 -0.3266 0.1515 -0.3266 0.1515 -0.1321 0.3142 -0.0194 -0.0262 0.3589 0.1614 -0.4678 0.0915 0.3426 0.1293 -0.2135 -0.2864 0.0367 -0.0840 -0.0282 -0.1282 0.1775 -0.0682
sh_2189_d110 HPGL0656 sh d110 2 #000099 HPGL0656 -0.2994 0.1667 -0.2994 0.1667 -0.0822 0.1970 -0.0818 0.2609 -0.0090 -0.0845 -0.0435 0.0596 -0.2643 0.0925 0.0806 0.2164 0.2405 0.0080 0.3207 -0.0492 -0.3223 0.5506
uninf_d107 HPGL0657 uninf d107 3 #009900 HPGL0657 -0.0698 -0.5181 -0.0698 -0.5181 -0.3205 0.4187 -0.1384 -0.1032 0.0196 -0.2389 0.4195 0.2541 0.1686 -0.1315 0.1258 -0.0747 -0.0302 -0.0381 -0.0279 0.0164 -0.0312 -0.0405
chr_5430_d107 HPGL0658 chr d107 3 #990000 HPGL0658 0.2616 0.0048 0.2616 0.0048 -0.1625 -0.1857 -0.2868 -0.4835 -0.0134 -0.0896 -0.1580 0.0686 -0.1558 -0.0868 -0.2176 -0.0185 0.5848 0.1342 0.1310 0.0628 0.0168 -0.0902
chr_5397_d107 HPGL0659 chr d107 3 #990000 HPGL0659 0.2622 0.0740 0.2622 0.0740 -0.2126 -0.2403 0.1645 0.1467 -0.1003 0.1508 -0.0401 0.3859 0.0590 0.2984 0.2323 -0.1649 -0.0537 -0.1630 0.2124 -0.1955 -0.3945 -0.3179
chr_2504_d107 HPGL0660 chr d107 3 #990000 HPGL0660 0.1762 -0.0059 0.1762 -0.0059 -0.2659 -0.1175 0.3293 -0.1482 0.1725 -0.1100 0.1807 -0.3585 -0.0439 0.0106 -0.3655 0.1324 -0.1585 -0.4589 0.0970 -0.2631 0.0585 0.1802
sh_2272_d107 HPGL0661 sh d107 3 #000099 HPGL0661 0.2603 0.0881 0.2603 0.0881 -0.2009 -0.2487 -0.0836 0.2894 -0.1482 -0.1987 -0.1267 0.2289 0.0611 -0.1381 0.1260 -0.1483 -0.1221 0.1477 -0.1830 -0.1493 0.4939 0.3961
sh_1022_d107 HPGL0662 sh d107 3 #000099 HPGL0662 0.1734 0.0376 0.1734 0.0376 -0.2562 0.2841 -0.0201 -0.1952 -0.3068 0.5276 -0.1446 -0.2784 0.0805 -0.0449 0.2782 0.3436 -0.1612 0.1504 0.0390 0.0265 0.1005 0.0179
sh_2189_d107 HPGL0663 sh d107 3 #000099 HPGL0663 0.2298 -0.0079 0.2298 -0.0079 -0.2697 0.0509 -0.0071 0.3890 0.2664 -0.0653 -0.0609 -0.2781 -0.1605 0.0603 -0.1678 -0.0837 -0.0771 0.2473 -0.2562 0.4936 -0.2416 -0.1310
## NULL

3.2.2 PCA: +uninfected Repeat with combat adjustment

For the second iteration, use the same normalization, but add a combat correction in an attempt to minimize patient’s effect in the variance.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17 pc_18 pc_19 pc_20
uninf_d108 HPGL0630 uninf d108 1 #009900 HPGL0630 -0.4328 0.3125 -0.4328 0.3125 -0.3075 0.3322 -0.3029 0.2646 -0.2314 -0.1371 0.0492 0.0876 -0.0158 0.2765 -0.1781 0.0035 -0.0383 0.0790 -0.0641 0.0437 -0.0182 0.3094
chr_5430_d108 HPGL0631 chr d108 1 #990000 HPGL0631 -0.0020 -0.2454 -0.0020 -0.2454 0.2015 -0.1907 0.1525 0.1149 0.0718 -0.0421 -0.0157 -0.1604 -0.1072 0.0472 -0.6706 0.0409 0.0453 -0.0883 0.2341 0.1030 -0.3873 0.2355
chr_5397_d108 HPGL0632 chr d108 1 #990000 HPGL0632 0.0011 -0.2485 0.0011 -0.2485 0.1550 -0.1758 -0.2850 -0.0719 -0.2837 0.1763 0.2271 0.0146 0.0767 -0.3153 -0.0495 0.3425 0.2953 0.0818 -0.3933 -0.0479 0.2105 0.2562
chr_2504_d108 HPGL0633 chr d108 1 #990000 HPGL0633 0.2062 0.2527 0.2062 0.2527 -0.0586 -0.2170 0.0234 -0.3983 0.0543 -0.5106 -0.2493 -0.0424 -0.0497 0.1811 0.1129 0.4279 0.0552 0.1371 0.0362 0.1871 0.0367 0.1346
sh_2272_d108 HPGL0634 sh d108 1 #000099 HPGL0634 -0.0570 -0.2727 -0.0570 -0.2727 -0.0072 -0.2770 0.0970 -0.1505 0.1007 -0.1035 0.2729 -0.0024 0.0228 0.5193 0.0506 -0.4940 0.0806 0.2488 -0.1642 -0.0797 0.2225 0.0510
sh_1022_d108 HPGL0635 sh d108 1 #000099 HPGL0635 0.2034 0.2278 0.2034 0.2278 -0.1749 0.0914 0.3003 0.1246 0.0878 0.2047 -0.3459 -0.3576 0.1784 -0.2602 0.0193 -0.2794 0.1387 0.3120 -0.2069 -0.0212 -0.0053 0.2820
sh_2189_d108 HPGL0636 sh d108 1 #000099 HPGL0636 0.0335 -0.2035 0.0335 -0.2035 0.1607 0.3723 0.0483 0.1749 0.5810 -0.1552 0.0828 0.3160 -0.1658 -0.1953 0.0406 0.0650 -0.1566 0.1259 -0.1422 0.2704 0.1710 0.0987
uninf_d110 HPGL0650 uninf d110 2 #009900 HPGL0650 -0.6055 -0.1592 -0.6055 -0.1592 -0.2942 -0.1298 0.2161 0.0047 0.1048 -0.0667 -0.0045 -0.2601 -0.0030 -0.2169 0.0568 0.1718 0.1162 0.0681 -0.0152 0.1245 -0.0059 -0.4619
chr_5430_d110 HPGL0651 chr d110 2 #990000 HPGL0651 0.1397 -0.1375 0.1397 -0.1375 0.0593 -0.1474 -0.2538 0.4069 -0.1169 -0.3065 -0.4345 0.2846 -0.0205 -0.0871 -0.0125 -0.1385 0.1810 0.1277 0.0513 -0.3189 -0.0074 -0.3171
chr_5397_d110 HPGL0652 chr d110 2 #990000 HPGL0652 0.1654 -0.1594 0.1654 -0.1594 0.1015 0.1512 0.1442 0.1273 -0.3246 0.4274 -0.1170 0.0452 -0.1800 0.3736 0.1754 0.2491 -0.1208 0.3093 -0.0050 0.2724 -0.1593 -0.2260
chr_2504_d110 HPGL0653 chr d110 2 #990000 HPGL0653 0.0608 0.1262 0.0608 0.1262 -0.0801 0.1391 0.2977 -0.3608 -0.2892 -0.0739 0.2197 0.4989 0.2633 -0.2698 -0.2208 -0.1551 -0.0724 0.1505 0.0584 -0.0012 -0.1673 -0.1665
sh_2272_d110 HPGL0654 sh d110 2 #000099 HPGL0654 0.1001 -0.0932 0.1001 -0.0932 -0.0049 0.0927 -0.3510 -0.1073 0.2010 0.0786 -0.1310 -0.0776 0.6277 0.1754 -0.0174 -0.0431 -0.0575 -0.3181 -0.2167 0.2310 -0.2221 -0.1986
sh_1022_d110 HPGL0655 sh d110 2 #000099 HPGL0655 0.1180 0.3448 0.1180 0.3448 0.2322 -0.0083 -0.0113 0.0594 -0.1723 -0.1355 0.1868 -0.1510 -0.4046 -0.0957 0.0573 -0.3048 0.1819 -0.3514 -0.2723 0.3166 -0.0683 -0.2268
sh_2189_d110 HPGL0656 sh d110 2 #000099 HPGL0656 0.1121 0.3014 0.1121 0.3014 0.1685 0.0939 -0.2457 -0.1171 0.1717 0.1386 0.1664 -0.3036 -0.0638 0.0079 -0.3406 0.1014 -0.2715 0.2355 0.0958 -0.3275 0.2717 -0.3452
uninf_d107 HPGL0657 uninf d107 3 #009900 HPGL0657 -0.4555 0.2762 -0.4555 0.2762 0.4683 -0.1907 0.0330 -0.1858 0.1520 0.3188 -0.2998 0.2682 -0.0238 0.0253 0.1716 -0.0556 -0.0181 -0.0998 0.0487 -0.1439 -0.0536 0.1440
chr_5430_d107 HPGL0658 chr d107 3 #990000 HPGL0658 0.0905 0.0068 0.0905 0.0068 -0.2742 -0.4739 0.0289 0.1969 -0.1005 0.0787 -0.0099 0.0813 -0.0310 -0.1417 0.0371 -0.0229 -0.6763 -0.2236 -0.0499 0.0858 0.1731 0.1080
chr_5397_d107 HPGL0659 chr d107 3 #990000 HPGL0659 0.1480 -0.0661 0.1480 -0.0661 -0.3694 0.1383 0.1400 -0.1342 0.1884 0.1346 0.0247 0.1087 -0.3051 0.1341 0.0286 0.1723 0.0952 -0.3394 -0.2604 -0.5240 -0.2422 0.0052
chr_2504_d107 HPGL0660 chr d107 3 #990000 HPGL0660 -0.0218 -0.2109 -0.0218 -0.2109 0.1444 0.3553 0.2942 -0.0523 -0.2776 -0.1088 -0.2013 -0.1460 0.1156 0.0780 -0.0248 0.0129 0.0006 -0.4142 0.1954 -0.0893 0.5288 0.0526
sh_2272_d107 HPGL0661 sh d107 3 #000099 HPGL0661 0.1271 -0.0458 0.1271 -0.0458 -0.3392 -0.0223 -0.3205 -0.1785 0.1086 0.3125 0.0037 0.0977 -0.1785 -0.1069 0.0553 -0.1701 0.3029 -0.0613 0.5585 0.2350 0.1561 0.0524
sh_1022_d107 HPGL0662 sh d107 3 #000099 HPGL0662 0.1184 0.2394 0.1184 0.2394 0.0875 -0.1154 0.2051 0.4394 0.1121 -0.0185 0.4540 0.0066 0.3338 0.0803 0.2844 0.2202 0.2022 -0.0775 0.3042 -0.1234 -0.0515 0.0363
sh_2189_d107 HPGL0663 sh d107 3 #000099 HPGL0663 -0.0495 -0.2454 -0.0495 -0.2454 0.1313 0.1820 -0.2103 -0.1569 -0.1381 -0.2118 0.1214 -0.3083 -0.0694 -0.2097 0.4244 -0.1441 -0.2835 0.0978 0.2079 -0.1924 -0.3819 0.1762
## NULL

3.2.3 PCA: +uninfected, change the condition to chr/sh

Including the uninfected samples and changing the condition should not much matter

3.2.4 PCA: +uninfected, Change batch to strain and condition to patient+state

## Error in solve.default(t(mod) %*% mod) : 
##   system is computationally singular: reciprocal condition number = 2.14038e-19

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17 pc_18 pc_19 pc_20
uninf_d108 HPGL0630 uninfected_d108 none 1 #1B9E77 HPGL0630 -0.3753 -0.1723 -0.3753 -0.1723 0.2976 -0.1561 -0.4133 -0.2250 -0.0323 -0.0591 0.3523 -0.2305 -0.3156 -0.2055 0.2930 -0.1479 0.0949 0.0281 0.0794 0.0056 0.0035 -0.0047
chr_5430_d108 HPGL0631 chronic_d108 s5430 7 #C16610 HPGL0631 0.2311 -0.3073 0.2311 -0.3073 0.0162 0.2276 0.0481 0.2750 -0.1501 -0.0264 -0.1846 -0.3633 -0.1954 0.1743 -0.0369 -0.4250 0.0008 -0.0166 -0.0675 0.0804 -0.2644 -0.3918
chr_5397_d108 HPGL0632 chronic_d108 s5397 6 #C16610 HPGL0632 0.2274 -0.3021 0.2274 -0.3021 0.0102 0.2846 0.0492 -0.6372 0.0127 -0.1782 -0.1092 0.0476 -0.0772 0.1011 -0.1470 0.0507 -0.0107 -0.0120 -0.0729 0.0374 -0.2019 0.4310
chr_2504_d108 HPGL0633 chronic_d108 s2504 5 #C16610 HPGL0633 0.2244 -0.2875 0.2244 -0.2875 -0.0693 0.1629 -0.3342 0.0759 -0.1800 -0.0781 0.0270 -0.1560 0.5639 -0.1138 0.1066 0.1447 0.0781 -0.0194 -0.0910 -0.0913 0.4690 -0.0281
sh_2272_d108 HPGL0634 self_heal_d108 s2272 4 #8D6B86 HPGL0634 -0.0404 -0.3746 -0.0404 -0.3746 -0.0554 0.0758 0.3352 0.1404 0.1877 0.2943 -0.0313 0.1193 -0.1335 -0.2984 0.2625 0.2559 -0.2399 0.3373 0.0205 0.3295 0.1037 -0.0037
sh_1022_d108 HPGL0635 self_heal_d108 s1022 2 #8D6B86 HPGL0635 -0.0602 -0.3113 -0.0602 -0.3113 -0.0627 -0.2698 0.1859 0.2105 0.1595 -0.2240 0.4142 0.0134 0.0521 0.0337 -0.3869 0.1242 -0.1618 -0.4831 0.1246 0.0554 -0.0383 0.0145
sh_2189_d108 HPGL0636 self_heal_d108 s2189 3 #8D6B86 HPGL0636 -0.0594 -0.3424 -0.0594 -0.3424 -0.1401 -0.3204 -0.1685 0.0986 0.0221 0.2102 -0.1683 0.4647 -0.0822 0.3090 0.0123 -0.0159 0.2684 0.1436 -0.0063 -0.4266 -0.0686 -0.0136
uninf_d110 HPGL0650 uninfected_d110 none 1 #BC4399 HPGL0650 -0.4394 0.0059 -0.4394 0.0059 0.5986 -0.0667 0.2037 0.0346 -0.0420 -0.0580 -0.4570 -0.0428 0.2765 0.0507 -0.1063 0.0889 -0.0332 -0.0726 -0.1819 -0.0123 -0.0085 0.0038
chr_5430_d110 HPGL0651 chronic_d110 s5430 7 #A66753 HPGL0651 0.0849 0.1976 0.0849 0.1976 -0.2162 -0.2285 -0.2524 -0.1321 -0.0448 -0.2155 -0.1766 -0.0440 -0.1432 -0.0432 0.0522 0.5303 -0.1792 -0.0303 -0.2978 0.0770 -0.2411 -0.3903
chr_5397_d110 HPGL0652 chronic_d110 s5397 6 #A66753 HPGL0652 0.0865 0.1798 0.0865 0.1798 -0.1548 -0.2481 -0.0025 0.1738 0.4736 0.0415 -0.0026 -0.3698 0.1998 0.1244 0.2673 -0.1380 0.0661 0.0515 -0.2605 0.0214 -0.1919 0.4195
chr_2504_d110 HPGL0653 chronic_d110 s2504 5 #A66753 HPGL0653 0.0925 0.1645 0.0925 0.1645 -0.0327 -0.1862 0.3172 -0.2285 -0.2290 0.4186 0.1347 -0.2801 -0.2534 0.1477 -0.1752 0.0433 0.1175 -0.0692 -0.2661 -0.0897 0.4326 -0.0278
sh_2272_d110 HPGL0654 self_heal_d110 s2272 4 #96A713 HPGL0654 -0.2001 0.1410 -0.2001 0.1410 -0.1986 0.0769 -0.1141 0.0812 0.0669 -0.1707 0.0227 0.1837 -0.0169 -0.2049 -0.4680 -0.1998 0.4015 0.2830 -0.2304 0.3919 0.1021 -0.0046
sh_1022_d110 HPGL0655 self_heal_d110 s1022 2 #96A713 HPGL0655 -0.1932 0.1543 -0.1932 0.1543 -0.1662 0.3384 0.1677 0.0709 -0.1784 -0.0516 0.1101 0.3360 -0.0290 0.1064 0.4835 -0.0533 0.1718 -0.4730 -0.2060 0.1062 -0.0191 0.0197
sh_2189_d110 HPGL0656 self_heal_d110 s2189 3 #96A713 HPGL0656 -0.2023 0.1421 -0.2023 0.1421 -0.1729 0.1852 -0.0110 -0.0376 -0.0601 0.0760 0.1707 0.0979 0.1202 -0.1989 -0.1419 -0.2898 -0.5843 0.1587 -0.2172 -0.4434 -0.0932 -0.0092
uninf_d107 HPGL0657 uninfected_d107 none 1 #D59D08 HPGL0657 -0.3190 0.1615 -0.3190 0.1615 -0.1413 0.4199 -0.1850 0.0670 0.1360 0.1878 0.0282 -0.2216 -0.0175 0.4373 -0.1436 0.3091 -0.0199 0.0888 0.4120 0.0085 0.0022 0.0036
chr_5430_d107 HPGL0658 chronic_d107 s5430 7 #9D7426 HPGL0658 0.2932 0.1802 0.2932 0.1802 0.3389 0.0566 0.0554 -0.2218 0.2485 0.2764 0.3327 0.1993 0.3325 -0.0194 -0.0029 -0.0195 0.1797 0.0474 0.0876 0.0389 -0.2617 -0.3893
chr_5397_d107 HPGL0659 chronic_d107 s5397 6 #9D7426 HPGL0659 0.2956 0.1981 0.2956 0.1981 0.2860 -0.0082 -0.1886 0.3998 -0.4477 0.1498 0.0944 0.1134 -0.1361 -0.1083 -0.1024 0.1609 -0.0520 0.0427 0.0635 0.0687 -0.2180 0.4200
chr_2504_d107 HPGL0660 chronic_d107 s2504 5 #9D7426 HPGL0660 0.2892 0.2053 0.2892 0.2053 0.2671 0.1012 -0.1022 0.1074 0.4162 -0.2836 -0.1328 0.2073 -0.3454 0.0795 0.0355 -0.1063 -0.1847 -0.0532 0.0611 -0.1115 0.4653 -0.0321
sh_2272_d107 HPGL0661 self_heal_d107 s2272 4 #666666 HPGL0661 0.0158 0.1300 0.0158 0.1300 -0.0952 -0.3450 0.1269 -0.1561 -0.3173 -0.2473 0.0204 0.1019 0.2052 0.3408 0.1706 -0.2422 -0.2410 0.2682 0.3296 0.3119 0.1086 -0.0173
sh_1022_d107 HPGL0662 self_heal_d107 s1022 2 #666666 HPGL0662 0.0195 0.1093 0.0195 0.1093 -0.2165 -0.1325 -0.1336 -0.1486 0.0379 0.3117 -0.4356 -0.0107 0.0393 -0.4024 -0.0571 -0.2110 -0.0152 -0.4076 0.4037 0.0798 0.0008 0.0194
sh_2189_d107 HPGL0663 self_heal_d107 s2189 3 #666666 HPGL0663 0.0292 0.1277 0.0292 0.1277 -0.0928 0.0325 0.4160 0.0514 -0.0793 -0.3739 -0.0094 -0.1660 -0.0440 -0.3101 0.0850 0.1407 0.3431 0.1874 0.3157 -0.4378 -0.0809 -0.0189
## NULL

3.2.5 PCA: +uninfected, Repeat but with just chronic/self-state

## The new colors are a character, changing according to condition.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17 pc_18 pc_19 pc_20
uninf_d108 HPGL0630 uninfected none 1 #008800 HPGL0630 -0.3753 -0.1723 -0.3753 -0.1723 0.2976 -0.1561 -0.4133 -0.2250 -0.0323 -0.0591 0.3523 -0.2305 -0.3156 -0.2055 0.2930 -0.1479 0.0949 0.0281 0.0794 0.0056 0.0035 -0.0047
chr_5430_d108 HPGL0631 chronic s5430 7 #880000 HPGL0631 0.2311 -0.3073 0.2311 -0.3073 0.0162 0.2276 0.0481 0.2750 -0.1501 -0.0264 -0.1846 -0.3633 -0.1954 0.1743 -0.0369 -0.4250 0.0008 -0.0166 -0.0675 0.0804 -0.2644 -0.3918
chr_5397_d108 HPGL0632 chronic s5397 6 #880000 HPGL0632 0.2274 -0.3021 0.2274 -0.3021 0.0102 0.2846 0.0492 -0.6372 0.0127 -0.1782 -0.1092 0.0476 -0.0772 0.1011 -0.1470 0.0507 -0.0107 -0.0120 -0.0729 0.0374 -0.2019 0.4310
chr_2504_d108 HPGL0633 chronic s2504 5 #880000 HPGL0633 0.2244 -0.2875 0.2244 -0.2875 -0.0693 0.1629 -0.3342 0.0759 -0.1800 -0.0781 0.0270 -0.1560 0.5639 -0.1138 0.1066 0.1447 0.0781 -0.0194 -0.0910 -0.0913 0.4690 -0.0281
sh_2272_d108 HPGL0634 self_heal s2272 4 #000088 HPGL0634 -0.0404 -0.3746 -0.0404 -0.3746 -0.0554 0.0758 0.3352 0.1404 0.1877 0.2943 -0.0313 0.1193 -0.1335 -0.2984 0.2625 0.2559 -0.2399 0.3373 0.0205 0.3295 0.1037 -0.0037
sh_1022_d108 HPGL0635 self_heal s1022 2 #000088 HPGL0635 -0.0602 -0.3113 -0.0602 -0.3113 -0.0627 -0.2698 0.1859 0.2105 0.1595 -0.2240 0.4142 0.0134 0.0521 0.0337 -0.3869 0.1242 -0.1618 -0.4831 0.1246 0.0554 -0.0383 0.0145
sh_2189_d108 HPGL0636 self_heal s2189 3 #000088 HPGL0636 -0.0594 -0.3424 -0.0594 -0.3424 -0.1401 -0.3204 -0.1685 0.0986 0.0221 0.2102 -0.1683 0.4647 -0.0822 0.3090 0.0123 -0.0159 0.2684 0.1436 -0.0063 -0.4266 -0.0686 -0.0136
uninf_d110 HPGL0650 uninfected none 1 #008800 HPGL0650 -0.4394 0.0059 -0.4394 0.0059 0.5986 -0.0667 0.2037 0.0346 -0.0420 -0.0580 -0.4570 -0.0428 0.2765 0.0507 -0.1063 0.0889 -0.0332 -0.0726 -0.1819 -0.0123 -0.0085 0.0038
chr_5430_d110 HPGL0651 chronic s5430 7 #880000 HPGL0651 0.0849 0.1976 0.0849 0.1976 -0.2162 -0.2285 -0.2524 -0.1321 -0.0448 -0.2155 -0.1766 -0.0440 -0.1432 -0.0432 0.0522 0.5303 -0.1792 -0.0303 -0.2978 0.0770 -0.2411 -0.3903
chr_5397_d110 HPGL0652 chronic s5397 6 #880000 HPGL0652 0.0865 0.1798 0.0865 0.1798 -0.1548 -0.2481 -0.0025 0.1738 0.4736 0.0415 -0.0026 -0.3698 0.1998 0.1244 0.2673 -0.1380 0.0661 0.0515 -0.2605 0.0214 -0.1919 0.4195
chr_2504_d110 HPGL0653 chronic s2504 5 #880000 HPGL0653 0.0925 0.1645 0.0925 0.1645 -0.0327 -0.1862 0.3172 -0.2285 -0.2290 0.4186 0.1347 -0.2801 -0.2534 0.1477 -0.1752 0.0433 0.1175 -0.0692 -0.2661 -0.0897 0.4326 -0.0278
sh_2272_d110 HPGL0654 self_heal s2272 4 #000088 HPGL0654 -0.2001 0.1410 -0.2001 0.1410 -0.1986 0.0769 -0.1141 0.0812 0.0669 -0.1707 0.0227 0.1837 -0.0169 -0.2049 -0.4680 -0.1998 0.4015 0.2830 -0.2304 0.3919 0.1021 -0.0046
sh_1022_d110 HPGL0655 self_heal s1022 2 #000088 HPGL0655 -0.1932 0.1543 -0.1932 0.1543 -0.1662 0.3384 0.1677 0.0709 -0.1784 -0.0516 0.1101 0.3360 -0.0290 0.1064 0.4835 -0.0533 0.1718 -0.4730 -0.2060 0.1062 -0.0191 0.0197
sh_2189_d110 HPGL0656 self_heal s2189 3 #000088 HPGL0656 -0.2023 0.1421 -0.2023 0.1421 -0.1729 0.1852 -0.0110 -0.0376 -0.0601 0.0760 0.1707 0.0979 0.1202 -0.1989 -0.1419 -0.2898 -0.5843 0.1587 -0.2172 -0.4434 -0.0932 -0.0092
uninf_d107 HPGL0657 uninfected none 1 #008800 HPGL0657 -0.3190 0.1615 -0.3190 0.1615 -0.1413 0.4199 -0.1850 0.0670 0.1360 0.1878 0.0282 -0.2216 -0.0175 0.4373 -0.1436 0.3091 -0.0199 0.0888 0.4120 0.0085 0.0022 0.0036
chr_5430_d107 HPGL0658 chronic s5430 7 #880000 HPGL0658 0.2932 0.1802 0.2932 0.1802 0.3389 0.0566 0.0554 -0.2218 0.2485 0.2764 0.3327 0.1993 0.3325 -0.0194 -0.0029 -0.0195 0.1797 0.0474 0.0876 0.0389 -0.2617 -0.3893
chr_5397_d107 HPGL0659 chronic s5397 6 #880000 HPGL0659 0.2956 0.1981 0.2956 0.1981 0.2860 -0.0082 -0.1886 0.3998 -0.4477 0.1498 0.0944 0.1134 -0.1361 -0.1083 -0.1024 0.1609 -0.0520 0.0427 0.0635 0.0687 -0.2180 0.4200
chr_2504_d107 HPGL0660 chronic s2504 5 #880000 HPGL0660 0.2892 0.2053 0.2892 0.2053 0.2671 0.1012 -0.1022 0.1074 0.4162 -0.2836 -0.1328 0.2073 -0.3454 0.0795 0.0355 -0.1063 -0.1847 -0.0532 0.0611 -0.1115 0.4653 -0.0321
sh_2272_d107 HPGL0661 self_heal s2272 4 #000088 HPGL0661 0.0158 0.1300 0.0158 0.1300 -0.0952 -0.3450 0.1269 -0.1561 -0.3173 -0.2473 0.0204 0.1019 0.2052 0.3408 0.1706 -0.2422 -0.2410 0.2682 0.3296 0.3119 0.1086 -0.0173
sh_1022_d107 HPGL0662 self_heal s1022 2 #000088 HPGL0662 0.0195 0.1093 0.0195 0.1093 -0.2165 -0.1325 -0.1336 -0.1486 0.0379 0.3117 -0.4356 -0.0107 0.0393 -0.4024 -0.0571 -0.2110 -0.0152 -0.4076 0.4037 0.0798 0.0008 0.0194
sh_2189_d107 HPGL0663 self_heal s2189 3 #000088 HPGL0663 0.0292 0.1277 0.0292 0.1277 -0.0928 0.0325 0.4160 0.0514 -0.0793 -0.3739 -0.0094 -0.1660 -0.0440 -0.3101 0.0850 0.1407 0.3431 0.1874 0.3157 -0.4378 -0.0809 -0.0189

3.2.6 PCA: Try only using samples for 1 patient

As per a conversation with Maria Adelaida on skype, lets remove all samples except those for one patient, then see if some aspect of the data jumps out (strain:strain variation, for example)

## Using a subset expression.
## There were 18, now there are 6 samples.

## Using a subset expression.
## There were 18, now there are 6 samples.

## Using a subset expression.
## There were 18, now there are 6 samples.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5
chr_5430_d110 HPGL0651 chr chronic 1 #990000 HPGL0651 -0.4377 0.2318 -0.4377 0.2318 -0.3217 -0.6957 -0.0233
chr_5397_d110 HPGL0652 chr chronic 1 #990000 HPGL0652 -0.5377 -0.1637 -0.5377 -0.1637 -0.2420 0.6266 0.2573
chr_2504_d110 HPGL0653 chr chronic 1 #990000 HPGL0653 0.2104 -0.8458 0.2104 -0.8458 0.1272 -0.2313 -0.0633
sh_2272_d110 HPGL0654 sh self_heal 2 #000099 HPGL0654 -0.1264 0.2503 -0.1264 0.2503 0.5490 0.1704 -0.6513
sh_1022_d110 HPGL0655 sh self_heal 2 #000099 HPGL0655 0.6214 0.2294 0.6214 0.2294 -0.5632 0.1923 -0.2010
sh_2189_d110 HPGL0656 sh self_heal 2 #000099 HPGL0656 0.2700 0.2980 0.2700 0.2980 0.4507 -0.0623 0.6816

3.2.7 PCA: Re-label one sample

In our previous discussion, Hector suggested that sample ‘HPGL0635’ is sufficiently dis-similar to its cohort samples that it might actually be a member of strain ‘2504’ rather than ‘1022’. Let us look and see what happens if that is changed.

I am going to leave out the uninfected samples to avoid the confusion they generate.

## Setting condition for ids sh_1022_d108 to chr.
## The new colors are a list, changing according to sampleID.

I may be biased, but I think this suggests that the samples were not switched.

4 Testing out some ideas

One query was to see if there is a reversal of two samples.

## This function will replace the expt$expressionset slot with:
## cpm(cbcb(data))
## It will save copies of each step along the way
##  in expt$normalized with the corresponding libsizes. Keep libsizes in mind
##  when invoking limma.  The appropriate libsize is non-log(cpm(normalized)).
##  This is most likely kept at:
##  'new_expt$normalized$intermediate_counts$normalization$libsizes'
##  A copy of this may also be found at:
##  new_expt$best_libsize
## Leaving the data in its current base format, keep in mind that
##  some metrics are easier to see when the data is log2 transformed, but
##  EdgeR/DESeq do not accept transformed data.
## Leaving the data unnormalized.  This is necessary for DESeq, but
##  EdgeR/limma might benefit from normalization.  Good choices include quantile,
##  size-factor, tmm, etc.
## Not correcting the count-data for batch effects.  If batch is
##  included in EdgerR/limma's model, then this is probably wise; but in extreme
##  batch effects this is a good parameter to play with.
## Step 1: performing count filter with option: cbcb
## Removing 37453 low-count genes (13588 remaining).
## Step 2: not normalizing the data.
## Step 3: converting the data with cpm.
## Step 4: not transforming the data.
## Step 5: not doing batch correction.
##                 chr_5430_d108 chr_5397_d108 chr_2504_d108 sh_2272_d108
## ENSG00000000419        19.929        19.090        16.925       19.811
## ENSG00000000457        25.968        30.203        22.446       29.716
## ENSG00000000460        13.890        11.024         7.845       10.340
## ENSG00000000938       531.128       522.412       385.291      403.864
## ENSG00000000971         4.429         4.929         4.358        4.605
## ENSG00000001036        38.851        37.193        38.791       35.625
##                 sh_1022_d108 sh_2189_d108 chr_5430_d110 chr_5397_d110
## ENSG00000000419       17.834       17.853        16.113        17.211
## ENSG00000000457       23.491       27.999        18.310        20.261
## ENSG00000000460        9.205        8.195         7.263         7.115
## ENSG00000000938      401.943      435.796       618.578       552.865
## ENSG00000000971        4.698        3.415         2.563         1.762
## ENSG00000001036       40.367       43.999        46.019        39.302
##                 chr_2504_d110 sh_2272_d110 sh_1022_d110 sh_2189_d110
## ENSG00000000419        15.027       13.010       12.678       15.247
## ENSG00000000457        20.135       20.095       17.554       18.089
## ENSG00000000460         6.533        4.637        4.632        4.458
## ENSG00000000938       329.400      460.325      477.447      452.104
## ENSG00000000971         3.029        2.834        1.544        2.972
## ENSG00000001036        35.933       37.357       35.595       35.920
##                 chr_5430_d107 chr_5397_d107 chr_2504_d107 sh_2272_d107
## ENSG00000000419        16.313        14.825        13.675       16.460
## ENSG00000000457        27.493        26.041        24.836       28.394
## ENSG00000000460         9.925         8.471         5.737        6.349
## ENSG00000000938       461.217       332.263       280.345      348.545
## ENSG00000000971         4.221         7.844         5.816        6.584
## ENSG00000001036        31.714        23.610        24.757       27.042
##                 sh_1022_d107 sh_2189_d107
## ENSG00000000419       14.623       15.870
## ENSG00000000457       24.613       27.969
## ENSG00000000460        7.448        6.903
## ENSG00000000938      391.453      298.651
## ENSG00000000971        3.724        3.807
## ENSG00000001036       30.517       20.994
## The new colors are a character, changing according to condition.

## The new colors are a character, changing according to condition.

## Going to write the image to: images/varpart_donor_strain.png when dev.off() is called.
## png 
##   2
## Going to write the image to: images/varpart_donor_strain_pct.png when dev.off() is called.
## png 
##   2

5 Try out some limma invocations with interaction models

The experimental design does not fully supprt interaction models, but I want to see how it looks.

6 Switch to the parasite transcriptome data

7 Look during infection

“Changes during infection hpgl0630-0636 and hpgl0650-hpgl0663”

Start out by creating the expt and poking at it to see how well/badly behaved the data is.

## Using a subset expression.
## There were 33, now there are 18 samples.
## The new colors are a character, changing according to condition.

7.2 PCA: Parasite edition

In this section, try out some normalizations/batch corrections and see the effect in PCA plots. Start out by taking the parasite data and doing the default normalization and see what there is to see.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
hpgl0631 HPGL0631 chr d108 1 #990000 HPGL0631 -0.4992 -0.1885 -0.4992 -0.1885 -0.0068 0.0362 -0.0272 0.0093 -0.0260 0.0196 -0.0289 0.0534 -0.0418 0.1431 -0.2097 0.0013 -0.3293 0.0750 -0.6874
hpgl0632 HPGL0632 chr d108 1 #990000 HPGL0632 0.1613 -0.1429 0.1613 -0.1429 -0.1274 0.1863 -0.1225 0.3713 -0.0461 0.3133 -0.0582 0.2185 -0.0593 0.3581 0.4647 -0.3873 -0.1644 0.1079 0.0800
hpgl0633 HPGL0633 chr d108 1 #990000 HPGL0633 -0.0126 0.2958 -0.0126 0.2958 -0.0822 0.1668 -0.4751 -0.0552 -0.4827 -0.0333 -0.3239 0.1915 0.1110 -0.4261 0.0066 0.0779 -0.0511 0.1207 0.0226
hpgl0634 HPGL0634 sh d108 1 #000099 HPGL0634 0.1341 -0.2141 0.1341 -0.2141 -0.2980 -0.4110 -0.4832 -0.2458 0.4694 -0.0927 -0.2167 -0.0798 -0.2042 -0.0147 -0.0231 -0.0562 0.0113 -0.0016 0.0324
hpgl0635 HPGL0635 sh d108 1 #000099 HPGL0635 -0.0461 0.3475 -0.0461 0.3475 -0.2900 -0.1231 0.2651 0.0594 0.0462 0.0291 -0.1392 0.4006 -0.2189 0.3249 -0.0149 0.5409 0.1028 -0.0602 0.0725
hpgl0636 HPGL0636 sh d108 1 #000099 HPGL0636 0.1800 -0.1629 0.1800 -0.1629 -0.1428 0.3860 0.2767 -0.7050 -0.0051 0.3671 -0.0357 -0.0380 0.0579 -0.0262 0.0106 0.0107 -0.0189 -0.0221 -0.0003
hpgl0651 HPGL0651 chr d110 2 #990000 HPGL0651 -0.4534 -0.1638 -0.4534 -0.1638 0.2094 -0.0501 0.0712 -0.0348 0.0455 -0.0148 -0.0108 -0.1480 -0.0897 -0.2378 0.6461 0.1909 0.3396 -0.0233 0.0278
hpgl0652 HPGL0652 chr d110 2 #990000 HPGL0652 0.2041 -0.0995 0.2041 -0.0995 0.3359 0.1086 0.0495 0.3088 0.2703 0.2789 -0.0355 -0.1525 -0.1447 -0.2564 -0.2156 0.3498 -0.1836 0.4402 0.1097
hpgl0653 HPGL0653 chr d110 2 #990000 HPGL0653 0.0629 0.3550 0.0629 0.3550 0.5533 0.0517 -0.2607 -0.2173 0.2413 0.0096 0.3263 0.3216 0.0193 0.1675 -0.0493 -0.1636 0.2260 -0.0118 -0.1163
hpgl0654 HPGL0654 sh d110 2 #000099 HPGL0654 0.1983 -0.1994 0.1983 -0.1994 0.3100 -0.5760 0.2267 -0.0851 -0.5220 0.0856 -0.1157 0.0273 -0.1835 0.0601 -0.1245 -0.1614 0.0764 0.0158 0.0036
hpgl0655 HPGL0655 sh d110 2 #000099 HPGL0655 -0.0074 0.3512 -0.0074 0.3512 0.0796 -0.1325 0.3007 0.0766 0.2402 0.0080 -0.1466 -0.0390 0.0339 -0.3303 0.0873 -0.2573 -0.4603 -0.4720 0.0367
hpgl0656 HPGL0656 sh d110 2 #000099 HPGL0656 0.2124 -0.1380 0.2124 -0.1380 0.1951 0.2313 0.1747 0.0180 0.0788 -0.6498 -0.4221 -0.0518 0.2876 0.2378 0.0355 0.0120 0.0331 0.0784 -0.0258
hpgl0658 HPGL0658 chr d107 3 #990000 HPGL0658 -0.5097 -0.1977 -0.5097 -0.1977 0.0121 0.0933 -0.0225 0.0090 -0.0029 -0.0280 0.0422 0.1052 0.0877 0.0952 -0.3858 -0.1686 -0.0194 -0.0524 0.6522
hpgl0659 HPGL0659 chr d107 3 #990000 HPGL0659 0.1384 -0.1391 0.1384 -0.1391 -0.1025 0.1857 -0.0530 0.3623 0.0065 0.1909 -0.0630 -0.1174 0.0595 -0.1015 -0.2869 -0.0191 0.5316 -0.4988 -0.2114
hpgl0660 HPGL0660 chr d107 3 #990000 HPGL0660 -0.0022 0.3059 -0.0022 0.3059 0.0093 0.0369 -0.2093 -0.0285 -0.2137 0.0097 0.1555 -0.7141 -0.0556 0.3979 0.0278 0.1483 -0.1430 -0.1111 0.0891
hpgl0661 HPGL0661 sh d107 3 #000099 HPGL0661 0.1396 -0.1910 0.1396 -0.1910 -0.1302 -0.2813 -0.0473 0.0571 -0.0343 0.0060 0.4347 0.1145 0.6830 -0.0515 0.1015 0.2788 -0.1547 -0.0496 -0.0010
hpgl0662 HPGL0662 sh d107 3 #000099 HPGL0662 -0.0788 0.3291 -0.0788 0.3291 -0.3490 -0.1210 0.2791 0.0637 0.0745 -0.0415 0.1183 -0.1852 0.1476 -0.1182 -0.0787 -0.3757 0.3066 0.5205 -0.0891
hpgl0663 HPGL0663 sh d107 3 #000099 HPGL0663 0.1781 -0.1476 0.1781 -0.1476 -0.1757 0.2121 0.0570 0.0363 -0.1397 -0.4575 0.5191 0.0933 -0.4898 -0.2220 0.0084 -0.0215 -0.1027 -0.0557 0.0048

Now repeat the same thing, but let sva minimize surrogate variables.

Now plot the result and see if things make more sense.

Adding SVA to the normalization does not help much.

Adding SVA to the normalization does not help much.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
hpgl0631 HPGL0631 chr d108 1 #990000 HPGL0631 -0.0051 0.0495 -0.0051 0.0495 0.0456 -0.0335 0.0142 0.0256 -0.0321 0.0335 0.0201 0.2255 -0.0794 0.1451 -0.6009 0.4313 0.1743 -0.0063 0.5288
hpgl0632 HPGL0632 chr d108 1 #990000 HPGL0632 0.1099 0.2011 0.1099 0.2011 0.0959 -0.3584 -0.0046 0.3244 -0.0644 0.2028 0.0459 0.2449 0.4034 -0.5180 -0.1452 -0.0962 -0.1927 -0.1730 -0.1147
hpgl0633 HPGL0633 chr d108 1 #990000 HPGL0633 0.0018 0.2333 0.0018 0.2333 0.4716 0.0404 0.3946 -0.0403 -0.4179 0.2475 -0.0579 -0.3896 -0.0784 0.1041 -0.0361 0.0220 -0.1052 0.2758 -0.1077
hpgl0634 HPGL0634 sh d108 1 #000099 HPGL0634 0.3378 -0.3211 0.3378 -0.3211 0.4498 0.2824 -0.5160 -0.1218 -0.2258 -0.0724 0.1925 0.0287 -0.0682 -0.0546 0.0177 -0.0222 -0.0154 -0.2409 -0.0565
hpgl0635 HPGL0635 sh d108 1 #000099 HPGL0635 0.3178 -0.1266 0.3178 -0.1266 -0.2855 -0.0444 -0.0397 0.0249 -0.1124 0.3616 0.1833 0.3365 0.2036 0.4591 0.1779 -0.1320 0.0069 0.3862 -0.0320
hpgl0636 HPGL0636 sh d108 1 #000099 HPGL0636 0.1396 0.3660 0.1396 0.3660 -0.3382 0.6972 0.1006 0.3241 -0.0529 -0.0759 -0.0715 -0.0223 0.0098 0.0078 -0.0297 -0.0201 0.0105 -0.2257 -0.0933
hpgl0651 HPGL0651 chr d110 2 #990000 HPGL0651 -0.1940 -0.0786 -0.1940 -0.0786 -0.0667 0.0289 -0.0269 -0.0118 -0.0209 -0.1516 0.2122 -0.4215 0.5561 -0.0072 0.3065 0.0260 0.0268 -0.0321 0.4927
hpgl0652 HPGL0652 chr d110 2 #990000 HPGL0652 -0.3157 0.0420 -0.3157 0.0420 -0.1335 -0.2813 -0.2792 0.2910 -0.0462 -0.1239 0.1554 -0.1753 -0.2314 0.4038 -0.0749 0.0546 -0.4738 -0.1779 -0.1723
hpgl0653 HPGL0653 chr d110 2 #990000 HPGL0653 -0.5899 -0.0010 -0.5899 -0.0010 0.1461 0.2753 -0.2841 0.0146 0.2847 0.3699 -0.0532 0.1441 0.0053 -0.1528 0.1203 0.1780 0.1202 0.2551 -0.1669
hpgl0654 HPGL0654 sh d110 2 #000099 HPGL0654 -0.1985 -0.6410 -0.1985 -0.6410 -0.1118 0.0502 0.5500 0.0835 -0.1091 0.0346 0.1395 0.1364 -0.1483 -0.1268 0.0542 0.0570 0.0131 -0.2511 -0.1049
hpgl0655 HPGL0655 sh d110 2 #000099 HPGL0655 -0.0521 -0.1716 -0.0521 -0.1716 -0.3118 -0.0781 -0.1923 0.0047 -0.1377 -0.1194 -0.0243 -0.3166 -0.1115 -0.2488 -0.4131 -0.4161 0.3150 0.3306 -0.1059
hpgl0656 HPGL0656 sh d110 2 #000099 HPGL0656 -0.1883 0.1672 -0.1883 0.1672 -0.2410 -0.0726 -0.0443 -0.6755 -0.3533 -0.1157 -0.2863 0.2053 0.1496 0.0043 0.0254 0.0747 -0.0576 -0.2010 -0.1574
hpgl0658 HPGL0658 chr d107 3 #990000 HPGL0658 -0.0351 0.1100 -0.0351 0.1100 0.0334 -0.0307 0.0021 -0.0286 0.0411 0.0841 -0.2024 0.1805 -0.4531 -0.1344 0.2763 -0.4677 -0.1958 -0.0240 0.5372
hpgl0659 HPGL0659 chr d107 3 #990000 HPGL0659 0.0867 0.1944 0.0867 0.1944 0.0392 -0.3474 -0.0395 0.2007 -0.0536 -0.0843 -0.0484 -0.0243 -0.2074 0.0675 0.3909 0.1761 0.6664 -0.1841 -0.1130
hpgl0660 HPGL0660 chr d107 3 #990000 HPGL0660 -0.0896 0.0932 -0.0896 0.0932 0.3149 0.0319 0.1972 0.0187 0.2137 -0.6717 0.1023 0.3400 0.1289 0.1285 -0.0288 -0.1927 0.0070 0.2781 -0.1298
hpgl0661 HPGL0661 sh d107 3 #000099 HPGL0661 0.1671 -0.2447 0.1671 -0.2447 0.1286 -0.0555 0.0354 0.0247 0.4162 0.0974 -0.6520 -0.1889 0.2200 0.2724 -0.1254 -0.1088 -0.0055 -0.1919 -0.0688
hpgl0662 HPGL0662 sh d107 3 #000099 HPGL0662 0.3459 -0.0890 0.3459 -0.0890 -0.1905 -0.0541 -0.0192 -0.0345 0.1535 -0.2028 -0.1486 -0.1190 -0.1714 -0.3280 0.1942 0.5137 -0.3098 0.3740 -0.0270
hpgl0663 HPGL0663 sh d107 3 #000099 HPGL0663 0.1616 0.2170 0.1616 0.2170 -0.0462 -0.0504 0.1519 -0.4247 0.5171 0.1862 0.4934 -0.1844 -0.1275 -0.0220 -0.1093 -0.0775 0.0156 -0.1920 -0.1085

No, not really, so lets change things by putting the ‘snp status’ as the “batch” factor and minimize it with sva/combat.

SNP status does not clarify things.

SNP status does not clarify things.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
hpgl0631 HPGL0631 chronic red 4 #1B9E77 HPGL0631 -0.4437 -0.0072 -0.4437 -0.0072 0.0763 0.1085 -0.0763 0.0942 0.0562 0.1645 0.1653 -0.6065 -0.4385 -0.0927 0.0719 0.0910 0.0682 -0.0010 -0.2662
hpgl0632 HPGL0632 chronic yellow 6 #1B9E77 HPGL0632 0.0149 0.0885 0.0149 0.0885 0.1998 0.1492 -0.3055 -0.0920 0.3659 -0.1000 0.1949 0.0360 -0.0107 0.4218 -0.5126 0.2899 -0.1490 0.0376 0.2020
hpgl0633 HPGL0633 chronic blue_chronic 1 #1B9E77 HPGL0633 0.2682 -0.0203 0.2682 -0.0203 0.1462 0.2964 -0.0123 0.4048 -0.0403 -0.1588 0.3938 0.2168 -0.0107 0.1767 0.5238 0.0946 -0.1086 0.1002 -0.1829
hpgl0634 HPGL0634 self_heal white 5 #7570B3 HPGL0634 -0.1016 0.4539 -0.1016 0.4539 -0.3317 0.4555 0.5197 -0.1670 -0.0722 0.0946 0.1224 -0.0553 0.1573 -0.0051 -0.0235 0.0173 -0.0083 -0.0058 0.2339
hpgl0635 HPGL0635 self_heal blue_self 2 #7570B3 HPGL0635 0.1900 0.3133 0.1900 0.3133 -0.1082 -0.2970 -0.1013 -0.1842 0.0308 0.1001 0.0872 -0.1202 0.0314 0.3647 0.0254 -0.6040 -0.1438 -0.0922 -0.3232
hpgl0636 HPGL0636 self_heal pink 3 #7570B3 HPGL0636 -0.0494 0.1556 -0.0494 0.1556 0.3880 -0.5227 0.5083 0.3351 0.2367 0.0489 -0.0367 0.0410 -0.0593 -0.0327 0.0003 0.0235 -0.0146 -0.0327 0.2355
hpgl0651 HPGL0651 chronic red 4 #1B9E77 HPGL0651 -0.4407 -0.2573 -0.4407 -0.2573 -0.1341 -0.1446 0.1994 -0.2461 0.0224 -0.6519 0.1154 0.2212 0.0477 0.0276 -0.0204 -0.0500 0.0383 0.0040 -0.2226
hpgl0652 HPGL0652 chronic yellow 6 #1B9E77 HPGL0652 0.0588 -0.3278 0.0588 -0.3278 0.0475 0.0279 -0.0305 -0.3674 0.3336 0.1190 -0.1014 -0.1556 0.1421 -0.2180 0.3392 -0.0961 -0.4355 0.3036 0.2466
hpgl0653 HPGL0653 chronic blue_chronic 1 #1B9E77 HPGL0653 0.3014 -0.4585 0.3014 -0.4585 -0.0189 0.1803 0.3156 0.0252 0.0050 0.0671 -0.3804 -0.1793 -0.0256 0.3869 -0.1159 -0.0210 0.3747 0.1052 -0.1152
hpgl0654 HPGL0654 self_heal white 5 #7570B3 HPGL0654 -0.0539 -0.2255 -0.0539 -0.2255 -0.6493 -0.2720 -0.2105 0.4208 0.0229 0.1217 0.1939 -0.0625 0.1809 0.0011 -0.0597 0.0157 0.0772 0.0374 0.2702
hpgl0655 HPGL0655 self_heal blue_self 2 #7570B3 HPGL0655 0.2523 -0.0511 0.2523 -0.0511 -0.1297 -0.1752 0.0286 -0.2615 0.0285 0.1414 -0.0107 0.0300 0.0349 -0.1151 0.1105 0.5188 -0.0762 -0.6210 -0.2287
hpgl0656 HPGL0656 self_heal pink 3 #7570B3 HPGL0656 0.0345 -0.2216 0.0345 -0.2216 0.1629 -0.1193 0.0118 -0.2673 -0.6175 0.2509 0.3124 0.2137 -0.3193 0.0074 -0.1249 -0.0568 -0.0071 0.0853 0.2573
hpgl0658 HPGL0658 chronic red 4 #1B9E77 HPGL0658 -0.4587 -0.0548 -0.4587 -0.0548 0.1697 0.1436 -0.1069 0.1601 -0.0949 0.4444 -0.2701 0.3597 0.3784 0.0486 -0.0600 -0.0325 -0.1236 -0.0354 -0.2627
hpgl0659 HPGL0659 chronic yellow 6 #1B9E77 HPGL0659 0.0011 0.0617 0.0011 0.0617 0.1979 0.1161 -0.2663 -0.1474 0.2421 0.0124 0.0958 0.1268 0.0989 -0.2354 0.1647 -0.2506 0.6807 -0.2149 0.2160
hpgl0660 HPGL0660 chronic blue_chronic 1 #1B9E77 HPGL0660 0.2809 -0.1044 0.2809 -0.1044 0.0552 0.2273 0.0337 0.2475 -0.0125 -0.1645 0.0131 -0.0247 -0.0297 -0.5493 -0.4742 -0.2913 -0.2263 -0.1484 -0.1549
hpgl0661 HPGL0661 self_heal white 5 #7570B3 HPGL0661 -0.0413 0.1884 -0.0413 0.1884 -0.2216 0.0838 -0.1891 0.0871 0.0134 -0.1406 -0.5408 0.2825 -0.5748 0.0375 0.1530 -0.0403 -0.1233 -0.0761 0.1944
hpgl0662 HPGL0662 self_heal blue_self 2 #7570B3 HPGL0662 0.1865 0.3185 0.1865 0.3185 -0.0728 -0.2002 -0.1020 -0.1061 -0.0348 0.0119 -0.0917 0.0939 0.0308 -0.2716 -0.0919 0.2987 0.2232 0.6324 -0.3168
hpgl0663 HPGL0663 self_heal pink 3 #7570B3 HPGL0663 0.0007 0.1487 0.0007 0.1487 0.2229 -0.0573 -0.2165 0.0643 -0.4854 -0.3610 -0.2625 -0.4174 0.3661 0.0475 0.0941 0.0931 -0.0460 -0.0782 0.2174

Ok, so let us remove the healing state with combat and see if that allows us to see a split on some other factor.

hmm ok, I think I quit for today.

hmm ok, I think I quit for today.

hmm ok, I think I quit for today.

hmm ok, I think I quit for today.

sampleid condition batch batch_int colors labels PC1 PC2 pc_1 pc_2 pc_3 pc_4 pc_5 pc_6 pc_7 pc_8 pc_9 pc_10 pc_11 pc_12 pc_13 pc_14 pc_15 pc_16 pc_17
hpgl0631 HPGL0631 s5430 chronic 1 #1B9E77 HPGL0631 -0.4424 0.2552 -0.4424 0.2552 0.0542 0.0441 0.0137 0.0211 -0.0073 0.0289 -0.0440 -0.0482 0.1262 -0.2239 0.0081 0.3247 -0.0247 0.6942 -0.1373
hpgl0632 HPGL0632 s5397 chronic 1 #D95F02 HPGL0632 -0.1406 -0.3656 -0.1406 -0.3656 0.2663 -0.0946 -0.2488 0.0062 0.0631 0.0510 -0.2827 -0.0857 0.3650 0.4524 -0.3634 0.2396 -0.0806 -0.0771 0.1249
hpgl0633 HPGL0633 s2504 chronic 1 #7570B3 HPGL0633 -0.0367 -0.0009 -0.0367 -0.0009 0.1353 -0.2143 0.3940 0.3720 -0.2496 0.3117 -0.2789 0.1293 -0.4154 0.0190 0.0543 0.0681 -0.1049 -0.0242 0.3731
hpgl0634 HPGL0634 s2272 self_heal 2 #E7298A HPGL0634 0.1173 -0.0720 0.1173 -0.0720 0.1279 -0.4971 0.3771 -0.5152 0.0353 0.2292 0.1420 -0.2211 -0.0228 -0.0249 -0.0659 -0.0209 -0.0061 -0.0280 -0.3534
hpgl0635 HPGL0635 s1022 self_heal 2 #66A61E HPGL0635 0.2648 0.3432 0.2648 0.3432 0.1208 -0.0876 -0.2303 0.0029 0.0811 0.1500 -0.3119 -0.2721 0.3267 -0.0261 0.5783 -0.1228 0.0513 -0.0711 0.1273
hpgl0636 HPGL0636 s2189 self_heal 2 #E6AB02 HPGL0636 0.1861 -0.0833 0.1861 -0.0833 0.2718 0.4733 0.3709 0.1710 0.6471 0.0327 0.0445 0.0490 -0.0177 0.0132 0.0114 -0.0059 0.0154 0.0052 -0.1142
hpgl0651 HPGL0651 s5430 chronic 1 #1B9E77 HPGL0651 -0.4179 0.2143 -0.4179 0.2143 -0.1841 0.0983 -0.0264 -0.0164 0.0299 0.0299 0.1616 -0.0734 -0.2317 0.6505 0.1675 -0.3470 -0.0199 -0.0279 -0.1402
hpgl0652 HPGL0652 s5397 chronic 1 #D95F02 HPGL0652 -0.1100 -0.4019 -0.1100 -0.4019 -0.2012 0.1088 -0.2989 -0.2496 0.1564 0.0480 0.0883 -0.0714 -0.3296 -0.2096 0.3425 0.2658 -0.4012 -0.1063 0.1201
hpgl0653 HPGL0653 s2504 chronic 1 #7570B3 HPGL0653 0.0118 -0.0640 0.0118 -0.0640 -0.5127 0.0682 0.3533 -0.2721 0.0493 -0.3477 -0.3093 -0.0339 0.1967 -0.0530 -0.1304 -0.2268 -0.0184 0.1070 0.3634
hpgl0654 HPGL0654 s2272 self_heal 2 #E7298A HPGL0654 0.1451 -0.1490 0.1451 -0.1490 -0.5027 -0.2035 -0.1292 0.5749 0.0872 0.1363 0.0159 -0.2081 0.0695 -0.1365 -0.1734 -0.0840 -0.0224 -0.0009 -0.3593
hpgl0655 HPGL0655 s1022 self_heal 2 #66A61E HPGL0655 0.2833 0.2988 0.2833 0.2988 -0.2234 0.0927 -0.2283 -0.1969 0.0765 0.1369 -0.0026 0.0503 -0.3156 0.1043 -0.2670 0.3838 0.5038 -0.0326 0.1205
hpgl0656 HPGL0656 s2189 self_heal 2 #E6AB02 HPGL0656 0.2048 -0.1184 0.2048 -0.1184 -0.0869 0.4675 0.0502 -0.1209 -0.5267 0.4045 0.1369 0.2620 0.2920 0.0260 0.0200 -0.0625 -0.0903 0.0335 -0.1311
hpgl0658 HPGL0658 s5430 chronic 1 #1B9E77 HPGL0658 -0.4470 0.2608 -0.4470 0.2608 0.0617 0.1130 0.0346 -0.0092 -0.0522 -0.0487 -0.1066 0.0572 0.1048 -0.3817 -0.1611 0.0322 0.0472 -0.6571 -0.1464
hpgl0659 HPGL0659 s5397 chronic 1 #D95F02 HPGL0659 -0.1491 -0.3408 -0.1491 -0.3408 0.2385 -0.0503 -0.2766 -0.0385 0.0129 0.0604 0.0473 0.1072 -0.1225 -0.2782 -0.0408 -0.5368 0.4694 0.1765 0.1321
hpgl0660 HPGL0660 s2504 chronic 1 #7570B3 HPGL0660 -0.0313 -0.0028 -0.0313 -0.0028 -0.0111 -0.2130 0.1196 0.1825 -0.0067 -0.1692 0.7000 0.0649 0.3102 0.0318 0.1495 0.1924 0.1366 -0.0948 0.3784
hpgl0661 HPGL0661 s2272 self_heal 2 #E7298A HPGL0661 0.1307 -0.0704 0.1307 -0.0704 0.0165 -0.2302 -0.0462 0.0208 -0.0354 -0.4287 -0.1983 0.6441 -0.0263 0.1105 0.2611 0.1126 0.0516 0.0073 -0.3619
hpgl0662 HPGL0662 s1022 self_heal 2 #66A61E HPGL0662 0.2440 0.3732 0.2440 0.3732 0.1789 -0.0930 -0.2617 -0.0316 0.0635 -0.1067 0.1691 0.1619 -0.0871 -0.0837 -0.3790 -0.2716 -0.5555 0.0894 0.1290
hpgl0663 HPGL0663 s2189 self_heal 2 #E6AB02 HPGL0663 0.1869 -0.0766 0.1869 -0.0766 0.2502 0.2176 0.0330 0.0988 -0.4245 -0.5182 0.0289 -0.5120 -0.2224 0.0102 -0.0117 0.0590 0.0487 0.0069 -0.1249
