1 Extract some count tables

I will use this sheet for exporting data to collaborators. Currently, that is only the folks interested in doing some analyses with the first PLOS NTD paper.

It is worth noting that I recently found some errors in my accounting for these and the plosntd2 samples (I had the pathogen strains reversed).

## The biomart annotations file already exists, loading from it.
## Reading the sample metadata.
## The sample definitions comprises: 441 rows(samples) and 68 columns(metadata fields).
## Reading count tables.
## Reading count tables with read.table().
## /mnt/sshfs/cbcbsub/fs/cbcb-lab/nelsayed/scratch/atb/rnaseq/multiple_leishmania_2018/preprocessing/hpgl1108/outputs/hisat2_hg38_91/E1.count.xz contains 58307 rows.
## preprocessing/hpgl1109/outputs/hisat2_hg38_91/E2.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1110/outputs/hisat2_hg38_91/E3.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1111/outputs/hisat2_hg38_91/E4.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1112/outputs/hisat2_hg38_91/E5.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1113/outputs/hisat2_hg38_91/E6.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1114/outputs/hisat2_hg38_91/E7.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1115/outputs/hisat2_hg38_91/E8.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1116/outputs/hisat2_hg38_91/L1.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1117/outputs/hisat2_hg38_91/L2.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1118/outputs/hisat2_hg38_91/L3.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1119/outputs/hisat2_hg38_91/L4.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1120/outputs/hisat2_hg38_91/L5.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1121/outputs/hisat2_hg38_91/L6.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1122/outputs/hisat2_hg38_91/L7.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1123/outputs/hisat2_hg38_91/L8.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1124/outputs/hisat2_hg38_91/L9.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1125/outputs/hisat2_hg38_91/L10.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1126/outputs/hisat2_hg38_91/L11.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1127/outputs/hisat2_hg38_91/L12.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1128/outputs/hisat2_hg38_91/L13.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1129/outputs/hisat2_hg38_91/L14.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1130/outputs/hisat2_hg38_91/L15.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1131/outputs/hisat2_hg38_91/L16.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1132/outputs/hisat2_hg38_91/L17.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1133/outputs/hisat2_hg38_91/N1.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1134/outputs/hisat2_hg38_91/N2.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1135/outputs/hisat2_hg38_91/N3.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1136/outputs/hisat2_hg38_91/N4.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1137/outputs/hisat2_hg38_91/N5.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1138/outputs/hisat2_hg38_91/N6.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1139/outputs/hisat2_hg38_91/N7.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1140/outputs/hisat2_hg38_91/N8.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1141/outputs/hisat2_hg38_91/N9.count.xz contains 58307 rows and merges to 58307 rows.
## preprocessing/hpgl1142/outputs/hisat2_hg38_91/N10.count.xz contains 58307 rows and merges to 58307 rows.
## Finished reading count tables.
## Matched 57645 annotations and counts.
## Bringing together the count matrix and gene information.
## Some annotations were lost in merging, setting them to 'undefined'.
## The final expressionset has 58302 rows and 35 columns.
## Writing the first sheet, containing a legend and some summary data.
## Writing the raw reads.
## Graphing the raw reads.
## varpart sees only 1 batch, adjusting the model accordingly.
## Attempting mixed linear model with: ~  (1|condition)
## Fitting the expressionset to the model, this is slow.
## A couple of common errors:
## An error like 'vtv downdated' may be because there are too many 0s, filter the data and rerun.
## An error like 'number of levels of each grouping factor must be < number of observations' means
## that the factor used is not appropriate for the analysis - it really only works for factors
## which are shared among multiple samples.
## Retrying with only condition in the model.
## Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
##   contrasts can be applied only to factors with 2 or more levels
## Attempting again with only condition failed.
## Error in simple_varpart(expt, predictor = NULL, factors = c("condition",  : 
## 
## Writing the normalized reads.
## Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
##   contrasts can be applied only to factors with 2 or more levels
## Graphing the normalized reads.
## varpart sees only 1 batch, adjusting the model accordingly.
## Attempting mixed linear model with: ~  (1|condition)
## Fitting the expressionset to the model, this is slow.
## A couple of common errors:
## An error like 'vtv downdated' may be because there are too many 0s, filter the data and rerun.
## An error like 'number of levels of each grouping factor must be < number of observations' means
## that the factor used is not appropriate for the analysis - it really only works for factors
## which are shared among multiple samples.
## Retrying with only condition in the model.
## Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
##   contrasts can be applied only to factors with 2 or more levels
## Attempting again with only condition failed.
## Error in simple_varpart(norm_data, predictor = NULL, factors = c("condition",  : 
## 
## Writing the median reads by factor.
LS0tCnRpdGxlOiAiVXNlIHRoaXMgc2hlZXQgdG8gZXh0cmFjdCBzdWJzZXRzIG9mIHRoZSBkYXRhLiIKYXV0aG9yOiAiYXRiIGFiZWxld0BnbWFpbC5jb20iCmRhdGU6ICJgciBTeXMuRGF0ZSgpYCIKb3V0cHV0OgogIGh0bWxfZG9jdW1lbnQ6CiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cKICAgIGZpZ19jYXB0aW9uOiB0cnVlCiAgICBmaWdfaGVpZ2h0OiA3CiAgICBmaWdfd2lkdGg6IDcKICAgIGhpZ2hsaWdodDogdGFuZ28KICAgIGtlZXBfbWQ6IGZhbHNlCiAgICBtb2RlOiBzZWxmY29udGFpbmVkCiAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUKICAgIHNlbGZfY29udGFpbmVkOiB0cnVlCiAgICB0aGVtZTogcmVhZGFibGUKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OgogICAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICAgIHNtb290aF9zY3JvbGw6IGZhbHNlCiAgcm1kZm9ybWF0czo6cmVhZHRoZWRvd246CiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cKICAgIGRmX3ByaW50OiBwYWdlZAogICAgZmlnX2NhcHRpb246IHRydWUKICAgIGZpZ19oZWlnaHQ6IDcKICAgIGZpZ193aWR0aDogNwogICAgaGlnaGxpZ2h0OiB0YW5nbwogICAgd2lkdGg6IDMwMAogICAga2VlcF9tZDogZmFsc2UKICAgIG1vZGU6IHNlbGZjb250YWluZWQKICAgIHRvY19mbG9hdDogdHJ1ZQogIEJpb2NTdHlsZTo6aHRtbF9kb2N1bWVudDoKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgIGNvZGVfZm9sZGluZzogc2hvdwogICAgZmlnX2NhcHRpb246IHRydWUKICAgIGZpZ19oZWlnaHQ6IDcKICAgIGZpZ193aWR0aDogNwogICAgaGlnaGxpZ2h0OiB0YW5nbwogICAga2VlcF9tZDogZmFsc2UKICAgIG1vZGU6IHNlbGZjb250YWluZWQKICAgIHRvY19mbG9hdDogdHJ1ZQotLS0KCjxzdHlsZSB0eXBlPSJ0ZXh0L2NzcyI+CmJvZHksIHRkIHsKICBmb250LXNpemU6IDE2cHg7Cn0KY29kZS5yewogIGZvbnQtc2l6ZTogMTZweDsKfQpwcmUgewogZm9udC1zaXplOiAxNnB4Cn0KPC9zdHlsZT4KCmBgYHtyIG9wdGlvbnMsIGluY2x1ZGU9RkFMU0V9CmxpYnJhcnkoImhwZ2x0b29scyIpCnR0IDwtIGRldnRvb2xzOjpsb2FkX2FsbCgiL2RhdGEvaHBnbHRvb2xzIikKa25pdHI6Om9wdHNfa25pdCRzZXQod2lkdGg9MTIwLAogICAgICAgICAgICAgICAgICAgICBwcm9ncmVzcz1UUlVFLAogICAgICAgICAgICAgICAgICAgICB2ZXJib3NlPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgIGVjaG89VFJVRSkKa25pdHI6Om9wdHNfY2h1bmskc2V0KGVycm9yPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgICBkcGk9OTYpCm9sZF9vcHRpb25zIDwtIG9wdGlvbnMoZGlnaXRzPTQsCiAgICAgICAgICAgICAgICAgICAgICAgc3RyaW5nc0FzRmFjdG9ycz1GQUxTRSwKICAgICAgICAgICAgICAgICAgICAgICBrbml0ci5kdXBsaWNhdGUubGFiZWw9ImFsbG93IikKZ2dwbG90Mjo6dGhlbWVfc2V0KGdncGxvdDI6OnRoZW1lX2J3KGJhc2Vfc2l6ZT0xMCkpCnJ1bmRhdGUgPC0gZm9ybWF0KFN5cy5EYXRlKCksIGZvcm1hdD0iJVklbSVkIikKcHJldmlvdXNfZmlsZSA8LSAiMDJfZXN0aW1hdGlvbl9pbmZlY3Rpb25fMjAxODA4MjIuUm1kIgp2ZXIgPC0gIjIwMjAwMjI3IgoKIyN0bXAgPC0gc20obG9hZG1lKGZpbGVuYW1lPXBhc3RlMChnc3ViKHBhdHRlcm49IlxcLlJtZCIsIHJlcGxhY2U9IiIsIHg9cHJldmlvdXNfZmlsZSksICItdiIsIHZlciwgIi5yZGEueHoiKSkpCiMjcm1kX2ZpbGUgPC0gIjAzX2V4cHJlc3Npb25faW5mZWN0aW9uXzIwMTgwODIyLlJtZCIKYGBgCgojIEV4dHJhY3Qgc29tZSBjb3VudCB0YWJsZXMKCkkgd2lsbCB1c2UgdGhpcyBzaGVldCBmb3IgZXhwb3J0aW5nIGRhdGEgdG8gY29sbGFib3JhdG9ycy4gIEN1cnJlbnRseSwgdGhhdCBpcwpvbmx5IHRoZSBmb2xrcyBpbnRlcmVzdGVkIGluIGRvaW5nIHNvbWUgYW5hbHlzZXMgd2l0aCB0aGUgZmlyc3QgUExPUyBOVEQgcGFwZXIuCgpJdCBpcyB3b3J0aCBub3RpbmcgdGhhdCBJIHJlY2VudGx5IGZvdW5kIHNvbWUgZXJyb3JzIGluIG15IGFjY291bnRpbmcgZm9yIHRoZXNlCmFuZCB0aGUgcGxvc250ZDIgc2FtcGxlcyAoSSBoYWQgdGhlIHBhdGhvZ2VuIHN0cmFpbnMgcmV2ZXJzZWQpLgoKYGBge3IgZXh0cmFjdF9wbG9zbnRkMSwgZmlnLnNob3c9ImhpZGUifQpoc19hbm5vdCA8LSBsb2FkX2Jpb21hcnRfYW5ub3RhdGlvbnMoKSRhbm5vdGF0aW9uCnJvd25hbWVzKGhzX2Fubm90KSA8LSBtYWtlLm5hbWVzKAogIHBhc3RlMChoc19hbm5vdFtbImVuc2VtYmxfdHJhbnNjcmlwdF9pZCJdXSwgIi4iLAogICAgICAgICBoc19hbm5vdFtbInRyYW5zY3JpcHRfdmVyc2lvbiJdXSksCiAgdW5pcXVlPVRSVUUpCmhzX3R4X2dlbmUgPC0gaHNfYW5ub3RbLCBjKCJlbnNlbWJsX2dlbmVfaWQiLCAiZW5zZW1ibF90cmFuc2NyaXB0X2lkIildCmhzX3R4X2dlbmVbWyJpZCJdXSA8LSByb3duYW1lcyhoc190eF9nZW5lKQpoc190eF9nZW5lIDwtIGhzX3R4X2dlbmVbLCBjKCJpZCIsICJlbnNlbWJsX2dlbmVfaWQiKV0KbmV3X2hzX2Fubm90IDwtIGhzX2Fubm90CnJvd25hbWVzKG5ld19oc19hbm5vdCkgPC0gbWFrZS5uYW1lcyhoc19hbm5vdFtbImVuc2VtYmxfZ2VuZV9pZCJdXSwgdW5pcXVlPVRSVUUpCgpzYW1wbGVfc2hlZXQgPC0gInNhbXBsZV9zaGVldHMvVU1EX2xlaXNobWFuaWFfaG9zdF9tZXRhc2hlZXRfMjAyMDAyMjcueGxzeCIKIyMgQXMgb2YgMjAyMDAyMjIsIEkgaGF2ZSBvbmx5IHBlcmZvcm1lZCBoaXNhdDIgbWFwcGluZyBmb3IgdGhlIHBsb3MgbnRkMiBkYXRhLgpwcmVmaXggPC0gImV4Y2VsL3Bsb3NfbnRkX2hvc3RfaGlzYXQyLXYiCnBsb3NudGQyX2V4cHQgPC0gY3JlYXRlX2V4cHQoc2FtcGxlX3NoZWV0LAogICAgICAgICAgICAgICAgICAgICAgICAgICAgIGZpbGVfY29sdW1uPSJoc2FwaWVuc2hpc2F0ZmlsZSIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgZ2VuZV9pbmZvPW5ld19oc19hbm5vdCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICBzYXZlZmlsZT1nbHVlOjpnbHVlKCJ7cHJlZml4fXt2ZXJ9LnJkYSIpKQoKd3JpdHRlbl9jc3YgPC0gcmVhZHI6OndyaXRlX2Nzdih4PWFzLmRhdGEuZnJhbWUoZXhwcnMocGxvc250ZDJfZXhwdCkpLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHBhdGg9Z2x1ZTo6Z2x1ZSgie3ByZWZpeH17dmVyfS5jc3YiKSkKd3JpdHRlbl94bHMgPC0gd3JpdGVfZXhwdChwbG9zbnRkMl9leHB0LAogICAgICAgICAgICAgICAgICAgICAgICAgIGV4Y2VsPWdsdWU6OmdsdWUoIntwcmVmaXh9e3Zlcn0ueGxzeCIpKQpgYGAKCmBgYHtyIHNhdmVtZSwgZXZhbD1GQUxTRX0KcGFuZGVyOjpwYW5kZXIoc2Vzc2lvbkluZm8oKSkKbWVzc2FnZShwYXN0ZTAoIlRoaXMgaXMgaHBnbHRvb2xzIGNvbW1pdDogIiwgZ2V0X2dpdF9jb21taXQoKSkpCnRoaXNfc2F2ZSA8LSBwYXN0ZTAoZ3N1YihwYXR0ZXJuPSJcXC5SbWQiLCByZXBsYWNlPSIiLCB4PXJtZF9maWxlKSwgIi12IiwgdmVyLCAiLnJkYS54eiIpCm1lc3NhZ2UocGFzdGUwKCJTYXZpbmcgdG8gIiwgdGhpc19zYXZlKSkKdG1wIDwtIHNtKHNhdmVtZShmaWxlbmFtZT10aGlzX3NhdmUpKQpgYGAK