1 Introduction

This document is intended to create the data structures used to evaluate our TMRC2 samples. In some cases, this includes only those samples starting in 2019; in other instances I am including our previous (2015-2016) samples.

In all cases the processing performed was:

  1. Default trimming was performed.
  2. Hisat2 was used to map the remaining reads against the Leishmania panamensis genome revision 36.
  3. The alignments from hisat2 were used to count reads/gene against the revision 36 annotations with htseq.
  4. These alignments were also passed to the pileup functionality of samtools and the vcf/bcf utilities in order to make a matrix of all observed differences between each sample with respect to the reference.
  5. The freebayes variant estimation tool was used in addition to #4 to search for variant positions in a more robust fashion.
  6. The trimmed reads were passed to kraken2 using a viral database in order to look for samples with potential LRV sequence.
  7. An explicit, grep-based search for spliced leader reads was used against all human-derived samples. The results from this were copy/pasted into the sample sheet.

2 Notes 20221206 meeting

I am thinking that this meeting will bring Maria Adelaida fully back into the analyses of the parasite data, and therefore may focus primarily on the goals rather than the analyses?

  • Maria Adelaida meeting with Olgla/Mariana: integrating transcriptomics/genomics question.
  • Paper on relationship btwn primary metadata factors via transcriptome/genome.
  • Second on drug susceptibility without those factors (I think this means the macrophages)
  • Definition of species? MAG: Define consensus sequences for various strains/species. We effectively have this on hand, though the quality may be a little less good for 2.3.
  • Resulting goal: Create a tree of the strains (I am just going to call zymodemes strains from now on). ** What organisms would we include in a tree to describe these relationships: guyanensis, braziliensis 2904, 2.2, 2.3, 2.1, 2.4, panamensis reference, peruviania(sp? I have not seen this genome), panama, 2903; actually this may be tricky because we have always done this with a specific reference strain (panamensis col) which is one of the strains in the comparison. hmm… ** Check the most variant strains for identity (Luc) ** Methods for creating tree, traditional phylogeny vs. variant hclust?
  • PCR queries, works well if one performs sanger sequencing.

2.1 Multiple datasets

In a couple of important ways the TMRC2 data is much more complex than the TMRC3:

  1. It comprises multiple, completely separate queries:
    1. Sequencing the parasite samples
    2. Sequencing a set of human macrophage samples which were infected with specific parasite samples.
  2. The parasite transcriptomic samples comprise multiple different types of queries:
    1. Differential expression to look at strain, susceptibility, and clinical outcomes.
    2. Individual variant searches to look for potentially useful SNPs for classification of parasite samples.
  3. The human macrophage samples may be used to query both the host and parasite transcriptomes because (at least when not drug treated) there is a tremendous population of parasite reads in them.

2.2 Sample sheet(s)

Our shared online sample sheet is nearly static at the time of this writing (202209), I expect at this point the only likely updates will be to annotate some strains as more or less susceptible to drug treatment.

sample_sheet <- "sample_sheets/macrophage_samples.xlsx"

3 Annotations

Everything which follows depends on the Existing TriTrypDB annotations revision 46, circa 2019. The following block loads a database of these annotations and turns it into a matrix where the rows are genes and columns are all the annotation types provided by TriTrypDB.

The same database was used to create a matrix of orthologous genes between L.panamensis and all of the other species in the TriTrypDB.

The same database of annotations also provides mappings to the set of annotated GO categories for the L.panamensis genome along with gene lengths.

meta <- download_eupath_metadata(webservice = "tritrypdb", overwrite = FALSE)
## Loading taxonomy and species database to cross reference against the download.
## Working on: 1: org.Tvivax.Y486.v67.eg.db.
## Working on: 2: org.Tcongolense.IL3000.v67.eg.db.
## Working on: 3: org.Laethiopica.L147.v67.eg.db.
## Working on: 4: org.Ltropica.L590.v67.eg.db.
## Working on: 5: org.Tcruzi.Tula.cl2.v67.eg.db.
## Working on: 6: org.Lpanamensis.MHOMCOL81L13.v67.eg.db.
## Working on: 7: org.Lbraziliensis.MHOMBR75M2903.v67.eg.db.
## Working on: 8: org.Tcruzi.Dm28c.2014.v67.eg.db.
## Working on: 9: org.Tbrucei.brucei.TREU927.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species/strain.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 10: org.Lmajor.Friedlin.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania major.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 11: org.Tcruzi.CL.Brener.v67.eg.db.
## Working on: 12: org.Tcruzi.Esmeraldo.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species/strain.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 13: org.Lbraziliensis.MHOMBR75M2904.v67.eg.db.
## Working on: 14: org.Trangeli.SC58.v67.eg.db.
## Working on: 15: org.Linfantum.JPCM5.v67.eg.db.
## Working on: 16: org.Tbrucei.gambiense.DAL972.v67.eg.db.
## Working on: 17: org.Lmajor.LV39c5.v67.eg.db.
## Working on: 18: org.Lmajor.SD.75.1.v67.eg.db.
## Working on: 19: org.Tcruzi.JR.cl.4.v67.eg.db.
## Working on: 20: org.Lmexicana.MHOMGT2001U1103.v67.eg.db.
## Working on: 21: org.Ldonovani.BPK282A1.v67.eg.db.
## Working on: 22: org.Adeanei.Cavalho.ATCC.PRA.265.v67.eg.db.
## Found 4 candidate genera matching Angomonas
## Found an exact match for the combination genus/species not strain for Angomonas deanei.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 23: org.Bayalai.B08.376.v67.eg.db.
## Found 20 candidate genera matching Blechomonas
## Found an exact match for the combination genus/species not strain for Blechomonas ayalai.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 24: org.Bnonstop.P57.v67.eg.db.
## Found 28 candidate genera matching Blastocrithidia
## Found a genus, but not species for Blastocrithidia nonstop P57, not adding taxon ID number.
## Working on: 25: org.Bsaltans.Lake.Konstanz.v67.eg.db.
## Found 28 candidate genera matching Bodo
## Found an exact match for the combination genus/species not strain for Bodo saltans.
## Working on: 26: org.Cfasciculata.Cf.Cl.v67.eg.db.
## Found 53 candidate genera matching Crithidia
## Found an exact match for the combination genus/species not strain for Crithidia fasciculata.
## Working on: 27: org.Emonterogeii.LV88.v67.eg.db.
## Found 13 candidate genera matching Endotrypanum
## Found an exact match for the combination genus/species not strain for Endotrypanum monterogeii.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 28: org.Lamazonensis.MHOMBR71973M2269.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania amazonensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 29: org.Lamazonensis.PH8.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania amazonensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 30: org.Larabica.LEM1108.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania arabica.
## Working on: 31: org.Lbraziliensis.MHOMBR75M2904.2019.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania braziliensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 32: org.Ldonovani.BHU.1220.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania donovani.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 33: org.Ldonovani.CL.SL.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania donovani.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 34: org.Ldonovani.HU3.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania donovani.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 35: org.Ldonovani.LV9.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania donovani.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 36: org.Lenriettii.LEM3045.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania enriettii.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 37: org.Lenriettii.MCAVBR2001CUR178.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania enriettii.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 38: org.Lgerbilli.LEM452.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania gerbilli.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 39: org.Lmajor.Friedlin.2021.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania major.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 40: org.Lmartiniquensis.LEM2494.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania martiniquensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 41: org.Lmartiniquensis.MHOMTH2012LSCM1.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania martiniquensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 42: org.Lorientalis.MHOMTH2014LSCM4.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania orientalis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 43: org.Lpanamensis.MHOMPA94PSC.1.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania panamensis.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 44: org.Lpyrrhocoris.H10.v67.eg.db.
## Found 73 candidate genera matching Leptomonas
## Found an exact match for the combination genus/species not strain for Leptomonas pyrrhocoris.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 45: org.Lseymouri.ATCC.30220.v67.eg.db.
## Found 73 candidate genera matching Leptomonas
## Found an exact match for the combination genus/species not strain for Leptomonas seymouri.
## Working on: 46: org.Lsp.Ghana.MHOMGH2012GH5.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania sp..
## Working on: 47: org.Lsp.Namibia.MPRONA1975252LV425.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania sp..
## Working on: 48: org.Ltarentolae.Parrot.TarII.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania tarentolae.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 49: org.Ltarentolae.Parrot.Tar.II.2019.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania tarentolae.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 50: org.Lturanica.LEM423.v67.eg.db.
## Found 279 candidate genera matching Leishmania
## Found an exact match for the combination genus/species not strain for Leishmania turanica.
## Working on: 51: org.Pconfusum.CUL13.v67.eg.db.
## Found 4 candidate genera matching Paratrypanosoma
## Found an exact match for the combination genus/species not strain for Paratrypanosoma confusum.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 52: org.Phertigi.MCOEPA1965C119.v67.eg.db.
## Found 3 candidate genera matching Porcisia
## Found an exact match for the combination genus/species not strain for Porcisia hertigi.
## Working on: 53: org.Tbrucei.EATRO1125.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma brucei.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 54: org.Tbrucei.Lister.427.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma brucei.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 55: org.Tbrucei.Lister.427.2018.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma brucei.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 56: org.Tcongolense.IL3000.2019.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma congolense.
## Working on: 57: org.Tcongolense.Tc1148.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma congolense.
## Working on: 58: org.Tcruzi.231.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 59: org.Tcruzi.Berenice.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 60: org.Tcruzi.Brazil.A4.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 61: org.Tcruzi.Bug2148.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 62: org.Tcruzi.CL.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 63: org.Tcruzi.CL.Brener.Esmeraldo.like.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 64: org.Tcruzi.CL.Brener.Non.Esmeraldo.like.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 65: org.Tcruzi.Dm28c.2017.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 66: org.Tcruzi.Dm28c.2018.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 67: org.Tcruzi.G.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 68: org.Tcruzi.S11.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 69: org.Tcruzi.S15.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 70: org.Tcruzi.S154a.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 71: org.Tcruzi.S162a.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 72: org.Tcruzi.S23b.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 73: org.Tcruzi.S44a.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 74: org.Tcruzi.S92a.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 75: org.Tcruzi.Sylvio.X101.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 76: org.Tcruzi.Sylvio.X101.2012.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 77: org.Tcruzi.TCC.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 78: org.Tcruzi.Y.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 79: org.Tcruzi.Y.C6.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 80: org.Tcruzi.Ycl2.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 81: org.Tcruzi.Ycl4.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 82: org.Tcruzi.Ycl6.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 83: org.Tcruzi.marinkellei.B7.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma cruzi.
## Working on: 84: org.Tequiperdum.OVI.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma equiperdum.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 85: org.Tevansi.STIB.805.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma evansi.
## Found more than one taxonomy ID match, returning the first match.
## Working on: 86: org.Tgrayi.ANR4.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma grayi.
## Working on: 87: org.Tmelophagium.St.Kilda.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma melophagium.
## Working on: 88: org.Ttheileri.isolate.Edinburgh.v67.eg.db.
## Found 549 candidate genera matching Trypanosoma
## Found an exact match for the combination genus/species not strain for Trypanosoma theileri.
panamensis_entry <- get_eupath_entry("MHOM", metadata = meta[["valid"]])
## Found the following hits: Leishmania panamensis MHOM/COL/81/L13, Leishmania braziliensis MHOM/BR/75/M2903, Leishmania braziliensis MHOM/BR/75/M2904, Leishmania mexicana MHOM/GT/2001/U1103, Leishmania sp. Ghana MHOM/GH/2012/GH5, choosing the first.
## Using: Leishmania panamensis MHOM/COL/81/L13.
panamensis_db <- make_eupath_orgdb(panamensis_entry)
##  org.Lpanamensis.MHOMCOL81L13.v67.eg.db is already installed and a copy should be found at: /sw/local/conda/202404/envs/hpgltools/lib/R/library/org.Lpanamensis.MHOMCOL81L13.v67.eg.db/extdata/org.Lpanamensis.MHOMCOL81L13.v67.eg.sqlite.
panamensis_pkg <- panamensis_db[["pkgname"]]
package_name <- panamensis_db[["pkgname"]]
if (is.null(panamensis_pkg)) {
  panamensis_pkg <- panamensis_entry[["OrgdbPkg"]]
  package_name <- panamensis_pkg
}

tt <- library(panamensis_pkg, character.only = TRUE)
## Loading required package: AnnotationDbi
## 
## Attaching package: 'AnnotationDbi'
## The following object is masked from 'package:dplyr':
## 
##     select
## 
panamensis_env <- get0(panamensis_pkg)
all_fields <- columns(panamensis_env)
all_lp_annot <- sm(load_orgdb_annotations(
    panamensis_env,
    keytype = "gid",
    fields = c("annot_gene_entrez_id", "annot_gene_name",
               "annot_strand", "annot_chromosome", "annot_cds_length",
               "annot_gene_product")))$genes
## Testing to see just how big the full database is.
## testing <- load_orgdb_annotations(panamensis_pkg, keytype = "gid", fields = "all")

lp_go <- load_orgdb_go(panamensis_pkg)
lp_go <- lp_go[, c("GID", "GO")]
lp_lengths <- all_lp_annot[, c("gid", "annot_cds_length")]
colnames(lp_lengths)  <- c("ID", "length")
all_lp_annot[["annot_gene_product"]] <- tolower(all_lp_annot[["annot_gene_product"]])
orthos <- sm(extract_eupath_orthologs(db = panamensis_pkg))
data_structures <- c(data_structures, "lp_lengths", "lp_go", "all_lp_annot")

4 Load a genome

The following block loads the full genome sequence for panamensis. We may use this later to attempt to estimate PCR primers to discern strains.

I am not sure how to increase the number of open files in a container, as a result this does not work.

testing_panamensis <- make_eupath_bsgenome(entry = panamensis_entry)
library(as.character(testing_panamensis), character.only = TRUE)
lp_genome <- get0(as.character(testing_panamensis))
data_structures <- c(data_structures, "lp_genome")

5 Generate Expressionsets and Sample Estimation

The process of sample estimation takes two primary inputs:

  1. The sample sheet, which contains all the metadata we currently have on hand, including filenames for the outputs of #3 and #4 above.
  2. The gene annotations.

An expressionSet(or summarizedExperiment) is a data structure used in R to examine RNASeq data. It is comprised of annotations, metadata, and expression data. In the case of our processing pipeline, the location of the expression data is provided by the filenames in the metadata.

5.1 Define colors

The following list contains the colors we have chosen to use when plotting the various ways of discerning the data.

color_choices <- list(
    "strain" = list(
        ## "z1.0" = "#333333", ## Changed this to 'braz' to make it easier to find them.
        "z2.0" = "#555555",
        "z3.0" = "#777777",
        "z2.1" = "#874400",
        "z2.2" = "#0000cc",
        "z2.3" = "#cc0000",
        "z2.4" = "#df7000",
        "z3.2" = "#888888",
        "z1.0" = "#cc00cc",
        "z1.5" = "#cc00cc",
        "b2904" = "#cc00cc",
        "unknown" = "#cbcbcb"),
    ## "null" = "#000000"),
    "zymo" = list(
      "z22" = "#0000cc",
      "z23" = "#cc0000"),
    "cf" = list(
        "cure" = "#006f00",
        "fail" = "#9dffa0",
        "unknown" = "#cbcbcb",
        "notapplicable" = "#000000"),
    "condition" = list(
      "inf" = "#199c75",
      "inf_sb" = "#d65d00",
      "uninf" = "#6e6ea3",
      "uninf_sb" = "#d83956"),
    "significance" = list(
      "lt0" = "#ffe0e0",
      "lt1" = "#ffa0a0",
      "lt2" = "#f94040",
      "lt4" = "#a00000",
      "gt0" = "#eeccf9",
      "gt1" = "#de8bf9",
      "gt2" = "#ad07e3",
      "gt4" = "#410257"),
    "drug" = list(
      "none" = "#989898",
      "antimony" = "#088b64"),
    "oldnew" = list(
      "previous" = "#2233aa",
      "current" =  "#9c0303"),
    "infectedp" = list(
      "uninfected" = "#676767",
      "infected" = "#ac06e2"),
    "treatment_zymo" = list(
      "infsb_z23" = "#E7298A",
      "inf_z23" = "#D95F02",
      "uninf_none" = "#66A61E",
      "uninfsb_none" = "#E6AB02",
      "inf_z22" = "#1B9E77",
      "infsb_z22" = "#7570B3"),
    "susceptibility" = list(
        "resistant" = "#8563a7",
        "sensitive" = "#8d0000",
        "ambiguous" = "#cbcbcb",
        "unknown" = "#555555"))
data_structures <- c(data_structures, "color_choices")

6 Macrophage data

All of the above focused entire on the parasite samples, now let us pull up the macrophage infected samples. This will comprise two datasets, one of the human and one of the parasite.

6.1 Macrophage host data

The metadata for the macrophage samples contains a couple of columns for mapped human and parasite reads. We will therefore use them separately to create two expressionsets, one for each species.

hs_annot <- load_biomart_annotations(year = "2020")
## Using mart: ENSEMBL_MART_ENSEMBL from host: Apr2020.archive.ensembl.org.
## Successfully connected to the hsapiens_gene_ensembl database.
## Batch submitting query [=>-----------------------------]   7% eta:  1mBatch submitting query [===>---------------------------]  14% eta:  1mBatch submitting query [======>------------------------]  21% eta: 50sBatch submitting query [========>----------------------]  29% eta: 40sBatch submitting query [==========>--------------------]  36% eta: 33sBatch submitting query [============>------------------]  43% eta: 28sBatch submitting query [===============>---------------]  50% eta: 22sBatch submitting query [=================>-------------]  57% eta: 18sBatch submitting query [===================>-----------]  64% eta: 14sBatch submitting query [=====================>---------]  71% eta: 11sBatch submitting query [=======================>-------]  79% eta:  8sBatch submitting query [==========================>----]  86% eta:  5sBatch submitting query [============================>--]  93% eta:  2s                                                                      Finished downloading ensembl gene annotations.
## Batch submitting query [=>-----------------------------]   7% eta:  4mBatch submitting query [===>---------------------------]  14% eta:  4mBatch submitting query [======>------------------------]  21% eta:  3mBatch submitting query [========>----------------------]  29% eta:  2mBatch submitting query [==========>--------------------]  36% eta:  2mBatch submitting query [============>------------------]  43% eta:  1mBatch submitting query [===============>---------------]  50% eta:  1mBatch submitting query [=================>-------------]  57% eta:  1mBatch submitting query [===================>-----------]  64% eta: 43sBatch submitting query [=====================>---------]  71% eta: 32sBatch submitting query [=======================>-------]  79% eta: 23sBatch submitting query [==========================>----]  86% eta: 14sBatch submitting query [============================>--]  93% eta:  7s                                                                      Finished downloading ensembl structure annotations.
## symbol columns is null, pattern matching 'symbol'.
## Batch submitting query [=>-----------------------------]   7% eta:  1mBatch submitting query [===>---------------------------]  14% eta:  1mBatch submitting query [======>------------------------]  21% eta: 39sBatch submitting query [========>----------------------]  29% eta: 32sBatch submitting query [==========>--------------------]  36% eta: 26sBatch submitting query [============>------------------]  43% eta: 22sBatch submitting query [===============>---------------]  50% eta: 18sBatch submitting query [=================>-------------]  57% eta: 15sBatch submitting query [===================>-----------]  64% eta: 12sBatch submitting query [=====================>---------]  71% eta:  9sBatch submitting query [=======================>-------]  79% eta:  7sBatch submitting query [==========================>----]  86% eta:  5sBatch submitting query [============================>--]  93% eta:  2s                                                                      Including symbols, there are 68503 vs the 249740 gene annotations.
## Dropping haplotype chromosome annotations, set drop_haplotypes = FALSE if this is bad.
## Saving annotations to hsapiens_biomart_annotations.rda.
## Finished save().
hs_annot <- hs_annot[["annotation"]]
hs_annot[["transcript"]] <- paste0(rownames(hs_annot), ".", hs_annot[["transcript_version"]])
rownames(hs_annot) <- make.names(hs_annot[["ensembl_gene_id"]], unique = TRUE)
rownames(hs_annot) <- paste0("gene:", rownames(hs_annot))
tx_gene_map <- hs_annot[, c("transcript", "ensembl_gene_id")]

sanitize_columns <- c("drug", "macrophagetreatment", "macrophagezymodeme")
macr_annot <- hs_annot
rownames(macr_annot) <- gsub(x = rownames(macr_annot),
                             pattern = "^gene:",
                             replacement = "")

hs_macrophage <- create_expt(sample_sheet, gene_info = macr_annot,
                             file_column = "hg38100hisatfile") %>%
  set_expt_conditions(fact = "macrophagetreatment") %>%
  set_expt_batches(fact = "macrophagezymodeme") %>%
  sanitize_expt_pData(columns = sanitize_columns) %>%
  subset_expt(nonzero = 12000)
## Reading the sample metadata.
## Did not find the condition column in the sample sheet.
## Filling it in as undefined.
## Did not find the batch column in the sample sheet.
## Filling it in as undefined.
## The sample definitions comprises: 69 rows(samples) and 83 columns(metadata fields).
## Matched 21452 annotations and counts.
## Bringing together the count matrix and gene information.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the expressionset to 'expt.rda'.
## The final expressionset has 21481 features and 69 samples.
## The numbers of samples by condition are:
## 
##      inf   inf_sb    uninf uninf_sb 
##       30       29        5        5
## The number of samples by batch are:
## 
## none z2.2 z2.3 
##   10   30   29
## The samples (and read coverage) removed when filtering 12000 non-zero genes are: 
## subset_expt(): There were 69, now there are 68 samples.
fixed_genenames <- gsub(x = rownames(exprs(hs_macrophage)), pattern = "^gene:",
                        replacement = "")
hs_macrophage <- set_expt_genenames(hs_macrophage, ids = fixed_genenames)
table(pData(hs_macrophage)$condition)
## 
##      inf   inf_sb    uninf uninf_sb 
##       29       29        5        5
## The following 3 lines were copy/pasted to datastructures and should be removed soon.
nostrain <- is.na(pData(hs_macrophage)[["strainid"]])
pData(hs_macrophage)[nostrain, "strainid"] <- "none"

pData(hs_macrophage)[["strain_zymo"]] <- paste0("s", pData(hs_macrophage)[["strainid"]],
                                                "_", pData(hs_macrophage)[["macrophagezymodeme"]])
uninfected <- pData(hs_macrophage)[["strain_zymo"]] == "snone_none"
pData(hs_macrophage)[uninfected, "strain_zymo"] <- "uninfected"

pData(hs_macrophage)[["infectedp"]] <- "infected"
pData(hs_macrophage)[uninfected, "infectedp"] <- "uninfected"

data_structures <- c(data_structures, "hs_macrophage")

6.2 Subset and create different groupings

all_human <- sanitize_expt_pData(hs_macrophage, columns = "drug") %>%
  set_expt_conditions(fact = "drug") %>%
  set_expt_batches(fact = "typeofcells")
## The numbers of samples by condition are:
## 
## antimony     none 
##       34       34
## The number of samples by batch are:
## 
## Macrophages        U937 
##          54          14
data_structures <- c(data_structures, "all_human")

## The following 3 lines were copy/pasted to datastructures and should be removed soon.
no_strain_idx <- pData(all_human)[["strainid"]] == "none"
##pData(all_human)[["strainid"]] <- paste0("s", pData(all_human)[["strainid"]],
##                                         "_", pData(all_human)[["macrophagezymodeme"]])
pData(all_human)[no_strain_idx, "strainid"] <- "none"
table(pData(all_human)[["strainid"]])
## 
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367  2169  7158  none 
##     2     8     2     2     2     8     7     8     2     7     8     2    10
all_human_types <- set_expt_conditions(all_human, fact = "typeofcells") %>%
  set_expt_batches(fact = "drug")
## The numbers of samples by condition are:
## 
## Macrophages        U937 
##          54          14
## The number of samples by batch are:
## 
## antimony     none 
##       34       34
data_structures <- c(data_structures, "all_human_types")

type_zymo_fact <- paste0(pData(all_human_types)[["condition"]], "_",
                         pData(all_human_types)[["macrophagezymodeme"]])
type_zymo <- set_expt_conditions(all_human_types, fact = type_zymo_fact)
## The numbers of samples by condition are:
## 
## Macrophages_none  Macrophages_z22  Macrophages_z23        U937_none 
##                8               23               23                2 
##         U937_z22         U937_z23 
##                6                6
data_structures <- c(data_structures, "type_zymo")

type_drug_fact <- paste0(pData(all_human_types)[["condition"]], "_",
                         pData(all_human_types)[["drug"]])
type_drug <- set_expt_conditions(all_human_types, fact = type_drug_fact)
## The numbers of samples by condition are:
## 
## Macrophages_antimony     Macrophages_none        U937_antimony 
##                   27                   27                    7 
##            U937_none 
##                    7
data_structures <- c(data_structures, "type_drug")

strain_fact <- pData(all_human_types)[["strainid"]]
table(strain_fact)
## strain_fact
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367  2169  7158  none 
##     2     8     2     2     2     8     7     8     2     7     8     2    10
new_conditions <- paste0(pData(hs_macrophage)[["macrophagetreatment"]], "_",
                         pData(hs_macrophage)[["macrophagezymodeme"]])
## Note the sanitize() call is redundant with the addition of sanitize() in the
## datastructures file, but I don't want to wait to rerun that.
hs_macr <- set_expt_conditions(hs_macrophage, fact = new_conditions) %>%
  sanitize_expt_pData(column = "drug") %>%
  set_expt_colors(color_choices[["treatment_zymo"]]) %>%
  subset_expt(subset = "typeofcells!='U937'")
## The numbers of samples by condition are:
## 
##      inf_z22      inf_z23    infsb_z22    infsb_z23   uninf_none uninfsb_none 
##           14           15           15           14            5            5
## The samples excluded are: TMRC30309, TMRC30293, TMRC30294, TMRC30291, TMRC30292, TMRC30307, TMRC30308, TMRC30310, TMRC30331, TMRC30311, TMRC30332, TMRC30305, TMRC30306, TMRC30330.
## subset_expt(): There were 68, now there are 54 samples.
data_structures <- c(data_structures, "hs_macr")

hs_macr_drug_expt <- set_expt_conditions(hs_macr, fact = "drug")
## The numbers of samples by condition are:
## 
## antimony     none 
##       27       27
hs_macr_strain_expt <- set_expt_conditions(hs_macr, fact = "macrophagezymodeme") %>%
  subset_expt(subset = "macrophagezymodeme != 'none'")
## The numbers of samples by condition are:
## 
## none  z22  z23 
##    8   23   23
## The samples excluded are: TMRC30059, TMRC30060, TMRC30266, TMRC30268, TMRC30326, TMRC30327, TMRC30312, TMRC30313.
## subset_expt(): There were 54, now there are 46 samples.
data_structures <- c(data_structures, "hs_macr_strain_expt")

table(pData(hs_macr)[["strainid"]])
## 
## 10763 10772 10977 11026 11075 11126 12251 12309 12355 12367  2169  7158  none 
##     2     6     2     2     2     6     5     6     2     5     6     2     8

Let us see if the sankey plot of these samples looks useful…

macr_sankey <- plot_meta_sankey(hs_macrophage, color_choices = color_choices,
                                factors = c("oldnew", "drug", "infectedp", "macrophagezymodeme"))
macr_sankey
## A sankey plot describing the metadata of 68 samples,
## including 26 out of 0 nodes and traversing metadata factors:
## .

Finally, split off the U937 samples.

hs_u937 <- subset_expt(hs_macrophage, subset = "typeofcells!='Macrophages'")
## The samples excluded are
## subset_expt(): There were 68, now there are 14 samples.
data_structures <- c(data_structures, "hs_u937")

6.3 Macrophage parasite data

In the previous block, we used a new invocation of ensembl-derived annotation data, this time we can just use our existing parasite gene annotations.

lp_macrophage <- create_expt(sample_sheet,
                             file_column = "lpanamensisv36hisatfile",
                             gene_info = all_lp_annot,
                             savefile = glue("rda/lp_macrophage-v{ver}.rda"),
                             annotation = "org.Lpanamensis.MHOMCOL81L13.v46.eg.db") %>%
set_expt_conditions(fact = "macrophagezymodeme") %>%
  set_expt_batches(fact = "macrophagetreatment")
## Reading the sample metadata.
## Did not find the condition column in the sample sheet.
## Filling it in as undefined.
## Did not find the batch column in the sample sheet.
## Filling it in as undefined.
## The sample definitions comprises: 69 rows(samples) and 83 columns(metadata fields).
## Warning in create_expt(sample_sheet, file_column = "lpanamensisv36hisatfile", :
## Some samples were removed when cross referencing the samples against the count
## data.
## Matched 8778 annotations and counts.
## Bringing together the count matrix and gene information.
## The final expressionset has 8778 features and 66 samples.
## The numbers of samples by condition are:
## 
## none z2.2 z2.3 
##    8   29   29
## The number of samples by batch are:
## 
##      inf   inf_sb    uninf uninf_sb 
##       29       29        4        4
unfilt_written <- write_expt(
  lp_macrophage,
  excel = glue("analyses/macrophage_de/{ver}/read_counts/lp_macrophage_reads_unfiltered-v{ver}.xlsx"))
## Writing the first sheet, containing a legend and some summary data.
## The following samples have less than 5705.7 genes.
##  [1] "TMRC30066" "TMRC30117" "TMRC30244" "TMRC30246" "TMRC30249" "TMRC30266"
##  [7] "TMRC30268" "TMRC30326" "TMRC30323" "TMRC30319" "TMRC30325" "TMRC30327"
## [13] "TMRC30312" "TMRC30300" "TMRC30304" "TMRC30302" "TMRC30313" "TMRC30309"
## [19] "TMRC30292" "TMRC30331" "TMRC30332" "TMRC30330"
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## 175550 entries are 0.  We are on a log scale, adding 1 to the data.
## 
## Changed 175550 zero count features.
## 
## Naively calculating coefficient of variation/dispersion with respect to condition.
## 
## Finished calculating dispersion estimates.
## 
## `geom_smooth()` using formula = 'y ~ x'
## Error in .fitExtractVarPartModel(exprObj, formula, data, REML = REML,  : 
##   Initial model failed:
## The variables specified in this model are redundant,
## so the design matrix is not full rank
## Error in density.default(x, adjust = adj) : 'x' contains missing values
## Error in density.default(x, adjust = adj) : 'x' contains missing values
## `geom_smooth()` using formula = 'y ~ x'
## Subsetting on features.
## 
## remove_genes_expt(), before removal, there were 8691 genes, now there are 2795.
## 
## There are 58 samples which kept less than 90 percent counts.
## TMRC30051 TMRC30057 TMRC30061 TMRC30062 TMRC30063 TMRC30064 TMRC30065 TMRC30066 
##     36.85     36.34     36.46     39.51     36.21     36.06     39.45     46.54 
## TMRC30067 TMRC30069 TMRC30117 TMRC30162 TMRC30244 TMRC30245 TMRC30246 TMRC30247 
##     36.26     36.61     52.21     36.29     45.60     36.41     47.17     36.38 
## TMRC30248 TMRC30249 TMRC30250 TMRC30251 TMRC30252 TMRC30267 TMRC30286 TMRC30316 
##     36.26     41.02     36.33     37.60     36.40     36.21     36.30     36.21 
## TMRC30317 TMRC30322 TMRC30323 TMRC30328 TMRC30318 TMRC30319 TMRC30324 TMRC30325 
##     36.22     36.31     55.67     36.46     36.39     49.93     36.51     51.17 
## TMRC30320 TMRC30321 TMRC30297 TMRC30298 TMRC30299 TMRC30300 TMRC30295 TMRC30296 
##     36.37     36.50     36.59     37.11     36.34     41.32     36.70     37.96 
## TMRC30303 TMRC30304 TMRC30301 TMRC30302 TMRC30314 TMRC30315 TMRC30293 TMRC30294 
##     36.56     50.00     36.86     43.80     36.60     37.24     36.40     36.63 
## TMRC30291 TMRC30292 TMRC30307 TMRC30308 TMRC30310 TMRC30331 TMRC30311 TMRC30332 
##     36.15     41.53     36.80     37.43     36.14     42.76     36.95     44.27 
## TMRC30305 TMRC30306 
##     36.67     37.29 
## Error in .fitExtractVarPartModel(exprObj, formula, data, REML = REML,  : 
##   Initial model failed:
## The variables specified in this model are redundant,
## so the design matrix is not full rank
## Retrying with only condition in the model.
lp_macrophage_filt <- subset_expt(lp_macrophage, nonzero = 2500) %>%
  semantic_expt_filter(semantic = c("amastin", "gp63", "leishmanolysin"),
                       semantic_column = "annot_gene_product")
## The samples (and read coverage) removed when filtering 2500 non-zero genes are: 
## subset_expt(): There were 66, now there are 50 samples.
## semantic_expt_filter(): Removed 68 genes.
data_structures <- c(data_structures, "lp_macrophage", "lp_macrophage_filt")
filt_written <- write_expt(lp_macrophage_filt,
  excel = glue("analyses/macrophage_de/{ver}/read_counts/lp_macrophage_reads_filtered-v{ver}.xlsx"))
## Writing the first sheet, containing a legend and some summary data.
## The following samples have less than 5661.5 genes.
## [1] "TMRC30249" "TMRC30300" "TMRC30302" "TMRC30292" "TMRC30331" "TMRC30332"
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## 44583 entries are 0.  We are on a log scale, adding 1 to the data.
## 
## Changed 44583 zero count features.
## 
## Naively calculating coefficient of variation/dispersion with respect to condition.
## 
## Finished calculating dispersion estimates.
## 
## `geom_smooth()` using formula = 'y ~ x'
## Error in density.default(x, adjust = adj) : 'x' contains missing values
## Error in density.default(x, adjust = adj) : 'x' contains missing values
## `geom_smooth()` using formula = 'y ~ x'
## Subsetting on features.
## 
## remove_genes_expt(), before removal, there were 8623 genes, now there are 8601.
lp_macrophage <- lp_macrophage_filt

lp_macrophage_nosb <- subset_expt(lp_macrophage, subset="batch!='inf_sb'")
## The samples excluded are: TMRC30051, TMRC30062, TMRC30065, TMRC30069, TMRC30248, TMRC30249, TMRC30251, TMRC30252, TMRC30317, TMRC30321, TMRC30298, TMRC30300, TMRC30296, TMRC30302, TMRC30315, TMRC30294, TMRC30292, TMRC30308, TMRC30331, TMRC30332, TMRC30306.
## subset_expt(): There were 50, now there are 29 samples.
lp_nosb_write <- write_expt(
  lp_macrophage_nosb,
  excel = glue("analyses/macrophage_de/{ver}/read_counts/lp_macrophage_nosb_reads-v{ver}.xlsx"))
## Writing the first sheet, containing a legend and some summary data.
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.6396 entries are 0.  We are on a log scale, adding 1 to the data.
## Changed 6396 zero count features.
## Naively calculating coefficient of variation/dispersion with respect to condition.
## Finished calculating dispersion estimates.
## `geom_smooth()` using formula = 'y ~ x'
## Error in filterInputData(exprObj, formula, data, useWeights = useWeights) : 
##   Variable in formula not found in data: condition
## `geom_smooth()` using formula = 'y ~ x'
## varpart sees only 1 batch, adjusting the model accordingly.
## 
## Subsetting on features.
## 
## remove_genes_expt(), before removal, there were 8591 genes, now there are 8591.
## Error in filterInputData(exprObj, formula, data, useWeights = useWeights) : 
##   Variable in formula not found in data: condition
## Retrying with only condition in the model.
data_structures <- c(data_structures, "lp_macrophage_nosb")

spec <- make_rnaseq_spec()
test <- sm(gather_preprocessing_metadata(sample_sheet, specification = spec))

7 Save all data structures into one rda

save(list = data_structures, file = glue("rda/tmrc2_data_structures-v{ver}.rda"))
pander::pander(sessionInfo())

R version 4.3.3 (2024-02-29)

Platform: x86_64-conda-linux-gnu (64-bit)

locale: C

attached base packages: stats4, stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: ruv(v.0.9.7.1), BiocParallel(v.1.36.0), variancePartition(v.1.32.5), org.Lpanamensis.MHOMCOL81L13.v67.eg.db(v.2024.04), AnnotationDbi(v.1.64.1), futile.logger(v.1.4.3), EuPathDB(v.1.6.0), GenomeInfoDbData(v.1.2.11), dplyr(v.1.1.4), Heatplus(v.3.10.0), ggplot2(v.3.5.0), hpgltools(v.1.0), Matrix(v.1.6-5), glue(v.1.7.0), SummarizedExperiment(v.1.32.0), GenomicRanges(v.1.54.1), GenomeInfoDb(v.1.38.8), IRanges(v.2.36.0), S4Vectors(v.0.40.2), MatrixGenerics(v.1.14.0), matrixStats(v.1.2.0), Biobase(v.2.62.0) and BiocGenerics(v.0.48.1)

loaded via a namespace (and not attached): fs(v.1.6.3), bitops(v.1.0-7), lubridate(v.1.9.3), doParallel(v.1.0.17), HDO.db(v.0.99.1), httr(v.1.4.7), RColorBrewer(v.1.1-3), numDeriv(v.2016.8-1.1), backports(v.1.4.1), tools(v.4.3.3), utf8(v.1.2.4), R6(v.2.5.1), lazyeval(v.0.2.2), mgcv(v.1.9-1), withr(v.3.0.0), gridExtra(v.2.3), prettyunits(v.1.2.0), cli(v.3.6.2), formatR(v.1.14), AnnotationHubData(v.1.32.1), labeling(v.0.4.3), sass(v.0.4.9), mvtnorm(v.1.2-4), genefilter(v.1.84.0), readr(v.2.1.5), Rsamtools(v.2.18.0), yulab.utils(v.0.1.4), DOSE(v.3.28.2), stringdist(v.0.9.12), AnnotationForge(v.1.44.0), limma(v.3.58.1), RSQLite(v.2.3.6), generics(v.0.1.3), BiocIO(v.1.12.0), gtools(v.3.9.5), vroom(v.1.6.5), zip(v.2.3.1), GO.db(v.3.18.0), fansi(v.1.0.6), abind(v.1.4-5), lifecycle(v.1.0.4), yaml(v.2.3.8), edgeR(v.4.0.16), gplots(v.3.1.3.1), biocViews(v.1.70.0), qvalue(v.2.34.0), SparseArray(v.1.2.4), BiocFileCache(v.2.10.2), Rtsne(v.0.17), grid(v.4.3.3), blob(v.1.2.4), promises(v.1.3.0), crayon(v.1.5.2), lattice(v.0.22-6), cowplot(v.1.1.3), GenomicFeatures(v.1.54.4), annotate(v.1.80.0), KEGGREST(v.1.42.0), pillar(v.1.9.0), knitr(v.1.46), varhandle(v.2.0.6), fgsea(v.1.28.0), rjson(v.0.2.21), boot(v.1.3-30), corpcor(v.1.6.10), codetools(v.0.2-20), fastmatch(v.1.1-4), data.table(v.1.15.4), vctrs(v.0.6.5), png(v.0.1-8), Rdpack(v.2.6), testthat(v.3.2.1), gtable(v.0.3.4), cachem(v.1.0.8), xfun(v.0.43), openxlsx(v.4.2.5.2), rbibutils(v.2.2.16), S4Arrays(v.1.2.1), mime(v.0.12), survival(v.3.5-8), iterators(v.1.0.14), statmod(v.1.5.0), directlabels(v.2024.1.21), interactiveDisplayBase(v.1.40.0), nlme(v.3.1-164), pbkrtest(v.0.5.2), bit64(v.4.0.5), EnvStats(v.2.8.1), progress(v.1.2.3), filelock(v.1.0.3), rprojroot(v.2.0.4), bslib(v.0.7.0), KernSmooth(v.2.23-22), colorspace(v.2.1-0), DBI(v.1.2.2), tidyselect(v.1.2.1), bit(v.4.0.5), compiler(v.4.3.3), curl(v.5.2.1), rvest(v.1.0.4), httr2(v.1.0.1), graph(v.1.80.0), BiocCheck(v.1.38.2), xml2(v.1.3.6), desc(v.1.4.3), DelayedArray(v.0.28.0), plotly(v.4.10.4), rtracklayer(v.1.62.0), scales(v.1.3.0), caTools(v.1.18.2), remaCor(v.0.0.18), quadprog(v.1.5-8), RBGL(v.1.78.0), rappdirs(v.0.3.3), stringr(v.1.5.1), digest(v.0.6.35), ggsankey(v.0.0.99999), minqa(v.1.2.6), rmarkdown(v.2.26), aod(v.1.3.3), XVector(v.0.42.0), RhpcBLASctl(v.0.23-42), htmltools(v.0.5.8.1), pkgconfig(v.2.0.3), lme4(v.1.1-35.2), highr(v.0.10), dbplyr(v.2.3.4), fastmap(v.1.1.1), rlang(v.1.1.3), htmlwidgets(v.1.6.4), shiny(v.1.8.1.1), farver(v.2.1.1), jquerylib(v.0.1.4), jsonlite(v.1.8.8), GOSemSim(v.2.28.1), RCurl(v.1.98-1.14), magrittr(v.2.0.3), munsell(v.0.5.1), Rcpp(v.1.0.12), stringi(v.1.8.3), brio(v.1.1.4), zlibbioc(v.1.48.2), MASS(v.7.3-60.0.1), AnnotationHub(v.3.10.1), plyr(v.1.8.9), parallel(v.4.3.3), ggrepel(v.0.9.5), Biostrings(v.2.70.3), splines(v.4.3.3), pander(v.0.6.5), hms(v.1.1.3), locfit(v.1.5-9.9), RUnit(v.0.4.33), fastcluster(v.1.2.6), reshape2(v.1.4.4), biomaRt(v.2.58.2), pkgload(v.1.3.4), futile.options(v.1.0.1), BiocVersion(v.3.18.1), XML(v.3.99-0.16.1), evaluate(v.0.23), lambda.r(v.1.2.4), BiocManager(v.1.30.22), nloptr(v.2.0.3), tzdb(v.0.4.0), foreach(v.1.5.2), httpuv(v.1.6.15), tidyr(v.1.3.1), purrr(v.1.0.2), BiocBaseUtils(v.1.4.0), broom(v.1.0.5), xtable(v.1.8-4), restfulr(v.0.0.15), fANCOVA(v.0.6-1), later(v.1.3.2), viridisLite(v.0.4.2), OrganismDbi(v.1.44.0), tibble(v.3.2.1), lmerTest(v.3.1-3), memoise(v.2.0.1), GenomicAlignments(v.1.38.2), sva(v.3.50.0), timechange(v.0.3.0) and GSEABase(v.1.64.0)

message("This is hpgltools commit: ", get_git_commit())
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 31bacdf03b6f37f1277e44be276420f3db4d4bd6
## This is hpgltools commit: Mon Apr 8 13:21:41 2024 -0400: 31bacdf03b6f37f1277e44be276420f3db4d4bd6
message("Saving to ", savefile)
## Saving to 01datasets.rda.xz
# tmp <- sm(saveme(filename = savefile))
tmp <- loadme(filename = savefile)
