Note, I am using my previous worksheet from when last I worked with Rezia as a template for this (I copied it from there and am modifying it now).

1 S.pyogenes 5448 rofA RNASeq version: 202301

1.1 Preprocessing

I used my cyoa tool to process these samples by doing the following:

  1. Copying the data from the sequencer into the directory ‘preprocessing/’
  2. Used a slightly involved shell command to create a directory for each sample and copy the reads for it to the ‘unprocessed/’ subdirectory within it.
  3. invoked the following:
cd preprocessing
start=$(pwd)
for i in $(/bin/ls -d ./*)
do
  cd $i
  rm -rf outputs scripts
  cyoa --task pipe --method prnas --species spyogenes_5448 \
       --gff_type gene --gff_tag locus_tag \
       --input $(/bin/ls unprocessed/* | tr '\n' ':' | sed 's/:$//g')
  cd $start
done

The above for loop goes into each sample and does the following:

  1. Trims the data, heavily compresses the outputs.
  2. Runs fastqc
  3. Runs hisat2 using my spyogenes_5448 indices.
  4. Converts the sam alignment to sorted/indexed bam.
  5. Makes a couple of extra copies of it with some filters.
  6. Compresses the aligned/unaligned reads.
  7. Runs htseq-count on the alignments to count reads/gene.

Note the following steps were not actually run because I had a speeling error. But since they are not necessary for the explicitly RNASeq analyses I first want to do, I ignored it. I am curious though to see if there are other mutations in these strains, so I will likely run those portions manually.

  1. Runs freebayes on the alignments to look for variants.
  2. Sorts/compresses the freebayes output.
  3. Does some parsing of the freebayes output and provides some tables about where mutations were observed.

1.2 Collect annotation information

Same two primary annotation sources, the gff file used for mapping/counting, and microbesonline.org. Note that since I moved to just downloading the material from the web interface, I no longer have a handy method to get the taxon ID, so I go there and hunt down the taxId manually.

Now that I am thinking about it, my 5448 genome/annotations are kind of old, I will ask and check to see if there is anything newer.

Also, 5448 does not have an entry at microbesonline.org, a fact which I forgot. I need to go poking in my notes to reconnect 5005 and 5448.

gff_annot <- load_gff_annotations("reference/spyogenes_5448.gff", type = "gene")
## Trying attempt: rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = TRUE)
## Trying attempt: rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = FALSE)
## Had a successful gff import with rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = FALSE)
## Returning a df with 14 columns and 1814 rows.
rownames(gff_annot) <- gff_annot[["locus_tag"]]
head(gff_annot)
##              seqnames start  end width strand                 source type score
## SP5448_00005 CP008776   232 1587  1356      + EMBL/GenBank/SwissProt gene    NA
## SP5448_00010 CP008776  1742 2878  1137      + EMBL/GenBank/SwissProt gene    NA
## SP5448_00015 CP008776  2953 3150   198      + EMBL/GenBank/SwissProt gene    NA
## SP5448_00020 CP008776  3480 4595  1116      + EMBL/GenBank/SwissProt gene    NA
## SP5448_00025 CP008776  4665 5234   570      + EMBL/GenBank/SwissProt gene    NA
## SP5448_00030 CP008776  5237 8740  3504      + EMBL/GenBank/SwissProt gene    NA
##              phase    locus_tag gene gene_synonym note pseudo
## SP5448_00005     1 SP5448_00005 <NA>              <NA>   <NA>
## SP5448_00010     1 SP5448_00010 <NA>              <NA>   <NA>
## SP5448_00015     1 SP5448_00015 <NA>              <NA>   <NA>
## SP5448_00020     1 SP5448_00020 <NA>              <NA>   <NA>
## SP5448_00025     1 SP5448_00025 <NA>              <NA>   <NA>
## SP5448_00030     1 SP5448_00030 <NA>              <NA>   <NA>
mgas_data <- load_genbank_annotations(accession="CP008776")
## Loading required namespace: rentrez
## Done Parsing raw GenBank file text. [ 14.042 seconds ]
## 2023-02-01 14:06:06 Starting creation of gene GRanges
## 2023-02-01 14:06:07 Starting creation of CDS GRanges
## 2023-02-01 14:06:09 Starting creation of exon GRanges
## No exons read from genbank file. Assuming sections of CDS are full exons
## 2023-02-01 14:06:09 Starting creation of variant VRanges
## 2023-02-01 14:06:09 Starting creation of transcript GRanges
## No transcript features (mRNA) found, using spans of CDSs
## 2023-02-01 14:06:09 Starting creation of misc feature GRanges
## Warning in fill_stack_df(feats[!typs %in% c("gene", "exon", "CDS",
## "variation", : Got unexpected multi-value field(s) [ inference ]. The resulting
## column(s) will be of class CharacterList, rather than vector(s). Please contact
## the maintainer if multi-valuedness is expected/meaningful for the listed
## field(s).
## 2023-02-01 14:06:09 - Done creating GenBankRecord object [ 3.937 seconds ]
genome_size <- GenomicRanges::width(mgas_data$seq)  ## This fails on travis?
mgas_cds <- as.data.frame(mgas_data$cds)
## Get rid of amino acid sequence
rownames(mgas_cds) <- mgas_cds[["locus_tag"]]
wanted <- ! colnames(mgas_cds) %in% c("translation", "type", "strand", "seqnames", "start", "end", "locus_tag", "note", "gene", "gene_synonym", "width")
mgas_cds <- mgas_cds[, wanted]
## And EC_number because wtf is that?
mgas_annot <- merge(mgas_cds, gff_annot, by="row.names")
rownames(mgas_annot) <- mgas_annot[["Row.names"]]
mgas_annot[["Row.names"]] <- NULL

2 5005 Annotations

I remapped the data using 5005 because the annotations are much better.

ref_gff_annot <- load_gff_annotations("reference/spyogenes_5005.gff", type="CDS")
## Trying attempt: rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = TRUE)
## Trying attempt: rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = FALSE)
## Had a successful gff import with rtracklayer::import.gff3(gff, sequenceRegionsAsSeqinfo = FALSE)
## Returning a df with 19 columns and 1841 rows.
rownames(ref_gff_annot) <- ref_gff_annot[["locus_tag"]]

microbes_annot <- load_microbesonline_annotations(species="5005")
## Found 1 entry.
## Streptococcus pyogenes MGAS5005Firmicutesyes2005-08-25yes101950293653
## The species being downloaded is: Streptococcus pyogenes MGAS5005
## Downloading: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=293653;export=tab
microbes_df <- as.data.frame(microbes_annot)
rownames(microbes_df) <- make.names(gsub(x=microbes_df[["sysName"]], pattern="Spy_", replacement="Spy"), unique=TRUE)

s5005_annot <- merge(ref_gff_annot, microbes_df, by="row.names")
rownames(s5005_annot) <- s5005_annot[["Row.names"]]
s5005_annot[["Row.names"]] <- NULL

microbes_go <- load_microbesonline_go(species="5005", id_column="sysName")
## Found 1 entry.
## Streptococcus pyogenes MGAS5005Firmicutesyes2005-08-25yes101950293653
## The species being downloaded is: Streptococcus pyogenes MGAS5005 and is being downloaded as 293653.tab.
microbes_go[["sysName"]] <- gsub(x=microbes_go[["sysName"]], pattern="Spy_", replacement="Spy")
colnames(microbes_go) <- c("ID", "GO")
single_hits <- readr::read_tsv("single_multi/outputs/fasta_spyogenes_5448_cds_spyogenes_5005_cds/spyogenes_5448_cds_singles.txt")
## Rows: 1340 Columns: 2
## ── Column specification ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (2): AKK69518.1, AAZ50620.1:1.000:0
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
colnames(single_hits) <- c("from", "to")
single_hits[["to"]] <- gsub(pattern="(^.*?):.*$", replacement="\\1",
                            x=single_hits[["to"]], perl=TRUE)
## The gff annotation has the protein_id column matching this cds entry.
single_both <- merge(mgas_annot, single_hits, by.x="protein_id", by.y="from", all.x=TRUE)
single_both <- merge(single_both, ref_gff_annot, by.x="to", by.y="protein_id", all.x=TRUE)

## Now pick up the missing old_locus_tags and put in the 5448 IDs
missing_ids <- is.na(single_both[["old_locus_tag"]])
new_ids <- single_both[missing_ids, "locus_tag.x"]
single_both[missing_ids, "old_locus_tag"] <- new_ids

single_both <- merge(single_both, microbes_annot, by.x="old_locus_tag", by.y="sysName", all.x=TRUE)
rownames(single_both) <- make.names(single_both[["gene_id"]], unique=TRUE)

2.1 Create expressionSet

Note that I did the following to the sample sheet provided by Dr. McIver:

  1. Changed the dP_R20_4 sample to dP_R20_2 (The original sample name is still there are the original column).
  2. Added columns at the end containing the count table locations.
  3. Added columns ‘short_media’, ‘growth_phase’, ‘genotype’ which hopefully contain the relevant metadata extracted from the sample descriptions.
  4. Added a column ‘Experiment’ which is either ‘rofA’ or ‘pdxR’.
rofa_expt <- create_expt(metadata="sample_sheets/all_samples.xlsx",
                        gene_info=s5005_annot, file_column="spyogenes5005genecounts") %>%
  subset_expt(subset="experiment=='rofA'") %>%
  set_expt_conditions(fact="genotype") %>%
  set_expt_batches(fact="growthphase")
## Reading the sample metadata.
## Did not find the condition column in the sample sheet.
## Filling it in as undefined.
## Did not find the batch column in the sample sheet.
## Filling it in as undefined.
## The sample definitions comprises: 72 rows(samples) and 27 columns(metadata fields).
## Matched 1841 annotations and counts.
## Bringing together the count matrix and gene information.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the expressionset to 'expt.rda'.
## The final expressionset has 1926 features and 72 samples.
## subset_expt(): There were 72, now there are 36 samples.
## 
##  delta revert     WT 
##     12     12     12 
## 
## exponential  stationary  transition 
##          12          12          12

I have 36 samples to play with, let us see what they look like.

2.2 Poke expressionSet

rofa_libsize <- plot_libsize(rofa_expt)
rofa_libsize$plot

rofa_filter_plot <- plot_libsize_prepost(rofa_expt)
rofa_filter_plot$count_plot

rofa_filter_plot$lowgene_plot
## Warning: Using alpha for a discrete variable is not advised.

rofa_nonzero <- plot_nonzero(rofa_expt)
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
rofa_nonzero$plot
## Warning: ggrepel: 5 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps

## awesome, there is a range of ~ 25 genes.

written <- write_expt(rofa_expt, excel = glue::glue("excel/rofA_expt-v{ver}.xlsx"))
## Deleting the file excel/rofA_expt-v202301.xlsx before writing the tables.
## Writing the first sheet, containing a legend and some summary data.
## `geom_smooth()` using formula = 'y ~ x'Loading required package: Matrix
## 
## Attaching package: 'Matrix'
## 
## The following object is masked from 'package:S4Vectors':
## 
##     expand
## 
## 
## Total:14 s
## `geom_smooth()` using formula = 'y ~ x'
## Total:13 s

2.3 Quick visualizations

2.3.1 Without considering time

Let us start with some views of the data without thinking about batch effects.

rofa_norm <- normalize_expt(rofa_expt, filter=TRUE, norm="quant", convert="cpm", transform="log2")
## Removing 103 low-count genes (1823 remaining).
rofa_pca <- plot_pca(rofa_norm)
rofa_pca$plot
## Warning: The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
## The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?

rofa_heatmap <- plot_disheat(rofa_norm)
rofa_heatmap$plot

2.3.2 Considering time

Repeat the previous plots, but this time using limma’s batch removal method (which is just a residuals).

rofa_time <- set_expt_conditions(rofa_expt, fact="growthphase") %>%
  set_expt_batches(fact="genotype")
## 
## exponential  stationary  transition 
##          12          12          12 
## 
##  delta revert     WT 
##     12     12     12
rofa_time_norm <- normalize_expt(rofa_time, filter=TRUE, norm="quant", convert="cpm",
                                 transform="log2")
## Removing 103 low-count genes (1823 remaining).
rofa_time_pca <- plot_pca(rofa_time_norm)
rofa_time_pca$plot
## Warning: The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
## The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?

rofa_time_nb <- normalize_expt(rofa_time, filter=TRUE, norm="quant", convert="cpm",
                             transform="log2", batch="limma")
## Removing 103 low-count genes (1823 remaining).
## If you receive a warning: 'NANs produced', one potential reason is that the data was quantile normalized.
## Setting 46 low elements to zero.
## transform_counts: Found 46 values equal to 0, adding 1 to the matrix.
rofa_time_nb_pca <- plot_pca(rofa_time_nb)
rofa_time_nb_pca$plot
## Warning: The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?
## The following aesthetics were dropped during statistical transformation: text
## ℹ This can happen when ggplot fails to infer the correct grouping structure in the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical variable into a factor?

Well, it is pretty clear that time is the dominant factor.

2.4 Differential Expression analyses

I am going to do the DE in 4 separate pieces:

  1. Only compare strains
  2. Only compare times
  3. Compare the concatenation of strains and times.

2.4.1 Strain only comparisons

strain_de <- all_pairwise(rofa_expt, model_batch=TRUE, filter=TRUE)
## This DE analysis will perform all pairwise comparisons among:
## 
##  delta revert     WT 
##     12     12     12
## This analysis will include a batch factor in the model comprised of:
## 
## exponential  stationary  transition 
##          12          12          12
## This will pre-filter the input data using normalize_expt's: TRUE argument.
## Using limma's removeBatchEffect to visualize with(out) batch inclusion.
## Finished running DE analyses, collecting outputs.
## Comparing analyses.

strain_keepers <- list(
    "delta_wt" = c("delta", "WT"),
    "revert_wt" = c("revert", "WT"),
    "delta_revert" = c("delta", "revert"))
strain_tables <- combine_de_tables(
    strain_de, keepers = strain_keepers,
    excel = glue::glue("excel/rofa_strain_tables-v{ver}.xlsx"))
## Deleting the file excel/rofa_strain_tables-v202301.xlsx before writing the tables.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
strain_sig <- extract_significant_genes(
    strain_tables,
    excel = glue::glue("excel/rofa_strain_sig-v{ver}.xlsx"))
## Deleting the file excel/rofa_strain_sig-v202301.xlsx before writing the tables.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WT_vs_revert according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revert_vs_delta according to the columns: logFC and adj.P.Val using the expressionset colors.

2.4.1.1 Ontology search of delta vs wt

ups <- strain_sig[["deseq"]][["ups"]][["delta_wt"]]
downs <- strain_sig[["deseq"]][["downs"]][["delta_wt"]]
both <- rbind(ups, downs)
length_db <- fData(rofa_expt)[, c("start.y", "stop")]
length_db[["ID"]] <- rownames(length_db)
undef_idx <- length_db[["stop"]] == "undefined"
length_db <- length_db[!undef_idx, ]
length_db[["length"]] <- abs(as.numeric(length_db[["stop"]]) - as.numeric(length_db[["start.y"]]))
test <- simple_goseq(both, go_db = microbes_go, length_db = length_db)
## Found 25 go_db genes and 42 length_db genes out of 46.
## Testing that go categories are defined.
## Removing undefined categories.
## Gathering synonyms.
## Gathering category definitions.
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
library(enrichplot)
dotplot(test$bp_enrich)

2.4.2 Time only comparisons

time_de <- all_pairwise(rofa_time, model_batch=TRUE)
## This DE analysis will perform all pairwise comparisons among:
## 
## exponential  stationary  transition 
##          12          12          12
## This analysis will include a batch factor in the model comprised of:
## 
##  delta revert     WT 
##     12     12     12
## Using limma's removeBatchEffect to visualize with(out) batch inclusion.
## Finished running DE analyses, collecting outputs.
## Comparing analyses.

time_keepers <- list(
    "transition_exponential" = c("transition", "exponential"),
    "stationary_exponential" = c("stationary", "exponential"),
    "stationary_transition" = c("stationary", "transition"))
time_tables <- combine_de_tables(
    time_de, keepers = time_keepers,
    excel = glue::glue("excel/rofa_time_tables-v{ver}.xlsx"))
## Deleting the file excel/rofa_time_tables-v202301.xlsx before writing the tables.
## Starting combine_extracted_plots() with do_inverse as: FALSE.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: FALSE.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and adj.P.Val using the expressionset colors.
time_sig <- extract_significant_genes(
    time_tables,
    excel = glue::glue("excel/rofa_time_sig-v{ver}.xlsx"))
## Deleting the file excel/rofa_time_sig-v202301.xlsx before writing the tables.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of stationary_vs_exponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of transition_vs_stationary according to the columns: logFC and adj.P.Val using the expressionset colors.

2.4.2.1 Ontology search of stationary vs exponential

ups <- time_sig[["deseq"]][["ups"]][["stationary_exponential"]]
downs <- time_sig[["deseq"]][["downs"]][["stationary_exponential"]]
both <- rbind(ups, downs)
up_time_goseq <- simple_goseq(ups, go_db = microbes_go, length_db = length_db)
## Found 255 go_db genes and 376 length_db genes out of 379.
## Testing that go categories are defined.
## Removing undefined categories.
## Gathering synonyms.
## Gathering category definitions.
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
fc_values <- time_tables[["data"]][["stationary_exponential"]][["deseq_logfc"]]
names(fc_values) <- rownames(time_tables[["data"]][["stationary_exponential"]])
head(fc_values)
## M5005_Spy0001 M5005_Spy0002 M5005_Spy0003 M5005_Spy0004 M5005_Spy0005 
##       -0.1191       -0.1996       -1.2100       -0.2377       -0.8275 
## M5005_Spy0006 
##       -0.9872
dotplot(up_time_goseq$mf_enrich)

cnetplot(up_time_goseq$mf_enrich, categorysize="pvalue", foldChange=fc_values)
## Warning in cnetplot.enrichResult(x, ...): Use 'color.params = list(foldChange = your_value)' instead of 'foldChange'.
##  The foldChange parameter will be removed in the next version.
## Scale for size is already present.
## Adding another scale for size, which will replace the existing scale.

heatplot(up_time_goseq$mf_enrich, foldChange=fc_values)

term_sim <- pairwise_termsim(up_time_goseq$mf_enrich)
treeplot(term_sim)

emapplot(term_sim)

both_time_goseq <- simple_goseq(both, go_db = microbes_go, length_db = length_db)
## Found 378 go_db genes and 533 length_db genes out of 541.
## Testing that go categories are defined.
## Removing undefined categories.
## Gathering synonyms.
## Gathering category definitions.
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
## The score column is null, defaulting to score.
## Possible columns are:
## [1] "category"                 "over_represented_pvalue" 
## [3] "under_represented_pvalue" "numDEInCat"              
## [5] "numInCat"                 "term"                    
## [7] "ontology"                 "qvalue"
## The over_represented_pvalue column is null, defaulting to score.
## Possible columns are:
## [1] "term"    "pvalue"  "score"   "num_de"  "num_cat"
dotplot(both_time_goseq$mf_enrich)

2.4.2.2 Compare times vs strains

I always worry that comparing a data set across multiple conditions results in not what I think it will. Let us therefore plot the logFCs of likely contrasts against each other.

x_axis <- time_tables[["data"]][["transition_exponential"]][, c("deseq_logfc", "deseq_adjp")]
y_axis <- strain_tables[["data"]][["delta_wt"]][, c("deseq_logfc", "deseq_adjp")]
both <- merge(x_axis, y_axis, by="row.names")
rownames(both) <- both[["Row.names"]]
both[["Row.names"]] <- NULL
cor.test(both[["deseq_logfc.x"]], both[["deseq_logfc.y"]], method="spearman")
## Warning in cor.test.default(both[["deseq_logfc.x"]], both[["deseq_logfc.y"]], :
## Cannot compute exact p-value with ties
## 
##  Spearman's rank correlation rho
## 
## data:  both[["deseq_logfc.x"]] and both[["deseq_logfc.y"]]
## S = 6.2e+08, p-value <2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##    rho 
## 0.3811
plotted <- plot_linear_scatter(both[, c("deseq_logfc.x", "deseq_logfc.y")])
## Warning in plot_multihistogram(df): NAs introduced by coercion
plotted$scatter

2.4.3 Interaction model

I want to make sure that my methods of performing interaction models work as I think it does, and this data set looks to me to be a perfect place to test that.

combined_factors <- paste0(pData(rofa_expt)[["genotype"]], "_",
                           pData(rofa_expt)[["growthphase"]])
combined_factors
##  [1] "WT_transition"      "WT_transition"      "WT_transition"     
##  [4] "WT_transition"      "delta_transition"   "delta_transition"  
##  [7] "delta_transition"   "delta_transition"   "revert_transition" 
## [10] "revert_transition"  "revert_transition"  "revert_transition" 
## [13] "WT_exponential"     "WT_exponential"     "WT_exponential"    
## [16] "WT_exponential"     "delta_exponential"  "delta_exponential" 
## [19] "delta_exponential"  "delta_exponential"  "revert_exponential"
## [22] "revert_exponential" "revert_exponential" "WT_stationary"     
## [25] "WT_stationary"      "WT_stationary"      "delta_stationary"  
## [28] "delta_stationary"   "delta_stationary"   "delta_stationary"  
## [31] "revert_stationary"  "revert_stationary"  "revert_stationary" 
## [34] "revert_stationary"  "revert_exponential" "WT_stationary"
combined_expt <- set_expt_conditions(rofa_expt, fact = combined_factors)
## 
##  delta_exponential   delta_stationary   delta_transition revert_exponential 
##                  4                  4                  4                  4 
##  revert_stationary  revert_transition     WT_exponential      WT_stationary 
##                  4                  4                  4                  4 
##      WT_transition 
##                  4
combined_de <- all_pairwise(combined_expt, model_batch = "svaseq", surrogates = 2,
                            filter = TRUE, parallel = FALSE)
## This DE analysis will perform all pairwise comparisons among:
## 
##  delta_exponential   delta_stationary   delta_transition revert_exponential 
##                  4                  4                  4                  4 
##  revert_stationary  revert_transition     WT_exponential      WT_stationary 
##                  4                  4                  4                  4 
##      WT_transition 
##                  4
## This analysis will include surrogate estimates from: svaseq.
## This will pre-filter the input data using normalize_expt's: TRUE argument.
## Removing 0 low-count genes (1823 remaining).
## Setting 3755 low elements to zero.
## transform_counts: Found 3755 values equal to 0, adding 1 to the matrix.
## Starting basic_pairwise().
## Starting basic pairwise comparison.
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## Basic step 0/3: Transforming data.
## Basic step 1/3: Creating mean and variance tables.
## Basic step 2/3: Performing 45 comparisons.
## Basic step 3/3: Creating faux DE Tables.
## Basic: Returning tables.
## Starting deseq_pairwise().
## Starting DESeq2 pairwise comparisons.
## The data should be suitable for EdgeR/DESeq/EBSeq. If they freak out, check the state of the count table and ensure that it is in integer counts.
## Including batch estimates from sva/ruv/pca in the model.
## Choosing the non-intercept containing model.
## DESeq2 step 1/5: Including a matrix of batch estimates in the deseq model.
## converting counts to integer mode
## DESeq2 step 2/5: Estimate size factors.
## DESeq2 step 3/5: Estimate dispersions.
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Using a parametric fitting seems to have worked.
## DESeq2 step 4/5: nbinomWaldTest.
## Starting edger_pairwise().
## Starting edgeR pairwise comparisons.
## The data should be suitable for EdgeR/DESeq/EBSeq. If they freak out, check the state of the count table and ensure that it is in integer counts.
## Including batch estimates from sva/ruv/pca in the model.
## Choosing the non-intercept containing model.
## EdgeR step 1/9: Importing and normalizing data.
## EdgeR step 2/9: Estimating the common dispersion.
## EdgeR step 3/9: Estimating dispersion across genes.
## EdgeR step 4/9: Estimating GLM Common dispersion.
## EdgeR step 5/9: Estimating GLM Trended dispersion.
## EdgeR step 6/9: Estimating GLM Tagged dispersion.
## EdgeR step 7/9: Running glmFit, switch to glmQLFit by changing the argument 'edger_test'.
## EdgeR step 8/9: Making pairwise contrasts.

## Starting limma_pairwise().
##                        Length Class         Mode     
## title                   1     -none-        character
## notes                   1     -none-        character
## initial_metadata       28     data.frame    list     
## expressionset           1     ExpressionSet S4       
## design                 28     data.frame    list     
## conditions             36     -none-        character
## batches                36     -none-        character
## samplenames            36     -none-        character
## colors                 36     -none-        character
## state                   5     -none-        list     
## libsize                36     -none-        numeric  
## original_expressionset  1     ExpressionSet S4       
## normalized              6     -none-        list     
## best_libsize           36     -none-        numeric  
## norm_result             6     -none-        list
## Starting limma pairwise comparison.
## libsize was not specified, this parameter has profound effects on limma's result.
## Using the libsize from expt$best_libsize.
## Limma step 1/6: choosing model.
## Including batch estimates from sva/ruv/pca in the model.
## Choosing the non-intercept containing model.
## Limma step 2/6: running limma::voom(), switch with the argument 'which_voom'.
## Using normalize.method = quantile for voom.

## Limma step 3/6: running lmFit with method: ls.
## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)
## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.
## Limma step 6/6: Writing limma outputs.
## Limma step 6/6: 1/36: Creating table: deltastationary_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 2/36: Creating table: deltatransition_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 3/36: Creating table: revertexponential_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 4/36: Creating table: revertstationary_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 5/36: Creating table: reverttransition_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 6/36: Creating table: WTexponential_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 7/36: Creating table: WTstationary_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 8/36: Creating table: WTtransition_vs_deltaexponential.  Adjust = BH
## Limma step 6/6: 9/36: Creating table: deltatransition_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 10/36: Creating table: revertexponential_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 11/36: Creating table: revertstationary_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 12/36: Creating table: reverttransition_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 13/36: Creating table: WTexponential_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 14/36: Creating table: WTstationary_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 15/36: Creating table: WTtransition_vs_deltastationary.  Adjust = BH
## Limma step 6/6: 16/36: Creating table: revertexponential_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 17/36: Creating table: revertstationary_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 18/36: Creating table: reverttransition_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 19/36: Creating table: WTexponential_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 20/36: Creating table: WTstationary_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 21/36: Creating table: WTtransition_vs_deltatransition.  Adjust = BH
## Limma step 6/6: 22/36: Creating table: revertstationary_vs_revertexponential.  Adjust = BH
## Limma step 6/6: 23/36: Creating table: reverttransition_vs_revertexponential.  Adjust = BH
## Limma step 6/6: 24/36: Creating table: WTexponential_vs_revertexponential.  Adjust = BH
## Limma step 6/6: 25/36: Creating table: WTstationary_vs_revertexponential.  Adjust = BH
## Limma step 6/6: 26/36: Creating table: WTtransition_vs_revertexponential.  Adjust = BH
## Limma step 6/6: 27/36: Creating table: reverttransition_vs_revertstationary.  Adjust = BH
## Limma step 6/6: 28/36: Creating table: WTexponential_vs_revertstationary.  Adjust = BH
## Limma step 6/6: 29/36: Creating table: WTstationary_vs_revertstationary.  Adjust = BH
## Limma step 6/6: 30/36: Creating table: WTtransition_vs_revertstationary.  Adjust = BH
## Limma step 6/6: 31/36: Creating table: WTexponential_vs_reverttransition.  Adjust = BH
## Limma step 6/6: 32/36: Creating table: WTstationary_vs_reverttransition.  Adjust = BH
## Limma step 6/6: 33/36: Creating table: WTtransition_vs_reverttransition.  Adjust = BH
## Limma step 6/6: 34/36: Creating table: WTstationary_vs_WTexponential.  Adjust = BH
## Limma step 6/6: 35/36: Creating table: WTtransition_vs_WTexponential.  Adjust = BH
## Limma step 6/6: 36/36: Creating table: WTtransition_vs_WTstationary.  Adjust = BH
## Limma step 6/6: 1/9: Creating table: deltaexponential.  Adjust = BH
## Limma step 6/6: 2/9: Creating table: deltastationary.  Adjust = BH
## Limma step 6/6: 3/9: Creating table: deltatransition.  Adjust = BH
## Limma step 6/6: 4/9: Creating table: revertexponential.  Adjust = BH
## Limma step 6/6: 5/9: Creating table: revertstationary.  Adjust = BH
## Limma step 6/6: 6/9: Creating table: reverttransition.  Adjust = BH
## Limma step 6/6: 7/9: Creating table: WTexponential.  Adjust = BH
## Limma step 6/6: 8/9: Creating table: WTstationary.  Adjust = BH
## Limma step 6/6: 9/9: Creating table: WTtransition.  Adjust = BH
##                        Length Class         Mode     
## title                   1     -none-        character
## notes                   1     -none-        character
## initial_metadata       28     data.frame    list     
## expressionset           1     ExpressionSet S4       
## design                 28     data.frame    list     
## conditions             36     -none-        character
## batches                36     -none-        character
## samplenames            36     -none-        character
## colors                 36     -none-        character
## state                   5     -none-        list     
## libsize                36     -none-        numeric  
## original_expressionset  1     ExpressionSet S4       
## normalized              6     -none-        list     
## best_libsize           36     -none-        numeric  
## norm_result             6     -none-        list
## Comparing analyses.

combined_keepers <- list(
    "exponential_delta_vs_wt" = c("deltaexponential", "WTexponential"),
    "exponential_delta_vs_revert" = c("deltaexponential", "revertexponential"),
    "stationary_delta_vs_wt" = c("deltastationary", "WTstationary"),
    "stationary_delta_vs_revert" = c("deltastationary", "revertstationary"),
    "transition_delta_vs_wt" = c("deltatransition", "WTtransition"),
    "transition_delta_vs_revert" = c("deltatransition", "reverttransition"),
    "WT_exponential_vs_transition" = c("WTexponential", "WTtransition"),
    "delta_exponential_vs_transition" = c("deltaexponential", "deltatransition"),
    "revert_exponential_vs_transition" = c("revertexponential", "reverttransition"),
    "WT_stationary_vs_transition" = c("WTstationary", "WTtransition"),
    "delta_stationary_vs_transition" = c("deltastationary", "deltatransition"),
    "revert_stationary_vs_transition" = c("revertstationary", "reverttransition"),
    "WT_stationary_vs_exponential" = c("WTstationary", "WTexponential"),
    "delta_stationary_vs_exponential" = c("deltastationary", "deltaexponential"),
    "revert_stationary_vs_exponential" = c("revertstationary", "revertexponential"))


combined_tables <- combine_de_tables(
    combined_de, keepers = combined_keepers,
    excel = glue::glue("excel/rofa_combined_tables-v{ver}.xlsx"))
## Deleting the file excel/rofa_combined_tables-v202301.xlsx before writing the tables.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: TRUE.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: FALSE.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: FALSE.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Starting combine_extracted_plots() with do_inverse as: FALSE.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
combined_sig <- extract_significant_genes(
    combined_tables,
    excel = glue::glue("excel/rofa_combined_sig-v{ver}.xlsx"))
## Deleting the file excel/rofa_combined_sig-v202301.xlsx before writing the tables.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and FDR using the expressionset colors.
## Plotting volcano plot of the DE results of WTexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertexponential_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_deltatransition according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTtransition_vs_WTstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltatransition_vs_deltastationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of reverttransition_vs_revertstationary according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of WTstationary_vs_WTexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of deltastationary_vs_deltaexponential according to the columns: logFC and adj.P.Val using the expressionset colors.
## Plotting volcano plot of the DE results of revertstationary_vs_revertexponential according to the columns: logFC and adj.P.Val using the expressionset colors.

3 Make a circos picture

Work on making a pretty picture, there are kind of a lot of potential contrasts, so lets just start with one.

circos_cfg <- circos_prefix(fData(rofa_expt), name="rofa", chr_column = "seqnames",
                            start_column = "start.x", stop_column = "end",
                            strand_column = "strand.x", cog_column = "COGFun")
## This assumes you have a colors.conf in circos/colors/ and fonts.conf in circos/fonts/
## It also assumes you have conf/ideogram.conf, conf/ticks.conf, and conf/housekeeping.conf
## It will write circos/conf/rofa.conf with a reasonable first approximation config file.
## Setting 86 undefined entries to the plus strand.
## Warning in circos_prefix(fData(rofa_expt), name = "rofa", chr_column =
## "seqnames", : NAs introduced by coercion

## Warning in circos_prefix(fData(rofa_expt), name = "rofa", chr_column =
## "seqnames", : NAs introduced by coercion
## Wrote karyotype to circos/conf/ideograms/rofa.conf
## This should match the ideogram= line in rofa.conf
## Wrote ticks to circos/conf/ticks_rofa.conf
kary <- circos_karyotype(circos_cfg, fasta = "reference/spyogenes_5005.fasta")
## Wrote karyotype to circos/conf/karyotypes/rofa.conf
## This should match the karyotype= line in rofa.conf
plus_minus <- circos_plus_minus(circos_cfg, width = 0.06, thickness = 40,
                                url_string = "http://www.microbesonline.org/cgi-bin/fetchLocus.cgi?locus=[id]")
## Writing data file: circos/data/rofa_plus_go.txt with the + strand GO data.
## Writing data file: circos/data/rofa_minus_go.txt with the - strand GO data.
## Wrote the +/- config files.  Appending their inclusion to the master file.
## Returning the inner width: 0.88.  Use it as the outer for the next ring.
exp_delta_vs_wt_hist <- circos_hist(circos_cfg, combined_tables[["data"]][["exponential_delta_vs_wt"]],
                                    colname="deseq_logfc", basename="exp_deltawt",
                                    outer=plus_minus)
## Assuming the input is a dataframe.
## Writing data file: circos/data/rofa_exp_deltawtdeseq_logfc_hist.txt with the exp_deltawtdeseq_logfc column.
## Returning the inner width: 0.8.  Use it as the outer for the next ring.
cjb_suffix <- circos_suffix(circos_cfg)
made <- circos_make(circos_cfg, target="rofa")

circos/rofa.png

if (!isTRUE(get0("skip_load"))) {
  pander::pander(sessionInfo())
  message(paste0("This is hpgltools commit: ", get_git_commit()))
  this_save <- paste0(gsub(pattern = "\\.Rmd", replace = "", x = rmd_file), "-v", ver, ".rda.xz")
  message(paste0("Saving to ", this_save))
  tmp <- sm(saveme(filename = this_save))
}
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 140f18734ff1a7e49669ad91dd4f469f0560ca1a
## This is hpgltools commit: Sun Jan 29 15:36:20 2023 -0500: 140f18734ff1a7e49669ad91dd4f469f0560ca1a
## Saving to index_rofA-v202301.rda.xz
---
title: "S.pyogenes 5448 202301: RNASeq comparing rofA strains across time."
author: "atb abelew@gmail.com"
date: "`r Sys.Date()`"
output:
 html_document:
  code_download: true
  code_folding: show
  fig_caption: true
  fig_height: 7
  fig_width: 7
  highlight: default
  keep_md: false
  mode: selfcontained
  number_sections: true
  self_contained: true
  theme: readable
  toc: true
  toc_float:
    collapsed: false
    smooth_scroll: false
---

<style>
  body .main-container {
    max-width: 1600px;
  }
</style>

```{r options, include = FALSE}
library(hpgltools)
tt <- devtools::load_all("~/hpgltools")
knitr::opts_knit$set(progress = TRUE,
                     verbose = TRUE,
                     width = 90,
                     echo = TRUE)
knitr::opts_chunk$set(error = TRUE,
                      fig.width = 8,
                      fig.height = 8,
                      dpi = 96)
old_options <- options(digits = 4,
                       stringsAsFactors = FALSE,
                       knitr.duplicate.label = "allow")
ggplot2::theme_set(ggplot2::theme_bw(base_size = 10))
set.seed(1)
ver <- "202301"
rmd_file <- "index_rofA.Rmd"
```

Note, I am using my previous worksheet from when last I worked with
Rezia as a template for this (I copied it from there and am modifying
it now).

# S.pyogenes 5448 rofA RNASeq version: `r ver`

## Preprocessing

I used my cyoa tool to process these samples by doing the following:

1.  Copying the data from the sequencer into the directory 'preprocessing/'
2.  Used a slightly involved shell command to create a directory for
    each sample and copy the reads for it to the 'unprocessed/'
    subdirectory within it.
3.  invoked the following:

```{bash, eval=FALSE}
cd preprocessing
start=$(pwd)
for i in $(/bin/ls -d ./*)
do
  cd $i
  rm -rf outputs scripts
  cyoa --task pipe --method prnas --species spyogenes_5448 \
       --gff_type gene --gff_tag locus_tag \
       --input $(/bin/ls unprocessed/* | tr '\n' ':' | sed 's/:$//g')
  cd $start
done
```

The above for loop goes into each sample and does the following:

1.  Trims the data, heavily compresses the outputs.
2.  Runs fastqc
3.  Runs hisat2 using my spyogenes_5448 indices.
4.  Converts the sam alignment to sorted/indexed bam.
5.  Makes a couple of extra copies of it with some filters.
6.  Compresses the aligned/unaligned reads.
7.  Runs htseq-count on the alignments to count reads/gene.

  Note the following steps were not actually run because I had a
  speeling error.  But since they are not necessary for the explicitly
  RNASeq analyses I first want to do, I ignored it.  I am curious
  though to see if there are other mutations in these strains, so I
  will likely run those portions manually.

8.  Runs freebayes on the alignments to look for variants.
9.  Sorts/compresses the freebayes output.
10. Does some parsing of the freebayes output and provides some tables
    about where mutations were observed.

## Collect annotation information

Same two primary annotation sources, the gff file used for mapping/counting,
and microbesonline.org.  Note that since I moved to just downloading the
material from the web interface, I no longer have a handy method to get the
taxon ID, so I go there and hunt down the taxId manually.

Now that I am thinking about it, my 5448 genome/annotations are kind
of old, I will ask and check to see if there is anything newer.

Also, 5448 does not have an entry at microbesonline.org, a fact which
I forgot.  I need to go poking in my notes to reconnect 5005 and 5448.

```{r annotation}
gff_annot <- load_gff_annotations("reference/spyogenes_5448.gff", type = "gene")
rownames(gff_annot) <- gff_annot[["locus_tag"]]
head(gff_annot)

mgas_data <- load_genbank_annotations(accession="CP008776")
genome_size <- GenomicRanges::width(mgas_data$seq)  ## This fails on travis?
mgas_cds <- as.data.frame(mgas_data$cds)
## Get rid of amino acid sequence
rownames(mgas_cds) <- mgas_cds[["locus_tag"]]
wanted <- ! colnames(mgas_cds) %in% c("translation", "type", "strand", "seqnames", "start", "end", "locus_tag", "note", "gene", "gene_synonym", "width")
mgas_cds <- mgas_cds[, wanted]
## And EC_number because wtf is that?
mgas_annot <- merge(mgas_cds, gff_annot, by="row.names")
rownames(mgas_annot) <- mgas_annot[["Row.names"]]
mgas_annot[["Row.names"]] <- NULL
```

# 5005 Annotations

I remapped the data using 5005 because the annotations are much better.

```{r spyogenes_5005_annot}
ref_gff_annot <- load_gff_annotations("reference/spyogenes_5005.gff", type="CDS")
rownames(ref_gff_annot) <- ref_gff_annot[["locus_tag"]]

microbes_annot <- load_microbesonline_annotations(species="5005")
microbes_df <- as.data.frame(microbes_annot)
rownames(microbes_df) <- make.names(gsub(x=microbes_df[["sysName"]], pattern="Spy_", replacement="Spy"), unique=TRUE)

s5005_annot <- merge(ref_gff_annot, microbes_df, by="row.names")
rownames(s5005_annot) <- s5005_annot[["Row.names"]]
s5005_annot[["Row.names"]] <- NULL

microbes_go <- load_microbesonline_go(species="5005", id_column="sysName")
microbes_go[["sysName"]] <- gsub(x=microbes_go[["sysName"]], pattern="Spy_", replacement="Spy")
colnames(microbes_go) <- c("ID", "GO")
```

```{r combine_strain_annotations}
single_hits <- readr::read_tsv("single_multi/outputs/fasta_spyogenes_5448_cds_spyogenes_5005_cds/spyogenes_5448_cds_singles.txt")
colnames(single_hits) <- c("from", "to")
single_hits[["to"]] <- gsub(pattern="(^.*?):.*$", replacement="\\1",
                            x=single_hits[["to"]], perl=TRUE)
## The gff annotation has the protein_id column matching this cds entry.
single_both <- merge(mgas_annot, single_hits, by.x="protein_id", by.y="from", all.x=TRUE)
single_both <- merge(single_both, ref_gff_annot, by.x="to", by.y="protein_id", all.x=TRUE)

## Now pick up the missing old_locus_tags and put in the 5448 IDs
missing_ids <- is.na(single_both[["old_locus_tag"]])
new_ids <- single_both[missing_ids, "locus_tag.x"]
single_both[missing_ids, "old_locus_tag"] <- new_ids

single_both <- merge(single_both, microbes_annot, by.x="old_locus_tag", by.y="sysName", all.x=TRUE)
rownames(single_both) <- make.names(single_both[["gene_id"]], unique=TRUE)
```

## Create expressionSet

Note that I did the following to the sample sheet provided by
Dr. McIver:

1.  Changed the dP_R20_4 sample to dP_R20_2 (The original sample name
    is still there are the original column).
2.  Added columns at the end containing the count table locations.
3.  Added columns 'short_media', 'growth_phase', 'genotype' which
    hopefully contain the relevant metadata extracted from the sample
    descriptions.
4.  Added a column 'Experiment' which is either 'rofA' or 'pdxR'.

```{r expt}
rofa_expt <- create_expt(metadata="sample_sheets/all_samples.xlsx",
                        gene_info=s5005_annot, file_column="spyogenes5005genecounts") %>%
  subset_expt(subset="experiment=='rofA'") %>%
  set_expt_conditions(fact="genotype") %>%
  set_expt_batches(fact="growthphase")
```

I have 36 samples to play with, let us see what they look like.

## Poke expressionSet

```{r poke}
rofa_libsize <- plot_libsize(rofa_expt)
rofa_libsize$plot

rofa_filter_plot <- plot_libsize_prepost(rofa_expt)
rofa_filter_plot$count_plot
rofa_filter_plot$lowgene_plot

rofa_nonzero <- plot_nonzero(rofa_expt)
rofa_nonzero$plot
## awesome, there is a range of ~ 25 genes.

written <- write_expt(rofa_expt, excel = glue::glue("excel/rofA_expt-v{ver}.xlsx"))
```

## Quick visualizations

### Without considering time

Let us start with some views of the data without thinking about batch effects.

```{r view_nobatch}
rofa_norm <- normalize_expt(rofa_expt, filter=TRUE, norm="quant", convert="cpm", transform="log2")
rofa_pca <- plot_pca(rofa_norm)
rofa_pca$plot

rofa_heatmap <- plot_disheat(rofa_norm)
rofa_heatmap$plot
```

### Considering time

Repeat the previous plots, but this time using limma's batch removal
method (which is just a residuals).

```{r nb_view}
rofa_time <- set_expt_conditions(rofa_expt, fact="growthphase") %>%
  set_expt_batches(fact="genotype")

rofa_time_norm <- normalize_expt(rofa_time, filter=TRUE, norm="quant", convert="cpm",
                                 transform="log2")
rofa_time_pca <- plot_pca(rofa_time_norm)
rofa_time_pca$plot

rofa_time_nb <- normalize_expt(rofa_time, filter=TRUE, norm="quant", convert="cpm",
                             transform="log2", batch="limma")
rofa_time_nb_pca <- plot_pca(rofa_time_nb)
rofa_time_nb_pca$plot
```

Well, it is pretty clear that time is the dominant factor.

## Differential Expression analyses

I am going to do the DE in 4 separate pieces:

1.  Only compare strains
2.  Only compare times
3.  Compare the concatenation of strains and times.

### Strain only comparisons

```{r pairwise_strain}
strain_de <- all_pairwise(rofa_expt, model_batch=TRUE, filter=TRUE)
strain_keepers <- list(
    "delta_wt" = c("delta", "WT"),
    "revert_wt" = c("revert", "WT"),
    "delta_revert" = c("delta", "revert"))
strain_tables <- combine_de_tables(
    strain_de, keepers = strain_keepers,
    excel = glue::glue("excel/rofa_strain_tables-v{ver}.xlsx"))
strain_sig <- extract_significant_genes(
    strain_tables,
    excel = glue::glue("excel/rofa_strain_sig-v{ver}.xlsx"))
```

#### Ontology search of delta vs wt

```{r goseq_delta_wt01}
ups <- strain_sig[["deseq"]][["ups"]][["delta_wt"]]
downs <- strain_sig[["deseq"]][["downs"]][["delta_wt"]]
both <- rbind(ups, downs)
length_db <- fData(rofa_expt)[, c("start.y", "stop")]
length_db[["ID"]] <- rownames(length_db)
undef_idx <- length_db[["stop"]] == "undefined"
length_db <- length_db[!undef_idx, ]
length_db[["length"]] <- abs(as.numeric(length_db[["stop"]]) - as.numeric(length_db[["start.y"]]))
test <- simple_goseq(both, go_db = microbes_go, length_db = length_db)
library(enrichplot)
dotplot(test$bp_enrich)
```

### Time only comparisons

```{r pairwise_time}
time_de <- all_pairwise(rofa_time, model_batch=TRUE)
time_keepers <- list(
    "transition_exponential" = c("transition", "exponential"),
    "stationary_exponential" = c("stationary", "exponential"),
    "stationary_transition" = c("stationary", "transition"))
time_tables <- combine_de_tables(
    time_de, keepers = time_keepers,
    excel = glue::glue("excel/rofa_time_tables-v{ver}.xlsx"))
time_sig <- extract_significant_genes(
    time_tables,
    excel = glue::glue("excel/rofa_time_sig-v{ver}.xlsx"))
```

#### Ontology search of stationary vs exponential

```{r goseq_delta_wt02}
ups <- time_sig[["deseq"]][["ups"]][["stationary_exponential"]]
downs <- time_sig[["deseq"]][["downs"]][["stationary_exponential"]]
both <- rbind(ups, downs)
up_time_goseq <- simple_goseq(ups, go_db = microbes_go, length_db = length_db)

fc_values <- time_tables[["data"]][["stationary_exponential"]][["deseq_logfc"]]
names(fc_values) <- rownames(time_tables[["data"]][["stationary_exponential"]])
head(fc_values)

dotplot(up_time_goseq$mf_enrich)
cnetplot(up_time_goseq$mf_enrich, categorysize="pvalue", foldChange=fc_values)
heatplot(up_time_goseq$mf_enrich, foldChange=fc_values)
term_sim <- pairwise_termsim(up_time_goseq$mf_enrich)
treeplot(term_sim)
emapplot(term_sim)

both_time_goseq <- simple_goseq(both, go_db = microbes_go, length_db = length_db)
dotplot(both_time_goseq$mf_enrich)
```

#### Compare times vs strains

I always worry that comparing a data set across multiple conditions
results in not what I think it will.  Let us therefore plot the logFCs
of likely contrasts against each other.

```{r compare_times_vs_strains}
x_axis <- time_tables[["data"]][["transition_exponential"]][, c("deseq_logfc", "deseq_adjp")]
y_axis <- strain_tables[["data"]][["delta_wt"]][, c("deseq_logfc", "deseq_adjp")]
both <- merge(x_axis, y_axis, by="row.names")
rownames(both) <- both[["Row.names"]]
both[["Row.names"]] <- NULL
cor.test(both[["deseq_logfc.x"]], both[["deseq_logfc.y"]], method="spearman")
plotted <- plot_linear_scatter(both[, c("deseq_logfc.x", "deseq_logfc.y")])
plotted$scatter
```

### Interaction model

I want to make sure that my methods of performing interaction models
work as I think it does, and this data set looks to me to be a perfect
place to test that.

```{r combined_de}
combined_factors <- paste0(pData(rofa_expt)[["genotype"]], "_",
                           pData(rofa_expt)[["growthphase"]])
combined_factors

combined_expt <- set_expt_conditions(rofa_expt, fact = combined_factors)
combined_de <- all_pairwise(combined_expt, model_batch = "svaseq", surrogates = 2,
                            filter = TRUE, parallel = FALSE)

combined_keepers <- list(
    "exponential_delta_vs_wt" = c("deltaexponential", "WTexponential"),
    "exponential_delta_vs_revert" = c("deltaexponential", "revertexponential"),
    "stationary_delta_vs_wt" = c("deltastationary", "WTstationary"),
    "stationary_delta_vs_revert" = c("deltastationary", "revertstationary"),
    "transition_delta_vs_wt" = c("deltatransition", "WTtransition"),
    "transition_delta_vs_revert" = c("deltatransition", "reverttransition"),
    "WT_exponential_vs_transition" = c("WTexponential", "WTtransition"),
    "delta_exponential_vs_transition" = c("deltaexponential", "deltatransition"),
    "revert_exponential_vs_transition" = c("revertexponential", "reverttransition"),
    "WT_stationary_vs_transition" = c("WTstationary", "WTtransition"),
    "delta_stationary_vs_transition" = c("deltastationary", "deltatransition"),
    "revert_stationary_vs_transition" = c("revertstationary", "reverttransition"),
    "WT_stationary_vs_exponential" = c("WTstationary", "WTexponential"),
    "delta_stationary_vs_exponential" = c("deltastationary", "deltaexponential"),
    "revert_stationary_vs_exponential" = c("revertstationary", "revertexponential"))


combined_tables <- combine_de_tables(
    combined_de, keepers = combined_keepers,
    excel = glue::glue("excel/rofa_combined_tables-v{ver}.xlsx"))
combined_sig <- extract_significant_genes(
    combined_tables,
    excel = glue::glue("excel/rofa_combined_sig-v{ver}.xlsx"))
```

# Make a circos picture

Work on making a pretty picture, there are kind of a lot of potential contrasts, so lets just
start with one.

```{r circos_result}
circos_cfg <- circos_prefix(fData(rofa_expt), name="rofa", chr_column = "seqnames",
                            start_column = "start.x", stop_column = "end",
                            strand_column = "strand.x", cog_column = "COGFun")
kary <- circos_karyotype(circos_cfg, fasta = "reference/spyogenes_5005.fasta")
plus_minus <- circos_plus_minus(circos_cfg, width = 0.06, thickness = 40,
                                url_string = "http://www.microbesonline.org/cgi-bin/fetchLocus.cgi?locus=[id]")
exp_delta_vs_wt_hist <- circos_hist(circos_cfg, combined_tables[["data"]][["exponential_delta_vs_wt"]],
                                    colname="deseq_logfc", basename="exp_deltawt",
                                    outer=plus_minus)
cjb_suffix <- circos_suffix(circos_cfg)
made <- circos_make(circos_cfg, target="rofa")
```

[circos/rofa.png](circos/rofa.png)


```{r saveme}
if (!isTRUE(get0("skip_load"))) {
  pander::pander(sessionInfo())
  message(paste0("This is hpgltools commit: ", get_git_commit()))
  this_save <- paste0(gsub(pattern = "\\.Rmd", replace = "", x = rmd_file), "-v", ver, ".rda.xz")
  message(paste0("Saving to ", this_save))
  tmp <- sm(saveme(filename = this_save))
}
```
