The file contains the code used to generate all of the tables and figures for the transcriptome analysis portion of this study. In addition to figures and tables included in the main text, several additional supplementary figures and tables are included here as well.
If you would like to regenerate all of the results from this text, see the section on “Reproducing this analysis” below.
This PDF was generated using the knitr and rmarkdown packages for R.
To reproduce the figures and results in this PDF, download a copy of the source .Rmd file and associated annotations and count tables.
Using the Bioconductor biocLite() function, install the following dependencies:
Next, open up an R console in the directory containing XXX.Rmd, and run:
rmarkdown::render('XXX.Rmd', output_format='pdf_document')
Note that you can also generate an HTML version of the output by switching pdf_document for html_document in the function call above.
If all of the dependencies specified above are properly installed, and the versions are not significantly different from those used at the time of writing this manuscript, then you should be able to regenerate all of the tables and figures in the PDF as they appear below.
To find out which specific library versions were used, refer to the “System Information” section at the bottom of this document.
| HPGL_ID | Condition | Reads |
|---|---|---|
| HPGL0130 | Mtb | 6,135,076 |
| HPGL0131 | Mtb | 5,624,817 |
| HPGL0132 | Mtb | 5,290,752 |
| HPGL0133 | MtbΔRv3167c | 6,201,775 |
| HPGL0134 | MtbΔRv3167c | 7,332,840 |
| HPGL0135 | MtbΔRv3167c | 4,581,716 |
| HPGL0136 | MtbΔRv3167c::Comp | 15,053,918 |
| HPGL0137 | MtbΔRv3167c::Comp | 16,388,081 |
| HPGL0138 | MtbΔRv3167c::Comp | 12,962,623 |
RNA-Seq sample counts The number of RNA-Seq reads successfully mapped for each sample, before any normalization is applied.
Sample PCA plot PCA was used to look at the relationship between samples and conditions after count filtering and normalization were applied. The tight clustering within each condition suggests that each sample represents a unique transcriptonal state.
Sample Heatmap Pearson correlation was used to measure the similarity between each RNA-Seq sample, and biclustering was applied in order to generate a heatmap depicting the relationship between samples.
| HPGL0130 | HPGL0131 | HPGL0132 | |
|---|---|---|---|
| Rv3167c | 89 | 82 | 80 |
| Rv3168 | 1014 | 686 | 742 |
| Rv3169 | 952 | 645 | 723 |
| HPGL0133 | HPGL0134 | HPGL0135 | |
|---|---|---|---|
| Rv3167c | 263 | 69 | 185 |
| Rv3168 | 6052 | 7508 | 4571 |
| Rv3169 | 3944 | 5066 | 3179 |
| HPGL0136 | HPGL0137 | HPGL0138 | |
|---|---|---|---|
| Rv3167c | 144585 | 198535 | 134943 |
| Rv3168 | 3494 | 3377 | 3364 |
| Rv3169 | 1800 | 1874 | 1715 |
| HPGL0130 | HPGL0131 | HPGL0132 | |
|---|---|---|---|
| Rv3167c | 134.7778 | 143.2222 | 114.2778 |
| Rv3168 | 1483.0000 | 1148.0000 | 1193.8333 |
| Rv3169 | 1381.3333 | 1080.7778 | 1159.6667 |
| HPGL0133 | HPGL0134 | HPGL0135 | |
|---|---|---|---|
| Rv3167c | 374.4444 | 76.500 | 321.000 |
| Rv3168 | 8248.0000 | 8473.667 | 8384.667 |
| Rv3169 | 5491.5556 | 5754.222 | 5754.222 |
| HPGL0136 | HPGL0137 | HPGL0138 | |
|---|---|---|---|
| Rv3167c | 88543.889 | 88543.889 | 88543.889 |
| Rv3168 | 2194.778 | 1871.222 | 2271.667 |
| Rv3169 | 1111.333 | 1019.222 | 1150.389 |
| condition | mean | sd |
|---|---|---|
| Mtb | 1274.94 | 181.63 |
| MtbΔRv3167c | 8368.78 | 113.67 |
| MtbΔRv3167c::Comp | 2112.56 | 212.51 |
| condition | mean | sd |
|---|---|---|
| Mtb | 1207.259 | 155.82747 |
| MtbΔRv3167c | 5666.667 | 151.65067 |
| MtbΔRv3167c::Comp | 1093.648 | 67.34796 |
Rv3168 Expression For each of the three experimental conditions, mRNA was collected and sequenced in triplicate. Above, mean quantile normalized mRNA counts are shown for Rv3168 in each of the conditions.
Rv3169 Expression For each of the three experimental conditions, mRNA was collected and sequenced in triplicate. Above, mean quantile normalized mRNA counts are shown for Rv3169 in each of the conditions.
## Warning in plot.window(...): "legend_width" is not a graphical parameter
## Warning in plot.window(...): "legend_cex" is not a graphical parameter
## Warning in plot.xy(xy, type, ...): "legend_width" is not a graphical
## parameter
## Warning in plot.xy(xy, type, ...): "legend_cex" is not a graphical
## parameter
## Warning in title(...): "legend_width" is not a graphical parameter
## Warning in title(...): "legend_cex" is not a graphical parameter
Expression of genes in the PDIM operon Biclustering heatmap showing the normalized expression levels for each gene in the PDIM operon, across all samples.
## Warning: `legend.margin` must be specified using `margin()`. For the old
## behavior use legend.spacing
Mtb vs. MtbΔRv3167c Limma was used to detect genes that were differentially expressed between Mtb and MtbΔRv3167c samples, resulting in a total of 1407 genes. In the above figure, each point represents a single gene, with red points indicating genes which are differentially expressed.
## Warning: `legend.margin` must be specified using `margin()`. For the old
## behavior use legend.spacing
MtbΔRv3167c vs MtbΔRv3167c::Comp Limma was used to detect genes that were differentially expressed between MtbΔRv3167c and MtbΔRv3167c::Comp samples, resulting in a total of 929 genes. In the above figure, each point represents a single gene, with red points indicating genes which are differentially expressed.
## Warning: `legend.margin` must be specified using `margin()`. For the old
## behavior use legend.spacing
Mtb vs MtbΔRv3167c::Comp Limma was used to detect genes that were differentially expressed between Mtb and MtbΔRv3167c::Comp samples, resulting in a total of 1404 genes. In the above figure, each point represents a single gene, with red points indicating genes which are differentially expressed.
Determination of putative Rv3167c-regulated genes In order to determine which genes are potentially regulated (either directly or indirectly) by Rv3167c, mRNA-sequencing was performed on three replicates each of Mtb, MtbΔRv3167c, and MtbΔRv3167c::Comp. Candidate regulated genes were required to be differentially expressed in both the Mtb vs. MtbΔRv3167c contrast, and also the MtbΔRv3167c vs. MtbΔRv3167c::Comp contrast.
| category | TERM | numDEInCat | numInCat | over_pval_adj | |
|---|---|---|---|---|---|
| 65 | GO:0005618 | cell wall | 295 | 659 | 0.0134636 |
| 312 | GO:0071770 | DIM/DIP cell wall layer assembly | 9 | 9 | 0.0348816 |
| category | TERM | numDEInCat | numInCat | over_pval_adj | |
|---|---|---|---|---|---|
| 312 | GO:0071770 | DIM/DIP cell wall layer assembly | 9 | 9 | 0.0000054 |
| 65 | GO:0005618 | cell wall | 123 | 659 | 0.0010785 |
| 93 | GO:0006633 | fatty acid biosynthetic process | 6 | 7 | 0.0039282 |
| category | TERM | numDEInCat | numInCat | over_pval_adj | |
|---|---|---|---|---|---|
| 65 | GO:0005618 | cell wall | 54 | 659 | 0.010955 |
| category | TERM | numDEInCat | numInCat | over_pval_adj | |
|---|---|---|---|---|---|
| 312 | GO:0071770 | DIM/DIP cell wall layer assembly | 9 | 9 | 0.0000001 |
| 93 | GO:0006633 | fatty acid biosynthetic process | 6 | 7 | 0.0004165 |
[1] L. Cuthbertson and J. R. Nodwell. “The TetR Family of Regulators”. In: Microbiology and Molecular Biology Reviews 77.3 (Sep. 2013), pp. 440-475.
DOI: 10.1128/mmbr.00018-13. <URL: http://dx.doi.org/10.1128/MMBR.00018-13>.
[2] S. Durinck, P. T. Spellman, E. Birney, et al. “Mapping identifiers for the integration of genomic datasets with the R/Bioconductor package biomaRt”.
In: Nat Protoc 4.8 (Jul. 2009), pp. 1184-1191. DOI: 10.1038/nprot.2009.97. <URL: http://dx.doi.org/10.1038/nprot.2009.97>.
[3] W. E. Johnson, C. Li and A. Rabinovic. “Adjusting batch effects in microarray expression data using empirical Bayes methods”. In: Biostatistics 8.1
(Apr. 2006), pp. 118-127. DOI: 10.1093/biostatistics/kxj037. <URL: http://dx.doi.org/10.1093/biostatistics/kxj037>.
[4] C. W. Law, Y. Chen, W. Shi, et al. “voom: precision weights unlock linear model analysis tools for RNA-seq read counts”. In: Genome Biol 15.2
(2014), p. R29. DOI: 10.1186/gb-2014-15-2-r29. <URL: http://dx.doi.org/10.1186/gb-2014-15-2-r29>.
[5] M. D. Young, M. J. Wakefield, G. K. Smyth, et al. “Gene ontology analysis for RNA-seq: accounting for selection bias”. In: Genome Biol 11.2 (2010),
p. R14. DOI: 10.1186/gb-2010-11-2-r14. <URL: http://dx.doi.org/10.1186/gb-2010-11-2-r14>.
[6] G. Yu, L. Wang, Y. Han, et al. “clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters”. In: OMICS: A Journal of
Integrative Biology 16.5 (May. 2012), pp. 284-287. DOI: 10.1089/omi.2011.0118. <URL: http://dx.doi.org/10.1089/omi.2011.0118>.
R version 3.3.2 (2016-10-31)
**Platform:** x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=en_US.UTF-8, LC_ADDRESS=en_US.UTF-8, LC_TELEPHONE=en_US.UTF-8, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=en_US.UTF-8
attached base packages: parallel, stats4, tools, stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: pander(v.0.6.0), viridis(v.0.3.4), knitcitations(v.1.0.7), venneuler(v.1.1-0), rJava(v.0.9-8), preprocessCore(v.1.36.0), limma(v.3.30.4), dplyr(v.0.5.0), genomeIntervals(v.1.30.0), intervals(v.0.15.1), RColorBrewer(v.1.1-2), GO.db(v.3.4.0), AnnotationDbi(v.1.36.0), IRanges(v.2.8.1), S4Vectors(v.0.12.0), Biobase(v.2.34.0), BiocGenerics(v.0.20.0), gplots(v.3.0.1), goseq(v.1.26.0), geneLenDataBase(v.1.10.0), BiasedUrn(v.1.07), ggplot2(v.2.2.0), knitr(v.1.15.1), rmarkdown(v.1.2), nvimcom(v.0.9-25) and colorout(v.1.1-0)
loaded via a namespace (and not attached): httr(v.1.2.1), gtools(v.3.5.0), assertthat(v.0.1), highr(v.0.6), Rsamtools(v.1.26.1), yaml(v.2.1.14), RSQLite(v.1.0.0), backports(v.1.0.4), lattice(v.0.20-34), digest(v.0.6.10), GenomicRanges(v.1.26.1), XVector(v.0.14.0), RefManageR(v.0.13.1), colorspace(v.1.3-1), htmltools(v.0.3.5), Matrix(v.1.2-7.1), plyr(v.1.8.4), XML(v.3.98-1.5), bibtex(v.0.4.0), biomaRt(v.2.30.0), zlibbioc(v.1.20.0), scales(v.0.4.1), gdata(v.2.17.0), BiocParallel(v.1.8.1), tibble(v.1.2), mgcv(v.1.8-16), SummarizedExperiment(v.1.4.0), GenomicFeatures(v.1.26.0), lazyeval(v.0.2.0.9000), RJSONIO(v.1.3-0), magrittr(v.1.5), evaluate(v.0.10), nlme(v.3.1-128), stringr(v.1.1.0), munsell(v.0.4.3), Biostrings(v.2.42.0), GenomeInfoDb(v.1.10.1), caTools(v.1.17.1), grid(v.3.3.2), RCurl(v.1.95-4.8), labeling(v.0.3), bitops(v.1.0-6), gtable(v.0.2.0), DBI(v.0.5-1), R6(v.2.2.0), GenomicAlignments(v.1.10.0), gridExtra(v.2.2.1), lubridate(v.1.6.0), rtracklayer(v.1.34.1), rprojroot(v.1.1), KernSmooth(v.2.23-15), stringi(v.1.1.2) and Rcpp(v.0.12.8)