1 TODO

1.1 202602

  1. Create a factor of {detection}_{library_type} so that we have 4 colors to view the clustering of all together.
  2. Create separate blocks for the different library types etc to make it easier to step through.
  3. Explicit kraken2 of nasal and skin samples – potentially use kmcp and/or deBruijn methods

1.2 202511

  1. We have some samples which are defined as persistent via 7SL. Can we see some reads in those samples?
  2. Send file with mapped reads per sample for all human samples.
  3. Queue SL read counter for all human samples.
  4. All samples have nasal, skin, PBMC; all patients are nasal + or nasal -; perform contrasts of nasal+/nasal- of the PBMC samples. For the PBMC samples, recast them as +/-

1.3 202601

  1. 7SL positive vs negative for
  2. Add plot of transcriptome clustering of nasal swabs between nasal positive/negative with the caveat that the are only 2 positives, 8 negatives, and 1 undetermined in the stranded library; the ribozero have 5 positive, 3 negative, and 1 undetermined.
  3. Create a relative abundance of bacteria/viruses observed in the nassal samples; counting at genus.

Note that this requires splitting the data into 4 groups: salmon+rz, salmon+mrna, hisat+rz, hisat+mrna.

2 Changelog

2.1 202511

Following my conversation with Maria Adelaida, I downloaded a new copy of our online sample sheet and made a sub-copy with only the human samples. It is named (creatively) sample_sheets/human_samples_202511.xlsx

3 Introduction

I want to use this document to examine our first round of persistence samples. I checked my email from Najib and did not find a sample sheet but did find an explanation of the three sample types we expect.

In preparation for this, I downloaded a new hg38 genome. Since the panamensis asembly has not significantly changed (excepting the putative long read genome which I have not yet seen), I am just using the same one.

4 Loading annotation

The hg38 genome I got is brand new (202405), so do not use the archive for a while.

## Ok, so useast.ensembl is failing today, let us use the jan2024 archive?
#hs_annot <- load_biomart_annotations(archive = FALSE, species = "hsapiens")
## Seems like the 202401 archive is a good choice, it is explicitly the hg38_111 release.
## and it is waaaaay faster (like 100x) than useast right now.
hs_annot <- load_biomart_annotations(archive = FALSE, species = "hsapiens", overwrite = FALSE,
                                     year = 2025, month = "08")
## The biomart annotations file already exists, loading from it.
panamensis_orgdb_idx <- grep(pattern = "^org.+panamen.+MHOM.+db$", x = rownames(installed.packages()))
panamensis_orgdb <- tail(rownames(installed.packages())[panamensis_orgdb_idx], n = 1)
lp_annot <- load_orgdb_annotations(panamensis_orgdb, keytype = "gid")
## Loading required package: AnnotationDbi
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: generics
## 
## Attaching package: 'generics'
## The following objects are masked from 'package:base':
## 
##     as.difftime, as.factor, as.ordered, intersect, is.element, setdiff, setequal, union
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:hpgltools':
## 
##     annotation<-, conditions, conditions<-
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind, colnames, dirname,
##     do.call, duplicated, eval, evalq, Filter, Find, get, grep, grepl, is.unsorted,
##     lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     Position, rank, rbind, Reduce, rownames, sapply, saveRDS, table, tapply, unique,
##     unsplit, which.max, which.min
## Loading required package: Biobase
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with 'browseVignettes()'. To cite
##     Bioconductor, see 'citation("Biobase")', and for packages 'citation("pkgname")'.
## 
## Attaching package: 'Biobase'
## The following object is masked from 'package:hpgltools':
## 
##     notes
## Loading required package: IRanges
## Loading required package: S4Vectors
## 
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:utils':
## 
##     findMatches
## The following objects are masked from 'package:base':
## 
##     expand.grid, I, unname
## 
## Attaching package: 'IRanges'
## The following object is masked from 'package:hpgltools':
## 
##     trim
## 
## Unable to find CDSNAME, setting it to ANNOT_EXTERNAL_DB_NAME.
## Unable to find CDSCHROM in the db, removing it.
## Unable to find CDSSTRAND in the db, removing it.
## Unable to find CDSSTART in the db, removing it.
## Unable to find CDSEND in the db, removing it.
## Extracted all gene ids.
## Attempting to select: ANNOT_EXTERNAL_DB_NAME, GENE_TYPE
## 'select()' returned 1:1 mapping between keys and columns

4.1 org.Hs annotations

Recently there have been problems connection to ensembl, I think therefore I want to have fallback annotations using our local dbi annotation databases.

hs_dbi <- load_orgdb_annotations()
## Assuming Homo.sapiens.
## Loading required package: OrganismDbi
## Loading required package: Seqinfo
## Loading required package: GenomicFeatures
## Loading required package: GenomicRanges
## Loading required package: GO.db
## Loading required package: org.Hs.eg.db
## 
## Loading required package: TxDb.Hsapiens.UCSC.hg19.knownGene
## Unable to find GENE_TYPE in the db, removing it.
## Extracted all gene ids.
## Attempting to select: CDSNAME, CDSCHROM, CDSSTRAND, CDSSTART, CDSEND
## 'select()' returned 1:many mapping between keys and columns
## 'select()' returned 1:1 mapping between keys and columns

This is a little silly, but I am going to reload the annotations using the previous invocation to extract the annotation table without having to think. The previous block loads the orgdb for me, so I can just use that to get the fun annotations.

lp_annot <- load_orgdb_annotations(panamensis_orgdb, keytype = "gid", fields = "^annot")
## Unable to find CDSNAME, setting it to ANNOT_EXTERNAL_DB_NAME.
## Unable to find CDSCHROM in the db, removing it.
## Unable to find CDSSTRAND in the db, removing it.
## Unable to find CDSSTART in the db, removing it.
## Unable to find CDSEND in the db, removing it.
## Extracted all gene ids.
## Attempting to select: ANNOT_EXTERNAL_DB_NAME, GENE_TYPE, ANNOT_AA_SEQUENCE_ID, ANNOT_ANNOTATED_GO_COMPONENT, ANNOT_ANNOTATED_GO_FUNCTION, ANNOT_ANNOTATED_GO_ID_COMPONENT, ANNOT_ANNOTATED_GO_ID_FUNCTION, ANNOT_ANNOTATED_GO_ID_PROCESS, ANNOT_ANNOTATED_GO_PROCESS, ANNOT_ANTICODON, ANNOT_APOLLO_LINK_OUT, ANNOT_APOLLO_TRANSCRIPT_DESCRIPTION, ANNOT_CDS, ANNOT_CDS_LENGTH, ANNOT_CHROMOSOME, ANNOT_CODING_END, ANNOT_CODING_START, ANNOT_EC_NUMBERS, ANNOT_EC_NUMBERS_DERIVED, ANNOT_END_MAX, ANNOT_EXON_COUNT, ANNOT_EXTERNAL_DB_NAME, ANNOT_EXTERNAL_DB_VERSION, ANNOT_FIVE_PRIME_UTR_LENGTH, ANNOT_GENE_CONTEXT_END, ANNOT_GENE_CONTEXT_START, ANNOT_GENE_END_MAX, ANNOT_GENE_END_MAX_TEXT, ANNOT_GENE_ENTREZ_ID, ANNOT_GENE_ENTREZ_LINK, ANNOT_GENE_EXON_COUNT, ANNOT_GENE_HTS_NONCODING_SNPS, ANNOT_GENE_HTS_NONSYN_SYN_RATIO, ANNOT_GENE_HTS_NONSYNONYMOUS_SNPS, ANNOT_GENE_HTS_STOP_CODON_SNPS, ANNOT_GENE_HTS_SYNONYMOUS_SNPS, ANNOT_GENE_LOCATION_TEXT, ANNOT_GENE_NAME, ANNOT_GENE_ORTHOLOG_NUMBER, ANNOT_GENE_ORTHOMCL_NAME, ANNOT_GENE_PARALOG_NUMBER, ANNOT_GENE_PREVIOUS_IDS, ANNOT_GENE_PRODUCT, ANNOT_GENE_START_MIN, ANNOT_GENE_START_MIN_TEXT, ANNOT_GENE_TOTAL_HTS_SNPS, ANNOT_GENE_TRANSCRIPT_COUNT, ANNOT_GENE_TYPE, ANNOT_GENOMIC_SEQUENCE_LENGTH, ANNOT_GENUS_SPECIES, ANNOT_HAS_MISSING_TRANSCRIPTS, ANNOT_INTERPRO_DESCRIPTION, ANNOT_INTERPRO_ID, ANNOT_IS_DEPRECATED, ANNOT_IS_PSEUDO, ANNOT_ISOELECTRIC_POINT, ANNOT_LOCATION_TEXT, ANNOT_MAP_LOCATION, ANNOT_MCMC_LOCATION, ANNOT_MOLECULAR_WEIGHT, ANNOT_NCBI_TAX_ID, ANNOT_ORTHOMCL_LINK, ANNOT_OVERVIEW, ANNOT_PFAM_DESCRIPTION, ANNOT_PFAM_ID, ANNOT_PIRSF_DESCRIPTION, ANNOT_PIRSF_ID, ANNOT_PREDICTED_GO_COMPONENT, ANNOT_PREDICTED_GO_FUNCTION, ANNOT_PREDICTED_GO_ID_COMPONENT, ANNOT_PREDICTED_GO_ID_FUNCTION, ANNOT_PREDICTED_GO_ID_PROCESS, ANNOT_PREDICTED_GO_PROCESS, ANNOT_PRIMARY_KEY, ANNOT_PROB_MAP, ANNOT_PROB_MCMC, ANNOT_PROSITEPROFILES_DESCRIPTION, ANNOT_PROSITEPROFILES_ID, ANNOT_PROTEIN_LENGTH, ANNOT_PROTEIN_SEQUENCE, ANNOT_PROTEIN_SOURCE_ID, ANNOT_PSEUDO_STRING, ANNOT_SEQUENCE_DATABASE_NAME, ANNOT_SEQUENCE_ID, ANNOT_SIGNALP_PEPTIDE, ANNOT_SMART_DESCRIPTION, ANNOT_SMART_ID, ANNOT_SNPOVERVIEW, ANNOT_SO_ID, ANNOT_SO_TERM_DEFINITION, ANNOT_SO_TERM_NAME, ANNOT_SO_VERSION, ANNOT_START_MIN, ANNOT_STRAND, ANNOT_STRAND_PLUS_MINUS, ANNOT_SUPERFAMILY_DESCRIPTION, ANNOT_SUPERFAMILY_ID, ANNOT_THREE_PRIME_UTR_LENGTH, ANNOT_TIGRFAM_DESCRIPTION, ANNOT_TIGRFAM_ID, ANNOT_TM_COUNT, ANNOT_TRANS_FOUND_PER_GENE_INTERNAL, ANNOT_TRANSCRIPT_INDEX_PER_GENE, ANNOT_TRANSCRIPT_LENGTH, ANNOT_TRANSCRIPT_LINK, ANNOT_TRANSCRIPT_PRODUCT, ANNOT_TRANSCRIPT_SEQUENCE, ANNOT_TRANSCRIPTS_FOUND_PER_GENE, ANNOT_UNIPROT_IDS, ANNOT_UNIPROT_LINKS
## 'select()' returned 1:1 mapping between keys and columns

5 Collect preprocessed metadata

Use my new sample sheet here.

current_samplesheet <- "sample_sheets/human_samples_202511.xlsx"
first_spec <- make_rnaseq_spec()
input <- read_metadata(current_samplesheet)
colnames(input)
##  [1] "seq"                                         "hpgl_identifier"                            
##  [3] "aim"                                         "participant_code"                           
##  [5] "sample_type"                                 "tube_label_origin"                          
##  [7] "number_of_vials"                             "sample_collection_date"                     
##  [9] "exp_person"                                  "clinical_presentation"                      
## [11] "parasite"                                    "prior_host_hpgl_code"                       
## [13] "prior_parasite_hpgl_code"                    "initial_recurrent"                          
## [15] "drug_susceptibility_perc_reduction_gluc"     "drug_susceptibility_perc_reduction_mlf"     
## [17] "zymodeme_by_electrophoresis"                 "zymodeme_by_pca"                            
## [19] "rna_extraction_date"                         "rna_volume_ul"                              
## [21] "rna_qc_date"                                 "rna_nanodrop_ng_ul"                         
## [23] "X260_280_ratio"                              "X260_230_ratio"                             
## [25] "rna_bioanalyzer_or_qubit_ng_ul"              "rin"                                        
## [27] "rna_qc_passed"                               "library_type"                               
## [29] "library_version"                             "library_name"                               
## [31] "library_const_date"                          "rna_used_to_construct_libraries_ul"         
## [33] "rna_used_to_construct_libraries_ng"          "library_qc_date"                            
## [35] "lib_qc_passed"                               "library_volume_ul"                          
## [37] "unique_dual_index_set"                       "unique_dual_index_plate_coordinate"         
## [39] "unique_dual_index_id"                        "concentrations_determined_by_1"             
## [41] "primer_conc_ng_ul_30100bp_region_1"          "adapter_dimer_conc_ng_ul_100200bp_region_1" 
## [43] "library_conc_ng_ul_2001000bp_region_1"       "library_molarity_nm_2001000bp_region_1"     
## [45] "library_ave_frag_size_bp_2001000bp_region_1" "calculated_adapter_dimer_percent_1"         
## [47] "concentrations_determined_by_2"              "primer_conc_ng_ul_30100bp_region_2"         
## [49] "adapter_dimer_conc_ng_ul_100200bp_region_2"  "library_conc_ng_ul_2001000bp_region_2"      
## [51] "library_molarity_nm_2001000bp_region_2"      "library_ave_frag_size_bp_2001000bp_region_2"
## [53] "calculated_adapter_dimer_percent_2"          "library_volume_ul_sent_to_umd"              
## [55] "shipment_date"                               "bbiagtc_date_received"                      
## [57] "bbiagtc_library_cleanup"                     "bbiagtc_date_sequenced"                     
## [59] "bbiagtc_sequence_batch"                      "bbiagtc_pe_reads_pf"                        
## [61] "bbiagtc_fastp_duplication_rate"              "hisat2inputreads"                           
## [63] "lpsingleconaligned"                          "hg38singleconaligned"                       
## [65] "lpmulticonaligned"                           "hg38multiconaligned"                        
## [67] "lpsingleallaligned"                          "hg38singleallaligned"                       
## [69] "lpmultiallaligned"                           "hg38multiallaligned"                        
## [71] "lppercentmappedfromlog"                      "hg38percentmappedfromlog"                   
## [73] "lphisatobservedcds"                          "hg38hisatobservedcds"                       
## [75] "krakenmatrix"                                "lphisatgenematrix"                          
## [77] "hg38hisatgenematrix"                         "lpsalmontranscriptmatrix"                   
## [79] "hg38salmontranscriptmatrix"                  "detectionparasiteby7sl"                     
## [81] "detectionparasiteby18s"                      "detectionparasitebykdnasec"                 
## [83] "metabolomicnasalswaborplasma"                "immunophenotypingnasalswaborpbmcs"
summary(as.factor(input[["detectionparasiteby7sl"]]))
## negative positive     NA's 
##       63       11       30
pre_meta <- gather_preprocessing_metadata(
  starting_metadata = current_samplesheet, id_column = "hpgl_identifier",
  specification = first_spec, new_metadata = "persistence_hu_modified.xlsx",
  basedir = "preprocessing", species = c("hg38_115", "lpanamensis_mhomcol_v68"))
## Dropped 1 rows from the sample metadata because the sample ID is blank.
## Did not find the condition column in the sample sheet.
## Filling it in as undefined.
## Did not find the batch column in the sample sheet.
## Filling it in as undefined.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## Warning in dispatch_regex_search(meta, search, replace, input_file_spec, : NAs introduced by
## coercion
## Warning in dispatch_regex_search(meta, search, replace, input_file_spec, : NAs introduced by
## coercion
## Writing new metadata to: persistence_hu_modified.xlsx
## Deleting the file persistence_hu_modified.xlsx before writing the tables.
summary(as.factor(pre_meta[["new_meta"]][["detectionparasiteby7sl"]]))
## negative positive     NA's 
##       63       11       29
modified_meta <- pre_meta[["new_meta"]]
## Added the following line to gather_preprocessing_metadata()
rownames(modified_meta) <- make.names(modified_meta[["hpgl_identifier"]], unique = TRUE)

## FIXME: 202511: I broke something in some of these functions and it is pulling
## the wrong information for number of observed genes.
head(modified_meta)
##          seq hpgl_identifier  aim participant_code sample_type tube_label_origin number_of_vials
## PRHU0001   1        PRHU0001 aim1           PP1006       PBMCs      PP1006 PBMCs               1
## PRHU0002   2        PRHU0002 aim1           PP2001  nasal swab         PP2001 HN               1
## PRHU0009   4        PRHU0009 aim1           PP2003       PBMCs      PP2003 PBMCs               1
## PRHU0010   5        PRHU0010 aim1           PP2004       PBMCs      PP2004 PBMCs               1
## PRHU0011   6        PRHU0011 aim1           PP2005       PBMCs      PP2005 PBMCs               1
## PRHU0018   7        PRHU0018 aim1           PP2003        WBCs       PP2003 WBCs               1
##          sample_collection_date exp_person clinical_presentation            prior_host_hpgl_code
## PRHU0001             2024-01-29         LG                    HD                            <NA>
## PRHU0002             2024-02-05         LG                    HD                            <NA>
## PRHU0009                   <NA>         LG                  H-CL TMRC30130; TMRC30124; TMRC30131
## PRHU0010             2024-04-05         LG                  H-CL                            <NA>
## PRHU0011             2024-04-12         LG                  H-CL                            <NA>
## PRHU0018             2024-03-22         LG                  H-CL TMRC30130; TMRC30124; TMRC30131
##          rna_extraction_date rna_volume_ul rna_qc_date rna_nanodrop_ng_ul X260_280_ratio
## PRHU0001          2024-01-31            25  2024-02-15             338.43           2.04
## PRHU0002          2024-02-14            25  2024-02-15              84.82           2.08
## PRHU0009          2024-05-06            25  2024-05-06              115.1           1.99
## PRHU0010          2024-05-06            25  2024-05-06             185.28           2.03
## PRHU0011          2024-05-06            25  2024-05-06              70.12           1.95
## PRHU0018          2024-03-22            30  2024-05-02              49.55           2.06
##          X260_230_ratio rna_bioanalyzer_or_qubit_ng_ul rin rna_qc_passed library_type
## PRHU0001           1.73                            407  NA           yes         mRNA
## PRHU0002           1.97                            101  NA           yes         mRNA
## PRHU0009           1.34                            167 8.0           yes         mRNA
## PRHU0010           1.91                            264 8.5           yes         mRNA
## PRHU0011           0.85                            113 8.7           yes         mRNA
## PRHU0018           1.56                             70 9.1           yes         mRNA
##          library_version             library_name library_const_date
## PRHU0001               1      PP1006.PBMCs.mRNA.1         2024-02-09
## PRHU0002               1 PP2001.nasal swab.mRNA.1         2024-02-15
## PRHU0009               1      PP2003.PBMCs.mRNA.1         2024-06-05
## PRHU0010               1      PP2004.PBMCs.mRNA.1         2024-06-05
## PRHU0011               1      PP2005.PBMCs.mRNA.1         2024-06-05
## PRHU0018               1       PP2003.WBCs.mRNA.1         2024-06-05
##          rna_used_to_construct_libraries_ul rna_used_to_construct_libraries_ng library_qc_date
## PRHU0001                               1.48                                500      2024-02-12
## PRHU0002                               3.54                                300      2024-02-16
## PRHU0009                               1.79                                300      2024-06-11
## PRHU0010                               1.14                                300      2024-06-11
## PRHU0011                               2.65                                300      2024-06-11
## PRHU0018                               4.29                                300      2024-06-11
##          lib_qc_passed library_volume_ul                 unique_dual_index_set
## PRHU0001           yes                15 IDT for Illumina RNA UD Indexes Set A
## PRHU0002           yes                15 IDT for Illumina RNA UD Indexes Set A
## PRHU0009           yes                15 IDT for Illumina RNA UD Indexes Set A
## PRHU0010           yes                15 IDT for Illumina RNA UD Indexes Set A
## PRHU0011           yes                15 IDT for Illumina RNA UD Indexes Set A
## PRHU0018           yes                15 IDT for Illumina RNA UD Indexes Set A
##          unique_dual_index_plate_coordinate unique_dual_index_id concentrations_determined_by_1
## PRHU0001                                C04              UDP0027                    TapeStation
## PRHU0002                                F04              UDP0030                    TapeStation
## PRHU0009                                B05              UDP0034                    Bioanalyzer
## PRHU0010                                C05              UDP0035                    Bioanalyzer
## PRHU0011                                D05              UDP0036                    Bioanalyzer
## PRHU0018                                E05              UDP0037                    Bioanalyzer
##          primer_conc_ng_ul_30100bp_region_1 adapter_dimer_conc_ng_ul_100200bp_region_1
## PRHU0001                                  -                                          -
## PRHU0002                                  -                                          -
## PRHU0009                               0.49                                       0.27
## PRHU0010                               0.58                                       0.35
## PRHU0011                                0.7                                       0.27
## PRHU0018                               0.58                                       0.12
##          library_conc_ng_ul_2001000bp_region_1 library_molarity_nm_2001000bp_region_1
## PRHU0001                                 69.50                                  321.0
## PRHU0002                                 17.50                                   82.3
## PRHU0009                                 60.64                                  293.3
## PRHU0010                                 64.84                                  314.0
## PRHU0011                                 56.86                                  279.9
## PRHU0018                                 85.33                                  421.9
##          library_ave_frag_size_bp_2001000bp_region_1 calculated_adapter_dimer_percent_1
## PRHU0001                                         353                                  0
## PRHU0002                                         345                                  0
## PRHU0009                                         336                0.00445250659630607
## PRHU0010                                         340                 0.0053979025293029
## PRHU0011                                         330                0.00474850510024622
## PRHU0018                                         328                0.00140630493378648
##          concentrations_determined_by_2 primer_conc_ng_ul_30100bp_region_2
## PRHU0001                           <NA>                               <NA>
## PRHU0002                           <NA>                               <NA>
## PRHU0009                           <NA>                               <NA>
## PRHU0010                           <NA>                               <NA>
## PRHU0011                           <NA>                               <NA>
## PRHU0018                           <NA>                               <NA>
##          adapter_dimer_conc_ng_ul_100200bp_region_2 library_conc_ng_ul_2001000bp_region_2
## PRHU0001                                       <NA>                                    NA
## PRHU0002                                       <NA>                                    NA
## PRHU0009                                       <NA>                                    NA
## PRHU0010                                       <NA>                                    NA
## PRHU0011                                       <NA>                                    NA
## PRHU0018                                       <NA>                                    NA
##          library_molarity_nm_2001000bp_region_2 library_ave_frag_size_bp_2001000bp_region_2
## PRHU0001                                     NA                                          NA
## PRHU0002                                     NA                                          NA
## PRHU0009                                     NA                                          NA
## PRHU0010                                     NA                                          NA
## PRHU0011                                     NA                                          NA
## PRHU0018                                     NA                                          NA
##          calculated_adapter_dimer_percent_2 library_volume_ul_sent_to_umd shipment_date
## PRHU0001                                 NA                            14    2024-02-20
## PRHU0002                                 NA                            14    2024-02-20
## PRHU0009                                 NA                            14    2024-08-26
## PRHU0010                                 NA                            14    2024-08-26
## PRHU0011                                 NA                            14    2024-08-26
## PRHU0018                                 NA                            14    2024-08-26
##          bbiagtc_date_received bbiagtc_library_cleanup bbiagtc_date_sequenced
## PRHU0001            2024-04-05              not needed             2024-05-03
## PRHU0002            2024-04-05              not needed             2024-05-03
## PRHU0009            2024-11-05              not needed             2024-11-08
## PRHU0010            2024-11-05                     yes             2024-11-08
## PRHU0011            2024-11-05              not needed             2024-11-08
## PRHU0018            2024-11-05              not needed             2024-11-08
##          bbiagtc_sequence_batch bbiagtc_pe_reads_pf bbiagtc_fastp_duplication_rate
## PRHU0001                PERS001            13167287                           13.9
## PRHU0002                PERS001             9815642                           18.4
## PRHU0009                PERS002            23614271                           19.9
## PRHU0010                PERS002            22777166                           16.9
## PRHU0011                PERS002            23458700                           20.4
## PRHU0018                PERS002            20907177                           16.6
##          detectionparasiteby7sl detectionparasiteby18s detectionparasitebykdnasec
## PRHU0001                   <NA>                   <NA>                       <NA>
## PRHU0002                   <NA>                   <NA>                       <NA>
## PRHU0009                   <NA>                   <NA>                       <NA>
## PRHU0010                   <NA>                   <NA>                       <NA>
## PRHU0011                   <NA>                   <NA>                       <NA>
## PRHU0018               positive               positive                       <NA>
##          metabolomicnasalswaborplasma immunophenotypingnasalswaborpbmcs condition     batch
## PRHU0001                         <NA>                              <NA> undefined undefined
## PRHU0002                         <NA>                              <NA> undefined undefined
## PRHU0009                          yes                              <NA> undefined undefined
## PRHU0010                          yes                              <NA> undefined undefined
## PRHU0011                          yes                              <NA> undefined undefined
## PRHU0018                          yes                              <NA> undefined undefined
##          sampleid trimomatic_input trimomatic_output trimomatic_percent fastqc_pct_gc
## PRHU0001 PRHU0001         13167287          12085247              0.918            49
## PRHU0002 PRHU0002               NA                NA                 NA            50
## PRHU0009 PRHU0009         23614271          22272928              0.943            51
## PRHU0010 PRHU0010         22777166          21621102              0.949            51
## PRHU0011 PRHU0011         23458700          22252717              0.949            51
## PRHU0018 PRHU0018         20907177          19815284              0.948            52
##          kraken_bacterial_classified kraken_bacterial_unclassified kraken_first_bacterial_species
## PRHU0001                      418110                      11667137          Staphylococcus aureus
## PRHU0002                      324998                       8681618         Bacillus thuringiensis
## PRHU0009                       86840                        866454        Porphyrobacter sp. GA68
## PRHU0010                       76438                       1056370          Klebsiella pneumoniae
## PRHU0011                       70332                        803596          Klebsiella pneumoniae
## PRHU0018                       46603                        619808            Priestia megaterium
##          kraken_first_bacterial_species_reads kraken_viral_classified kraken_viral_unclassified
## PRHU0001                               117628                   58135                  12027112
## PRHU0002                                48125                   43789                   8962827
## PRHU0009                                11937                  197113                  22075815
## PRHU0010                                 7070                  132560                  21488542
## PRHU0011                                 6002                  204047                  22048670
## PRHU0018                                 7831                  108172                  19707112
##          kraken_first_viral_species kraken_first_viral_species_reads
## PRHU0001      Proteus virus Isfahan                            35851
## PRHU0002      Proteus virus Isfahan                            28527
## PRHU0009      Proteus virus Isfahan                           146195
## PRHU0010      Proteus virus Isfahan                            90884
## PRHU0011      Proteus virus Isfahan                           160954
## PRHU0018      Proteus virus Isfahan                            74208
##                                                                   kraken_matrix_viral
## PRHU0001 preprocessing/PRHU0001/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0002 preprocessing/PRHU0002/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0009 preprocessing/PRHU0009/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0010 preprocessing/PRHU0010/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0011 preprocessing/PRHU0011/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0018 preprocessing/PRHU0018/outputs/20250918kraken_viral/kraken_report_matrix.tsv
##                                                            kraken_matrix_bacterial
## PRHU0001 preprocessing/PRHU0001/outputs/02kraken_bacteria/kraken_report_matrix.tsv
## PRHU0002 preprocessing/PRHU0002/outputs/02kraken_bacteria/kraken_report_matrix.tsv
## PRHU0009 preprocessing/PRHU0009/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## PRHU0010 preprocessing/PRHU0010/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## PRHU0011 preprocessing/PRHU0011/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## PRHU0018 preprocessing/PRHU0018/outputs/06kraken_bacteria/kraken_report_matrix.tsv
##          hisat_rrna_input_reads_hg38_115 hisat_rrna_input_reads_lpanamensis_mhomcol_v68
## PRHU0001                        12085247                                             NA
## PRHU0002                         9006616                                             NA
## PRHU0009                        22272928                                             NA
## PRHU0010                        21621102                                             NA
## PRHU0011                        22252717                                             NA
## PRHU0018                        19815284                                             NA
##          hisat_rrna_single_concordant_hg38_115
## PRHU0001                                 73426
## PRHU0002                                 80682
## PRHU0009                                408744
## PRHU0010                                149371
## PRHU0011                                200482
## PRHU0018                                117268
##          hisat_rrna_single_concordant_lpanamensis_mhomcol_v68
## PRHU0001                                                   NA
## PRHU0002                                                   NA
## PRHU0009                                                   NA
## PRHU0010                                                   NA
## PRHU0011                                                   NA
## PRHU0018                                                   NA
##          hisat_rrna_multi_concordant_hg38_115 hisat_rrna_multi_concordant_lpanamensis_mhomcol_v68
## PRHU0001                                    0                                                  NA
## PRHU0002                                    3                                                  NA
## PRHU0009                                   40                                                  NA
## PRHU0010                                    6                                                  NA
## PRHU0011                                    1                                                  NA
## PRHU0018                                    3                                                  NA
##          hisat_rrna_percent_log_hg38_115 hisat_rrna_percent_log_lpanamensis_mhomcol_v68
## PRHU0001                            0.63                                             NA
## PRHU0002                            0.94                                             NA
## PRHU0009                            1.93                                             NA
## PRHU0010                            0.73                                             NA
## PRHU0011                            0.94                                             NA
## PRHU0018                            0.62                                             NA
##          hisat_genome_input_reads_hg38_115 hisat_genome_input_reads_lpanamensis_mhomcol_v68
## PRHU0001                          12085247                                         12085247
## PRHU0002                           9006616                                          9006616
## PRHU0009                          22272928                                         22272928
## PRHU0010                          21621102                                         21621102
## PRHU0011                          22252717                                         22252717
## PRHU0018                          19815284                                         19815284
##          hisat_genome_single_concordant_hg38_115
## PRHU0001                                11027565
## PRHU0002                                 8119315
## PRHU0009                                19763135
## PRHU0010                                19276929
## PRHU0011                                19973324
## PRHU0018                                18103721
##          hisat_genome_single_concordant_lpanamensis_mhomcol_v68
## PRHU0001                                                    838
## PRHU0002                                                    760
## PRHU0009                                                   1382
## PRHU0010                                                    559
## PRHU0011                                                    662
## PRHU0018                                                    415
##          hisat_genome_multi_concordant_hg38_115
## PRHU0001                                 621289
## PRHU0002                                 505363
## PRHU0009                                1516414
## PRHU0010                                1171082
## PRHU0011                                1356659
## PRHU0018                                1010925
##          hisat_genome_multi_concordant_lpanamensis_mhomcol_v68 hisat_genome_single_all_hg38_115
## PRHU0001                                                    99                           365382
## PRHU0002                                                   102                           282669
## PRHU0009                                                   102                           790119
## PRHU0010                                                    96                           920497
## PRHU0011                                                    82                           758139
## PRHU0018                                                    62                           536623
##          hisat_genome_single_all_lpanamensis_mhomcol_v68 hisat_genome_multi_all_hg38_115
## PRHU0001                                           14329                           84362
## PRHU0002                                           13432                           71982
## PRHU0009                                           22507                          235198
## PRHU0010                                           10941                          242320
## PRHU0011                                           12590                          206128
## PRHU0018                                            8241                          133869
##          hisat_genome_multi_all_lpanamensis_mhomcol_v68 hisat_unmapped_hg38_115
## PRHU0001                                           6702                   36996
## PRHU0002                                           7507                  131347
## PRHU0009                                          17740                  220613
## PRHU0010                                          12474                  189403
## PRHU0011                                          13330                  233569
## PRHU0018                                           9468                  177472
##          hisat_unmapped_lpanamensis_mhomcol_v68 hisat_genome_percent_log_hg38_115
## PRHU0001                               24147571                             99.85
## PRHU0002                               17990551                             99.27
## PRHU0009                               44502561                             99.50
## PRHU0010                               43217431                             99.56
## PRHU0011                               44477964                             99.48
## PRHU0018                               39611887                             99.55
##          hisat_genome_percent_log_lpanamensis_mhomcol_v68
## PRHU0001                                             0.09
## PRHU0002                                             0.13
## PRHU0009                                             0.10
## PRHU0010                                             0.06
## PRHU0011                                             0.06
## PRHU0018                                             0.05
##                                                           hisat_alignment_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
##                                                                          hisat_alignment_lpanamensis_mhomcol_v68
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam
##          salmon_mapped_hg38_115 salmon_mapped_lpanamensis_mhomcol_v68 salmon_percent_hg38_115
## PRHU0001                     NA                                   228                   54.62
## PRHU0002                     NA                                    NA                   61.09
## PRHU0009                     NA                                   514                   53.76
## PRHU0010                     NA                                   532                   55.29
## PRHU0011                     NA                                   564                   56.48
## PRHU0018                     NA                                   411                   57.38
##          salmon_percent_lpanamensis_mhomcol_v68 salmon_observed_genes_hg38_115
## PRHU0001                               0.001887                          40892
## PRHU0002                               0.023971                          37639
## PRHU0009                               0.002308                          47176
## PRHU0010                               0.002461                          47162
## PRHU0011                               0.002535                          46983
## PRHU0018                               0.002074                          43731
##          salmon_observed_genes_lpanamensis_mhomcol_v68                                 input_r1
## PRHU0001                                            12 unprocessed/PRHU0001_S49_R1_001.fastq.gz
## PRHU0002                                            12                                         
## PRHU0009                                            14  unprocessed/PRHU0009_S7_R1_001.fastq.gz
## PRHU0010                                            13  unprocessed/PRHU0010_S8_R1_001.fastq.gz
## PRHU0011                                            15  unprocessed/PRHU0011_S9_R1_001.fastq.gz
## PRHU0018                                            14 unprocessed/PRHU0018_S16_R1_001.fastq.gz
##                                          input_r2
## PRHU0001 unprocessed/PRHU0001_S49_R2_001.fastq.gz
## PRHU0002                                         
## PRHU0009  unprocessed/PRHU0009_S7_R2_001.fastq.gz
## PRHU0010  unprocessed/PRHU0010_S8_R2_001.fastq.gz
## PRHU0011  unprocessed/PRHU0011_S9_R2_001.fastq.gz
## PRHU0018 unprocessed/PRHU0018_S16_R2_001.fastq.gz
##                                                                                      hisat_count_table_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
##                                                                                                                    hisat_count_table_lpanamensis_mhomcol_v68
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
##                                            salmon_count_table_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0002 preprocessing/PRHU0002/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0009 preprocessing/PRHU0009/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0010 preprocessing/PRHU0010/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0011 preprocessing/PRHU0011/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0018 preprocessing/PRHU0018/outputs/80salmon_hg38_115_CDS/quant.sf
##                                                  salmon_count_table_lpanamensis_mhomcol_v68
## PRHU0001 preprocessing/PRHU0001/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0002 preprocessing/PRHU0002/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0009 preprocessing/PRHU0009/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0010 preprocessing/PRHU0010/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0011 preprocessing/PRHU0011/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0018 preprocessing/PRHU0018/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
head(modified_meta[["salmon_observed_genes_hg38_115"]])
## [1] 40892 37639 47176 47162 46983 43731
summary(as.factor(modified_meta[["detectionparasiteby7sl"]]))
## negative positive     NA's 
##       63       11       29
modified_meta[["detectionparasiteby7sl"]] <- sanitize_metadata(modified_meta[["detectionparasiteby7sl"]])
summary(modified_meta[["detectionparasiteby7sl"]])
##      negative notapplicable      positive 
##            63            29            11

Create a factor of the 7SL detection of nasal samples. Note, we need to recast the NAs as undefined. I would have sworn that my gather function would do that?

modified_meta[["nasal_7sl_status"]] <- modified_meta[["detectionparasiteby7sl"]]

nasal_samples <- modified_meta[["sample_type"]] == "nasal swab"
summary(nasal_samples)
##    Mode   FALSE    TRUE 
## logical      83      20
sl_positive <- modified_meta[["nasal_7sl_status"]] == "positive"
summary(sl_positive)
##    Mode   FALSE    TRUE 
## logical      92      11
nasal_positive <- nasal_samples & sl_positive
summary(nasal_positive)
##    Mode   FALSE    TRUE 
## logical      96       7
nasal_positive_samples <- rownames(modified_meta)[nasal_positive]
nasal_positive_people <- modified_meta[nasal_positive_samples, "participant_code"]
nasal_positive_people
## [1] "PP1009" "PP2020" "PP1009" "PP2005" "PP2006" "PP2019" "PP2020"
nasal_positive_people_samples <- modified_meta[["participant_code"]] %in% nasal_positive_people
modified_meta[["nasal_7sl_status"]] <- "negative"
modified_meta[nasal_positive_people_samples, "nasal_7sl_status"] <- "positive"
modified_meta[["nasal_7sl_status"]] <- as.factor(modified_meta[["nasal_7sl_status"]])
summary(modified_meta[["nasal_7sl_status"]])
## negative positive 
##       84       19

5.1 Add combination of detection and library type

Also add a category separating all skin samples from everything else.

modified_meta[["detection_type"]] <- as.factor(paste0(modified_meta[["detectionparasiteby7sl"]], "_",
                                                      modified_meta[["library_type"]]))

summary(modified_meta[["detection_type"]])
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                 57                  6                 28                  1                  6 
##        positive_RZ 
##                  5
modified_meta[["skinp"]] <- "not_skin"
skin_idx <- grepl(x = modified_meta[["sample_type"]], pattern = "^skin")
summary(skin_idx)
##    Mode   FALSE    TRUE 
## logical      68      35
modified_meta[skin_idx, "skinp"] <- "skin"
modified_meta[["skinp"]] <- as.factor(modified_meta[["skinp"]])
hisat_idx <- grep(pattern = "^hisat", x = names(first_spec))
second_spec <- first_spec[hisat_idx]
post_meta <- gather_preprocessing_metadata(
  starting_metadata = pre_meta[["new_meta"]],
  specification = second_spec, basedir = "preprocessing/202405", species = "hg38_111",
  new_metadata = "sample_sheets/tmrc2_persistence_202405_lp_hg.xlsx")

both_meta <- gather_preprocessing_metadata(
  starting_metadata = "sample_sheets/tmrc_persistence_202405.xlsx",
  specification = first_spec,
  basedir = "preprocessing/202405", species= c("lpanamensis_v68", "hg38_111"),
  new_metadata = "sample_sheets/tmrc_persistence_202405_both.xlsx")

6 Collect gene annotations

I should have all my load_xyz_annotation functions return some of the same elements in their retlists.

lp_genes <- lp_annot[["genes"]]
hg_genes <- hs_annot[["gene_annotations"]]

7 Quick peek at the SL samples, hg38 release 115

7.1 Gather the transcript and gene annotations.

hg_tx <- hs_annot[["annotation"]]
hg_map <- hs_annot[["gene_tx_map"]]
lp_genes <- lp_annot[["genes"]]

7.2 Create initial hisat/salmon tables

hu_se_salmon <- create_se(modified_meta, gene_info = hg_tx,
                          tx_gene_map = hg_map, file_column = "salmon_count_table_hg38_115") %>%
  set_conditions(fact = "sample_type") %>%
  set_batches(fact = "library_type")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 110 columns(metadata fields).
## In some cases, (notably salmon) the format of the IDs used by this can be tricky.
## It is likely to require the transcript ID followed by a '.' and the ensembl column:
## 'transcript_version', which is explicitly different than the gene version column.
## If this is not correctly performed, very few genes will be observed
## Rewriting the transcript<->gene map to remove tx versions.
## reading in files with read_tsv
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 
## transcripts missing from tx2gene: 147702
## summarizing abundance
## summarizing counts
## summarizing length
## Matched 14414 annotations and counts.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 14414 rows and 110 columns.
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
## mRNA   RZ 
##   91   12
hu_se_hisat_gene <- create_se(modified_meta, gene_info = hg_genes,
                              file_column = "hisat_count_table_hg38_115") %>%
  set_conditions(fact = "sample_type") %>%
  set_batches(fact = "library_type")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 110 columns(metadata fields).
## Matched 21571 annotations and counts.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 21571 rows and 110 columns.
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
## mRNA   RZ 
##   91   12

7.3 Figure out which samples do not have 7SL categories

undef_7sl <- is.na(colData(hu_se_salmon)[["detectionparasiteby7sl"]])
colData(hu_se_salmon)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
colData(hu_se_hisat_gene)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
sample_7sl <- paste0(colData(hu_se_salmon)[["sample_type"]], "_",
                     colData(hu_se_salmon)[["detectionparasiteby7sl"]])
sample_7sl <- gsub(x = sample_7sl, pattern = "[[:space:]]", replacement = "_")
colData(hu_se_salmon)[["sample_7sl"]] <- sample_7sl
colData(hu_se_hisat_gene)[["sample_7sl"]] <- sample_7sl

7.4 Healthy vs scar samples

healthy_vs_scar <- gsub(x = colData(hu_se_salmon)[["sample_type"]],
                        pattern = "^skin biopsy ", replacement = "")
colData(hu_se_salmon)[["hs"]] <- healthy_vs_scar
colData(hu_se_hisat_gene)[["hs"]] <- healthy_vs_scar

7.5 Clean up missing 7SL detection samples

I think the last of these has been fixed since the last time I updated this sheet.

undef_7sl <- is.na(colData(hu_se_hisat_gene)[["detectionparasiteby7sl"]])
summary(undef_7sl)
##    Mode   FALSE 
## logical     103
if (sum(undef_7sl)) {
  colData(hu_se_salmon)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
  colData(hu_se_hisat_gene)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
} else {
  message("There appear to be no missing 7SL entries.")
}
## There appear to be no missing 7SL entries.

7.7 Combine 7SL status and the sample type into one factor

sample_7sl <- as.factor(paste0(colData(hu_se_hisat_gene)[["sample_type"]], "_",
                               colData(hu_se_hisat_gene)[["detectionparasiteby7sl"]]))
summary(sample_7sl)
##               nasal swab_negative          nasal swab_notapplicable 
##                                11                                 2 
##               nasal swab_positive               PBMCs_notapplicable 
##                                 7                                25 
##      skin biopsy healthy_negative skin biopsy healthy_notapplicable 
##                                14                                 1 
##   skin biopsy non-lesion_negative         skin biopsy scar_negative 
##                                 4                                15 
##    skin biopsy scar_notapplicable                     WBCs_negative 
##                                 1                                19 
##                     WBCs_positive 
##                                 4
colData(hu_se_hisat_gene)[["sample_7sl"]] <- sample_7sl
colData(hu_se_salmon)[["sample_7sl"]] <- sample_7sl

table(colData(hu_se_hisat_gene)[["sample_7sl"]])
## 
##               nasal swab_negative          nasal swab_notapplicable 
##                                11                                 2 
##               nasal swab_positive               PBMCs_notapplicable 
##                                 7                                25 
##      skin biopsy healthy_negative skin biopsy healthy_notapplicable 
##                                14                                 1 
##   skin biopsy non-lesion_negative         skin biopsy scar_negative 
##                                 4                                15 
##    skin biopsy scar_notapplicable                     WBCs_negative 
##                                 1                                19 
##                     WBCs_positive 
##                                 4

7.8 Separate the mRNA samples from ribo-zero

Note, when we are finished, we will be using only the mRNA samples and ignoring the ribo-zero. But there are some questions about the data provided by the two libraries.

hu_se_salmon_mrna <- set_conditions(hu_se_salmon, fact = "sample_type") %>%
  subset_se(subset = "library_type=='mRNA'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##            57            28             6
hu_se_salmon_rz <- set_conditions(hu_se_salmon, fact = "sample_type") %>%
  subset_se(subset = "library_type=='RZ'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##             6             1             5
hu_se_hisat_gene_mrna <- set_conditions(hu_se_hisat_gene, fact = "sample_type") %>%
  subset_se(subset = "library_type=='mRNA'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##            57            28             6
hu_se_hisat_gene_rz <- set_conditions(hu_se_hisat_gene, fact = "sample_type") %>%
  subset_se(subset = "library_type=='RZ'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##             6             1             5

7.9 Extract only the healthy of scar samples, only mRNA

hu_hs_salmon_mrna <- subset_se(hu_se_salmon, subset = "hs=='healthy'|hs=='scar'") %>%
  set_conditions(fact = "hs")
## The numbers of samples by condition are:
## 
## healthy    scar 
##      15      16
hu_hs_hisat_mrna <- subset_se(hu_se_hisat_gene_mrna, subset = "hs=='healthy'|hs=='scar'") %>%
  set_conditions(fact = "hs")
## The numbers of samples by condition are:
## 
## healthy    scar 
##      15      15

8 HU Metadata

8.1 Percent of reads mapped to the genome by hisat

hu_mapped_mrna <- plot_metadata_factors(hu_se_hisat_gene_mrna, column = "hisat_genome_percent_log_hg38_115")
hu_mapped_mrna

hu_mapped_rz <- plot_metadata_factors(hu_se_hisat_gene_rz, column = "hisat_genome_percent_log_hg38_115")
hu_mapped_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

8.2 Number of gene observed by salmon

hu_observed_mrna <- plot_metadata_factors(hu_se_hisat_gene_mrna, column = "salmon_observed_genes_hg38_115")
hu_observed_mrna

hu_observed_rz <- plot_metadata_factors(hu_se_hisat_gene_rz, column = "salmon_observed_genes_hg38_115")
hu_observed_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

8.3 Percent reads quantified to the human transcriptome by salmon

hu_pct_mrna <- plot_metadata_factors(hu_se_salmon_mrna, column = "salmon_percent_hg38_115")
hu_pct_mrna

hu_pct_rz <- plot_metadata_factors(hu_se_salmon_rz, column = "salmon_percent_hg38_115")
hu_pct_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

8.4 Number of kraken identified bacterial reads

While at this, add a column dividing the kraken bacterial classified reads from the total reads.

hu_kraken_mrna <- plot_metadata_factors(hu_se_salmon_mrna, column = "kraken_bacterial_classified")
hu_kraken_mrna

hu_kraken_rz <- plot_metadata_factors(hu_se_salmon_rz, column = "kraken_bacterial_classified")
hu_kraken_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

colData(hu_se_salmon_mrna)[["pct_kraken_bacterial"]] <- colData(hu_se_salmon_mrna)[["kraken_bacterial_classified"]] / colData(hu_se_salmon_mrna)[["trimomatic_output"]]
colData(hu_se_salmon_rz)[["pct_kraken_bacterial"]] <- colData(hu_se_salmon_rz)[["kraken_bacterial_classified"]] / colData(hu_se_salmon_rz)[["trimomatic_output"]]
hu_kraken_pct_mrna <- plot_metadata_factors(hu_se_salmon_mrna, column = "pct_kraken_bacterial")
hu_kraken_pct_mrna
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_ydensity()`).
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_boxplot()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

hu_kraken_pct_rz <- plot_metadata_factors(hu_se_salmon_rz, column = "pct_kraken_bacterial")
hu_kraken_pct_rz
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_ydensity()`).
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.
## Warning: Removed 1 row containing non-finite outside the scale range (`stat_boxplot()`).
## Warning: Removed 1 row containing missing values or values outside the scale range
## (`geom_point()`).

8.5 Sankey of a few factors

hu_sankey <- plot_meta_sankey(hu_se_salmon, factors = c("detectionparasiteby7sl", "sample_type", "library_type"))
## Warning: attributes are not identical across measure variables; they will be dropped
## Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## ℹ The deprecated feature was likely used in the ggsankey package.
##   Please report the issue at <https://github.com/davidsjoberg/ggsankey/issues>.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
hu_sankey
## A sankey plot describing the metadata of 103 samples,
## including 30 out of 0 nodes and traversing metadata factors:
## detectionparasiteby7sl, sample_type, library_type.

9 Write out the metadata in its current state

write_xlsx(data = modified_meta, excel = "sample_sheets/human_samples_202511_with_nasal_factor.xlsx")
## Deleting the file sample_sheets/human_samples_202511_with_nasal_factor.xlsx before writing the tables.
## write_xlsx() wrote sample_sheets/human_samples_202511_with_nasal_factor.xlsx.
## The cursor is on sheet first, row: 106 column: 117.

10 nonzero/libsize/etc

plot_legend(hu_se_salmon)
## The colors used in the expressionset are: #1B9E77, #66A61E, #7570B3, #D95F02, #E6AB02, #E7298A.

plot_libsize(hu_se_salmon)
## Library sizes of 103 samples, 
## ranging from 571,379 to 8,603,759.

plot_nonzero(hu_se_salmon, y_intercept = 0.75)
## The following samples have less than 9369.1 genes.
##   [1] "PRHU0001" "PRHU0002" "PRHU0009" "PRHU0010" "PRHU0011" "PRHU0018" "PRHU0019" "PRHU0020"
##   [9] "PRHU0012" "PRHU0013" "PRHU0014" "PRHU0021" "PRHU0022" "PRHU0023" "PRHU0015" "PRHU0016"
##  [17] "PRHU0017" "PRHU0024" "PRHU0025" "PRHU0026" "PRHU0038" "PRHU0006" "PRHU0007" "PRHU0008"
##  [25] "PRHU0005" "PRHU0004" "PRHU0003" "PRHU0027" "PRHU0028" "PRHU0029" "PRHU0030" "PRHU0031"
##  [33] "PRHU0032" "PRHU0035" "PRHU0033" "PRHU0034" "PRHU0036" "PRHU0037" "PRHU0039" "PRHU0040"
##  [41] "PRHU0041" "PRHU0042" "PRHU0043" "PRHU0044" "PRHU0045" "PRHU0046" "PRHU0047" "PRHU0048"
##  [49] "PRHU0049" "PRHU0050" "PRHU0051" "PRHU0052" "PRHU0053" "PRHU0054" "PRHU0055" "PRHU0056"
##  [57] "PRHU0057" "PRHU0058" "PRHU0059" "PRHU0060" "PRHU0061" "PRHU0062" "PRHU0063" "PRHU0064"
##  [65] "PRHU0065" "PRHU0066" "PRHU0067" "PRHU0068" "PRHU0069" "PRHU0070" "PRHU0071" "PRHU0072"
##  [73] "PRHU0073" "PRHU0074" "PRHU0075" "PRHU0076" "PRHU0077" "PRHU0078" "PRHU0079" "PRHU0080"
##  [81] "PRHU0081" "PRHU0082" "PRHU0083" "PRHU0084" "PRHU0085" "PRHU0086" "PRHU0087" "PRHU0088"
##  [89] "PRHU0089" "PRHU0090" "PRHU0091" "PRHU0092" "PRHU0093" "PRHU0094" "PRHU0095" "PRHU0096"
##  [97] "PRHU0097" "PRHU0098" "PRHU0099" "PRHU0100" "PRHU0101" "PRHU0102" "PRHU0103"
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the hpgltools package.
##   Please report the issue to the authors.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## A non-zero genes plot of 103 samples.
## These samples have an average 2.286 CPM coverage and 4342 genes observed, ranging from 3303 to
## 4932.
## Warning: ggrepel: 95 unlabeled data points (too many overlaps). Consider increasing max.overlaps

plot_libsize(hu_se_hisat_gene)
## Library sizes of 103 samples, 
## ranging from 4,395,408 to 18,970,049.

plot_nonzero(hu_se_hisat_gene)
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## A non-zero genes plot of 103 samples.
## These samples have an average 12.61 CPM coverage and 15853 genes observed, ranging from 14800 to
## 17913.
## Warning: ggrepel: 61 unlabeled data points (too many overlaps). Consider increasing max.overlaps

plot_libsize(hu_se_hisat_gene_mrna)
## Library sizes of 91 samples, 
## ranging from 6,563,145 to 18,970,049.

plot_libsize(hu_se_hisat_gene_rz)
## Library sizes of 12 samples, 
## ranging from 4,395,408 to 16,550,439.

11 Normalize

A couple plots of all the salmon samples colored by sample type.

hu_sesn <- normalize(hu_se_salmon, transform = "log2", convert = "cpm",
                     filter = TRUE, norm = "quant")
## Removing 3906 low-count genes (10508 remaining).
## transform_counts: Found 646654 values equal to 0, adding 1 to the matrix.
plot_corheat(hu_sesn)
## A heatmap of pairwise sample correlations ranging from: 
## 0.410843298800327 to 0.894948457993829.

hu_sesn_pca <- plot_pca(hu_sesn)

pp(file = "images/hu_pca_sampletype.png")
hu_sesn_pca$plot
dev.off()
## png 
##   2
hu_sesn_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by mRNA, RZ.

11.1 Remove samples without 7SL detection state

While we are at it, color by the 7SL detection and shape by sample type (Rz/mRNA).

hu_detected <- subset_se(hu_se_salmon, subset = "detectionparasiteby7sl!='unknown'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  set_batches("sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            63            29            11
## The number of samples by batch are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
hu_detect_nb <- normalize(hu_detected, transform = "log2", convert = "cpm",
                          filter = TRUE, batch = "svaseq")
## Removing 3906 low-count genes (10508 remaining).
## transform_counts: Found 226065 values less than 0.
## transform_counts: Found 226065 values equal to 0, adding 1 to the matrix.
hu_detect_pca <- plot_pca(hu_detect_nb)
## Warning in ggplot2::guide_legend(overwrite.aes = list(size = plot_size)): Arguments in `...` must be used.
## ✖ Problematic argument:
## • overwrite.aes = list(size = plot_size)
## ℹ Did you misspell an argument name?
pp(file = "images/hu_pca_detect_sva.png")
hu_detect_pca$plot
dev.off()
## png 
##   2
hu_detect_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, notapplicable, positive
## Shapes are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.

12 Compare distribution of RZ/Stranded libraries

Maria Adelaida is interested in the distribution of the relatively few rz samples vs the relatively large number of stranded mRNA libraries.

I think it is likely that the nasal samples are of primary interest.

12.1 Salmon tx quantifications

salmon_mrna_7sl <- set_conditions(hu_se_salmon_mrna, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            57            28             6
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                 57                  0                  0                  0                  6 
##        positive_RZ 
##                  0
## The number of samples by batch are:
## 
##             nasal swab    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar 
##                     10                     14                      2                     14 
##                   WBCs 
##                     23
salmon_mrna_7sl_norm <- normalize(salmon_mrna_7sl, convert = "cpm", filter = TRUE,
                                  norm = "quant", transform = "log2")
## Removing 5333 low-count genes (9081 remaining).
## transform_counts: Found 317582 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(salmon_mrna_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

salmon_mrna_7sl_nb <- normalize(salmon_mrna_7sl, convert = "cpm", filter = TRUE,
                                batch = "sva", transform = "log2")
## Removing 5333 low-count genes (9081 remaining).
## transform_counts: Found 104689 values less than 0.
## transform_counts: Found 104689 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_mrna_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.

12.2 Hisat genes, mRNA samples

hisat_mrna_7sl <- set_conditions(hu_se_hisat_gene_mrna, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            57            28             6
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                 57                  0                  0                  0                  6 
##        positive_RZ 
##                  0
## The number of samples by batch are:
## 
##             nasal swab    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar 
##                     10                     14                      2                     14 
##                   WBCs 
##                     23
hisat_mrna_7sl_norm <- normalize(hisat_mrna_7sl, convert = "cpm", filter = TRUE,
                                  norm = "quant", transform = "log2")
## Removing 6173 low-count genes (15398 remaining).
## transform_counts: Found 14191 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(hisat_mrna_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

hisat_mrna_7sl_nb <- normalize(hisat_mrna_7sl, convert = "cpm", filter = TRUE,
                                batch = "sva", transform = "log2")
## Removing 6173 low-count genes (15398 remaining).
## transform_counts: Found 6654 values less than 0.
## transform_counts: Found 6654 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_mrna_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.

salmon_mrna_7sl_de <- all_pairwise(salmon_mrna_7sl, model_fstring = "~ 0 + condition",
                                   model_svs = "svaseq", filter = TRUE, force = TRUE)
## negative_mRNA positive_mRNA 
##            57             6
## Removing 5333 low-count genes (9081 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 340140 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_mRNA positive_mRNA 
##            57             6
## conditions
## negative_mRNA positive_mRNA 
##            57             6
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_mRNA positive_mRNA 
##            57             6
salmon_mrna_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 p_RNA___RN
## basic_vs_deseq      0.2161
## basic_vs_dream      0.3679
## basic_vs_ebseq      0.5795
## basic_vs_edger      0.2731
## basic_vs_limma      0.3580
## basic_vs_noiseq     0.3156
## deseq_vs_dream      0.3559
## deseq_vs_ebseq      0.5695
## deseq_vs_edger      0.6088
## deseq_vs_limma      0.3231
## deseq_vs_noiseq     0.5692
## dream_vs_ebseq      0.4002
## dream_vs_edger      0.6833
## dream_vs_limma      0.9114
## dream_vs_noiseq     0.6110
## ebseq_vs_edger      0.5873
## ebseq_vs_limma      0.3796
## ebseq_vs_noiseq     0.6072
## edger_vs_limma      0.6136
## edger_vs_noiseq     0.7090
## limma_vs_noiseq     0.5769
salmon_mrna_7sl_table <- combine_de_tables(salmon_mrna_7sl_de, excel = "excel/salmon_mrna_7sl_table.xlsx")
## Deleting the file excel/salmon_mrna_7sl_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
salmon_mrna_7sl_table
## A set of combined differential expression results.
##                            table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_mRNA_vs_negative_mRNA          19           445          24            37          15
##   limma_sigdown
## 1            20
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the UpSetR package.
##   Please report the issue to the authors.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## ℹ The deprecated feature was likely used in the UpSetR package.
##   Please report the issue to the authors.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## Plot describing unique/shared genes in a differential expression table.

12.2.1 Salmon nasal samples, mRNA libraries

Now restrict to just the nasal samples.

salmon_nasal_mrna <- subset_se(hu_se_salmon_mrna, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  8                  0                  0                  0                  2 
##        positive_RZ 
##                  0
## The number of samples by batch are:
## 
## nasal swab 
##         10
salmon_nasal_mrna_norm <- normalize(salmon_nasal_mrna, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 10144 low-count genes (4270 remaining).
## transform_counts: Found 10719 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_mrna_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

salmon_nasal_mrna_nb <- normalize(salmon_nasal_mrna, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 10144 low-count genes (4270 remaining).
## transform_counts: Found 3569 values less than 0.
## transform_counts: Found 3569 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_mrna_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

salmon_nasal_mrna_7sl_de <- all_pairwise(salmon_nasal_mrna, model_fstring = "~ 0 + condition",
                                         model_svs = "svaseq", filter = TRUE, force = TRUE)
## negative_mRNA positive_mRNA 
##             8             2
## Removing 10144 low-count genes (4270 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 13141 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_mRNA positive_mRNA 
##             8             2
## conditions
## negative_mRNA positive_mRNA 
##             8             2
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_mRNA positive_mRNA 
##             8             2
salmon_nasal_mrna_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 p_RNA___RN
## basic_vs_deseq      0.4983
## basic_vs_dream      0.7292
## basic_vs_ebseq      0.6543
## basic_vs_edger      0.6276
## basic_vs_limma      0.7234
## basic_vs_noiseq     0.7864
## deseq_vs_dream      0.6132
## deseq_vs_ebseq      0.6434
## deseq_vs_edger      0.7982
## deseq_vs_limma      0.5877
## deseq_vs_noiseq     0.6565
## dream_vs_ebseq      0.6895
## dream_vs_edger      0.8311
## dream_vs_limma      0.9716
## dream_vs_noiseq     0.7470
## ebseq_vs_edger      0.7483
## ebseq_vs_limma      0.6357
## ebseq_vs_noiseq     0.9642
## edger_vs_limma      0.7964
## edger_vs_noiseq     0.7663
## limma_vs_noiseq     0.7051
salmon_nasal_mrna_7sl_table <- combine_de_tables(salmon_nasal_mrna_7sl_de, excel = "excel/salmon_nasal_mrna_7sl_table.xlsx")
## Deleting the file excel/salmon_nasal_mrna_7sl_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
salmon_nasal_mrna_7sl_table
## A set of combined differential expression results.
##                            table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_mRNA_vs_negative_mRNA          29           126           4            29           9
##   limma_sigdown
## 1             0
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

12.2.2 Salmon nasal samples, mRNA libraries

hisat_nasal_mrna <- subset_se(hu_se_hisat_gene_mrna, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  8                  0                  0                  0                  2 
##        positive_RZ 
##                  0
## The number of samples by batch are:
## 
## nasal swab 
##         10
hisat_nasal_mrna_norm <- normalize(hisat_nasal_mrna, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 8412 low-count genes (13159 remaining).
## transform_counts: Found 12 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_mrna_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

hisat_nasal_mrna_nb <- normalize(hisat_nasal_mrna, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 8412 low-count genes (13159 remaining).
## transform_counts: Found 108 values less than 0.
## transform_counts: Found 108 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_mrna_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

12.3 Repeat with the ribo zero samples

12.3.1 Salmon 7SL samples, ribozero

salmon_rz_7sl <- set_conditions(hu_se_salmon_rz, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             6             1             5
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  0                  6                  0                  0                  0 
##        positive_RZ 
##                  5
## The number of samples by batch are:
## 
##             nasal swab skin biopsy non-lesion       skin biopsy scar 
##                      8                      2                      1
salmon_rz_7sl_norm <- normalize(salmon_rz_7sl, convert = "cpm", filter = TRUE,
                                norm = "quant", transform = "log2")
## Removing 7999 low-count genes (6415 remaining).
## transform_counts: Found 27489 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(salmon_rz_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy non-lesion, skin biopsy scar.

salmon_rz_7sl_nb <- normalize(salmon_rz_7sl, convert = "cpm", filter = TRUE,
                              batch = "sva", transform = "log2")
## Removing 7999 low-count genes (6415 remaining).
## transform_counts: Found 8841 values less than 0.
## transform_counts: Found 8841 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_rz_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy non-lesion, skin biopsy scar.

salmon_rz_7sl_de <- all_pairwise(salmon_rz_7sl, model_fstring = "~ 0 + condition",
                                 model_svs = "svaseq", filter = TRUE, force = TRUE)
## negative_RZ positive_RZ 
##           6           5
## Removing 7999 low-count genes (6415 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 29027 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           6           5
## conditions
## negative_RZ positive_RZ 
##           6           5
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           6           5
salmon_rz_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 ps_RZ___RZ
## basic_vs_deseq      0.4620
## basic_vs_dream      0.6389
## basic_vs_ebseq      0.6936
## basic_vs_edger      0.5229
## basic_vs_limma      0.6401
## basic_vs_noiseq     0.6094
## deseq_vs_dream      0.7216
## deseq_vs_ebseq      0.5161
## deseq_vs_edger      0.8905
## deseq_vs_limma      0.7081
## deseq_vs_noiseq     0.6824
## dream_vs_ebseq      0.5468
## dream_vs_edger      0.8219
## dream_vs_limma      0.9929
## dream_vs_noiseq     0.7829
## ebseq_vs_edger      0.5925
## ebseq_vs_limma      0.5352
## ebseq_vs_noiseq     0.6974
## edger_vs_limma      0.8047
## edger_vs_noiseq     0.7644
## limma_vs_noiseq     0.7901
salmon_rz_7sl_table <- combine_de_tables(salmon_rz_7sl_de, excel = "excel/salmon_rz_7sl_table.xlsx")
## Deleting the file excel/salmon_rz_7sl_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
salmon_rz_7sl_table
## A set of combined differential expression results.
##                        table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_RZ_vs_negative_RZ          31            30          45            40           4
##   limma_sigdown
## 1            13
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

12.3.2 Hisat 7SL samples, ribozero

hisat_rz_7sl <- set_conditions(hu_se_hisat_gene_rz, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             6             1             5
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  0                  6                  0                  0                  0 
##        positive_RZ 
##                  5
## The number of samples by batch are:
## 
##             nasal swab skin biopsy non-lesion       skin biopsy scar 
##                      8                      2                      1
hisat_rz_7sl_norm <- normalize(hisat_rz_7sl, convert = "cpm", filter = TRUE,
                               norm = "quant", transform = "log2")
## Removing 7116 low-count genes (14455 remaining).
## transform_counts: Found 306 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(hisat_rz_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy non-lesion, skin biopsy scar.

hisat_rz_7sl_nb <- normalize(hisat_rz_7sl, convert = "cpm", filter = TRUE,
                             batch = "sva", transform = "log2")
## Removing 7116 low-count genes (14455 remaining).
## transform_counts: Found 322 values less than 0.
## transform_counts: Found 322 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_rz_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab, skin biopsy non-lesion, skin biopsy scar.

hisat_rz_7sl_de <- all_pairwise(hisat_rz_7sl, model_fstring = "~ 0 + condition",
                                 model_svs = "svaseq", filter = TRUE)
## negative_RZ positive_RZ 
##           6           5
## Removing 7116 low-count genes (14455 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 11842 entries to zero.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## conditions
## negative_RZ positive_RZ 
##           6           5
## conditions
## negative_RZ positive_RZ 
##           6           5
## conditions
## negative_RZ positive_RZ 
##           6           5
hisat_rz_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 ps_RZ___RZ
## basic_vs_deseq      0.7651
## basic_vs_dream      0.7920
## basic_vs_ebseq      0.7624
## basic_vs_edger      0.7132
## basic_vs_limma      0.8185
## basic_vs_noiseq     0.4647
## deseq_vs_dream      0.9638
## deseq_vs_ebseq      0.7106
## deseq_vs_edger      0.9463
## deseq_vs_limma      0.9561
## deseq_vs_noiseq     0.7600
## dream_vs_ebseq      0.6907
## dream_vs_edger      0.9003
## dream_vs_limma      0.9944
## dream_vs_noiseq     0.7741
## ebseq_vs_edger      0.7353
## ebseq_vs_limma      0.6867
## ebseq_vs_noiseq     0.4985
## edger_vs_limma      0.8904
## edger_vs_noiseq     0.7077
## limma_vs_noiseq     0.7468
hisat_rz_7sl_table <- combine_de_tables(hisat_rz_7sl_de, excel = "excel/hisat_rz_7sl_table.xlsx")
## Deleting the file excel/hisat_rz_7sl_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
hisat_rz_7sl_table
## A set of combined differential expression results.
##                        table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_RZ_vs_negative_RZ         572           306         638           333          38
##   limma_sigdown
## 1            16
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

12.3.3 Salmon 7SL nasal samples, ribozero

Now restrict to just the nasal samples.

salmon_nasal_rz <- subset_se(hu_se_salmon_rz, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             3             1             5
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  0                  3                  0                  0                  0 
##        positive_RZ 
##                  5
## The number of samples by batch are:
## 
## nasal swab 
##          8
salmon_nasal_rz_norm <- normalize(salmon_nasal_rz, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 9288 low-count genes (5126 remaining).
## transform_counts: Found 12445 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_rz_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

salmon_nasal_rz_nb <- normalize(salmon_nasal_rz, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 9288 low-count genes (5126 remaining).
## transform_counts: Found 2938 values less than 0.
## transform_counts: Found 2938 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_rz_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

salmon_rz_7sl_nasal_de <- all_pairwise(salmon_nasal_rz, model_fstring = "~ 0 + condition",
                                       model_svs = "svaseq", filter = TRUE, force = TRUE)
## negative_RZ positive_RZ 
##           3           5
## Removing 9288 low-count genes (5126 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 13365 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           3           5
## conditions
## negative_RZ positive_RZ 
##           3           5
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           3           5
salmon_rz_7sl_nasal_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 ps_RZ___RZ
## basic_vs_deseq      0.6978
## basic_vs_dream      0.8808
## basic_vs_ebseq      0.7100
## basic_vs_edger      0.7549
## basic_vs_limma      0.9121
## basic_vs_noiseq     0.8388
## deseq_vs_dream      0.7444
## deseq_vs_ebseq      0.7559
## deseq_vs_edger      0.9405
## deseq_vs_limma      0.7470
## deseq_vs_noiseq     0.7843
## dream_vs_ebseq      0.7597
## dream_vs_edger      0.8131
## dream_vs_limma      0.9897
## dream_vs_noiseq     0.8514
## ebseq_vs_edger      0.8398
## ebseq_vs_limma      0.7619
## ebseq_vs_noiseq     0.9421
## edger_vs_limma      0.8119
## edger_vs_noiseq     0.8542
## limma_vs_noiseq     0.8586
salmon_rz_7sl_nasal_table <- combine_de_tables(
  salmon_rz_7sl_nasal_de, excel = "excel/salmon_rz_7sl_nasal_table.xlsx")
## Deleting the file excel/salmon_rz_7sl_nasal_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
salmon_rz_7sl_nasal_table
## A set of combined differential expression results.
##                        table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_RZ_vs_negative_RZ          18            27          16            49           0
##   limma_sigdown
## 1             8
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

12.3.4 Hisat 7SL nasal samples, ribozero

hisat_nasal_rz <- subset_se(hu_se_hisat_gene_rz, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'") %>%
  set_conditions(fact = "detection_type") %>%
  set_batches(fact = "sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             3             1             5
## The numbers of samples by condition are:
## 
##      negative_mRNA        negative_RZ notapplicable_mRNA   notapplicable_RZ      positive_mRNA 
##                  0                  3                  0                  0                  0 
##        positive_RZ 
##                  5
## The number of samples by batch are:
## 
## nasal swab 
##          8
hisat_nasal_rz_norm <- normalize(hisat_nasal_rz, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 8212 low-count genes (13359 remaining).
## transform_counts: Found 9 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_rz_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

hisat_nasal_rz_nb <- normalize(hisat_nasal_rz, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 8212 low-count genes (13359 remaining).
## transform_counts: Found 137 values less than 0.
## transform_counts: Found 137 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_rz_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative_mRNA, negative_RZ, notapplicable_mRNA, notapplicable_RZ, positive_mRNA, positive_RZ
## Shapes are defined by nasal swab.

hisat_rz_7sl_nasal_de <- all_pairwise(hisat_nasal_rz, model_fstring = "~ 0 + condition",
                                       model_svs = "svaseq", filter = TRUE, force = TRUE)
## negative_RZ positive_RZ 
##           3           5
## Removing 8212 low-count genes (13359 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 2833 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           3           5
## conditions
## negative_RZ positive_RZ 
##           3           5
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative_RZ positive_RZ 
##           3           5
hisat_rz_7sl_nasal_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 ps_RZ___RZ
## basic_vs_deseq      0.8838
## basic_vs_dream      0.8874
## basic_vs_ebseq      0.8892
## basic_vs_edger      0.8851
## basic_vs_limma      0.8999
## basic_vs_noiseq     0.8937
## deseq_vs_dream      0.9548
## deseq_vs_ebseq      0.8968
## deseq_vs_edger      0.9995
## deseq_vs_limma      0.9510
## deseq_vs_noiseq     0.8922
## dream_vs_ebseq      0.9191
## dream_vs_edger      0.9563
## dream_vs_limma      0.9950
## dream_vs_noiseq     0.9162
## ebseq_vs_edger      0.8978
## ebseq_vs_limma      0.9170
## ebseq_vs_noiseq     0.9992
## edger_vs_limma      0.9526
## edger_vs_noiseq     0.8931
## limma_vs_noiseq     0.9156
hisat_rz_7sl_nasal_table <- combine_de_tables(
  hisat_rz_7sl_nasal_de, excel = "excel/hisat_rz_7sl_nasal_table.xlsx")
## Deleting the file excel/hisat_rz_7sl_nasal_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
hisat_rz_7sl_nasal_table
## A set of combined differential expression results.
##                        table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_RZ_vs_negative_RZ         307           164         278           148           0
##   limma_sigdown
## 1             0
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

13 Look at sample type and 7sl

Start out by recategorizing all samples by the combination of sample type and 7SL state. Then once again extract the relatively small number of nasal samples.

For the moment, just do this with the salmon quantifications.

13.1 All samples: color by (sample type + 7SL)

hu_s7sl <- set_conditions(hu_se_salmon_mrna, fact = "sample_7sl")
## The numbers of samples by condition are:
## 
##               nasal swab_negative          nasal swab_notapplicable 
##                                 8                                 1 
##               nasal swab_positive               PBMCs_notapplicable 
##                                 2                                25 
##      skin biopsy healthy_negative skin biopsy healthy_notapplicable 
##                                14                                 1 
##   skin biopsy non-lesion_negative         skin biopsy scar_negative 
##                                 2                                14 
##    skin biopsy scar_notapplicable                     WBCs_negative 
##                                 1                                19 
##                     WBCs_positive 
##                                 4
hu_s7sl_norm <- normalize(hu_s7sl, transform = "log2", convert = "cpm",
                          norm = "quant", filter = TRUE)
## Removing 4625 low-count genes (9789 remaining).
## transform_counts: Found 511587 values equal to 0, adding 1 to the matrix.
plot_pca(hu_s7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab_negative, nasal swab_notapplicable, nasal swab_positive, PBMCs_notapplicable, skin biopsy healthy_negative, skin biopsy healthy_notapplicable, skin biopsy non-lesion_negative, skin biopsy scar_negative, skin biopsy scar_notapplicable, WBCs_negative, WBCs_positive
## Shapes are defined by negative, notapplicable, positive.

13.2 Extract the nasal subset

There is not much to work with here.

hu_nasal <- subset_se(hu_s7sl, subset = "sample_type=='nasal swab'")
hu_nasal_norm <- normalize(hu_nasal, transform = "log2", convert = "cpm",
                           norm = "quant", filter = TRUE)
## Removing 9942 low-count genes (4472 remaining).
## transform_counts: Found 13344 values equal to 0, adding 1 to the matrix.
plot_pca(hu_nasal_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab_negative, nasal swab_notapplicable, nasal swab_positive
## Shapes are defined by negative, notapplicable, positive.

hu_nasal_nb <- normalize(hu_nasal, transform = "log2", convert = "cpm",
                         batch = "svaseq", filter = TRUE)
## Removing 9942 low-count genes (4472 remaining).
## transform_counts: Found 4426 values less than 0.
## transform_counts: Found 4426 values equal to 0, adding 1 to the matrix.
nasal_pca <- plot_pca(hu_nasal_nb)
nasal_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab_negative, nasal swab_notapplicable, nasal swab_positive
## Shapes are defined by negative, notapplicable, positive.

pp(file = "images/nasal_sample_np.png")
nasal_pca$plot
dev.off()
## png 
##   2

13.3 Extract the blood cells (not PBMC)

hu_wbc <- subset_se(hu_s7sl, subset = "sample_type=='WBCs'")
hu_wbc_nb <- normalize(hu_wbc, transform = "log2", convert = "cpm",
                       batch = "svaseq", filter = TRUE)
## Removing 7669 low-count genes (6745 remaining).
## transform_counts: Found 22099 values less than 0.
## transform_counts: Found 22099 values equal to 0, adding 1 to the matrix.
wbc_pca <- plot_pca(hu_wbc_nb)
wbc_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by WBCs_negative, WBCs_positive
## Shapes are defined by negative, positive.

pp(file = "images/wbc_sample_np.png")
wbc_pca$plot
dev.off()
## png 
##   2

13.4 Skin samples

hu_skin <- subset_se(hu_s7sl, subset = "skinp=='skin'")
hu_skin_nb <- normalize(hu_skin, transform = "log2", convert = "cpm",
                        batch = "svaseq", filter = TRUE)
## Removing 7754 low-count genes (6660 remaining).
## transform_counts: Found 32990 values less than 0.
## transform_counts: Found 32990 values equal to 0, adding 1 to the matrix.
skin_pca <- plot_pca(hu_skin_nb)
skin_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by skin biopsy healthy_negative, skin biopsy healthy_notapplicable, skin biopsy non-lesion_negative, skin biopsy scar_negative, skin biopsy scar_notapplicable
## Shapes are defined by negative, notapplicable.

pp(file = "images/skin_sample_np.png")
skin_pca$plot
dev.off()
## png 
##   2
short_factor <- gsub(x = as.character(colData(hu_nasal)[["condition"]]),
                     pattern = ".*_(.*)$", replacement = "\\1")
hu_nasal <- set_conditions(hu_nasal, fact = as.factor(short_factor))
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
hu_nasal_np <- subset_se(hu_nasal, subset = "condition!='notapplicable'")

hu_nasal_de <- all_pairwise(hu_nasal_np, filter = TRUE, force = TRUE,
                            model_fstring = "~ 0 + condition", model_svs = "svaseq")
## negative positive 
##        8        2
## Removing 10144 low-count genes (4270 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 13141 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative positive 
##        8        2
## conditions
## negative positive 
##        8        2
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## negative positive 
##        8        2
hu_nasal_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 pstv_vs_ng
## basic_vs_deseq      0.4983
## basic_vs_dream      0.7292
## basic_vs_ebseq      0.6543
## basic_vs_edger      0.6276
## basic_vs_limma      0.7234
## basic_vs_noiseq     0.1646
## deseq_vs_dream      0.6132
## deseq_vs_ebseq      0.6434
## deseq_vs_edger      0.7982
## deseq_vs_limma      0.5877
## deseq_vs_noiseq     0.4067
## dream_vs_ebseq      0.6895
## dream_vs_edger      0.8311
## dream_vs_limma      0.9716
## dream_vs_noiseq     0.2401
## ebseq_vs_edger      0.7483
## ebseq_vs_limma      0.6357
## ebseq_vs_noiseq     0.6407
## edger_vs_limma      0.7964
## edger_vs_noiseq     0.3815
## limma_vs_noiseq     0.2164
hu_nasal_table <- combine_de_tables(hu_nasal_de, excel = "excel/persist_table.xlsx")
## Deleting the file excel/persist_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
hu_nasal_table
## A set of combined differential expression results.
##                  table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_vs_negative          29           126           4            29           9
##   limma_sigdown
## 1             0
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

hu_nasal_sig <- extract_significant_genes(hu_nasal_table, excel = "excel/persist_sig.xlsx")
## Deleting the file excel/persist_sig.xlsx before writing the tables.
hu_nasal_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
##                      limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up
## positive_vs_negative        9          0        4         29       29        126       91
##                      ebseq_down basic_up basic_down
## positive_vs_negative         77        0          0

14 Healthy vs Scar samples

One query from our last meeting which I forgot about until I reread my TODO notes: compare the samples marked as healthy compared to those marked as scar. These are two distantly separate skin biopsies of the same person.

hu_hs_de <- all_pairwise(hu_hs_hisat_mrna, filter = TRUE, force = TRUE,
                         model_svs = "svaseq", model_fstring = "~ 0 + condition")
## healthy    scar 
##      15      15
## Removing 7549 low-count genes (14022 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 20103 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## healthy    scar 
##      15      15
## conditions
## healthy    scar 
##      15      15
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into
## integers.
## conditions
## healthy    scar 
##      15      15
hu_hs_table <- combine_de_tables(hu_hs_de, excel = "excel/healthy_vs_scar_table.xlsx")
## Deleting the file excel/healthy_vs_scar_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
hu_hs_table
## A set of combined differential expression results.
##             table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 scar_vs_healthy          70             2          85             4           5             0
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Plot describing unique/shared genes in a differential expression table.

hu_hs_sig <- extract_significant_genes(hu_hs_table, excel = "excel/healthy_vs_scar_sig.xlsx")
## Deleting the file excel/healthy_vs_scar_sig.xlsx before writing the tables.
hu_hs_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
##                 limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up ebseq_down
## scar_vs_healthy        5          0       85          4       70          2        6          0
##                 basic_up basic_down
## scar_vs_healthy       61          5

15 Take a peek at the kraken results

hu_kraken_viral <- create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_viral",
                       handle_na = "zero")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 107 columns(metadata fields).
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0002/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0009/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0010/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0011/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0018/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0019/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0020/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0012/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0013/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0014/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0021/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0022/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0023/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0015/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0016/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0017/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0024/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0025/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0026/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0038/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0006/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0007/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0008/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0005/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0004/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0003/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0027/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0028/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0029/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0030/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0031/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0032/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0035/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0033/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0034/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0036/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0037/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0039/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0040/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0041/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0042/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0043/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0044/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0045/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0046/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0047/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0048/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0049/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0050/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0051/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0052/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0053/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0054/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0055/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0056/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0057/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0058/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0059/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0060/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0061/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0062/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0063/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0064/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0065/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0066/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0067/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0068/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0069/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0070/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0071/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0072/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0073/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0074/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0075/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0076/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0077/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0078/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0079/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0080/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0081/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0082/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0083/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0084/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0085/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0086/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0087/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0088/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0089/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0090/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0091/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0092/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0093/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0094/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0095/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0096/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0097/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0098/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0099/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0100/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0101/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0102/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0103/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_viral", : There are
## some NAs in this data, the 'handle_nas' parameter may be required.
## Matched 487 annotations and counts.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 487 rows and 107 columns.
hu_kraken_viral <- set_conditions(hu_kraken_viral, fact = "sample_type") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
## negative positive  unknown 
##       63       11       29
kraken_viral_norm <- normalize(hu_kraken_viral, filter = TRUE, norm = "cpm", transform = "log2")
## Removing 0 low-count genes (487 remaining).
## Did not recognize the normalization, leaving the table alone.
##   Recognized normalizations include: 'qsmooth', 'sf', 'sf2', 'vsd', 'quant',
##   'tmm', 'qsmooth_median', 'upperquartile', and 'rle.'
## transform_counts: Found 45295 values equal to 0, adding 1 to the matrix.
plot_corheat(kraken_viral_norm)
## A heatmap of pairwise sample correlations ranging from: 
## 0.745956883236431 to 0.973829842329974.

plot_disheat(kraken_viral_norm)
## A heatmap of pairwise sample distances ranging from: 
## 7.3460130601336 to 25.9738855355352.

plot_pca(kraken_viral_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by negative, positive, unknown.

nasal_kraken_viral <- subset_se(hu_kraken_viral, subset = "condition=='nasal swab'")
nasal_norm <- normalize(nasal_kraken_viral, filter = TRUE, norm = "cpm", transform = "log2")
## Removing 0 low-count genes (487 remaining).
## Did not recognize the normalization, leaving the table alone.
##   Recognized normalizations include: 'qsmooth', 'sf', 'sf2', 'vsd', 'quant',
##   'tmm', 'qsmooth_median', 'upperquartile', and 'rle.'
## transform_counts: Found 8804 values equal to 0, adding 1 to the matrix.
plot_corheat(nasal_norm)
## A heatmap of pairwise sample correlations ranging from: 
## 0.765126924228018 to 0.959000311538253.

hu_kraken_bacteria <- create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_bacterial",
                       handle_na = "zero")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 107 columns(metadata fields).
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0002/outputs/02kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0009/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0010/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0011/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0018/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0019/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0020/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0012/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0013/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0014/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0021/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0022/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0023/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0015/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0016/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0017/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0024/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0025/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0026/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0038/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0006/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0007/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0008/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0005/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0004/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0003/outputs/02kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0027/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0028/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0029/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0030/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0031/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0032/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0035/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0033/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0034/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0036/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0037/outputs/06kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0039/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0040/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0041/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0042/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0043/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0044/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0045/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0046/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0047/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0048/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0049/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0050/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0051/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0052/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0053/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0054/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0055/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0056/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0057/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0058/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0059/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0060/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0061/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0062/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0063/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0064/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0065/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0066/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0067/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0068/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0069/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0070/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0071/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0072/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0073/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0074/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0075/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0076/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0077/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0078/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0079/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0080/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0081/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0082/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0083/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0084/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0085/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0086/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0087/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0088/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0089/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0090/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0091/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0092/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0093/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0094/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0095/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0096/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0097/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0098/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0099/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0100/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0101/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0102/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter
## object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0103/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv
## has mismatched rownames.
## Warning in create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_bacterial", : There are
## some NAs in this data, the 'handle_nas' parameter may be required.
## Matched 1754 annotations and counts.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 1754 rows and 107 columns.
hu_kraken_bacteria <- set_conditions(hu_kraken_bacteria, fact = "sample_type") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion 
##                     20                     25                     15                      4 
##       skin biopsy scar                   WBCs 
##                     16                     23
## The number of samples by batch are:
## 
## negative positive  unknown 
##       63       11       29
kraken_bacteria_norm <- normalize(hu_kraken_bacteria, filter = TRUE,
                                  norm = "cpm", transform = "log2")
## Removing 0 low-count genes (1754 remaining).
## Did not recognize the normalization, leaving the table alone.
##   Recognized normalizations include: 'qsmooth', 'sf', 'sf2', 'vsd', 'quant',
##   'tmm', 'qsmooth_median', 'upperquartile', and 'rle.'
## transform_counts: Found 88803 values equal to 0, adding 1 to the matrix.
plot_corheat(kraken_bacteria_norm)
## A heatmap of pairwise sample correlations ranging from: 
## 0.604850876191328 to 0.934803391036582.

plot_disheat(kraken_bacteria_norm)
## A heatmap of pairwise sample distances ranging from: 
## 38.2027362700424 to 129.211351296268.

plot_pca(kraken_bacteria_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by negative, positive, unknown.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

16 Nasal as a proxy for everything else

In the beginning of this document, I created a peculiar factor out of the nasal sample 7SL state and applied its result to every other sample for each person; thus a person who was positive for the nasal sample was deemed positive for everything. Let us see what that looks like…

nasal_7sl_se <- set_conditions(hu_se_salmon, fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       84       19
nasal_7sl_hisat_se <- set_conditions(hu_se_hisat_genes, fact = "nasal_7sl_status")
## Error in `h()`:
## ! error in evaluating the argument 'exp' in selecting a method for function 'set_conditions': object 'hu_se_hisat_genes' not found
nasal_7sl_se_nb <- normalize(nasal_7sl_se, transform = "log2", convert = "cpm", filter = TRUE,
                             batch = "svaseq")
## Removing 3906 low-count genes (10508 remaining).
## transform_counts: Found 213491 values less than 0.
## transform_counts: Found 213491 values equal to 0, adding 1 to the matrix.
plot_pca(nasal_7sl_se_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA, RZ.

nasal_7sl_de <- all_pairwise(nasal_7sl_se, filter = TRUE,
                             model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       84       19
## Removing 3906 low-count genes (10508 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 680865 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## conditions
## negative positive 
##       84       19
## conditions
## negative positive 
##       84       19
## conditions
## negative positive 
##       84       19
nasal_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 pstv_vs_ng
## basic_vs_dream      0.5235
## basic_vs_ebseq      0.4216
## basic_vs_edger      0.3924
## basic_vs_limma      0.5050
## basic_vs_noiseq     0.5955
## dream_vs_ebseq      0.3986
## dream_vs_edger      0.6788
## dream_vs_limma      0.9221
## dream_vs_noiseq     0.5671
## ebseq_vs_edger      0.6263
## ebseq_vs_limma      0.3332
## ebseq_vs_noiseq     0.6965
## edger_vs_limma      0.6006
## edger_vs_noiseq     0.6471
## limma_vs_noiseq     0.5301
nasal_7sl_hisat_de <- all_pairwise(nasal_7sl_hisat_se, filter = TRUE,
                                   model_svs = "svaseq", model_fstring = "~ 0 + condition")
## Error in `h()`:
## ! error in evaluating the argument 'object' in selecting a method for function 'pData': object 'nasal_7sl_hisat_se' not found
nasal_7sl_hisat_de
## Error:
## ! object 'nasal_7sl_hisat_de' not found
nasal_7sl_table <- combine_de_tables(nasal_7sl_de, excel = "excel/nasal_7sl_proxy_table.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
nasal_7sl_table
## A set of combined differential expression results.
##                  table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup
## 1 positive_vs_negative           0             0          50            54          25
##   limma_sigdown
## 1            10
## Only  has information, cannot create an UpSet.
## Plot describing unique/shared genes in a differential expression table.
## NULL
nasal_7sl_hisat_table <- combine_de_tables(nasal_7sl_hisat_de, excel = "excel/nasal_7sl_proxy_table.xlsx")
## Deleting the file excel/nasal_7sl_proxy_table.xlsx before writing the tables.
## Error:
## ! object 'nasal_7sl_hisat_de' not found
nasal_7sl_hisat_table
## Error:
## ! object 'nasal_7sl_hisat_table' not found

Oh, Maria Adelaida was actually looking only for the PBMC samples.

pbmc_nasal_7sl_se <- subset_se(hu_se_salmon, subset = "condition=='PBMCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       21        4
pbmc_nasal_7sl_hisat_se <- subset_se(hu_se_hisat_gene, subset = "condition=='PBMCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       21        4
pbmc_nasal_hisat_norm <- normalize(pbmc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                   norm = "quant", filter = TRUE)
## Removing 9124 low-count genes (12447 remaining).
## transform_counts: Found 39 values equal to 0, adding 1 to the matrix.
plot_pca(pbmc_nasal_hisat_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

pbmc_nasal_hisat_nb <- normalize(pbmc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                 batch = "svaseq", filter = "simple")
## Removing 3931 low-count genes (17640 remaining).
## transform_counts: Found 20961 values less than 0.
## transform_counts: Found 20961 values equal to 0, adding 1 to the matrix.
plot_pca(pbmc_nasal_hisat_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

pbmc_nasal_7sl_de <- all_pairwise(pbmc_nasal_7sl_se, filter = "simple",
                                  model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       21        4
## Removing 4969 low-count genes (9445 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 130146 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## Warning in variancePartition::voomWithDreamWeights(counts = data, formula = model_fstring, : The maximum precision weight is 2.004e+12, suggesting a poor smoothing fit
## on the mean-variance plot for large expression values. Such large weights can
## have unexpected effects downstream.  Consider examining the mean-variance plot
## and reducing the span parameter.
## conditions
## negative positive 
##       21        4
## conditions
## negative positive 
##       21        4
## conditions
## negative positive 
##       21        4
pbmc_nasal_7sl_table <- combine_de_tables(pbmc_nasal_7sl_de, excel = "excel/pbmc_nasal_proxy.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.

17 TODO 202512

Repeat this nasal proxy test using each of the other cell types.

The factors of likely interest are: “wbcs” “nasal swab” ideally both “skin biopsy healthy” and “skin biopsy scar” but perhaps only “skin biopsy”.

wbc_nasal_7sl_se <- subset_se(hu_se_salmon, subset = "condition=='WBCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       19        4
wbc_nasal_7sl_hisat_se <- subset_se(hu_se_hisat_gene, subset = "condition=='WBCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       19        4
wbc_nasal_hisat_norm <- normalize(wbc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                  filter = "simple", norm = "quant")
## Removing 4161 low-count genes (17410 remaining).
## transform_counts: Found 52115 values equal to 0, adding 1 to the matrix.
plot_pca(wbc_nasal_hisat_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

wbc_nasal_hisat_nb <- normalize(wbc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                  filter = "simple", batch = "svaseq")
## Removing 4161 low-count genes (17410 remaining).
## transform_counts: Found 19565 values less than 0.
## transform_counts: Found 19565 values equal to 0, adding 1 to the matrix.
plot_pca(wbc_nasal_hisat_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

wbc_nasal_7sl_de <- all_pairwise(wbc_nasal_7sl_se, filter = "simple",
                                  model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       19        4
## Removing 5290 low-count genes (9124 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 114757 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## conditions
## negative positive 
##       19        4
## conditions
## negative positive 
##       19        4
## conditions
## negative positive 
##       19        4
wbc_nasal_7sl_table <- combine_de_tables(wbc_nasal_7sl_de, excel = "excel/wbc_nasal_proxy.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.

18 Upload count tables for Mariana

ready <- tar_meta_column(hu_se_hisat_gene, column = "hisat_count_table_hg38_115")
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0001/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0002/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0009/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0010/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0011/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0018/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0019/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0020/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0012/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0013/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0014/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0021/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0022/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0023/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0015/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0016/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0017/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0024/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0025/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0026/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0038/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0006/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0007/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0008/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0005/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0004/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0003/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0027/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0028/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0029/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0030/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0031/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0032/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0035/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0033/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0034/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0036/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0037/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0039/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0040/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0041/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0042/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0043/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0044/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0045/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0046/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0047/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0048/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0049/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0050/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0051/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0052/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0053/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0054/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0055/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0056/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0057/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0058/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0059/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0060/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0061/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0062/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0063/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0064/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0065/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0066/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0067/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0068/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0069/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0070/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0071/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0072/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0073/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0074/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0075/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0076/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0077/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0078/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0079/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0080/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0081/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0082/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0083/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0084/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0085/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0086/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0087/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0088/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0089/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0090/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0091/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0092/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0093/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0094/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0095/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0096/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0097/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0098/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0099/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0100/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0101/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0102/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning in utils::tar(output, files = include_list, compression = compression): storing paths of more than 100 bytes is not portable:
##   'preprocessing/PRHU0103/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz'
## Warning: invalid uid value replaced by that for user 'nobody'
ready <- tar_meta_column(hu_se_hisat_gene, column = "salmon_count_table_hg38_115")
## Warning: invalid uid value replaced by that for user 'nobody'
