1 TODO

1.1 202511

  1. We have some samples which are defined as persistent via 7SL. Can we see some reads in those samples?
  2. Send file with mapped reads per sample for all human samples.
  3. Queue SL read counter for all human samples.
  4. All samples have nasal, skin, PBMC; all patients are nasal + or nasal -; perform contrasts of nasal+/nasal- of the PBMC samples. For the PBMC samples, recast them as +/-

1.2 202601

  1. 7SL positive vs negative for
  2. Add plot of transcriptome clustering of nasal swabs between nasal positive/negative with the caveat that the are only 2 positives, 8 negatives, and 1 undetermined in the stranded library; the ribozero have 5 positive, 3 negative, and 1 undetermined.
  3. Create a relative abundance of bacteria/viruses observed in the nassal samples; counting at genus.

Note that this requires splitting the data into 4 groups: salmon+rz, salmon+mrna, hisat+rz, hisat+mrna.

2 Changelog

2.1 202511

Following my conversation with Maria Adelaida, I downloaded a new copy of our online sample sheet and made a sub-copy with only the human samples. It is named (creatively) sample_sheets/human_samples_202511.xlsx

3 Introduction

I want to use this document to examine our first round of persistence samples. I checked my email from Najib and did not find a sample sheet but did find an explanation of the three sample types we expect.

In preparation for this, I downloaded a new hg38 genome. Since the panamensis asembly has not significantly changed (excepting the putative long read genome which I have not yet seen), I am just using the same one.

4 Loading annotation

The hg38 genome I got is brand new (202405), so do not use the archive for a while.

## Ok, so useast.ensembl is failing today, let us use the jan2024 archive?
#hs_annot <- load_biomart_annotations(archive = FALSE, species = "hsapiens")
## Seems like the 202401 archive is a good choice, it is explicitly the hg38_111 release.
## and it is waaaaay faster (like 100x) than useast right now.
hs_annot <- load_biomart_annotations(archive = FALSE, species = "hsapiens", overwrite = TRUE,
                                     year = 2025, month = "08")
## Using mart: ENSEMBL_MART_ENSEMBL from host: useast.ensembl.org.
## Successfully connected to the hsapiens_gene_ensembl database.
## Finished downloading ensembl gene annotations.
## Finished downloading ensembl structure annotations.
## symbol columns is null, pattern matching 'symbol' and taking the first.
## Found 2 potential symbol columns, using the first:hgnc_symbol.
## Including symbols, there are 86371 vs the 533740 gene annotations.
## Not dropping haplotype chromosome annotations, set drop_haplotypes = TRUE if this is bad.
## Saving annotations to hsapiens_biomart_annotations.rda.
## Finished save().
panamensis_orgdb_idx <- grep(pattern = "^org.+panamen.+MHOM.+db$", x = rownames(installed.packages()))
panamensis_orgdb <- tail(rownames(installed.packages())[panamensis_orgdb_idx], n = 1)
lp_annot <- load_orgdb_annotations(panamensis_orgdb, keytype = "gid")
## Loading required package: AnnotationDbi
## Loading required package: stats4
## Loading required package: BiocGenerics
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:hpgltools':
## 
##     annotation<-, conditions, conditions<-
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter,
##     Find, get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
##     pmin.int, Position, rank, rbind, Reduce, rownames, sapply, saveRDS, setdiff, table, tapply, union, unique, unsplit,
##     which.max, which.min
## Loading required package: Biobase
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")',
##     and for packages 'citation("pkgname")'.
## 
## Attaching package: 'Biobase'
## The following object is masked from 'package:hpgltools':
## 
##     notes
## Loading required package: IRanges
## Loading required package: S4Vectors
## 
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:utils':
## 
##     findMatches
## The following objects are masked from 'package:base':
## 
##     expand.grid, I, unname
## 
## Attaching package: 'IRanges'
## The following object is masked from 'package:hpgltools':
## 
##     trim
## 
## Unable to find CDSNAME, setting it to ANNOT_EXTERNAL_DB_NAME.
## Unable to find CDSCHROM in the db, removing it.
## Unable to find CDSSTRAND in the db, removing it.
## Unable to find CDSSTART in the db, removing it.
## Unable to find CDSEND in the db, removing it.
## Extracted all gene ids.
## Attempting to select: ANNOT_EXTERNAL_DB_NAME, GENE_TYPE
## 'select()' returned 1:1 mapping between keys and columns

This is a little silly, but I am going to reload the annotations using the previous invocation to extract the annotation table without having to think. The previous block loads the orgdb for me, so I can just use that to get the fun annotations.

lp_annot <- load_orgdb_annotations(panamensis_orgdb, keytype = "gid", fields = "^annot")
## Unable to find CDSNAME, setting it to ANNOT_EXTERNAL_DB_NAME.
## Unable to find CDSCHROM in the db, removing it.
## Unable to find CDSSTRAND in the db, removing it.
## Unable to find CDSSTART in the db, removing it.
## Unable to find CDSEND in the db, removing it.
## Extracted all gene ids.
## Attempting to select: ANNOT_EXTERNAL_DB_NAME, GENE_TYPE, ANNOT_AA_SEQUENCE_ID, ANNOT_ANNOTATED_GO_COMPONENT, ANNOT_ANNOTATED_GO_FUNCTION, ANNOT_ANNOTATED_GO_ID_COMPONENT, ANNOT_ANNOTATED_GO_ID_FUNCTION, ANNOT_ANNOTATED_GO_ID_PROCESS, ANNOT_ANNOTATED_GO_PROCESS, ANNOT_ANTICODON, ANNOT_APOLLO_LINK_OUT, ANNOT_APOLLO_TRANSCRIPT_DESCRIPTION, ANNOT_CDS, ANNOT_CDS_LENGTH, ANNOT_CHROMOSOME, ANNOT_CODING_END, ANNOT_CODING_START, ANNOT_EC_NUMBERS, ANNOT_EC_NUMBERS_DERIVED, ANNOT_END_MAX, ANNOT_EXON_COUNT, ANNOT_EXTERNAL_DB_NAME, ANNOT_EXTERNAL_DB_VERSION, ANNOT_FIVE_PRIME_UTR_LENGTH, ANNOT_GENE_CONTEXT_END, ANNOT_GENE_CONTEXT_START, ANNOT_GENE_END_MAX, ANNOT_GENE_END_MAX_TEXT, ANNOT_GENE_ENTREZ_ID, ANNOT_GENE_ENTREZ_LINK, ANNOT_GENE_EXON_COUNT, ANNOT_GENE_HTS_NONCODING_SNPS, ANNOT_GENE_HTS_NONSYN_SYN_RATIO, ANNOT_GENE_HTS_NONSYNONYMOUS_SNPS, ANNOT_GENE_HTS_STOP_CODON_SNPS, ANNOT_GENE_HTS_SYNONYMOUS_SNPS, ANNOT_GENE_LOCATION_TEXT, ANNOT_GENE_NAME, ANNOT_GENE_ORTHOLOG_NUMBER, ANNOT_GENE_ORTHOMCL_NAME, ANNOT_GENE_PARALOG_NUMBER, ANNOT_GENE_PREVIOUS_IDS, ANNOT_GENE_PRODUCT, ANNOT_GENE_START_MIN, ANNOT_GENE_START_MIN_TEXT, ANNOT_GENE_TOTAL_HTS_SNPS, ANNOT_GENE_TRANSCRIPT_COUNT, ANNOT_GENE_TYPE, ANNOT_GENOMIC_SEQUENCE_LENGTH, ANNOT_GENUS_SPECIES, ANNOT_HAS_MISSING_TRANSCRIPTS, ANNOT_INTERPRO_DESCRIPTION, ANNOT_INTERPRO_ID, ANNOT_IS_DEPRECATED, ANNOT_IS_PSEUDO, ANNOT_ISOELECTRIC_POINT, ANNOT_LOCATION_TEXT, ANNOT_MAP_LOCATION, ANNOT_MCMC_LOCATION, ANNOT_MOLECULAR_WEIGHT, ANNOT_NCBI_TAX_ID, ANNOT_ORTHOMCL_LINK, ANNOT_OVERVIEW, ANNOT_PFAM_DESCRIPTION, ANNOT_PFAM_ID, ANNOT_PIRSF_DESCRIPTION, ANNOT_PIRSF_ID, ANNOT_PREDICTED_GO_COMPONENT, ANNOT_PREDICTED_GO_FUNCTION, ANNOT_PREDICTED_GO_ID_COMPONENT, ANNOT_PREDICTED_GO_ID_FUNCTION, ANNOT_PREDICTED_GO_ID_PROCESS, ANNOT_PREDICTED_GO_PROCESS, ANNOT_PRIMARY_KEY, ANNOT_PROB_MAP, ANNOT_PROB_MCMC, ANNOT_PROSITEPROFILES_DESCRIPTION, ANNOT_PROSITEPROFILES_ID, ANNOT_PROTEIN_LENGTH, ANNOT_PROTEIN_SEQUENCE, ANNOT_PROTEIN_SOURCE_ID, ANNOT_PSEUDO_STRING, ANNOT_SEQUENCE_DATABASE_NAME, ANNOT_SEQUENCE_ID, ANNOT_SIGNALP_PEPTIDE, ANNOT_SMART_DESCRIPTION, ANNOT_SMART_ID, ANNOT_SNPOVERVIEW, ANNOT_SO_ID, ANNOT_SO_TERM_DEFINITION, ANNOT_SO_TERM_NAME, ANNOT_SO_VERSION, ANNOT_START_MIN, ANNOT_STRAND, ANNOT_STRAND_PLUS_MINUS, ANNOT_SUPERFAMILY_DESCRIPTION, ANNOT_SUPERFAMILY_ID, ANNOT_THREE_PRIME_UTR_LENGTH, ANNOT_TIGRFAM_DESCRIPTION, ANNOT_TIGRFAM_ID, ANNOT_TM_COUNT, ANNOT_TRANS_FOUND_PER_GENE_INTERNAL, ANNOT_TRANSCRIPT_INDEX_PER_GENE, ANNOT_TRANSCRIPT_LENGTH, ANNOT_TRANSCRIPT_LINK, ANNOT_TRANSCRIPT_PRODUCT, ANNOT_TRANSCRIPT_SEQUENCE, ANNOT_TRANSCRIPTS_FOUND_PER_GENE, ANNOT_UNIPROT_IDS, ANNOT_UNIPROT_LINKS
## 'select()' returned 1:1 mapping between keys and columns

5 Collect preprocessed metadata

Use my new sample sheet here.

current_samplesheet <- "sample_sheets/human_samples_202511.xlsx"
first_spec <- make_rnaseq_spec()
input <- read_metadata(current_samplesheet)
colnames(input)
##  [1] "seq"                                         "hpgl_identifier"                             "aim"                                        
##  [4] "participant_code"                            "sample_type"                                 "tube_label_origin"                          
##  [7] "number_of_vials"                             "sample_collection_date"                      "exp_person"                                 
## [10] "clinical_presentation"                       "parasite"                                    "prior_host_hpgl_code"                       
## [13] "prior_parasite_hpgl_code"                    "initial_recurrent"                           "drug_susceptibility_perc_reduction_gluc"    
## [16] "drug_susceptibility_perc_reduction_mlf"      "zymodeme_by_electrophoresis"                 "zymodeme_by_pca"                            
## [19] "rna_extraction_date"                         "rna_volume_ul"                               "rna_qc_date"                                
## [22] "rna_nanodrop_ng_ul"                          "X260_280_ratio"                              "X260_230_ratio"                             
## [25] "rna_bioanalyzer_or_qubit_ng_ul"              "rin"                                         "rna_qc_passed"                              
## [28] "library_type"                                "library_version"                             "library_name"                               
## [31] "library_const_date"                          "rna_used_to_construct_libraries_ul"          "rna_used_to_construct_libraries_ng"         
## [34] "library_qc_date"                             "lib_qc_passed"                               "library_volume_ul"                          
## [37] "unique_dual_index_set"                       "unique_dual_index_plate_coordinate"          "unique_dual_index_id"                       
## [40] "concentrations_determined_by_1"              "primer_conc_ng_ul_30100bp_region_1"          "adapter_dimer_conc_ng_ul_100200bp_region_1" 
## [43] "library_conc_ng_ul_2001000bp_region_1"       "library_molarity_nm_2001000bp_region_1"      "library_ave_frag_size_bp_2001000bp_region_1"
## [46] "calculated_adapter_dimer_percent_1"          "concentrations_determined_by_2"              "primer_conc_ng_ul_30100bp_region_2"         
## [49] "adapter_dimer_conc_ng_ul_100200bp_region_2"  "library_conc_ng_ul_2001000bp_region_2"       "library_molarity_nm_2001000bp_region_2"     
## [52] "library_ave_frag_size_bp_2001000bp_region_2" "calculated_adapter_dimer_percent_2"          "library_volume_ul_sent_to_umd"              
## [55] "shipment_date"                               "bbiagtc_date_received"                       "bbiagtc_library_cleanup"                    
## [58] "bbiagtc_date_sequenced"                      "bbiagtc_sequence_batch"                      "bbiagtc_pe_reads_pf"                        
## [61] "bbiagtc_fastp_duplication_rate"              "hisat2inputreads"                            "lpsingleconaligned"                         
## [64] "hg38singleconaligned"                        "lpmulticonaligned"                           "hg38multiconaligned"                        
## [67] "lpsingleallaligned"                          "hg38singleallaligned"                        "lpmultiallaligned"                          
## [70] "hg38multiallaligned"                         "lppercentmappedfromlog"                      "hg38percentmappedfromlog"                   
## [73] "lphisatobservedcds"                          "hg38hisatobservedcds"                        "krakenmatrix"                               
## [76] "lphisatgenematrix"                           "hg38hisatgenematrix"                         "lpsalmontranscriptmatrix"                   
## [79] "hg38salmontranscriptmatrix"                  "detectionparasiteby7sl"                      "detectionparasiteby18s"                     
## [82] "detectionparasitebykdnasec"                  "metabolomicnasalswaborplasma"                "immunophenotypingnasalswaborpbmcs"
pre_meta <- gather_preprocessing_metadata(
  starting_metadata = current_samplesheet, id_column = "hpgl_identifier",
  specification = first_spec, new_metadata = "persistence_hu_modified.xlsx",
  basedir = "preprocessing", species = c("hg38_115", "lpanamensis_mhomcol_v68"))
## Dropped 1 rows from the sample metadata because the sample ID is blank.
## Did not find the condition column in the sample sheet.
## Filling it in as undefined.
## Did not find the batch column in the sample sheet.
## Filling it in as undefined.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## Warning in dispatch_regex_search(meta, search, replace, input_file_spec, : NAs introduced by coercion
## Warning in dispatch_regex_search(meta, search, replace, input_file_spec, : NAs introduced by coercion
## Writing new metadata to: persistence_hu_modified.xlsx
## Deleting the file persistence_hu_modified.xlsx before writing the tables.
modified_meta <- pre_meta[["new_meta"]]
## Added the following line to gather_preprocessing_metadata()
rownames(modified_meta) <- make.names(modified_meta[["hpgl_identifier"]], unique = TRUE)

## FIXME: 202511: I broke something in some of these functions and it is pulling
## the wrong information for number of observed genes.
head(modified_meta)
##          seq hpgl_identifier  aim participant_code sample_type tube_label_origin number_of_vials sample_collection_date exp_person
## PRHU0001   1        PRHU0001 aim1           PP1006       PBMCs      PP1006 PBMCs               1             2024-01-29         LG
## PRHU0002   2        PRHU0002 aim1           PP2001  nasal swab         PP2001 HN               1             2024-02-05         LG
## PRHU0009   4        PRHU0009 aim1           PP2003       PBMCs      PP2003 PBMCs               1                   <NA>         LG
## PRHU0010   5        PRHU0010 aim1           PP2004       PBMCs      PP2004 PBMCs               1             2024-04-05         LG
## PRHU0011   6        PRHU0011 aim1           PP2005       PBMCs      PP2005 PBMCs               1             2024-04-12         LG
## PRHU0018   7        PRHU0018 aim1           PP2003        WBCs       PP2003 WBCs               1             2024-03-22         LG
##          clinical_presentation            prior_host_hpgl_code rna_extraction_date rna_volume_ul rna_qc_date rna_nanodrop_ng_ul X260_280_ratio
## PRHU0001                    HD                            <NA>          2024-01-31            25  2024-02-15             338.43           2.04
## PRHU0002                    HD                            <NA>          2024-02-14            25  2024-02-15              84.82           2.08
## PRHU0009                  H-CL TMRC30130; TMRC30124; TMRC30131          2024-05-06            25  2024-05-06              115.1           1.99
## PRHU0010                  H-CL                            <NA>          2024-05-06            25  2024-05-06             185.28           2.03
## PRHU0011                  H-CL                            <NA>          2024-05-06            25  2024-05-06              70.12           1.95
## PRHU0018                  H-CL TMRC30130; TMRC30124; TMRC30131          2024-03-22            30  2024-05-02              49.55           2.06
##          X260_230_ratio rna_bioanalyzer_or_qubit_ng_ul rin rna_qc_passed library_type library_version             library_name
## PRHU0001           1.73                            407  NA           yes         mRNA               1      PP1006.PBMCs.mRNA.1
## PRHU0002           1.97                            101  NA           yes         mRNA               1 PP2001.nasal swab.mRNA.1
## PRHU0009           1.34                            167 8.0           yes         mRNA               1      PP2003.PBMCs.mRNA.1
## PRHU0010           1.91                            264 8.5           yes         mRNA               1      PP2004.PBMCs.mRNA.1
## PRHU0011           0.85                            113 8.7           yes         mRNA               1      PP2005.PBMCs.mRNA.1
## PRHU0018           1.56                             70 9.1           yes         mRNA               1       PP2003.WBCs.mRNA.1
##          library_const_date rna_used_to_construct_libraries_ul rna_used_to_construct_libraries_ng library_qc_date lib_qc_passed
## PRHU0001         2024-02-09                               1.48                                500      2024-02-12           yes
## PRHU0002         2024-02-15                               3.54                                300      2024-02-16           yes
## PRHU0009         2024-06-05                               1.79                                300      2024-06-11           yes
## PRHU0010         2024-06-05                               1.14                                300      2024-06-11           yes
## PRHU0011         2024-06-05                               2.65                                300      2024-06-11           yes
## PRHU0018         2024-06-05                               4.29                                300      2024-06-11           yes
##          library_volume_ul                 unique_dual_index_set unique_dual_index_plate_coordinate unique_dual_index_id
## PRHU0001                15 IDT for Illumina RNA UD Indexes Set A                                C04              UDP0027
## PRHU0002                15 IDT for Illumina RNA UD Indexes Set A                                F04              UDP0030
## PRHU0009                15 IDT for Illumina RNA UD Indexes Set A                                B05              UDP0034
## PRHU0010                15 IDT for Illumina RNA UD Indexes Set A                                C05              UDP0035
## PRHU0011                15 IDT for Illumina RNA UD Indexes Set A                                D05              UDP0036
## PRHU0018                15 IDT for Illumina RNA UD Indexes Set A                                E05              UDP0037
##          concentrations_determined_by_1 primer_conc_ng_ul_30100bp_region_1 adapter_dimer_conc_ng_ul_100200bp_region_1
## PRHU0001                    TapeStation                                  -                                          -
## PRHU0002                    TapeStation                                  -                                          -
## PRHU0009                    Bioanalyzer                               0.49                                       0.27
## PRHU0010                    Bioanalyzer                               0.58                                       0.35
## PRHU0011                    Bioanalyzer                                0.7                                       0.27
## PRHU0018                    Bioanalyzer                               0.58                                       0.12
##          library_conc_ng_ul_2001000bp_region_1 library_molarity_nm_2001000bp_region_1 library_ave_frag_size_bp_2001000bp_region_1
## PRHU0001                                 69.50                                  321.0                                         353
## PRHU0002                                 17.50                                   82.3                                         345
## PRHU0009                                 60.64                                  293.3                                         336
## PRHU0010                                 64.84                                  314.0                                         340
## PRHU0011                                 56.86                                  279.9                                         330
## PRHU0018                                 85.33                                  421.9                                         328
##          calculated_adapter_dimer_percent_1 concentrations_determined_by_2 primer_conc_ng_ul_30100bp_region_2
## PRHU0001                                  0                           <NA>                               <NA>
## PRHU0002                                  0                           <NA>                               <NA>
## PRHU0009                0.00445250659630607                           <NA>                               <NA>
## PRHU0010                 0.0053979025293029                           <NA>                               <NA>
## PRHU0011                0.00474850510024622                           <NA>                               <NA>
## PRHU0018                0.00140630493378648                           <NA>                               <NA>
##          adapter_dimer_conc_ng_ul_100200bp_region_2 library_conc_ng_ul_2001000bp_region_2 library_molarity_nm_2001000bp_region_2
## PRHU0001                                       <NA>                                    NA                                     NA
## PRHU0002                                       <NA>                                    NA                                     NA
## PRHU0009                                       <NA>                                    NA                                     NA
## PRHU0010                                       <NA>                                    NA                                     NA
## PRHU0011                                       <NA>                                    NA                                     NA
## PRHU0018                                       <NA>                                    NA                                     NA
##          library_ave_frag_size_bp_2001000bp_region_2 calculated_adapter_dimer_percent_2 library_volume_ul_sent_to_umd shipment_date
## PRHU0001                                          NA                                 NA                            14    2024-02-20
## PRHU0002                                          NA                                 NA                            14    2024-02-20
## PRHU0009                                          NA                                 NA                            14    2024-08-26
## PRHU0010                                          NA                                 NA                            14    2024-08-26
## PRHU0011                                          NA                                 NA                            14    2024-08-26
## PRHU0018                                          NA                                 NA                            14    2024-08-26
##          bbiagtc_date_received bbiagtc_library_cleanup bbiagtc_date_sequenced bbiagtc_sequence_batch bbiagtc_pe_reads_pf
## PRHU0001            2024-04-05              not needed             2024-05-03                PERS001            13167287
## PRHU0002            2024-04-05              not needed             2024-05-03                PERS001             9815642
## PRHU0009            2024-11-05              not needed             2024-11-08                PERS002            23614271
## PRHU0010            2024-11-05                     yes             2024-11-08                PERS002            22777166
## PRHU0011            2024-11-05              not needed             2024-11-08                PERS002            23458700
## PRHU0018            2024-11-05              not needed             2024-11-08                PERS002            20907177
##          bbiagtc_fastp_duplication_rate detectionparasiteby7sl detectionparasiteby18s detectionparasitebykdnasec metabolomicnasalswaborplasma
## PRHU0001                           13.9                   <NA>                   <NA>                       <NA>                         <NA>
## PRHU0002                           18.4                   <NA>                   <NA>                       <NA>                         <NA>
## PRHU0009                           19.9                   <NA>                   <NA>                       <NA>                          yes
## PRHU0010                           16.9                   <NA>                   <NA>                       <NA>                          yes
## PRHU0011                           20.4                   <NA>                   <NA>                       <NA>                          yes
## PRHU0018                           16.6               positive               positive                       <NA>                          yes
##          immunophenotypingnasalswaborpbmcs condition     batch sampleid trimomatic_input trimomatic_output trimomatic_percent fastqc_pct_gc
## PRHU0001                              <NA> undefined undefined PRHU0001         13167287          12085247              0.918            49
## PRHU0002                              <NA> undefined undefined PRHU0002               NA                NA                 NA            50
## PRHU0009                              <NA> undefined undefined PRHU0009         23614271          22272928              0.943            51
## PRHU0010                              <NA> undefined undefined PRHU0010         22777166          21621102              0.949            51
## PRHU0011                              <NA> undefined undefined PRHU0011         23458700          22252717              0.949            51
## PRHU0018                              <NA> undefined undefined PRHU0018         20907177          19815284              0.948            52
##          kraken_bacterial_classified kraken_bacterial_unclassified kraken_first_bacterial_species kraken_first_bacterial_species_reads
## PRHU0001                      418110                      11667137          Staphylococcus aureus                               117628
## PRHU0002                      324998                       8681618         Bacillus thuringiensis                                48125
## PRHU0009                       86840                        866454        Porphyrobacter sp. GA68                                11937
## PRHU0010                       76438                       1056370          Klebsiella pneumoniae                                 7070
## PRHU0011                       70332                        803596          Klebsiella pneumoniae                                 6002
## PRHU0018                       46603                        619808            Priestia megaterium                                 7831
##          kraken_viral_classified kraken_viral_unclassified kraken_first_viral_species kraken_first_viral_species_reads
## PRHU0001                   58135                  12027112      Proteus virus Isfahan                            35851
## PRHU0002                   43789                   8962827      Proteus virus Isfahan                            28527
## PRHU0009                  197113                  22075815      Proteus virus Isfahan                           146195
## PRHU0010                  132560                  21488542      Proteus virus Isfahan                            90884
## PRHU0011                  204047                  22048670      Proteus virus Isfahan                           160954
## PRHU0018                  108172                  19707112      Proteus virus Isfahan                            74208
##                                                                   kraken_matrix_viral
## PRHU0001 preprocessing/PRHU0001/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0002 preprocessing/PRHU0002/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0009 preprocessing/PRHU0009/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0010 preprocessing/PRHU0010/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0011 preprocessing/PRHU0011/outputs/20250918kraken_viral/kraken_report_matrix.tsv
## PRHU0018 preprocessing/PRHU0018/outputs/20250918kraken_viral/kraken_report_matrix.tsv
##                                                            kraken_matrix_bacterial hisat_rrna_input_reads_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/02kraken_bacteria/kraken_report_matrix.tsv                        12085247
## PRHU0002 preprocessing/PRHU0002/outputs/02kraken_bacteria/kraken_report_matrix.tsv                         9006616
## PRHU0009 preprocessing/PRHU0009/outputs/06kraken_bacteria/kraken_report_matrix.tsv                        22272928
## PRHU0010 preprocessing/PRHU0010/outputs/06kraken_bacteria/kraken_report_matrix.tsv                        21621102
## PRHU0011 preprocessing/PRHU0011/outputs/06kraken_bacteria/kraken_report_matrix.tsv                        22252717
## PRHU0018 preprocessing/PRHU0018/outputs/06kraken_bacteria/kraken_report_matrix.tsv                        19815284
##          hisat_rrna_input_reads_lpanamensis_mhomcol_v68 hisat_rrna_single_concordant_hg38_115
## PRHU0001                                             NA                                 73426
## PRHU0002                                             NA                                 80682
## PRHU0009                                             NA                                408744
## PRHU0010                                             NA                                149371
## PRHU0011                                             NA                                200482
## PRHU0018                                             NA                                117268
##          hisat_rrna_single_concordant_lpanamensis_mhomcol_v68 hisat_rrna_multi_concordant_hg38_115
## PRHU0001                                                   NA                                    0
## PRHU0002                                                   NA                                    3
## PRHU0009                                                   NA                                   40
## PRHU0010                                                   NA                                    6
## PRHU0011                                                   NA                                    1
## PRHU0018                                                   NA                                    3
##          hisat_rrna_multi_concordant_lpanamensis_mhomcol_v68 hisat_rrna_percent_log_hg38_115 hisat_rrna_percent_log_lpanamensis_mhomcol_v68
## PRHU0001                                                  NA                            0.63                                             NA
## PRHU0002                                                  NA                            0.94                                             NA
## PRHU0009                                                  NA                            1.93                                             NA
## PRHU0010                                                  NA                            0.73                                             NA
## PRHU0011                                                  NA                            0.94                                             NA
## PRHU0018                                                  NA                            0.62                                             NA
##          hisat_genome_input_reads_hg38_115 hisat_genome_input_reads_lpanamensis_mhomcol_v68 hisat_genome_single_concordant_hg38_115
## PRHU0001                          12085247                                         12085247                                11027565
## PRHU0002                           9006616                                          9006616                                 8119315
## PRHU0009                          22272928                                         22272928                                19763135
## PRHU0010                          21621102                                         21621102                                19276929
## PRHU0011                          22252717                                         22252717                                19973324
## PRHU0018                          19815284                                         19815284                                18103721
##          hisat_genome_single_concordant_lpanamensis_mhomcol_v68 hisat_genome_multi_concordant_hg38_115
## PRHU0001                                                    838                                 621289
## PRHU0002                                                    760                                 505363
## PRHU0009                                                   1382                                1516414
## PRHU0010                                                    559                                1171082
## PRHU0011                                                    662                                1356659
## PRHU0018                                                    415                                1010925
##          hisat_genome_multi_concordant_lpanamensis_mhomcol_v68 hisat_genome_single_all_hg38_115
## PRHU0001                                                    99                           365382
## PRHU0002                                                   102                           282669
## PRHU0009                                                   102                           790119
## PRHU0010                                                    96                           920497
## PRHU0011                                                    82                           758139
## PRHU0018                                                    62                           536623
##          hisat_genome_single_all_lpanamensis_mhomcol_v68 hisat_genome_multi_all_hg38_115 hisat_genome_multi_all_lpanamensis_mhomcol_v68
## PRHU0001                                           14329                           84362                                           6702
## PRHU0002                                           13432                           71982                                           7507
## PRHU0009                                           22507                          235198                                          17740
## PRHU0010                                           10941                          242320                                          12474
## PRHU0011                                           12590                          206128                                          13330
## PRHU0018                                            8241                          133869                                           9468
##          hisat_unmapped_hg38_115 hisat_unmapped_lpanamensis_mhomcol_v68 hisat_genome_percent_log_hg38_115
## PRHU0001                   36996                               24147571                             99.85
## PRHU0002                  131347                               17990551                             99.27
## PRHU0009                  220613                               44502561                             99.50
## PRHU0010                  189403                               43217431                             99.56
## PRHU0011                  233569                               44477964                             99.48
## PRHU0018                  177472                               39611887                             99.55
##          hisat_genome_percent_log_lpanamensis_mhomcol_v68                                                  hisat_alignment_hg38_115
## PRHU0001                                             0.09 preprocessing/PRHU0001/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0002                                             0.13 preprocessing/PRHU0002/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0009                                             0.10 preprocessing/PRHU0009/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0010                                             0.06 preprocessing/PRHU0010/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0011                                             0.06 preprocessing/PRHU0011/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
## PRHU0018                                             0.05 preprocessing/PRHU0018/outputs/20250918hisat_hg38_115/hg38_115_genome.bam
##                                                                          hisat_alignment_lpanamensis_mhomcol_v68 salmon_mapped_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome.bam                     NA
##          salmon_mapped_lpanamensis_mhomcol_v68 salmon_percent_hg38_115 salmon_percent_lpanamensis_mhomcol_v68 salmon_observed_genes_hg38_115
## PRHU0001                                   228                   54.62                               0.001887                          40892
## PRHU0002                                    NA                   61.09                               0.023971                          37639
## PRHU0009                                   514                   53.76                               0.002308                          47176
## PRHU0010                                   532                   55.29                               0.002461                          47162
## PRHU0011                                   564                   56.48                               0.002535                          46983
## PRHU0018                                   411                   57.38                               0.002074                          43731
##          salmon_observed_genes_lpanamensis_mhomcol_v68                                 input_r1                                 input_r2
## PRHU0001                                            12 unprocessed/PRHU0001_S49_R1_001.fastq.gz unprocessed/PRHU0001_S49_R2_001.fastq.gz
## PRHU0002                                            12                                                                                  
## PRHU0009                                            14  unprocessed/PRHU0009_S7_R1_001.fastq.gz  unprocessed/PRHU0009_S7_R2_001.fastq.gz
## PRHU0010                                            13  unprocessed/PRHU0010_S8_R1_001.fastq.gz  unprocessed/PRHU0010_S8_R2_001.fastq.gz
## PRHU0011                                            15  unprocessed/PRHU0011_S9_R1_001.fastq.gz  unprocessed/PRHU0011_S9_R2_001.fastq.gz
## PRHU0018                                            14 unprocessed/PRHU0018_S16_R1_001.fastq.gz unprocessed/PRHU0018_S16_R2_001.fastq.gz
##                                                                                      hisat_count_table_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_hg38_115/hg38_115_genome-paired_s2_gene_ID_fcounts.csv.xz
##                                                                                                                    hisat_count_table_lpanamensis_mhomcol_v68
## PRHU0001 preprocessing/PRHU0001/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0002 preprocessing/PRHU0002/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0009 preprocessing/PRHU0009/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0010 preprocessing/PRHU0010/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0011 preprocessing/PRHU0011/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
## PRHU0018 preprocessing/PRHU0018/outputs/20250918hisat_lpanamensis_mhomcol_v68/lpanamensis_mhomcol_v68_genome-paired_s2_protein_coding_gene_ID_fcounts.csv.xz
##                                            salmon_count_table_hg38_115
## PRHU0001 preprocessing/PRHU0001/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0002 preprocessing/PRHU0002/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0009 preprocessing/PRHU0009/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0010 preprocessing/PRHU0010/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0011 preprocessing/PRHU0011/outputs/80salmon_hg38_115_CDS/quant.sf
## PRHU0018 preprocessing/PRHU0018/outputs/80salmon_hg38_115_CDS/quant.sf
##                                                  salmon_count_table_lpanamensis_mhomcol_v68
## PRHU0001 preprocessing/PRHU0001/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0002 preprocessing/PRHU0002/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0009 preprocessing/PRHU0009/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0010 preprocessing/PRHU0010/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0011 preprocessing/PRHU0011/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
## PRHU0018 preprocessing/PRHU0018/outputs/20250918salmon_lpanamensis_mhomcol_v68_CDS/quant.sf
head(modified_meta[["salmon_observed_genes_hg38_115"]])
## [1] 40892 37639 47176 47162 46983 43731
modified_meta[["detectionparasiteby7sl"]] <- sanitize_metadata(modified_meta[["detectionparasiteby7sl"]])
summary(modified_meta[["detectionparasiteby7sl"]])
##      negative notapplicable      positive 
##            63            29            11

Create a factor of the 7SL detection of nasal samples. Note, we need to recast the NAs as undefined. I would have sworn that my gather function would do that?

modified_meta[["nasal_7sl_status"]] <- modified_meta[["detectionparasiteby7sl"]]

nasal_samples <- modified_meta[["sample_type"]] == "nasal swab"
summary(nasal_samples)
##    Mode   FALSE    TRUE 
## logical      83      20
sl_positive <- modified_meta[["nasal_7sl_status"]] == "positive"
summary(sl_positive)
##    Mode   FALSE    TRUE 
## logical      92      11
nasal_positive <- nasal_samples & sl_positive
summary(nasal_positive)
##    Mode   FALSE    TRUE 
## logical      96       7
nasal_positive_samples <- rownames(modified_meta)[nasal_positive]
nasal_positive_people <- modified_meta[nasal_positive_samples, "participant_code"]
nasal_positive_people
## [1] "PP1009" "PP2020" "PP1009" "PP2005" "PP2006" "PP2019" "PP2020"
nasal_positive_people_samples <- modified_meta[["participant_code"]] %in% nasal_positive_people
modified_meta[["nasal_7sl_status"]] <- "negative"
modified_meta[nasal_positive_people_samples, "nasal_7sl_status"] <- "positive"
write_xlsx(data = modified_meta, excel = "sample_sheets/human_samples_202511_with_nasal_factor.xlsx")
## Deleting the file sample_sheets/human_samples_202511_with_nasal_factor.xlsx before writing the tables.
## write_xlsx() wrote sample_sheets/human_samples_202511_with_nasal_factor.xlsx.
## The cursor is on sheet first, row: 106 column: 115.
hisat_idx <- grep(pattern = "^hisat", x = names(first_spec))
second_spec <- first_spec[hisat_idx]
post_meta <- gather_preprocessing_metadata(
  starting_metadata = pre_meta[["new_meta"]],
  specification = second_spec, basedir = "preprocessing/202405", species = "hg38_111",
  new_metadata = "sample_sheets/tmrc2_persistence_202405_lp_hg.xlsx")

both_meta <- gather_preprocessing_metadata(
  starting_metadata = "sample_sheets/tmrc_persistence_202405.xlsx",
  specification = first_spec,
  basedir = "preprocessing/202405", species= c("lpanamensis_v68", "hg38_111"),
  new_metadata = "sample_sheets/tmrc_persistence_202405_both.xlsx")

6 Collect gene annotations

I should have all my load_xyz_annotation functions return some of the same elements in their retlists.

lp_genes <- lp_annot[["genes"]]
hg_genes <- hs_annot[["gene_annotations"]]

7 Quick peek at the SL samples, hg38 release 115

7.1 Gather the transcript and gene annotations.

hg_tx <- hs_annot[["annotation"]]
hg_map <- hs_annot[["gene_tx_map"]]
lp_genes <- lp_annot[["genes"]]

7.2 Create initial hisat/salmon tables

hu_se_salmon <- create_se(modified_meta, gene_info = hg_tx,
                          tx_gene_map = hg_map, file_column = "salmon_count_table_hg38_115") %>%
  set_conditions(fact = "sample_type") %>%
  set_batches(fact = "library_type")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 108 columns(metadata fields).
## In some cases, (notably salmon) the format of the IDs used by this can be tricky.
## It is likely to require the transcript ID followed by a '.' and the ensembl column:
## 'transcript_version', which is explicitly different than the gene version column.
## If this is not correctly performed, very few genes will be observed
## Rewriting the transcript<->gene map to remove tx versions.
## reading in files with read_tsv
## 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 
## summarizing abundance
## summarizing counts
## summarizing length
## Matched 22263 annotations and counts.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 22263 rows and 108 columns.
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
## mRNA   RZ 
##   91   12
hu_se_hisat_gene <- create_se(modified_meta, gene_info = hg_genes,
                              file_column = "hisat_count_table_hg38_115") %>%
  set_conditions(fact = "sample_type") %>%
  set_batches(fact = "library_type")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 108 columns(metadata fields).
## Matched 21571 annotations and counts.
## Some annotations were lost in merging, setting them to 'undefined'.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 21571 rows and 108 columns.
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
## mRNA   RZ 
##   91   12

7.3 Figure out which samples do not have 7SL categories

undef_7sl <- is.na(colData(hu_se_salmon)[["detectionparasiteby7sl"]])
colData(hu_se_salmon)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
colData(hu_se_hisat_gene)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
sample_7sl <- paste0(colData(hu_se_salmon)[["sample_type"]], "_",
                     colData(hu_se_salmon)[["detectionparasiteby7sl"]])
sample_7sl <- gsub(x = sample_7sl, pattern = "[[:space:]]", replacement = "_")
colData(hu_se_salmon)[["sample_7sl"]] <- sample_7sl
colData(hu_se_hisat_gene)[["sample_7sl"]] <- sample_7sl

7.4 Healthy vs scar samples

healthy_vs_scar <- gsub(x = colData(hu_se_salmon)[["sample_type"]],
                        pattern = "^skin biopsy ", replacement = "")
colData(hu_se_salmon)[["hs"]] <- healthy_vs_scar
colData(hu_se_hisat_gene)[["hs"]] <- healthy_vs_scar

7.5 Clean up missing 7SL detection samples

I think the last of these has been fixed since the last time I updated this sheet.

undef_7sl <- is.na(colData(hu_se_hisat_gene)[["detectionparasiteby7sl"]])
summary(undef_7sl)
##    Mode   FALSE 
## logical     103
if (sum(undef_7sl)) {
  colData(hu_se_salmon)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
  colData(hu_se_hisat_gene)[undef_7sl, "detectionparasiteby7sl"] <- "unknown"
} else {
  message("There appear to be no missing 7SL entries.")
}
## There appear to be no missing 7SL entries.

7.7 Combine 7SL status and the sample type into one factor

sample_7sl <- paste0(colData(hu_se_hisat_gene)[["sample_type"]], "_",
                     colData(hu_se_hisat_gene)[["detectionparasiteby7sl"]])
colData(hu_se_hisat_gene)[["sample_7sl"]] <- sample_7sl
colData(hu_se_salmon)[["sample_7sl"]] <- sample_7sl

7.8 Separate the mRNA samples from ribo-zero

Note, when we are finished, we will be using only the mRNA samples and ignoring the ribo-zero. But there are some questions about the data provided by the two libraries.

hu_se_salmon_mrna <- set_conditions(hu_se_salmon, fact = "sample_type") %>%
  subset_se(subset = "library_type=='mRNA'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##            57            28             6
hu_se_salmon_rz <- set_conditions(hu_se_salmon, fact = "sample_type") %>%
  subset_se(subset = "library_type=='RZ'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##             6             1             5
hu_se_hisat_gene_mrna <- set_conditions(hu_se_hisat_gene, fact = "sample_type") %>%
  subset_se(subset = "library_type=='mRNA'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##            57            28             6
hu_se_hisat_gene_rz <- set_conditions(hu_se_hisat_gene, fact = "sample_type") %>%
  subset_se(subset = "library_type=='RZ'") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
##      negative notapplicable      positive 
##             6             1             5

7.9 Extract only the healthy of scar samples, only mRNA

hu_hs_salmon_mrna <- subset_se(hu_se_salmon, subset = "hs=='healthy'|hs=='scar'") %>%
  set_conditions(fact = "hs")
## The numbers of samples by condition are:
## 
## healthy    scar 
##      15      16
hu_hs_hisat_mrna <- subset_se(hu_se_hisat_gene_mrna, subset = "hs=='healthy'|hs=='scar'") %>%
  set_conditions(fact = "hs")
## The numbers of samples by condition are:
## 
## healthy    scar 
##      15      15

8 HU Metadata

hu_mapped_mrna <- plot_metadata_factors(hu_se_hisat_gene_mrna, column = "hisat_genome_percent_log_hg38_115")
hu_mapped_mrna

hu_mapped_rz <- plot_metadata_factors(hu_se_hisat_gene_rz, column = "hisat_genome_percent_log_hg38_115")
hu_mapped_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

hu_observed_mrna <- plot_metadata_factors(hu_se_hisat_gene_mrna, column = "salmon_observed_genes_hg38_115")
hu_observed_mrna

hu_observed_rz <- plot_metadata_factors(hu_se_hisat_gene_rz, column = "salmon_observed_genes_hg38_115")
hu_observed_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

hu_pct_mrna <- plot_metadata_factors(hu_se_salmon_mrna, column = "salmon_percent_hg38_115")
hu_pct_mrna

hu_pct_rz <- plot_metadata_factors(hu_se_salmon_rz, column = "salmon_percent_hg38_115")
hu_pct_rz
## Warning: Groups with fewer than two datapoints have been dropped.
## ℹ Set `drop = FALSE` to consider such groups for position adjustment purposes.

hu_sankey <- plot_meta_sankey(hu_se_salmon, factors = c("detectionparasiteby7sl", "sample_type", "library_type"))
## Warning: attributes are not identical across measure variables; they will be dropped
## Warning: The `size` argument of `element_rect()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## ℹ The deprecated feature was likely used in the ggsankey package.
##   Please report the issue at <https://github.com/davidsjoberg/ggsankey/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
hu_sankey
## A sankey plot describing the metadata of 103 samples,
## including 30 out of 0 nodes and traversing metadata factors:
## detectionparasiteby7sl, sample_type, library_type.

9 nonzero/libsize/etc

plot_legend(hu_se_salmon)
## The colors used in the expressionset are: #1B9E77, #66A61E, #7570B3, #D95F02, #E6AB02, #E7298A.

plot_libsize(hu_se_salmon)
## Library sizes of 103 samples, 
## ranging from 2,281,699 to 14,374,842.

plot_nonzero(hu_se_salmon, y_intercept = 0.75)
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## ℹ The deprecated feature was likely used in the hpgltools package.
##   Please report the issue to the authors.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## A non-zero genes plot of 103 samples.
## These samples have an average 8.943 CPM coverage and 16231 genes observed, ranging from 15093 to
## 17527.
## Warning: ggrepel: 56 unlabeled data points (too many overlaps). Consider increasing max.overlaps

plot_libsize(hu_se_hisat_gene)
## Library sizes of 103 samples, 
## ranging from 4,395,408 to 18,970,049.

plot_nonzero(hu_se_hisat_gene)
## Scale for colour is already present.
## Adding another scale for colour, which will replace the existing scale.
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
## A non-zero genes plot of 103 samples.
## These samples have an average 12.61 CPM coverage and 15853 genes observed, ranging from 14800 to
## 17913.
## Warning: ggrepel: 61 unlabeled data points (too many overlaps). Consider increasing max.overlaps

plot_libsize(hu_se_hisat_gene_mrna)
## Library sizes of 91 samples, 
## ranging from 6,563,145 to 18,970,049.

plot_libsize(hu_se_hisat_gene_rz)
## Library sizes of 12 samples, 
## ranging from 4,395,408 to 16,550,439.

10 Normalize

hu_sesn <- normalize(hu_se_salmon, transform = "log2", convert = "cpm",
                     filter = TRUE, norm = "quant")
## Removing 4599 low-count genes (17664 remaining).
## transform_counts: Found 200385 values equal to 0, adding 1 to the matrix.
plot_corheat(hu_sesn)
## A heatmap of pairwise sample correlations ranging from: 
## 0.472624447742625 to 0.975818273903798.

hu_sesn_pca <- plot_pca(hu_sesn)

pp(file = "images/hu_pca_sampletype.png")
hu_sesn_pca$plot
dev.off()
## png 
##   2
hu_sesn_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by mRNA, RZ.

hu_detected <- subset_se(hu_se_salmon, subset = "detectionparasiteby7sl!='unknown'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  set_batches("sample_type")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            63            29            11
## The number of samples by batch are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
hu_detect_nb <- normalize(hu_detected, transform = "log2", convert = "cpm",
                          filter = TRUE, batch = "svaseq")
## Removing 4599 low-count genes (17664 remaining).
## transform_counts: Found 64135 values less than 0.
## transform_counts: Found 64135 values equal to 0, adding 1 to the matrix.
hu_detect_pca <- plot_pca(hu_detect_nb)
## Warning in ggplot2::guide_legend(overwrite.aes = list(size = plot_size)): Arguments in `...` must be used.
## ✖ Problematic argument:
## • overwrite.aes = list(size = plot_size)
## ℹ Did you misspell an argument name?
pp(file = "images/hu_pca_detect_sva.png")
hu_detect_pca$plot
dev.off()
## png 
##   2
hu_detect_pca
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, notapplicable, positive
## Shapes are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs.

11 Compare distribution of RZ/Stranded libraries

Maria Adelaida is interested in the distribution of the relatively few rz samples vs the relatively large number of stranded mRNA libraries.

I think it is likely that the nasal samples are of primary interest.

salmon_mrna_7sl <- set_conditions(hu_se_salmon_mrna, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            57            28             6
salmon_mrna_7sl_norm <- normalize(salmon_mrna_7sl, convert = "cpm", filter = TRUE,
                                  norm = "quant", transform = "log2")
## Removing 5488 low-count genes (16775 remaining).
## transform_counts: Found 64911 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(salmon_mrna_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

salmon_mrna_7sl_nb <- normalize(salmon_mrna_7sl, convert = "cpm", filter = TRUE,
                                batch = "sva", transform = "log2")
## Removing 5488 low-count genes (16775 remaining).
## transform_counts: Found 20417 values less than 0.
## transform_counts: Found 20417 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_mrna_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_mrna_7sl <- set_conditions(hu_se_hisat_gene_mrna, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##            57            28             6
hisat_mrna_7sl_norm <- normalize(hisat_mrna_7sl, convert = "cpm", filter = TRUE,
                                  norm = "quant", transform = "log2")
## Removing 6173 low-count genes (15398 remaining).
## transform_counts: Found 14191 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(hisat_mrna_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

hisat_mrna_7sl_nb <- normalize(hisat_mrna_7sl, convert = "cpm", filter = TRUE,
                                batch = "sva", transform = "log2")
## Removing 6173 low-count genes (15398 remaining).
## transform_counts: Found 6654 values less than 0.
## transform_counts: Found 6654 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_mrna_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

Now restrict to just the nasal samples.

salmon_nasal_mrna <- subset_se(hu_se_salmon_mrna, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
salmon_nasal_mrna_norm <- normalize(salmon_nasal_mrna, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 8374 low-count genes (13889 remaining).
## transform_counts: Found 2353 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_mrna_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

salmon_nasal_mrna_nb <- normalize(salmon_nasal_mrna, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 8374 low-count genes (13889 remaining).
## transform_counts: Found 841 values less than 0.
## transform_counts: Found 841 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_mrna_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_nasal_mrna <- subset_se(hu_se_hisat_gene_mrna, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
hisat_nasal_mrna_norm <- normalize(hisat_nasal_mrna, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 8412 low-count genes (13159 remaining).
## transform_counts: Found 12 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_mrna_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_nasal_mrna_nb <- normalize(hisat_nasal_mrna, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 8412 low-count genes (13159 remaining).
## transform_counts: Found 108 values less than 0.
## transform_counts: Found 108 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_mrna_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

11.1 Repeat with the ribo zero samples

salmon_rz_7sl <- set_conditions(hu_se_salmon_rz, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             6             1             5
salmon_rz_7sl_norm <- normalize(salmon_rz_7sl, convert = "cpm", filter = TRUE,
                                norm = "quant", transform = "log2")
## Removing 6360 low-count genes (15903 remaining).
## transform_counts: Found 8271 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(salmon_rz_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

salmon_rz_7sl_nb <- normalize(salmon_rz_7sl, convert = "cpm", filter = TRUE,
                              batch = "sva", transform = "log2")
## Removing 6360 low-count genes (15903 remaining).
## transform_counts: Found 1964 values less than 0.
## transform_counts: Found 1964 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_rz_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_rz_7sl <- set_conditions(hu_se_hisat_gene_rz, fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             6             1             5
hisat_rz_7sl_norm <- normalize(hisat_rz_7sl, convert = "cpm", filter = TRUE,
                               norm = "quant", transform = "log2")
## Removing 7116 low-count genes (14455 remaining).
## transform_counts: Found 306 values equal to 0, adding 1 to the matrix.
## This still clusters primarily by sample type, and there are precious few positive samples.
plot_pca(hisat_rz_7sl_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_rz_7sl_nb <- normalize(hisat_rz_7sl, convert = "cpm", filter = TRUE,
                             batch = "sva", transform = "log2")
## Removing 7116 low-count genes (14455 remaining).
## transform_counts: Found 322 values less than 0.
## transform_counts: Found 322 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_rz_7sl_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

Now restrict to just the nasal samples.

salmon_nasal_rz <- subset_se(hu_se_salmon_rz, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             3             1             5
salmon_nasal_rz_norm <- normalize(salmon_nasal_rz, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 7557 low-count genes (14706 remaining).
## transform_counts: Found 2498 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_rz_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

salmon_nasal_rz_nb <- normalize(salmon_nasal_rz, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 7557 low-count genes (14706 remaining).
## transform_counts: Found 1303 values less than 0.
## transform_counts: Found 1303 values equal to 0, adding 1 to the matrix.
plot_pca(salmon_nasal_rz_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_nasal_rz <- subset_se(hu_se_hisat_gene_rz, subset = "sample_type=='nasal swab'") %>%
  set_conditions(fact = "detectionparasiteby7sl") %>%
  subset_se(subset = "condition!='notapplicable'")
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             3             1             5
hisat_nasal_rz_norm <- normalize(hisat_nasal_rz, convert = "cpm", filter = TRUE,
                                    norm = "quant", transform = "log2")
## Removing 8212 low-count genes (13359 remaining).
## transform_counts: Found 9 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_rz_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

hisat_nasal_rz_nb <- normalize(hisat_nasal_rz, convert = "cpm", filter = TRUE,
                                  batch = "sva", transform = "log2")
## Removing 8212 low-count genes (13359 remaining).
## transform_counts: Found 137 values less than 0.
## transform_counts: Found 137 values equal to 0, adding 1 to the matrix.
plot_pca(hisat_nasal_rz_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by negative, positive.

12 Look at sample type and 7sl

hu_s7sl <- set_conditions(hu_se_salmon_mrna, fact = "sample_7sl")
## The numbers of samples by condition are:
## 
##               nasal swab_negative          nasal swab_notapplicable               nasal swab_positive               PBMCs_notapplicable 
##                                 8                                 1                                 2                                25 
##      skin biopsy healthy_negative skin biopsy healthy_notapplicable   skin biopsy non-lesion_negative         skin biopsy scar_negative 
##                                14                                 1                                 2                                14 
##    skin biopsy scar_notapplicable                     WBCs_negative                     WBCs_positive 
##                                 1                                19                                 4
hu_nasal <- subset_se(hu_s7sl, subset = "sample_type=='nasal swab'")
hu_nasal_nb <- normalize(hu_nasal, transform = "log2", convert = "cpm",
                          batch = "svaseq", filter = TRUE)
## Removing 8223 low-count genes (14040 remaining).
## transform_counts: Found 1045 values less than 0.
## transform_counts: Found 1045 values equal to 0, adding 1 to the matrix.
pp(file = "images/nasal_sample_np.png")
plot_pca(hu_nasal_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab_negative, nasal swab_notapplicable, nasal swab_positive
## Shapes are defined by negative, notapplicable, positive.
dev.off()
## png 
##   2
hu_wbc <- subset_se(hu_s7sl, subset = "sample_type=='WBCs'")
hu_wbc_nb <- normalize(hu_wbc, transform = "log2", convert = "cpm",
                       batch = "svaseq", filter = TRUE)
## Removing 8982 low-count genes (13281 remaining).
## transform_counts: Found 2601 values less than 0.
## transform_counts: Found 2601 values equal to 0, adding 1 to the matrix.
pp(file = "images/wbc_sample_np.png")
plot_pca(hu_wbc_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by WBCs_negative, WBCs_positive
## Shapes are defined by negative, positive.
dev.off()
## png 
##   2
short_factor <- gsub(x = as.character(colData(hu_nasal)[["condition"]]), pattern = ".*_(.*)$", replacement = "\\1")
hu_nasal <- set_conditions(hu_nasal, fact = as.factor(short_factor))
## The numbers of samples by condition are:
## 
##      negative notapplicable      positive 
##             8             1             2
hu_nasal_np <- subset_se(hu_nasal, subset = "condition!='notapplicable'")

hu_nasal_de <- all_pairwise(hu_nasal_np, filter = TRUE, force = TRUE,
                            model_fstring = "~ 0 + condition", model_svs = "svaseq")
## negative positive 
##        8        2
## Removing 8374 low-count genes (13889 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 6305 entries to zero.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into integers.
## This received a matrix of SVs.
## converting counts to integer mode
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into integers.
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into integers.
## conditions
## negative positive 
##        8        2
## conditions
## negative positive 
##        8        2
## Warning in choose_binom_dataset(input, force = force): This data was inappropriately forced into integers.
## conditions
## negative positive 
##        8        2
hu_nasal_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 pstv_vs_ng
## basic_vs_deseq     0.52525
## basic_vs_dream     0.70791
## basic_vs_ebseq     0.67788
## basic_vs_edger     0.60628
## basic_vs_limma     0.71872
## basic_vs_noiseq    0.11718
## deseq_vs_dream     0.69773
## deseq_vs_ebseq     0.74945
## deseq_vs_edger     0.90410
## deseq_vs_limma     0.65985
## deseq_vs_noiseq    0.33641
## dream_vs_ebseq     0.69816
## dream_vs_edger     0.80879
## dream_vs_limma     0.97772
## dream_vs_noiseq    0.09634
## ebseq_vs_edger     0.80917
## ebseq_vs_limma     0.68784
## ebseq_vs_noiseq    0.54862
## edger_vs_limma     0.76926
## edger_vs_noiseq    0.33872
## limma_vs_noiseq    0.10929
hu_nasal_table <- combine_de_tables(hu_nasal_de, excel = "excel/persist_table.xlsx")
## Deleting the file excel/persist_table.xlsx before writing the tables.
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
hu_nasal_table
## A set of combined differential expression results.
##                  table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 positive_vs_negative          21            69          19            69           0             0
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the UpSetR package.
##   Please report the issue to the authors.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## `geom_line()`: Each group consists of only one observation.
## ℹ Do you need to adjust the group aesthetic?
## Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
## ℹ Please use the `linewidth` argument instead.
## ℹ The deprecated feature was likely used in the UpSetR package.
##   Please report the issue to the authors.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
## Plot describing unique/shared genes in a differential expression table.

hu_nasal_sig <- extract_significant_genes(hu_nasal_table, excel = "excel/persist_sig.xlsx")
## Deleting the file excel/persist_sig.xlsx before writing the tables.
hu_nasal_sig
## A set of genes deemed significant according to limma, edger, deseq, ebseq, basic.
## The parameters defining significant were:
## LFC cutoff: 1 adj P cutoff: 0.05
##                      limma_up limma_down edger_up edger_down deseq_up deseq_down ebseq_up ebseq_down basic_up basic_down
## positive_vs_negative        0          0       19         69       21         69       85         54        0          0

13 Healthy vs Scar samples

One query from our last meeting which I forgot about until I reread my TODO notes: compare the samples marked as healthy compared to those marked as scar. These are two distantly separate skin biopsies of the same person.

hu_hs_de <- all_pairwise(hu_hs, filter = TRUE, force = TRUE,
                         model_svs = "svaseq", model_fstring = "~ 0 + condition")
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 'hu_hs' not found
hu_hs_table <- combine_de_tables(hu_hs_de, excel = "excel/healthy_vs_scar_table.xlsx")
## Deleting the file excel/healthy_vs_scar_table.xlsx before writing the tables.
## Error: object 'hu_hs_de' not found
hu_hs_table
## Error: object 'hu_hs_table' not found
hu_hs_sig <- extract_significant_genes(hu_hs_table, excel = "excel/healthy_vs_scar_sig.xlsx")
## Deleting the file excel/healthy_vs_scar_sig.xlsx before writing the tables.
## Error: object 'hu_hs_table' not found
hu_hs_sig
## Error: object 'hu_hs_sig' not found

14 Take a peek at the kraken results

hu_kraken_viral <- create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_viral",
                       handle_na = "zero")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 107 columns(metadata fields).
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0002/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0009/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0010/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0011/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0018/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0019/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0020/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0012/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0013/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0014/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0021/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0022/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0023/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0015/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0016/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0017/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0024/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0025/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0026/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0038/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0006/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0007/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0008/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0005/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0004/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0003/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0027/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0028/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0029/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0030/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0031/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0032/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0035/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0033/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0034/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0036/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0037/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0039/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0040/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0041/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0042/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0043/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0044/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0045/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0046/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0047/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0048/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0049/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0050/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0051/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0052/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0053/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0054/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0055/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0056/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0057/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0058/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0059/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0060/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0061/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0062/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0063/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0064/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0065/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0066/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0067/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0068/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0069/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0070/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0071/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0072/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0073/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0074/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0075/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0076/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0077/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0078/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0079/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0080/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0081/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0082/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0083/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0084/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0085/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0086/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0087/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0088/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0089/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0090/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0091/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0092/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0093/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0094/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0095/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0096/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0097/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0098/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0099/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0100/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0101/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0102/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0103/outputs/20250918kraken_viral/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_viral", : There are some NAs in this data, the 'handle_nas'
## parameter may be required.
## Matched 487 annotations and counts.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 487 rows and 107 columns.
hu_kraken_viral <- set_conditions(hu_kraken_viral, fact = "sample_type") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
## negative positive  unknown 
##       63       11       29
kraken_viral_norm <- normalize(hu_kraken_viral, filter = TRUE, norm = "cpm", transform = "log2")
## Removing 0 low-count genes (487 remaining).
## Did not recognize the normalization, leaving the table alone.
##   Recognized normalizations include: 'qsmooth', 'sf', 'sf2', 'vsd', 'quant',
##   'tmm', 'qsmooth_median', 'upperquartile', and 'rle.'
## transform_counts: Found 45295 values equal to 0, adding 1 to the matrix.
plot_corheat(kraken_viral_norm)
## A heatmap of pairwise sample correlations ranging from: 
## 0.745956883236431 to 0.973829842329974.

plot_disheat(kraken_viral_norm)
## A heatmap of pairwise sample distances ranging from: 
## 7.3460130601336 to 25.9738855355352.

plot_pca(kraken_viral_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by negative, positive, unknown.

nasal_kraken <- subset_se(hu_kraken_viral, subset = "condition=='nasal swab'")
nasal_norm <- normalize(nasal_kraken_viral, filter = TRUE, norm = "cpm", transform = "log2")
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'normalize': object 'nasal_kraken_viral' not found
plot_corheat(nasal_norm)
## Error in h(simpleError(msg, call)): error in evaluating the argument 'input_data' in selecting a method for function 'plot_heatmap': object 'nasal_norm' not found
hu_kraken_bacteria <- create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_bacterial",
                       handle_na = "zero")
## Reading the sample metadata.
## Checking the state of the condition column.
## Checking the state of the batch column.
## Checking the condition factor.
## The sample definitions comprises: 103 rows(samples) and 107 columns(metadata fields).
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0002/outputs/02kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0009/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0010/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0011/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0018/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0019/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0020/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0012/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0013/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0014/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0021/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0022/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0023/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0015/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0016/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0017/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0024/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0025/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0026/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0038/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0006/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0007/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0008/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0005/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0004/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0003/outputs/02kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0027/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0028/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0029/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0030/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0031/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0032/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0035/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0033/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0034/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0036/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0037/outputs/06kraken_bacteria/kraken_report_matrix.tsv has mismatched
## rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0039/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0040/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0041/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0042/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0043/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0044/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0045/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0046/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0047/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0048/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0049/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0050/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0051/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0052/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0053/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0054/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0055/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0056/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0057/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0058/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0059/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0060/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0061/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0062/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0063/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0064/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0065/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0066/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0067/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0068/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0069/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0070/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0071/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0072/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0073/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0074/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0075/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0076/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0077/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0078/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0079/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0080/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0081/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0082/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0083/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0084/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0085/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0086/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0087/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0088/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0089/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0090/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0091/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0092/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0093/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0094/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0095/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0096/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0097/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0098/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0099/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0100/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0101/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0102/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in first_rownames != current_rownames: longer object length is not a multiple of shorter object length
## Warning in read_counts(ids, files, countdir = countdir, file_type = file_type, : The file:
## /z1/scratch/atb/rnaseq/lpanamensis_persistence_2023/preprocessing/PRHU0103/outputs/20250918kraken_bacteria/kraken_report_matrix.tsv has
## mismatched rownames.
## Warning in create_se(pre_meta[["new_meta"]], file_column = "kraken_matrix_bacterial", : There are some NAs in this data, the 'handle_nas'
## parameter may be required.
## Matched 1754 annotations and counts.
## Saving the summarized experiment to 'se.rda'.
## The final summarized experiment has 1754 rows and 107 columns.
hu_kraken_bacteria <- set_conditions(hu_kraken_bacteria, fact = "sample_type") %>%
  set_batches("detectionparasiteby7sl")
## The numbers of samples by condition are:
## 
##             nasal swab                  PBMCs    skin biopsy healthy skin biopsy non-lesion       skin biopsy scar                   WBCs 
##                     20                     25                     15                      4                     16                     23
## The number of samples by batch are:
## 
## negative positive  unknown 
##       63       11       29
colData(hu_kraken_bacteria)[["kraken_bacteria"]] <- kraken_bacteria
## Error: object 'kraken_bacteria' not found
kraken_bacteria_norm <- normalize(hu_kraken_bacteria, filter = TRUE, norm = "cpm", transform = "log2")
## Removing 0 low-count genes (1754 remaining).
## Did not recognize the normalization, leaving the table alone.
##   Recognized normalizations include: 'qsmooth', 'sf', 'sf2', 'vsd', 'quant',
##   'tmm', 'qsmooth_median', 'upperquartile', and 'rle.'
## transform_counts: Found 88803 values equal to 0, adding 1 to the matrix.
plot_corheat(kraken_bacteria_norm)
## A heatmap of pairwise sample correlations ranging from: 
## 0.604850876191328 to 0.934803391036582.

plot_disheat(kraken_bacteria_norm)
## A heatmap of pairwise sample distances ranging from: 
## 38.2027362700424 to 129.211351296268.

plot_pca(kraken_bacteria_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by nasal swab, PBMCs, skin biopsy healthy, skin biopsy non-lesion, skin biopsy scar, WBCs
## Shapes are defined by negative, positive, unknown.
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure
## Warning in MASS::cov.trob(data[, vars], wt = weight * nrow(data)): Probable convergence failure

15 Nasal as a proxy for everything else

In the beginning of this document, I created a peculiar factor out of the nasal sample 7SL state and applied its result to every other sample for each person; thus a person who was positive for the nasal sample was deemed positive for everything. Let us see what that looks like…

nasal_7sl_se <- set_conditions(hu_se_salmon, fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       84       19
nasal_7sl_hisat_se <- set_conditions(hu_se_hisat_genes, fact = "nasal_7sl_status")
## Error in h(simpleError(msg, call)): error in evaluating the argument 'exp' in selecting a method for function 'set_conditions': object 'hu_se_hisat_genes' not found
nasal_7sl_se_nb <- normalize(nasal_7sl_se, transform = "log2", convert = "cpm", filter = TRUE,
                             batch = "svaseq")
## Removing 4599 low-count genes (17664 remaining).
## transform_counts: Found 43532 values less than 0.
## transform_counts: Found 43532 values equal to 0, adding 1 to the matrix.
plot_pca(nasal_7sl_se_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA, RZ.

nasal_7sl_de <- all_pairwise(nasal_7sl_se, filter = TRUE,
                             model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       84       19
## Removing 4599 low-count genes (17664 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 440621 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## conditions
## negative positive 
##       84       19
## conditions
## negative positive 
##       84       19
## conditions
## negative positive 
##       84       19
nasal_7sl_de
## A pairwise differential expression with results from: basic, deseq, ebseq, edger, limma, noiseq.
## This used a surrogate/batch estimate from: svaseq.
## The primary analysis performed 1 comparisons.
## The logFC agreement among the methods follows:
##                 pstv_vs_ng
## basic_vs_dream      0.5423
## basic_vs_ebseq      0.5845
## basic_vs_edger      0.4407
## basic_vs_limma      0.6008
## basic_vs_noiseq     0.5254
## dream_vs_ebseq      0.5379
## dream_vs_edger      0.6983
## dream_vs_limma      0.9815
## dream_vs_noiseq     0.6734
## ebseq_vs_edger      0.7371
## ebseq_vs_limma      0.5669
## ebseq_vs_noiseq     0.7698
## edger_vs_limma      0.6834
## edger_vs_noiseq     0.7224
## limma_vs_noiseq     0.6947
nasal_7sl_hisat_de <- all_pairwise(nasal_7sl_hisat_se, filter = TRUE,
                                   model_svs = "svaseq", model_fstring = "~ 0 + condition")
## Error in h(simpleError(msg, call)): error in evaluating the argument 'object' in selecting a method for function 'pData': object 'nasal_7sl_hisat_se' not found
nasal_7sl_hisat_de
## Error: object 'nasal_7sl_hisat_de' not found
nasal_7sl_table <- combine_de_tables(nasal_7sl_de, excel = "excel/nasal_7sl_proxy_table.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
nasal_7sl_table
## A set of combined differential expression results.
##                  table deseq_sigup deseq_sigdown edger_sigup edger_sigdown limma_sigup limma_sigdown
## 1 positive_vs_negative           0             0          26            49          80            54
## Only  has information, cannot create an UpSet.
## Plot describing unique/shared genes in a differential expression table.
## NULL
nasal_7sl_hisat_table <- combine_de_tables(nasal_7sl_hisat_de, excel = "excel/nasal_7sl_proxy_table.xlsx")
## Deleting the file excel/nasal_7sl_proxy_table.xlsx before writing the tables.
## Error: object 'nasal_7sl_hisat_de' not found
nasal_7sl_hisat_table
## Error: object 'nasal_7sl_hisat_table' not found

Oh, Maria Adelaida was actually looking only for the PBMC samples.

pbmc_nasal_7sl_se <- subset_se(hu_se_salmon, subset = "condition=='PBMCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       21        4
pbmc_nasal_7sl_hisat_se <- subset_se(hu_se_hisat_gene, subset = "condition=='PBMCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       21        4
pbmc_nasal_hisat_norm <- normalize(pbmc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                   norm = "quant", filter = TRUE)
## Removing 9124 low-count genes (12447 remaining).
## transform_counts: Found 39 values equal to 0, adding 1 to the matrix.
plot_pca(pbmc_nasal_hisat_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

pbmc_nasal_hisat_nb <- normalize(pbmc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                 batch = "svaseq", filter = "simple")
## Removing 3931 low-count genes (17640 remaining).
## transform_counts: Found 20961 values less than 0.
## transform_counts: Found 20961 values equal to 0, adding 1 to the matrix.
plot_pca(pbmc_nasal_hisat_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

pbmc_nasal_7sl_de <- all_pairwise(pbmc_nasal_7sl_se, filter = "simple",
                                  model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       21        4
## Removing 3479 low-count genes (18784 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 140680 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## conditions
## negative positive 
##       21        4
## conditions
## negative positive 
##       21        4
## conditions
## negative positive 
##       21        4
pbmc_nasal_7sl_table <- combine_de_tables(pbmc_nasal_7sl_de, excel = "excel/pbmc_nasal_proxy.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.

16 TODO 202512

Repeat this nasal proxy test using each of the other cell types.

The factors of likely interest are: “wbcs” “nasal swab” ideally both “skin biopsy healthy” and “skin biopsy scar” but perhaps only “skin biopsy”.

wbc_nasal_7sl_se <- subset_se(hu_se_salmon, subset = "condition=='WBCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       19        4
wbc_nasal_7sl_hisat_se <- subset_se(hu_se_hisat_gene, subset = "condition=='WBCs'") %>%
  set_conditions(fact = "nasal_7sl_status")
## The numbers of samples by condition are:
## 
## negative positive 
##       19        4
wbc_nasal_hisat_norm <- normalize(wbc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                  filter = "simple", norm = "quant")
## Removing 4161 low-count genes (17410 remaining).
## transform_counts: Found 52115 values equal to 0, adding 1 to the matrix.
plot_pca(wbc_nasal_hisat_norm)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

wbc_nasal_hisat_nb <- normalize(wbc_nasal_7sl_hisat_se, transform = "log2", convert = "cpm",
                                  filter = "simple", batch = "svaseq")
## Removing 4161 low-count genes (17410 remaining).
## transform_counts: Found 19565 values less than 0.
## transform_counts: Found 19565 values equal to 0, adding 1 to the matrix.
plot_pca(wbc_nasal_hisat_nb)
## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by negative, positive
## Shapes are defined by mRNA.

wbc_nasal_7sl_de <- all_pairwise(wbc_nasal_7sl_se, filter = "simple",
                                  model_svs = "svaseq", model_fstring = "~ 0 + condition")
## negative positive 
##       19        4
## Removing 3727 low-count genes (18536 remaining).
## Basic step 0/3: Normalizing data.
## Basic step 0/3: Converting data.
## I think this is failing? SummarizedExperiment
## Basic step 0/3: Transforming data.
## Setting 130319 entries to zero.
## This received a matrix of SVs.
## Error in DESeqDataSet(se, design = design, ignoreRank) : 
##   some values in assay are not integers
## conditions
## negative positive 
##       19        4
## conditions
## negative positive 
##       19        4
## conditions
## negative positive 
##       19        4
wbc_nasal_7sl_table <- combine_de_tables(wbc_nasal_7sl_de, excel = "excel/wbc_nasal_proxy.xlsx")
## Looking for subscript invalid names, start of extract_keepers.
## Looking for subscript invalid names, end of extract_keepers.
