I wrote the function ‘create_matrices()’ to collect mutation counts. At least in theory the results from it should be able to address most/any question regarding the counts of mutations observed in the data.
Categorize the data with at least 3 indexes per mutant
devtools::load_all("Rerrrt")
## Loading Rerrrt
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:hpgltools':
##
## combine
## The following object is masked from 'package:testthat':
##
## matches
## The following object is masked from 'package:Biobase':
##
## combine
## The following objects are masked from 'package:BiocGenerics':
##
## combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: tidyr
##
## Attaching package: 'tidyr'
## The following object is masked from 'package:testthat':
##
## matches
sample_sheet <- "sample_sheets/new_samples.xlsx"
ident_column <- "identtable"
mut_column <- "mutationtable"
min_reads <- 3
min_indexes <- 3
min_sequencer <- 6
min_position <- 5
max_position <- 200
max_mutations_per_read <- NULL
prune_n <- TRUE
verbose <- TRUE
excel <- glue::glue("excel/{rundate}_new_triples-v{ver}.xlsx")
triples <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Deleting the file excel/20200908_new_triples-v20200314.xlsx before writing the tables.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.
max_mutations_per_read <- 10
excel <- glue::glue("excel/{rundate}_triples_tenmpr-v{ver}.xlsx")
triples_tenmpr <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.
max_mutations_per_read <- 5
excel <- glue::glue("excel/{rundate}_triples_fivempr-v{ver}.xlsx")
triples_fivempr <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.
Categorize the data with at least 5 indexes per mutant
min_indexes <- 5
max_mutations_per_read <- NULL
excel <- glue::glue("excel/{rundate}_quints-v{ver}.xlsx")
quints <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.
max_mutations_per_read <- 10
excel <- glue::glue("excel/{rundate}_quints_tenmpr-v{ver}.xlsx")
quints_tenmpr <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.
max_mutations_per_read <- 5
excel <- glue::glue("excel/{rundate}_quints_fivempr-v{ver}.xlsx")
quints_fivempr <- create_matrices(sample_sheet=sample_sheet,
ident_column=ident_column, mut_column=mut_column,
min_reads=min_reads, min_indexes=min_indexes,
min_sequencer=min_sequencer,
min_position=min_position, max_position=max_position,
prune_n=prune_n, verbose=verbose, excel=excel)
## Dropped 6 rows from the sample metadata because they were blank.
## Starting sample: control_dna.
## Reading the file containing mutations: preprocessing/control_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 344009 reads.
## Mutation data: after min-position pruning, there are: 344007 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 344007 reads.
## Mutation data: after max-position pruning, there are: 344007 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 337124 reads: 6883 lost or 2.00%.
## Mutation data: all filters removed 6885 reads, or 2.00%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1392352 indexes in all the data.
## After reads/index pruning, there are: 258638 indexes: 1133714 lost or 81.42%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 337124 changed reads.
## All data: before reads/index pruning, there are: 2240589 identical reads.
## All data: after index pruning, there are: 117053 changed reads: 34.72%.
## All data: after index pruning, there are: 870547 identical reads: 38.85%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 870547 identical reads.
## Before classification, there are 117053 reads with mutations.
## After classification, there are 741108 reads/indexes which are only identical.
## After classification, there are 6056 reads/indexes which are strictly sequencer.
## After classification, there are 12445 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1336553 forward reads and 1515986 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_dna.
## Reading the file containing mutations: preprocessing/low_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 418833 reads.
## Mutation data: after min-position pruning, there are: 418830 reads: 3 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 418830 reads.
## Mutation data: after max-position pruning, there are: 418830 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 411131 reads: 7699 lost or 1.84%.
## Mutation data: all filters removed 7702 reads, or 1.84%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1950160 indexes in all the data.
## After reads/index pruning, there are: 177404 indexes: 1772756 lost or 90.90%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 411131 changed reads.
## All data: before reads/index pruning, there are: 2566212 identical reads.
## All data: after index pruning, there are: 85285 changed reads: 20.74%.
## All data: after index pruning, there are: 540399 identical reads: 21.06%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 540399 identical reads.
## Before classification, there are 85285 reads with mutations.
## After classification, there are 468326 reads/indexes which are only identical.
## After classification, there are 1201 reads/indexes which are strictly sequencer.
## After classification, there are 20730 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 771013 forward reads and 901147 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_dna.
## Reading the file containing mutations: preprocessing/high_dna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_dna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 548445 reads.
## Mutation data: after min-position pruning, there are: 548444 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 548444 reads.
## Mutation data: after max-position pruning, there are: 548444 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 533484 reads: 14960 lost or 2.73%.
## Mutation data: all filters removed 14961 reads, or 2.73%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1829434 indexes in all the data.
## After reads/index pruning, there are: 212532 indexes: 1616902 lost or 88.38%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 533484 changed reads.
## All data: before reads/index pruning, there are: 2421460 identical reads.
## All data: after index pruning, there are: 137380 changed reads: 25.75%.
## All data: after index pruning, there are: 629940 identical reads: 26.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 629940 identical reads.
## Before classification, there are 137380 reads with mutations.
## After classification, there are 542896 reads/indexes which are only identical.
## After classification, there are 1940 reads/indexes which are strictly sequencer.
## After classification, there are 60340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 985552 forward reads and 1114299 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: control_rna.
## Reading the file containing mutations: preprocessing/control_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/control_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after min-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 267014 reads.
## Mutation data: after max-position pruning, there are: 267014 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 261209 reads: 5805 lost or 2.17%.
## Mutation data: all filters removed 5805 reads, or 2.17%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1155581 indexes in all the data.
## After reads/index pruning, there are: 210352 indexes: 945229 lost or 81.80%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 261209 changed reads.
## All data: before reads/index pruning, there are: 1854668 identical reads.
## All data: after index pruning, there are: 88407 changed reads: 33.85%.
## All data: after index pruning, there are: 711770 identical reads: 38.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 711770 identical reads.
## Before classification, there are 88407 reads with mutations.
## After classification, there are 609535 reads/indexes which are only identical.
## After classification, there are 4778 reads/indexes which are strictly sequencer.
## After classification, there are 9929 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1101670 forward reads and 1250588 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: low_rna.
## Reading the file containing mutations: preprocessing/low_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/low_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 387777 reads.
## Mutation data: after min-position pruning, there are: 387776 reads: 1 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 387776 reads.
## Mutation data: after max-position pruning, there are: 387776 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 376092 reads: 11684 lost or 3.01%.
## Mutation data: all filters removed 11685 reads, or 3.01%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1306310 indexes in all the data.
## After reads/index pruning, there are: 255514 indexes: 1050796 lost or 80.44%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 376092 changed reads.
## All data: before reads/index pruning, there are: 2083008 identical reads.
## All data: after index pruning, there are: 136149 changed reads: 36.20%.
## All data: after index pruning, there are: 833439 identical reads: 40.01%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 833439 identical reads.
## Before classification, there are 136149 reads with mutations.
## After classification, there are 709121 reads/indexes which are only identical.
## After classification, there are 5313 reads/indexes which are strictly sequencer.
## After classification, there are 36973 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1343128 forward reads and 1445434 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: high_rna.
## Reading the file containing mutations: preprocessing/high_rna/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/high_rna/step2_identical_reads.txt.xz
## Counting indexes before filtering.
## Mutation data: removing any differences before position: 5.
## Mutation data: before pruning, there are: 558061 reads.
## Mutation data: after min-position pruning, there are: 558059 reads: 2 lost or 0.00%.
## Mutation data: removing any differences after position: 200.
## Mutation data: before pruning, there are: 558059 reads.
## Mutation data: after max-position pruning, there are: 558059 reads: 0 lost or 0.00%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 544407 reads: 13652 lost or 2.45%.
## Mutation data: all filters removed 13654 reads, or 2.45%.
## Gathering information about the number of reads per index.
## Before reads/index pruning, there are: 1224839 indexes in all the data.
## After reads/index pruning, there are: 254401 indexes: 970438 lost or 79.23%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 544407 changed reads.
## All data: before reads/index pruning, there are: 1855921 identical reads.
## All data: after index pruning, there are: 216201 changed reads: 39.71%.
## All data: after index pruning, there are: 771788 identical reads: 41.59%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 771788 identical reads.
## Before classification, there are 216201 reads with mutations.
## After classification, there are 644468 reads/indexes which are only identical.
## After classification, there are 5761 reads/indexes which are strictly sequencer.
## After classification, there are 108340 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 1312147 forward reads and 1454177 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Writing a legend.
## Plotting Index density for mutant reads before filtering.
## Plotting Index density for identical reads before filtering.
## Plotting Index density for all reads before filtering.
## Plotting Index density for mutant reads after filtering.
## Plotting Index density for identical reads after filtering.
## Plotting Index density for all reads after filtering.
## Writing raw data.
## Writing cpm data.
## Writing data normalized by reads/indexes.
## Writing data normalized by reads/indexes and length.
## Writing data normalized by cpm(reads/indexes) and length.