I wrote the function ‘create_matrices()’ to collect mutation counts. At least in theory the results from it should be able to address most/any question regarding the counts of mutations observed in the data.
Categorize the data with at least 3 indexes per mutant
## Loading errRt
## Loading required package: dplyr
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:hpgltools':
##
## combine
## The following object is masked from 'package:Biobase':
##
## combine
## The following objects are masked from 'package:BiocGenerics':
##
## combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: tidyr
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: all filters removed 203354 reads, or 17.58%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1742165 indexes in all the data.
## After reads/index pruning, there are: 837608 indexes: 904557 lost or 51.92%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 953181 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 491995 changed reads: 51.62%.
## All data: after index pruning, there are: 3663004 identical reads: 78.24%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3663004 identical reads.
## Before classification, there are 491995 reads with mutations.
## After classification, there are 2738199 reads/indexes which are only identical.
## After classification, there are 11023 reads/indexes which are strictly sequencer.
## After classification, there are 26963 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7018785 forward reads and 7148314 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: all filters removed 1778234 reads, or 51.98%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1261478 indexes in all the data.
## After reads/index pruning, there are: 693725 indexes: 567753 lost or 45.01%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1642969 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 814407 changed reads: 49.57%.
## All data: after index pruning, there are: 4834092 identical reads: 92.41%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4834092 identical reads.
## Before classification, there are 814407 reads with mutations.
## After classification, there are 2802107 reads/indexes which are only identical.
## After classification, there are 111708 reads/indexes which are strictly sequencer.
## After classification, there are 126921 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 11803361 forward reads and 12275547 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: all filters removed 2857634 reads, or 66.31%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 884042 indexes in all the data.
## After reads/index pruning, there are: 463445 indexes: 420597 lost or 47.58%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1452047 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 730397 changed reads: 50.30%.
## All data: after index pruning, there are: 3332136 identical reads: 92.99%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3332136 identical reads.
## Before classification, there are 730397 reads with mutations.
## After classification, there are 1851177 reads/indexes which are only identical.
## After classification, there are 90341 reads/indexes which are strictly sequencer.
## After classification, there are 244494 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9104237 forward reads and 9257103 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list
triples_tenmpr <- create_matrices(sample_sheet="sample_sheets/all_samples.xlsx",
ident_column="identtable", mut_column="mutationtable",
min_reads=3, min_indexes=3, min_sequencer=10,
min_position=24, max_position=176,
max_mutations_per_read=10,
prune_n=TRUE, verbose=TRUE)
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 799403 reads: 153778 lost or 16.13%.
## Mutation data: all filters removed 357132 reads, or 30.88%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1733789 indexes in all the data.
## After reads/index pruning, there are: 836838 indexes: 896951 lost or 51.73%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 799403 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 441562 changed reads: 55.24%.
## All data: after index pruning, there are: 3661605 identical reads: 78.21%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3661605 identical reads.
## Before classification, there are 441562 reads with mutations.
## After classification, there are 2748736 reads/indexes which are only identical.
## After classification, there are 9916 reads/indexes which are strictly sequencer.
## After classification, there are 26403 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7049093 forward reads and 7175885 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 1232741 reads: 410228 lost or 24.97%.
## Mutation data: all filters removed 2188462 reads, or 63.97%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1231605 indexes in all the data.
## After reads/index pruning, there are: 693381 indexes: 538224 lost or 43.70%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1232741 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 720963 changed reads: 58.48%.
## All data: after index pruning, there are: 4833605 identical reads: 92.40%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4833605 identical reads.
## Before classification, there are 720963 reads with mutations.
## After classification, there are 2832509 reads/indexes which are only identical.
## After classification, there are 98387 reads/indexes which are strictly sequencer.
## After classification, there are 123178 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 11930745 forward reads and 12406826 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 1110089 reads: 341958 lost or 23.55%.
## Mutation data: all filters removed 3199592 reads, or 74.24%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 857851 indexes in all the data.
## After reads/index pruning, there are: 463161 indexes: 394690 lost or 46.01%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1110089 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 662025 changed reads: 59.64%.
## All data: after index pruning, there are: 3331914 identical reads: 92.98%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3331914 identical reads.
## Before classification, there are 662025 reads with mutations.
## After classification, there are 1873630 reads/indexes which are only identical.
## After classification, there are 79142 reads/indexes which are strictly sequencer.
## After classification, there are 237111 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9205882 forward reads and 9355117 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list
triples_fivempr <- create_matrices(sample_sheet="sample_sheets/all_samples.xlsx",
ident_column="identtable", mut_column="mutationtable",
min_reads=3, min_indexes=3, min_sequencer=10,
min_position=24, max_position=176,
max_mutations_per_read=5,
prune_n=TRUE, verbose=TRUE)
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 608429 reads: 344752 lost or 36.17%.
## Mutation data: all filters removed 548106 reads, or 47.39%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1713933 indexes in all the data.
## After reads/index pruning, there are: 834821 indexes: 879112 lost or 51.29%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 608429 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 379603 changed reads: 62.39%.
## All data: after index pruning, there are: 3657910 identical reads: 78.14%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3657910 identical reads.
## Before classification, there are 379603 reads with mutations.
## After classification, there are 2777271 reads/indexes which are only identical.
## After classification, there are 8544 reads/indexes which are strictly sequencer.
## After classification, there are 25485 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7127863 forward reads and 7254038 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 807185 reads: 835784 lost or 50.87%.
## Mutation data: all filters removed 2614018 reads, or 76.41%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1179116 indexes in all the data.
## After reads/index pruning, there are: 692307 indexes: 486809 lost or 41.29%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 807185 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 585835 changed reads: 72.58%.
## All data: after index pruning, there are: 4832196 identical reads: 92.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4832196 identical reads.
## Before classification, there are 585835 reads with mutations.
## After classification, there are 2934376 reads/indexes which are only identical.
## After classification, there are 79902 reads/indexes which are strictly sequencer.
## After classification, there are 116271 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 12365004 forward reads and 12844113 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 746662 reads: 705385 lost or 48.58%.
## Mutation data: all filters removed 3563019 reads, or 82.67%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 808995 indexes in all the data.
## After reads/index pruning, there are: 461997 indexes: 346998 lost or 42.89%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 746662 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 555226 changed reads: 74.36%.
## All data: after index pruning, there are: 3330970 identical reads: 92.96%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3330970 identical reads.
## Before classification, there are 555226 reads with mutations.
## After classification, there are 1957637 reads/indexes which are only identical.
## After classification, there are 63014 reads/indexes which are strictly sequencer.
## After classification, there are 223250 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9578873 forward reads and 9724531 reverse_reads.
## Subsetting based on mutations with at least 3 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list
Categorize the data with at least 5 indexes per mutant
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: all filters removed 203354 reads, or 17.58%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1742165 indexes in all the data.
## After reads/index pruning, there are: 837608 indexes: 904557 lost or 51.92%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 953181 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 491995 changed reads: 51.62%.
## All data: after index pruning, there are: 3663004 identical reads: 78.24%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3663004 identical reads.
## Before classification, there are 491995 reads with mutations.
## After classification, there are 2738199 reads/indexes which are only identical.
## After classification, there are 11023 reads/indexes which are strictly sequencer.
## After classification, there are 26963 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7018785 forward reads and 7148314 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: all filters removed 1778234 reads, or 51.98%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1261478 indexes in all the data.
## After reads/index pruning, there are: 693725 indexes: 567753 lost or 45.01%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1642969 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 814407 changed reads: 49.57%.
## All data: after index pruning, there are: 4834092 identical reads: 92.41%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4834092 identical reads.
## Before classification, there are 814407 reads with mutations.
## After classification, there are 2802107 reads/indexes which are only identical.
## After classification, there are 111708 reads/indexes which are strictly sequencer.
## After classification, there are 126921 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 11803361 forward reads and 12275547 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: all filters removed 2857634 reads, or 66.31%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 884042 indexes in all the data.
## After reads/index pruning, there are: 463445 indexes: 420597 lost or 47.58%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1452047 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 730397 changed reads: 50.30%.
## All data: after index pruning, there are: 3332136 identical reads: 92.99%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3332136 identical reads.
## Before classification, there are 730397 reads with mutations.
## After classification, there are 1851177 reads/indexes which are only identical.
## After classification, there are 90341 reads/indexes which are strictly sequencer.
## After classification, there are 244494 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9104237 forward reads and 9257103 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list
quints_tenmpr <- create_matrices(sample_sheet="sample_sheets/all_samples.xlsx",
ident_column="identtable", mut_column="mutationtable",
min_reads=3, min_indexes=5, min_sequencer=10,
min_position=24, max_position=176,
max_mutations_per_read=10,
prune_n=TRUE, verbose=TRUE)
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 799403 reads: 153778 lost or 16.13%.
## Mutation data: all filters removed 357132 reads, or 30.88%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1733789 indexes in all the data.
## After reads/index pruning, there are: 836838 indexes: 896951 lost or 51.73%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 799403 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 441562 changed reads: 55.24%.
## All data: after index pruning, there are: 3661605 identical reads: 78.21%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3661605 identical reads.
## Before classification, there are 441562 reads with mutations.
## After classification, there are 2748736 reads/indexes which are only identical.
## After classification, there are 9916 reads/indexes which are strictly sequencer.
## After classification, there are 26403 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7049093 forward reads and 7175885 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 1232741 reads: 410228 lost or 24.97%.
## Mutation data: all filters removed 2188462 reads, or 63.97%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1231605 indexes in all the data.
## After reads/index pruning, there are: 693381 indexes: 538224 lost or 43.70%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1232741 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 720963 changed reads: 58.48%.
## All data: after index pruning, there are: 4833605 identical reads: 92.40%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4833605 identical reads.
## Before classification, there are 720963 reads with mutations.
## After classification, there are 2832509 reads/indexes which are only identical.
## After classification, there are 98387 reads/indexes which are strictly sequencer.
## After classification, there are 123178 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 11930745 forward reads and 12406826 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: removing reads with greater than 10 mutations.
## Mutation data: after max_mutation pruning, there are: 1110089 reads: 341958 lost or 23.55%.
## Mutation data: all filters removed 3199592 reads, or 74.24%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 857851 indexes in all the data.
## After reads/index pruning, there are: 463161 indexes: 394690 lost or 46.01%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 1110089 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 662025 changed reads: 59.64%.
## All data: after index pruning, there are: 3331914 identical reads: 92.98%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3331914 identical reads.
## Before classification, there are 662025 reads with mutations.
## After classification, there are 1873630 reads/indexes which are only identical.
## After classification, there are 79142 reads/indexes which are strictly sequencer.
## After classification, there are 237111 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9205882 forward reads and 9355117 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list
quints_fivempr <- create_matrices(sample_sheet="sample_sheets/all_samples.xlsx",
ident_column="identtable", mut_column="mutationtable",
min_reads=3, min_indexes=5, min_sequencer=10,
min_position=24, max_position=176,
max_mutations_per_read=5,
prune_n=TRUE, verbose=TRUE)
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 1156535 reads.
## Mutation data: after min-position pruning, there are: 1037310 reads: 119225 lost or 10.31%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1037310 reads.
## Mutation data: after max-position pruning, there are: 968161 reads: 69149 lost or 6.67%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 953181 reads: 14980 lost or 1.55%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 608429 reads: 344752 lost or 36.17%.
## Mutation data: all filters removed 548106 reads, or 47.39%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1713933 indexes in all the data.
## After reads/index pruning, there are: 834821 indexes: 879112 lost or 51.29%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 608429 changed reads.
## All data: before reads/index pruning, there are: 4681501 identical reads.
## All data: after index pruning, there are: 379603 changed reads: 62.39%.
## All data: after index pruning, there are: 3657910 identical reads: 78.14%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3657910 identical reads.
## Before classification, there are 379603 reads with mutations.
## After classification, there are 2777271 reads/indexes which are only identical.
## After classification, there are 8544 reads/indexes which are strictly sequencer.
## After classification, there are 25485 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 7127863 forward reads and 7254038 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 3421203 reads.
## Mutation data: after min-position pruning, there are: 1758479 reads: 1662724 lost or 48.60%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1758479 reads.
## Mutation data: after max-position pruning, there are: 1667302 reads: 91177 lost or 5.18%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1642969 reads: 24333 lost or 1.46%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 807185 reads: 835784 lost or 50.87%.
## Mutation data: all filters removed 2614018 reads, or 76.41%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 1179116 indexes in all the data.
## After reads/index pruning, there are: 692307 indexes: 486809 lost or 41.29%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 807185 changed reads.
## All data: before reads/index pruning, there are: 5230976 identical reads.
## All data: after index pruning, there are: 585835 changed reads: 72.58%.
## All data: after index pruning, there are: 4832196 identical reads: 92.38%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 4832196 identical reads.
## Before classification, there are 585835 reads with mutations.
## After classification, there are 2934376 reads/indexes which are only identical.
## After classification, there are 79902 reads/indexes which are strictly sequencer.
## After classification, there are 116271 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 12365004 forward reads and 12844113 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Mutation data: removing any differences before position: 24.
## Mutation data: before pruning, there are: 4309681 reads.
## Mutation data: after min-position pruning, there are: 1564155 reads: 2745526 lost or 63.71%.
## Mutation data: removing any differences after position: 176.
## Mutation data: before pruning, there are: 1564155 reads.
## Mutation data: after max-position pruning, there are: 1482559 reads: 81596 lost or 5.22%.
## Mutation data: removing any reads with 'N' as the hit.
## Mutation data: after N pruning, there are: 1452047 reads: 30512 lost or 2.06%.
## Mutation data: removing reads with greater than 5 mutations.
## Mutation data: after max_mutation pruning, there are: 746662 reads: 705385 lost or 48.58%.
## Mutation data: all filters removed 3563019 reads, or 82.67%.
## All data: gathering information about the indexes observed, this is slow.
## Before reads/index pruning, there are: 808995 indexes in all the data.
## After reads/index pruning, there are: 461997 indexes: 346998 lost or 42.89%.
## All data: removing indexes with fewer than 3 reads/index.
## All data: before reads/index pruning, there are: 746662 changed reads.
## All data: before reads/index pruning, there are: 3583390 identical reads.
## All data: after index pruning, there are: 555226 changed reads: 74.36%.
## All data: after index pruning, there are: 3330970 identical reads: 92.96%.
## Gathering identical, mutant, and sequencer reads/indexes.
## Before classification, there are 3330970 identical reads.
## Before classification, there are 555226 reads with mutations.
## After classification, there are 1957637 reads/indexes which are only identical.
## After classification, there are 63014 reads/indexes which are strictly sequencer.
## After classification, there are 223250 reads/indexes which are deemed from reverse transcriptase.
## Counted by direction: 9578873 forward reads and 9724531 reverse_reads.
## Subsetting based on mutations with at least 5 indexes.
## Classified mutation strings according to various queries.
## Making a matrix of miss_reads_by_position.
## Making a matrix of miss_indexes_by_position.
## Making a matrix of miss_sequencer_by_position.
## Making a matrix of miss_reads_by_string.
## Making a matrix of miss_indexes_by_string.
## Making a matrix of miss_sequencer_by_string.
## Making a matrix of miss_reads_by_ref_nt.
## Making a matrix of miss_indexes_by_ref_nt.
## Making a matrix of miss_sequencer_by_ref_nt.
## Making a matrix of miss_reads_by_hit_nt.
## Making a matrix of miss_indexes_by_hit_nt.
## Making a matrix of miss_sequencer_by_hit_nt.
## Making a matrix of miss_reads_by_type.
## Making a matrix of miss_indexes_by_type.
## Making a matrix of miss_sequencer_by_type.
## Making a matrix of miss_reads_by_trans.
## Making a matrix of miss_indexes_by_trans.
## Making a matrix of miss_sequencer_by_trans.
## Making a matrix of miss_reads_by_strength.
## Making a matrix of miss_indexes_by_strength.
## Making a matrix of miss_sequencer_by_strength.
## Making a matrix of insert_reads_by_position.
## Making a matrix of insert_indexes_by_position.
## Making a matrix of insert_sequencer_by_position.
## Making a matrix of insert_reads_by_nt.
## Making a matrix of insert_indexes_by_nt.
## Making a matrix of insert_sequencer_by_nt.
## Making a matrix of delete_reads_by_position.
## Making a matrix of delete_indexes_by_position.
## Making a matrix of delete_sequencer_by_position.
## Making a matrix of delete_reads_by_nt.
## Making a matrix of delete_indexes_by_nt.
## Making a matrix of delete_sequencer_by_nt.
## Skipping table: miss_reads_by_ref_nt
## Skipping table: miss_indexes_by_ref_nt
## Skipping table: miss_sequencer_by_ref_nt
## Skipping table: miss_reads_by_hit_nt
## Skipping table: miss_indexes_by_hit_nt
## Skipping table: miss_sequencer_by_hit_nt
## Skipping table: delete_reads_by_position
## Skipping table: delete_indexes_by_position
## Skipping table: delete_sequencer_by_position
## Skipping table: delete_reads_by_nt
## Skipping table: delete_indexes_by_nt
## Skipping table: delete_sequencer_by_nt
## Length Class Mode
## samples 3 -none- list
## reads_per_sample 3 -none- numeric
## indexes_per_sample 3 -none- numeric
## matrices 33 -none- list
## matrices_by_counts 33 -none- list
## normalized 33 -none- list
## normalized_by_counts 33 -none- list