1 Messing around

Rawr!

## Loading errRt
## Loading required package: dplyr
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:hpgltools':
## 
##     combine
## The following object is masked from 'package:Biobase':
## 
##     combine
## The following objects are masked from 'package:BiocGenerics':
## 
##     combine, intersect, setdiff, union
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: tidyr
## Skipping missing files: quantify.r
## Adding files missing in collate: quant.R
## Starting sample: 1.
## Reading the file containing mutations: preprocessing/s1/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s1/step2_identical_reads.txt.xz
## Removing any differences before position: 24.
## Before pruning, there are: 1156535 reads.
## After position pruning, there are: 1037310 reads.
## Removing any reads with 'N' as the hit.
## Before N pruning, there are: 1037310 reads.
## After N pruning, there are: 1033703 reads.
## Gathering information about the indexes observed, this is slow.
## Before read pruning, there are: 1742744 indexes.
## After read pruning, there are: 838158 indexes.
## Removing indexes with fewer than  indexes.
## Before index pruning, there are: 1033703 changed reads.
## Before index pruning, there are: 4681501 identical reads.
## After index pruning, there are: 532571 changed reads.
## After index pruning, there are: 3663814 identical reads.
## Gathering identical, mutant, and sequencer reads/indexes.
## Counting by direction.
## Counting by string.
## Counting by reference position.
## Counting by identity string.
## Counting by reference nucleotide.
## Counting by product nucleotide.
## Counting by mutation type.
## Counting by transitions/transversion.
## Counting strong/weak.
## Counting insertions by position.
## Counting insertions by nucleotide
## Counting deletions by position.
## Counting deletions by nucleotide.
## Starting sample: 2.
## Reading the file containing mutations: preprocessing/s2/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s2/step2_identical_reads.txt.xz
## Removing any differences before position: 24.
## Before pruning, there are: 3421203 reads.
## After position pruning, there are: 1758479 reads.
## Removing any reads with 'N' as the hit.
## Before N pruning, there are: 1758479 reads.
## After N pruning, there are: 1754936 reads.
## Gathering information about the indexes observed, this is slow.
## Before read pruning, there are: 1263564 indexes.
## After read pruning, there are: 694479 indexes.
## Removing indexes with fewer than  indexes.
## Before index pruning, there are: 1754936 changed reads.
## Before index pruning, there are: 5230976 identical reads.
## After index pruning, there are: 874781 changed reads.
## After index pruning, there are: 4834367 identical reads.
## Gathering identical, mutant, and sequencer reads/indexes.
## Counting by direction.
## Counting by string.
## Counting by reference position.
## Counting by identity string.
## Counting by reference nucleotide.
## Counting by product nucleotide.
## Counting by mutation type.
## Counting by transitions/transversion.
## Counting strong/weak.
## Counting insertions by position.
## Counting insertions by nucleotide
## Counting deletions by position.
## Counting deletions by nucleotide.
## Starting sample: 3.
## Reading the file containing mutations: preprocessing/s3/step4.txt.xz
## Reading the file containing the identical reads: preprocessing/s3/step2_identical_reads.txt.xz
## Removing any differences before position: 24.
## Before pruning, there are: 4309681 reads.
## After position pruning, there are: 1564155 reads.
## Removing any reads with 'N' as the hit.
## Before N pruning, there are: 1564155 reads.
## After N pruning, there are: 1561090 reads.
## Gathering information about the indexes observed, this is slow.
## Before read pruning, there are: 887983 indexes.
## After read pruning, there are: 465046 indexes.
## Removing indexes with fewer than  indexes.
## Before index pruning, there are: 1561090 changed reads.
## Before index pruning, there are: 3583390 identical reads.
## After index pruning, there are: 783358 changed reads.
## After index pruning, there are: 3332425 identical reads.
## Gathering identical, mutant, and sequencer reads/indexes.
## Counting by direction.
## Counting by string.
## Counting by reference position.
## Counting by identity string.
## Counting by reference nucleotide.
## Counting by product nucleotide.
## Counting by mutation type.
## Counting by transitions/transversion.
## Counting strong/weak.
## Counting insertions by position.
## Counting insertions by nucleotide
## Counting deletions by position.
## Counting deletions by nucleotide.
## Working on miss_reads_by_position.
## Working on miss_indexes_by_position.
## Working on miss_sequencer_by_position.
## Working on miss_reads_by_string.
## Warning in order(as.numeric(rownames(matrices[[t]]))): NAs introduced by
## coercion
## Working on miss_indexes_by_string.
## Warning in order(as.numeric(rownames(matrices[[t]]))): NAs introduced by
## coercion
## Working on miss_sequencer_by_string.
## Warning in order(as.numeric(rownames(matrices[[t]]))): NAs introduced by
## coercion
## Working on miss_reads_by_ref_nt.
## Working on miss_indexes_by_ref_nt.
## Working on miss_sequencer_by_ref_nt.
## Working on miss_reads_by_hit_nt.
## Working on miss_indexes_by_hit_nt.
## Working on miss_sequencer_by_hit_nt.
## Working on miss_reads_by_type.
## Working on miss_indexes_by_type.
## Working on miss_sequencer_by_type.
## Working on miss_reads_by_trans.
## Working on miss_indexes_by_trans.
## Working on miss_sequencer_by_trans.
## Working on miss_reads_by_strength.
## Working on miss_indexes_by_strength.
## Working on miss_sequencer_by_strength.
## Working on insert_reads_by_position.
## Working on insert_indexes_by_position.
## Working on insert_sequencer_by_position.
## Working on insert_reads_by_nt.
## Working on insert_indexes_by_nt.
## Working on insert_sequencer_by_nt.
## Working on delete_reads_by_position.
## Working on delete_indexes_by_position.
## Working on delete_sequencer_by_position.
## Working on delete_reads_by_nt.
## Working on delete_indexes_by_nt.
## Working on delete_sequencer_by_nt.
##            Length Class  Mode
## matrices   33     -none- list
## normalized 33     -none- list
LS0tCnRpdGxlOiAiUGxheWluZyB3aXRoIG1pc21hdGNoIGRhdGEuIgphdXRob3I6ICJhdGIgYWJlbGV3QGdtYWlsLmNvbSIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6CiAgaHRtbF9kb2N1bWVudDoKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgIGNvZGVfZm9sZGluZzogc2hvdwogICAgZmlnX2NhcHRpb246IHRydWUKICAgIGZpZ19oZWlnaHQ6IDcKICAgIGZpZ193aWR0aDogNwogICAgaGlnaGxpZ2h0OiB0YW5nbwogICAga2VlcF9tZDogZmFsc2UKICAgIG1vZGU6IHNlbGZjb250YWluZWQKICAgIG51bWJlcl9zZWN0aW9uczogdHJ1ZQogICAgc2VsZl9jb250YWluZWQ6IHRydWUKICAgIHRoZW1lOiByZWFkYWJsZQogICAgdG9jOiB0cnVlCiAgICB0b2NfZmxvYXQ6CiAgICAgIGNvbGxhcHNlZDogZmFsc2UKICAgICAgc21vb3RoX3Njcm9sbDogZmFsc2UKICBybWRmb3JtYXRzOjpyZWFkdGhlZG93bjoKICAgIGNvZGVfZG93bmxvYWQ6IHRydWUKICAgIGNvZGVfZm9sZGluZzogc2hvdwogICAgZGZfcHJpbnQ6IHBhZ2VkCiAgICBmaWdfY2FwdGlvbjogdHJ1ZQogICAgZmlnX2hlaWdodDogNwogICAgZmlnX3dpZHRoOiA3CiAgICBoaWdobGlnaHQ6IHRhbmdvCiAgICB3aWR0aDogMzAwCiAgICBrZWVwX21kOiBmYWxzZQogICAgbW9kZTogc2VsZmNvbnRhaW5lZAogICAgdG9jX2Zsb2F0OiB0cnVlCiAgQmlvY1N0eWxlOjpodG1sX2RvY3VtZW50OgogICAgY29kZV9kb3dubG9hZDogdHJ1ZQogICAgY29kZV9mb2xkaW5nOiBzaG93CiAgICBmaWdfY2FwdGlvbjogdHJ1ZQogICAgZmlnX2hlaWdodDogNwogICAgZmlnX3dpZHRoOiA3CiAgICBoaWdobGlnaHQ6IHRhbmdvCiAgICBrZWVwX21kOiBmYWxzZQogICAgbW9kZTogc2VsZmNvbnRhaW5lZAogICAgdG9jX2Zsb2F0OiB0cnVlCi0tLQoKPHN0eWxlIHR5cGU9InRleHQvY3NzIj4KYm9keSwgdGQgewogIGZvbnQtc2l6ZTogMTZweDsKfQpjb2RlLnJ7CiAgZm9udC1zaXplOiAxNnB4Owp9CnByZSB7CiBmb250LXNpemU6IDE2cHgKfQo8L3N0eWxlPgoKYGBge3Igb3B0aW9ucywgaW5jbHVkZT1GQUxTRX0KbGlicmFyeSgiaHBnbHRvb2xzIikKdHQgPC0gZGV2dG9vbHM6OmxvYWRfYWxsKCIvZGF0YS9ocGdsdG9vbHMiKQprbml0cjo6b3B0c19rbml0JHNldCh3aWR0aD0xMjAsCiAgICAgICAgICAgICAgICAgICAgIHByb2dyZXNzPVRSVUUsCiAgICAgICAgICAgICAgICAgICAgIHZlcmJvc2U9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgZWNobz1UUlVFKQprbml0cjo6b3B0c19jaHVuayRzZXQoZXJyb3I9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgIGRwaT05NikKb2xkX29wdGlvbnMgPC0gb3B0aW9ucyhkaWdpdHM9NCwKICAgICAgICAgICAgICAgICAgICAgICBzdHJpbmdzQXNGYWN0b3JzPUZBTFNFLAogICAgICAgICAgICAgICAgICAgICAgIGtuaXRyLmR1cGxpY2F0ZS5sYWJlbD0iYWxsb3ciKQpnZ3Bsb3QyOjp0aGVtZV9zZXQoZ2dwbG90Mjo6dGhlbWVfYncoYmFzZV9zaXplPTEwKSkKcnVuZGF0ZSA8LSBmb3JtYXQoU3lzLkRhdGUoKSwgZm9ybWF0PSIlWSVtJWQiKQpwcmV2aW91c19maWxlIDwtICJpbmRleC5SbWQiCnZlciA8LSAiMjAxOTEyMDEiCgojI3RtcCA8LSBzbShsb2FkbWUoZmlsZW5hbWU9cGFzdGUwKGdzdWIocGF0dGVybj0iXFwuUm1kIiwgcmVwbGFjZT0iIiwgeD1wcmV2aW91c19maWxlKSwgIi12IiwgdmVyLCAiLnJkYS54eiIpKSkKIyNybWRfZmlsZSA8LSAiMDNfZXhwcmVzc2lvbl9pbmZlY3Rpb25fMjAxODA4MjIuUm1kIgpgYGAKCiMgTWVzc2luZyBhcm91bmQKCiBSYXdyIQoKYGBge3IgdGVzdGluZ30KZGV2dG9vbHM6OmxvYWRfYWxsKCJlcnJSdCIpCmRhdGFfc3VtbWFyeSA8LSBjcmVhdGVfbWF0cmljZXMoc2FtcGxlX3NoZWV0PSJzYW1wbGVfc2hlZXRzL2FsbF9zYW1wbGVzLnhsc3giLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIGlkZW50X2NvbHVtbj0iaWRlbnR0YWJsZSIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbXV0X2NvbHVtbj0ibXV0YXRpb250YWJsZSIsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbWluX3JlYWRzPTMsIG1pbl9pbmRleGVzPU5VTEwsCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgbWluX3NlcXVlbmNlcj0xMCwgbWluX3Bvc2l0aW9uPTI0LAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHBydW5lX249VFJVRSwgdmVyYm9zZT1UUlVFKQpkYXRhX3Bsb3RzIDwtIGJhcnBsb3RfbWF0cmljZXMoZGF0YV9zdW1tYXJ5KQpzdW1tYXJ5KGRhdGFfc3VtbWFyeSkKYGBgCgojIFByaW50IHJhdyB0YWJsZXMKCmBgYHtyIHJhdywgcmVzdWx0cz0nYXNpcyd9CmZvciAodCBpbiAxOmxlbmd0aChkYXRhX3N1bW1hcnlbWyJtYXRyaWNlcyJdXSkpIHsKICB0YWJsZV9uYW1lIDwtIG5hbWVzKGRhdGFfc3VtbWFyeVtbIm1hdHJpY2VzIl1dKVt0XQogIG1lc3NhZ2UoIlJhdyB0YWJsZTogIiwgdGFibGVfbmFtZSwgIi4iKQogIHByaW50KGtuaXRyOjprYWJsZShkYXRhX3N1bW1hcnlbWyJtYXRyaWNlcyJdXVt0XSkpCn0KYGBgCgojIFByaW50IHJhdyBwbG90cwoKYGBge3IgcmF3X3Bsb3RzfQpmb3IgKHQgaW4gMTpsZW5ndGgoZGF0YV9zdW1tYXJ5W1sibWF0cmljZXMiXV0pKSB7CiAgbWVzc2FnZSgiUmF3IHRhYmxlOiAiLCB0YWJsZV9uYW1lLCAiLiIpCiAgcHJpbnQoZGF0YV9wbG90c1tbIm1hdHJpY2VzIl1dW3RdKQp9CmBgYAoKIyBQcmludCBub3JtYWxpemVkIHRhYmxlcwoKYGBge3Igbm9ybSwgcmVzdWx0cz0nYXNpcyd9CmZvciAodCBpbiAxOmxlbmd0aChkYXRhX3N1bW1hcnlbWyJub3JtYWxpemVkIl1dKSkgewogIHRhYmxlX25hbWUgPC0gbmFtZXMoZGF0YV9zdW1tYXJ5W1sibm9ybWFsaXplZCJdXSlbdF0KICBtZXNzYWdlKCJOb3JtYWxpemVkIHRhYmxlOiAiLCB0YWJsZV9uYW1lLCAiLiIpCiAgcHJpbnQoa25pdHI6OmthYmxlKGRhdGFfc3VtbWFyeVtbIm5vcm1hbGl6ZWQiXV1bdF0pKQp9CmBgYAoKIyBQcmludCBub3JtYWxpemVkIHBsb3RzCgpgYGB7ciBub3JtX3Bsb3RzfQpmb3IgKHQgaW4gMTpsZW5ndGgoZGF0YV9zdW1tYXJ5W1sibm9ybWFsaXplZCJdXSkpIHsKICBtZXNzYWdlKCJOb3JtYWxpemVkIHRhYmxlOiAiLCB0YWJsZV9uYW1lLCAiLiIpCiAgcHJpbnQoZGF0YV9wbG90c1tbIm5vcm1hbCJdXVt0XSkKfQpgYGAKCmBgYHtyIHNhdmVtZSwgZXZhbD1GQUxTRX0KcGFuZGVyOjpwYW5kZXIoc2Vzc2lvbkluZm8oKSkKbWVzc2FnZShwYXN0ZTAoIlRoaXMgaXMgaHBnbHRvb2xzIGNvbW1pdDogIiwgZ2V0X2dpdF9jb21taXQoKSkpCnRoaXNfc2F2ZSA8LSBwYXN0ZTAoZ3N1YihwYXR0ZXJuPSJcXC5SbWQiLCByZXBsYWNlPSIiLCB4PXJtZF9maWxlKSwgIi12IiwgdmVyLCAiLnJkYS54eiIpCm1lc3NhZ2UocGFzdGUwKCJTYXZpbmcgdG8gIiwgdGhpc19zYXZlKSkKdG1wIDwtIHNtKHNhdmVtZShmaWxlbmFtZT10aGlzX3NhdmUpKQpgYGAK