1 Introduction

I am coming into this project in a state of perfect ignorance. Carrie kindly sent me a few or two yesterday but I have yet to open the email and start reading them.

The only things I know for certain:

There are ~ 24 samples with names prefixed with ‘A’ ‘B’ and ‘C’.
The most likely reference is the ensembl house mouse mm39; though the real reference is actually the charles river CD-1. I was reasonably certain yesterday that it is possible to download this mouse line’s reference, but I think that is untrue – or at least my attempts to find it failed.
The sequence libraries are likely in the reverse orientation. At least that was my assumption.

The document ‘preprocess.Rmd’ outlines the commands I ran. I used my pipeline’s Process_RNASeq function, which trims, runs fastqc, kraken, hisat, and htseq by default.

2 Metadata

I received a complete sample sheet from either Najib or Carrie and modified it slightly to match the places where I put the raw data. I then copied it to ‘sample_sheets/all_samples.xlsx’ so that I need not worry about messing up the original.

My function ‘gather_preprocessing_metadata()’ has defaults which should provide some helpful new columns in this metadata sheet. Upon completion, it should write a new copy of the same file with a suffix ’_modified.xlsx’.

FIXME: modify the function to detect columns with dates and make sure to keep the encoding the same. FIXME: Najib changed his template and added a new first row, add a check against that - or just delete the row manually. FIXME: Set the default significant digits back to NULL, having all these darn .000’s is annoying.

modified <- gather_preprocessing_metadata("sample_sheets/all_samples.xlsx")

## Did not find the condition column in the sample sheet.

## Filling it in as undefined.

## Did not find the batch column in the sample sheet.

## Filling it in as undefined.

## Skipping for now

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A1/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A2/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B1/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B3/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B4/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B6/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C4/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C6/outputs/*kraken_*/kraken_report_matrix.tsv.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A1/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A2/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A3/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A4/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/A5/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B1/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B2/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B3/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B4/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B5/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/B6/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C1/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C2/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C3/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C4/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C5/outputs/*salmon_*/quant.sf.

## Warning in dispatch_filename_search(meta, input_file_spec, verbose = verbose, : The input file is NA for:
## preprocessing/C6/outputs/*salmon_*/quant.sf.

## Writing new metadata to: sample_sheets/all_samples_modified.xlsx

## Deleting the file sample_sheets/all_samples_modified.xlsx before writing the tables.

I immediately learned that I somehow forgot to process the first sample!? It is processing now, I have no clue how that happened.

3 Annotations

load_biomart_annotations, if not told anything else, will connect to ensembl and attempt to download the most commonly requested annotations for homo_sapiens from the archive server 2 years before the current date. This is because, as a general rule, I use genomes which are ~ 2-3 years old.

I am also downloading the ontology data, though most tools are aware of Mus.

In addition, I will load the gff annotations from the gff file used to count the genes, just in case there are some mismatches between the ensembl and gff gene IDs.

annot <- load_biomart_annotations(species = "mmusculus", year = 2022, month = 7)

## The biomart annotations file already exists, loading from it.

gene_annotations <- annot[["gene_annotations"]]

go_db <- load_biomart_go(species = "mmusculus", year = 2022, month = 7)

## The biomart annotations file already exists, loading from it.

gff_annot <- load_gff_annotations("~/libraries/genome/mm38_100.gff", id_col = "gene")

## Warning in load_gff_annotations("~/libraries/genome/mm38_100.gff", id_col = "gene"): Attempting to create a dataframe with gene and locus_tag both
## failed.

4 Initial expressionset

The experimental metadata now includes the count table filenames and I have a reasonable set of gene annotations. I should be able therefore the merge them all into an expressionset and/or summarizedExperiment.

mm_expt <- create_expt(modified[["new_file"]],
                       gene_info = gene_annotations,
                       file_column = "hisatcounttable") %>%
  set_expt_conditions(fact = "abc") %>%
  set_expt_batches(fact = "number")

## Reading the sample metadata.

## The sample definitions comprises: 17 rows(samples) and 41 columns(metadata fields).

## Matched 25760 annotations and counts.

## Bringing together the count matrix and gene information.

## Saving the expressionset to 'expt.rda'.

## The final expressionset has 25760 features and 17 samples.

## The numbers of samples by condition are:

## 
## A B C 
## 5 6 6

## The number of samples by batch are:

## 
## b1 b2 b3 b4 b5 b6 
##  3  3  3  3  3  2

written <- write_expt(expt, excel = glue("excel/all_samples-v{ver}.xlsx"))

## Error in (function (classes, fdef, mtable) : unable to find an inherited method for function 'exprs' for signature '"function"'

5 Poke at it

plot_libsize(mm_expt)

## Library sizes of 17 samples, 
## ranging from 27,324,171 to 69,203,380.

plot_nonzero(mm_expt)

## The following samples have less than 16744 genes.

## [1] "A1" "A2" "A5" "B1" "B3" "B4"

## A non-zero genes plot of 17 samples.
## These samples have an average 37.06 CPM coverage and 16818 genes observed, ranging from 16399 to
## 17275.

norm <- normalize_expt(mm_expt, transform = "log2", convert = "cpm",
                       norm = "quant", filter = TRUE)

## Removing 12303 low-count genes (13457 remaining).

## transform_counts: Found 57 values equal to 0, adding 1 to the matrix.

plot_corheat(norm)

## A heatmap of pairwise sample correlations ranging from: 
## 0.745893022508784 to 0.996487658158264.

plot_disheat(norm)

## A heatmap of pairwise sample distances ranging from: 
## 19.647203152876 to 167.113181354411.

plot_pca(norm)

## The result of performing a fast_svd dimension reduction.
## The x-axis is PC1 and the y-axis is PC2
## Colors are defined by A, B, C
## Shapes are defined by b1, b2, b3, b4, b5, b6.

Holy ass crackers! I am not sure I have ever had a dataset which split this coherently. I need better names than ‘A’ ‘B’ ‘C’.

Ok, I will just do a no-batch DE because I am not sure of the actual batches and/or surrogates, and who cares the data split so well I am worried (not really) it is simulated.

Oh, before I forget, April has been asking about rRNA content. I think I quantified that?

5.1 Check the rRNA content

No, it appears I didn’t submit rRNA queries. Lets do that now before I forget.

cd preprocessing
start=$(pwd)
for i in A* B* C*; do
    cd $i
    cyoa --method hisat --species mm38_100 --libtype rRNA --gff_type misc_feature --gff_tag ID \
         --input $(/bin/ls *-trimmed.fastq.xz | tr '\n' ':' | sed 's/:$//g')
    cd $start
done

6 varpart

varpart <- simple_varpart(mm_expt)

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## Warning in summary.lm(object, ...): essentially perfect fit: summary may be unreliable

## 
## Total:162 s

varpart

## The result of using variancePartition with the model:
## ~ condition + batch

7 GSVA

Since I have not read the kindly-sent reviews, I will cheat a little and use GSVA to get some ideas about potential papers. I default to C2 which is likely not the right gene set list.

I just downloaded the new msigdb, let us use that instead of the much less interesting GSVAdata set. Frustratingly, the new version of MSigDB provides invalid XML (there are apparently ‘<’ characters in the text fields of this file, which is explicitly forbidden in the XML standard), so I wrote a function to read the annotations from the SQLite database.

FIXME: I need to do some work to clean up the IDs with this new function.

mm_gsva <- simple_gsva(mm_expt, orgdb = "org.Mm.eg.db")

## Converting the rownames() of the expressionset to ENTREZID.

## 4032 ENSEMBL ID's didn't have a matching ENTEREZ ID. Dropping them now.

## Before conversion, the expressionset has 25760 entries.

## After conversion, the expressionset has 22019 entries.

#msig_meta <- get_msigdb_metadata(mm_gsva,
#                                 msig_db = "reference/msigdb_v2023.2.Mm/msigdb_v2023.2.Mm.db")


mm_gsva_sig <- get_sig_gsva_categories(mm_gsva)

## Starting limma pairwise comparison.

## libsize was not specified, this parameter has profound effects on limma's result.

## Using the libsize from expt$libsize.

## Limma step 1/6: choosing model.

## Assuming this data is similar to a micro array and not performign voom.

## Limma step 3/6: running lmFit with method: ls.

## Limma step 4/6: making and fitting contrasts with no intercept. (~ 0 + factors)

## Limma step 5/6: Running eBayes with robust = FALSE and trend = FALSE.

## Limma step 6/6: Writing limma outputs.

## Limma step 6/6: 1/3: Creating table: B_vs_A.  Adjust = BH

## Limma step 6/6: 2/3: Creating table: C_vs_A.  Adjust = BH

## Limma step 6/6: 3/3: Creating table: C_vs_B.  Adjust = BH

## Limma step 6/6: 1/3: Creating table: A.  Adjust = BH

## Limma step 6/6: 2/3: Creating table: B.  Adjust = BH

## Limma step 6/6: 3/3: Creating table: C.  Adjust = BH

## The factor A has 5 rows.

## The factor B has 6 rows.

## The factor C has 6 rows.

## Testing each factor against the others.

## Scoring A against everything else.

## Scoring B against everything else.

## Scoring C against everything else.

## Deleting the file excel/gsva_subset.xlsx before writing the tables.

8 Kraken matrix

Come back to this, note to self the previous iteration was explicitly looking for Pseudomonas contamination.

genus_expt <- create_expt(gathered[["new_file"]],
                          file_column = "krakenmatrix", file_type = "table")
genus_norm <- normalize_expt(genus_expt, convert = "cpm")
plot_disheat(genus_norm)
genus_normv2 <- normalize_expt(genus_expt, convert = "cpm", transform = "log2")
plot_pca(genus_normv2)
plot_libsize(genus_expt)
head(exprs(genus_expt))
exprs(genus_expt)["Pseudomonas", ]

9 Differential Expression

Until I get more meaningful condition names, I will just do B/A C/A C/B

keepers <- list(
  "ba" = c("B", "A"),
  "ca" = c("C", "A"),
  "cb" = c("C", "B"))
de <- all_pairwise(mm_expt, filter = TRUE, model_batch = FALSE)

## 
## A B C 
## 5 6 6

tables <- combine_de_tables(
  de, keepers = keepers, excel = glue("excel/de_tables-v{ver}.xlsx"))

## Checking limma for name ba:B_vs_A

## Checking deseq for name ba:B_vs_A

## Checking edger for name ba:B_vs_A

## Checking ebseq for name ba:B_vs_A

## Checking noiseq for name ba:B_vs_A

## Checking basic for name ba:B_vs_A

## Checking limma for name ca:C_vs_A

## Checking deseq for name ca:C_vs_A

## Checking edger for name ca:C_vs_A

## Checking ebseq for name ca:C_vs_A

## Checking noiseq for name ca:C_vs_A

## Checking basic for name ca:C_vs_A

## Checking limma for name cb:C_vs_B

## Checking deseq for name cb:C_vs_B

## Checking edger for name cb:C_vs_B

## Checking ebseq for name cb:C_vs_B

## Checking noiseq for name cb:C_vs_B

## Checking basic for name cb:C_vs_B

## About to start combine_mapped_table()

## Finished combine_mapped_table()

## About to start combine_mapped_table()

## Finished combine_mapped_table()

## About to start combine_mapped_table()

## Finished combine_mapped_table()

sig <- extract_significant_genes(
  tables, according_to = "deseq", excel = glue("excel/de_sig-v{ver}.xlsx"))

10 Ontology enrichment

mm38 is nicely supported in gProfiler/clusterProfiler.

all_gp <- all_gprofiler(sig, species = "mmusculus")
all_gp

## Running gProfiler on every set of significant genes found:

##           GO KEGG REAC WP  TF MIRNA HPA CORUM HP
## ba_up   1329    4   14  2 483     0   0     0  0
## ba_down 1296    9   58  1 407     0   0     0  0
## ca_up   1484    5    9  0 524     0   0     0  0
## ca_down 1374   11   80  2 489     0   0     0  0
## cb_up    656    3    3  0 146     0   0     0  1
## cb_down  168    0    0  0  36     0   0     0  0

all_gp[["ba_up"]]

## A set of ontologies produced by gprofiler using 2437
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 1329 GO hits, 4, KEGG hits, 14 reactome hits, 2 wikipathway hits, 483 transcription factor hits, 0 miRNA hits, 0 HPA hits, 0 HP hits, and 0 CORUM hits.

## Category MF is the most populated with 30 hits.

all_gp[["ba_down"]]

## A set of ontologies produced by gprofiler using 1497
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 1296 GO hits, 9, KEGG hits, 58 reactome hits, 1 wikipathway hits, 407 transcription factor hits, 0 miRNA hits, 0 HPA hits, 0 HP hits, and 0 CORUM hits.
## Category MF is the most populated with 30 hits.

all_gp[["ca_up"]]

## A set of ontologies produced by gprofiler using 2916
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 1484 GO hits, 5, KEGG hits, 9 reactome hits, 0 wikipathway hits, 524 transcription factor hits, 0 miRNA hits, 0 HPA hits, 0 HP hits, and 0 CORUM hits.
## Category MF is the most populated with 30 hits.

all_gp[["ca_down"]]

## A set of ontologies produced by gprofiler using 1840
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 1374 GO hits, 11, KEGG hits, 80 reactome hits, 2 wikipathway hits, 489 transcription factor hits, 0 miRNA hits, 0 HPA hits, 0 HP hits, and 0 CORUM hits.
## Category MF is the most populated with 30 hits.

all_gp[["cb_up"]]

## A set of ontologies produced by gprofiler using 753
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 656 GO hits, 3, KEGG hits, 3 reactome hits, 0 wikipathway hits, 146 transcription factor hits, 0 miRNA hits, 0 HPA hits, 1 HP hits, and 0 CORUM hits.
## Category MF is the most populated with 30 hits.

all_gp[["cb_down"]]

## A set of ontologies produced by gprofiler using 199
## genes against the mmusculus annotations and significance cutoff 0.05.
## There are 168 GO hits, 0, KEGG hits, 0 reactome hits, 0 wikipathway hits, 36 transcription factor hits, 0 miRNA hits, 0 HPA hits, 0 HP hits, and 0 CORUM hits.

## Category BP is the most populated with 30 hits.

## all_cp <- all_clusterprofiler(sig, species = "mmusculus")

LS0tCnRpdGxlOiAiRXhhbWluaW5nIGFuIGV4cGVyaW1lbnQgaW4gbW91c2UgZW1icnlvZ2VuZXNpcy4iCmF1dGhvcjogImF0YiBhYmVsZXdAZ21haWwuY29tIgpkYXRlOiAiYHIgU3lzLkRhdGUoKWAiCm91dHB1dDoKICBodG1sX2RvY3VtZW50OgogICAgY29kZV9kb3dubG9hZDogdHJ1ZQogICAgY29kZV9mb2xkaW5nOiBzaG93CiAgICBmaWdfY2FwdGlvbjogdHJ1ZQogICAgZmlnX2hlaWdodDogNwogICAgZmlnX3dpZHRoOiA3CiAgICBoaWdobGlnaHQ6IHplbmJ1cm4KICAgIGtlZXBfbWQ6IGZhbHNlCiAgICBtb2RlOiBzZWxmY29udGFpbmVkCiAgICBudW1iZXJfc2VjdGlvbnM6IHRydWUKICAgIHNlbGZfY29udGFpbmVkOiB0cnVlCiAgICB0aGVtZTogcmVhZGFibGUKICAgIHRvYzogdHJ1ZQogICAgdG9jX2Zsb2F0OgogICAgICBjb2xsYXBzZWQ6IGZhbHNlCiAgICAgIHNtb290aF9zY3JvbGw6IGZhbHNlCiAgcm1kZm9ybWF0czo6cmVhZHRoZWRvd246CiAgICBjb2RlX2Rvd25sb2FkOiB0cnVlCiAgICBjb2RlX2ZvbGRpbmc6IHNob3cKICAgIGRmX3ByaW50OiBwYWdlZAogICAgZmlnX2NhcHRpb246IHRydWUKICAgIGZpZ19oZWlnaHQ6IDcKICAgIGZpZ193aWR0aDogNwogICAgaGlnaGxpZ2h0OiB6ZW5idXJuCiAgICB3aWR0aDogMzAwCiAgICBrZWVwX21kOiBmYWxzZQogICAgbW9kZTogc2VsZmNvbnRhaW5lZAogICAgdG9jX2Zsb2F0OiB0cnVlCiAgQmlvY1N0eWxlOjpodG1sX2RvY3VtZW50OgogICAgY29kZV9kb3dubG9hZDogdHJ1ZQogICAgY29kZV9mb2xkaW5nOiBzaG93CiAgICBmaWdfY2FwdGlvbjogdHJ1ZQogICAgZmlnX2hlaWdodDogNwogICAgZmlnX3dpZHRoOiA3CiAgICBoaWdobGlnaHQ6IHplbmJ1cm4KICAgIGtlZXBfbWQ6IGZhbHNlCiAgICBtb2RlOiBzZWxmY29udGFpbmVkCiAgICB0b2NfZmxvYXQ6IHRydWUKLS0tCgo8c3R5bGUgdHlwZT0idGV4dC9jc3MiPgpib2R5LCB0ZCB7CiAgZm9udC1zaXplOiAxNnB4Owp9CmNvZGUucnsKICBmb250LXNpemU6IDE2cHg7Cn0KcHJlIHsKIGZvbnQtc2l6ZTogMTZweAp9CmJvZHkgLm1haW4tY29udGFpbmVyIHsKICBtYXgtd2lkdGg6IDE2MDBweDsKfQo8L3N0eWxlPgoKYGBge3Igb3B0aW9ucywgaW5jbHVkZT1GQUxTRX0KbGlicmFyeShocGdsdG9vbHMpCmxpYnJhcnkoaHBnbGRhdGEpCmxpYnJhcnkocmV0aWN1bGF0ZSkKbGlicmFyeShnbHVlKQp0dCA8LSBkZXZ0b29sczo6bG9hZF9hbGwoIn4vaHBnbHRvb2xzIikKa25pdHI6Om9wdHNfa25pdCRzZXQoCiAgcHJvZ3Jlc3MgPSBUUlVFLCB2ZXJib3NlID0gVFJVRSwgd2lkdGggPSA5MCwgZWNobyA9IFRSVUUpCmtuaXRyOjpvcHRzX2NodW5rJHNldCgKICBlcnJvciA9IFRSVUUsIGZpZy53aWR0aCA9IDgsIGZpZy5oZWlnaHQgPSA4LCBmaWcucmV0aW5hID0gMiwKICBmaWcucG9zID0gInQiLCBmaWcuYWxpZ24gPSAiY2VudGVyIiwgZHBpID0gaWYgKGtuaXRyOjppc19sYXRleF9vdXRwdXQoKSkgNzIgZWxzZSAzMDAsCiAgb3V0LndpZHRoID0gIjEwMCUiLCBkZXYgPSAicG5nIiwKICBkZXYuYXJncyA9IGxpc3QocG5nID0gbGlzdCh0eXBlID0gImNhaXJvLXBuZyIpKSkKb2xkX29wdGlvbnMgPC0gb3B0aW9ucyhkaWdpdHMgPSA0LAogICAgICAgICAgICAgICAgICAgICAgIHN0cmluZ3NBc0ZhY3RvcnMgPSBGQUxTRSwKICAgICAgICAgICAgICAgICAgICAgICBrbml0ci5kdXBsaWNhdGUubGFiZWwgPSAiYWxsb3ciKQpnZ3Bsb3QyOjp0aGVtZV9zZXQoZ2dwbG90Mjo6dGhlbWVfYncoYmFzZV9zaXplID0gMTIpKQp2ZXIgPC0gIjIwMjQwMSIKcHJldmlvdXNfZmlsZSA8LSAiIgp2ZXIgPC0gZm9ybWF0KFN5cy5EYXRlKCksICIlWSVtJWQiKQoKIyN0bXAgPC0gc20obG9hZG1lKGZpbGVuYW1lPXBhc3RlMChnc3ViKHBhdHRlcm49IlxcLlJtZCIsIHJlcGxhY2U9IiIsIHg9cHJldmlvdXNfZmlsZSksICItdiIsIHZlciwgIi5yZGEueHoiKSkpCnJtZF9maWxlIDwtICJpbmRleC5SbWQiCmBgYAoKIyBJbnRyb2R1Y3Rpb24KCkkgYW0gY29taW5nIGludG8gdGhpcyBwcm9qZWN0IGluIGEgc3RhdGUgb2YgcGVyZmVjdCBpZ25vcmFuY2UuICBDYXJyaWUKa2luZGx5IHNlbnQgbWUgYSBmZXcgb3IgdHdvIHllc3RlcmRheSBidXQgSSBoYXZlIHlldCB0byBvcGVuIHRoZSBlbWFpbAphbmQgc3RhcnQgcmVhZGluZyB0aGVtLgoKVGhlIG9ubHkgdGhpbmdzIEkga25vdyBmb3IgY2VydGFpbjoKCjEuICBUaGVyZSBhcmUgfiAyNCBzYW1wbGVzIHdpdGggbmFtZXMgcHJlZml4ZWQgd2l0aCAnQScgJ0InIGFuZCAnQycuCjIuICBUaGUgbW9zdCBsaWtlbHkgcmVmZXJlbmNlIGlzIHRoZSBlbnNlbWJsIGhvdXNlIG1vdXNlIG1tMzk7IHRob3VnaAogICAgdGhlIHJlYWwgcmVmZXJlbmNlIGlzIGFjdHVhbGx5IHRoZSBjaGFybGVzIHJpdmVyIENELTEuICBJIHdhcwogICAgcmVhc29uYWJseSBjZXJ0YWluIHllc3RlcmRheSB0aGF0IGl0IGlzIHBvc3NpYmxlIHRvIGRvd25sb2FkIHRoaXMKICAgIG1vdXNlIGxpbmUncyByZWZlcmVuY2UsIGJ1dCBJIHRoaW5rIHRoYXQgaXMgdW50cnVlIC0tIG9yIGF0IGxlYXN0CiAgICBteSBhdHRlbXB0cyB0byBmaW5kIGl0IGZhaWxlZC4KMy4gIFRoZSBzZXF1ZW5jZSBsaWJyYXJpZXMgYXJlIGxpa2VseSBpbiB0aGUgcmV2ZXJzZSBvcmllbnRhdGlvbi4gIEF0CiAgICBsZWFzdCB0aGF0IHdhcyBteSBhc3N1bXB0aW9uLgoKVGhlIGRvY3VtZW50ICdwcmVwcm9jZXNzLlJtZCcgb3V0bGluZXMgdGhlIGNvbW1hbmRzIEkgcmFuLiAgSSB1c2VkIG15CnBpcGVsaW5lJ3MgUHJvY2Vzc19STkFTZXEgZnVuY3Rpb24sIHdoaWNoIHRyaW1zLCBydW5zIGZhc3RxYywga3Jha2VuLApoaXNhdCwgYW5kIGh0c2VxIGJ5IGRlZmF1bHQuCgojIE1ldGFkYXRhCgpJIHJlY2VpdmVkIGEgY29tcGxldGUgc2FtcGxlIHNoZWV0IGZyb20gZWl0aGVyIE5hamliIG9yIENhcnJpZSBhbmQKbW9kaWZpZWQgaXQgc2xpZ2h0bHkgdG8gbWF0Y2ggdGhlIHBsYWNlcyB3aGVyZSBJIHB1dCB0aGUgcmF3IGRhdGEuICBJCnRoZW4gY29waWVkIGl0IHRvICdzYW1wbGVfc2hlZXRzL2FsbF9zYW1wbGVzLnhsc3gnIHNvIHRoYXQgSSBuZWVkIG5vdAp3b3JyeSBhYm91dCBtZXNzaW5nIHVwIHRoZSBvcmlnaW5hbC4KCk15IGZ1bmN0aW9uICdnYXRoZXJfcHJlcHJvY2Vzc2luZ19tZXRhZGF0YSgpJyBoYXMgZGVmYXVsdHMgd2hpY2gKc2hvdWxkIHByb3ZpZGUgc29tZSBoZWxwZnVsIG5ldyBjb2x1bW5zIGluIHRoaXMgbWV0YWRhdGEgc2hlZXQuICBVcG9uCmNvbXBsZXRpb24sIGl0IHNob3VsZCB3cml0ZSBhIG5ldyBjb3B5IG9mIHRoZSBzYW1lIGZpbGUgd2l0aCBhIHN1ZmZpeAonX21vZGlmaWVkLnhsc3gnLgoKRklYTUU6IG1vZGlmeSB0aGUgZnVuY3Rpb24gdG8gZGV0ZWN0IGNvbHVtbnMgd2l0aCBkYXRlcyBhbmQgbWFrZSBzdXJlCnRvIGtlZXAgdGhlIGVuY29kaW5nIHRoZSBzYW1lLgpGSVhNRTogTmFqaWIgY2hhbmdlZCBoaXMgdGVtcGxhdGUgYW5kIGFkZGVkIGEgbmV3IGZpcnN0IHJvdywgYWRkIGEKY2hlY2sgYWdhaW5zdCB0aGF0IC0gb3IganVzdCBkZWxldGUgdGhlIHJvdyBtYW51YWxseS4KRklYTUU6IFNldCB0aGUgZGVmYXVsdCBzaWduaWZpY2FudCBkaWdpdHMgYmFjayB0byBOVUxMLCBoYXZpbmcgYWxsCnRoZXNlIGRhcm4gLjAwMCdzIGlzIGFubm95aW5nLgoKYGBge3J9Cm1vZGlmaWVkIDwtIGdhdGhlcl9wcmVwcm9jZXNzaW5nX21ldGFkYXRhKCJzYW1wbGVfc2hlZXRzL2FsbF9zYW1wbGVzLnhsc3giKQpgYGAKCkkgaW1tZWRpYXRlbHkgbGVhcm5lZCB0aGF0IEkgc29tZWhvdyBmb3Jnb3QgdG8gcHJvY2VzcyB0aGUgZmlyc3QKc2FtcGxlIT8gSXQgaXMgcHJvY2Vzc2luZyBub3csIEkgaGF2ZSBubyBjbHVlIGhvdyB0aGF0IGhhcHBlbmVkLgoKIyBBbm5vdGF0aW9ucwoKbG9hZF9iaW9tYXJ0X2Fubm90YXRpb25zLCBpZiBub3QgdG9sZCBhbnl0aGluZyBlbHNlLCB3aWxsIGNvbm5lY3QgdG8KZW5zZW1ibCBhbmQgYXR0ZW1wdCB0byBkb3dubG9hZCB0aGUgbW9zdCBjb21tb25seSByZXF1ZXN0ZWQKYW5ub3RhdGlvbnMgZm9yIGhvbW9fc2FwaWVucyBmcm9tIHRoZSBhcmNoaXZlIHNlcnZlciAyIHllYXJzIGJlZm9yZQp0aGUgY3VycmVudCBkYXRlLiAgVGhpcyBpcyBiZWNhdXNlLCBhcyBhIGdlbmVyYWwgcnVsZSwgSSB1c2UgZ2Vub21lcwp3aGljaCBhcmUgfiAyLTMgeWVhcnMgb2xkLgoKSSBhbSBhbHNvIGRvd25sb2FkaW5nIHRoZSBvbnRvbG9neSBkYXRhLCB0aG91Z2ggbW9zdCB0b29scyBhcmUgYXdhcmUKb2YgTXVzLgoKSW4gYWRkaXRpb24sIEkgd2lsbCBsb2FkIHRoZSBnZmYgYW5ub3RhdGlvbnMgZnJvbSB0aGUgZ2ZmIGZpbGUgdXNlZCB0bwpjb3VudCB0aGUgZ2VuZXMsIGp1c3QgaW4gY2FzZSB0aGVyZSBhcmUgc29tZSBtaXNtYXRjaGVzIGJldHdlZW4gdGhlCmVuc2VtYmwgYW5kIGdmZiBnZW5lIElEcy4KCmBgYHtyfQphbm5vdCA8LSBsb2FkX2Jpb21hcnRfYW5ub3RhdGlvbnMoc3BlY2llcyA9ICJtbXVzY3VsdXMiLCB5ZWFyID0gMjAyMiwgbW9udGggPSA3KQpnZW5lX2Fubm90YXRpb25zIDwtIGFubm90W1siZ2VuZV9hbm5vdGF0aW9ucyJdXQoKZ29fZGIgPC0gbG9hZF9iaW9tYXJ0X2dvKHNwZWNpZXMgPSAibW11c2N1bHVzIiwgeWVhciA9IDIwMjIsIG1vbnRoID0gNykKCmdmZl9hbm5vdCA8LSBsb2FkX2dmZl9hbm5vdGF0aW9ucygifi9saWJyYXJpZXMvZ2Vub21lL21tMzhfMTAwLmdmZiIsIGlkX2NvbCA9ICJnZW5lIikKYGBgCgojIEluaXRpYWwgZXhwcmVzc2lvbnNldAoKVGhlIGV4cGVyaW1lbnRhbCBtZXRhZGF0YSBub3cgaW5jbHVkZXMgdGhlIGNvdW50IHRhYmxlIGZpbGVuYW1lcyBhbmQgSQpoYXZlIGEgcmVhc29uYWJsZSBzZXQgb2YgZ2VuZSBhbm5vdGF0aW9ucy4gIEkgc2hvdWxkIGJlIGFibGUgdGhlcmVmb3JlCnRoZSBtZXJnZSB0aGVtIGFsbCBpbnRvIGFuIGV4cHJlc3Npb25zZXQgYW5kL29yIHN1bW1hcml6ZWRFeHBlcmltZW50LgoKYGBge3J9Cm1tX2V4cHQgPC0gY3JlYXRlX2V4cHQobW9kaWZpZWRbWyJuZXdfZmlsZSJdXSwKICAgICAgICAgICAgICAgICAgICAgICBnZW5lX2luZm8gPSBnZW5lX2Fubm90YXRpb25zLAogICAgICAgICAgICAgICAgICAgICAgIGZpbGVfY29sdW1uID0gImhpc2F0Y291bnR0YWJsZSIpICU+JQogIHNldF9leHB0X2NvbmRpdGlvbnMoZmFjdCA9ICJhYmMiKSAlPiUKICBzZXRfZXhwdF9iYXRjaGVzKGZhY3QgPSAibnVtYmVyIikKCgp3cml0dGVuIDwtIHdyaXRlX2V4cHQoZXhwdCwgZXhjZWwgPSBnbHVlKCJleGNlbC9hbGxfc2FtcGxlcy12e3Zlcn0ueGxzeCIpKQpgYGAKCiMgUG9rZSBhdCBpdAoKYGBge3J9CnBsb3RfbGlic2l6ZShtbV9leHB0KQpwbG90X25vbnplcm8obW1fZXhwdCkKCm5vcm0gPC0gbm9ybWFsaXplX2V4cHQobW1fZXhwdCwgdHJhbnNmb3JtID0gImxvZzIiLCBjb252ZXJ0ID0gImNwbSIsCiAgICAgICAgICAgICAgICAgICAgICAgbm9ybSA9ICJxdWFudCIsIGZpbHRlciA9IFRSVUUpCnBsb3RfY29yaGVhdChub3JtKQpwbG90X2Rpc2hlYXQobm9ybSkKcGxvdF9wY2Eobm9ybSkKYGBgCgpIb2x5IGFzcyBjcmFja2VycyEgIEkgYW0gbm90IHN1cmUgSSBoYXZlIGV2ZXIgaGFkIGEgZGF0YXNldCB3aGljaApzcGxpdCB0aGlzIGNvaGVyZW50bHkuICBJIG5lZWQgYmV0dGVyIG5hbWVzIHRoYW4gJ0EnICdCJyAnQycuCgpPaywgSSB3aWxsIGp1c3QgZG8gYSBuby1iYXRjaCBERSBiZWNhdXNlIEkgYW0gbm90IHN1cmUgb2YgdGhlIGFjdHVhbApiYXRjaGVzIGFuZC9vciBzdXJyb2dhdGVzLCBhbmQgd2hvIGNhcmVzIHRoZSBkYXRhIHNwbGl0IHNvIHdlbGwgSSBhbQp3b3JyaWVkIChub3QgcmVhbGx5KSBpdCBpcyBzaW11bGF0ZWQuCgpPaCwgYmVmb3JlIEkgZm9yZ2V0LCBBcHJpbCBoYXMgYmVlbiBhc2tpbmcgYWJvdXQgclJOQSBjb250ZW50LiAgSQp0aGluayBJIHF1YW50aWZpZWQgdGhhdD8KCiMjIENoZWNrIHRoZSByUk5BIGNvbnRlbnQKCk5vLCBpdCBhcHBlYXJzIEkgZGlkbid0IHN1Ym1pdCByUk5BIHF1ZXJpZXMuICBMZXRzIGRvIHRoYXQgbm93IGJlZm9yZQpJIGZvcmdldC4KCmBgYHtiYXNoLCBldmFsPUZBTFNFfQpjZCBwcmVwcm9jZXNzaW5nCnN0YXJ0PSQocHdkKQpmb3IgaSBpbiBBKiBCKiBDKjsgZG8KICAgIGNkICRpCiAgICBjeW9hIC0tbWV0aG9kIGhpc2F0IC0tc3BlY2llcyBtbTM4XzEwMCAtLWxpYnR5cGUgclJOQSAtLWdmZl90eXBlIG1pc2NfZmVhdHVyZSAtLWdmZl90YWcgSUQgXAogICAgICAgICAtLWlucHV0ICQoL2Jpbi9scyAqLXRyaW1tZWQuZmFzdHEueHogfCB0ciAnXG4nICc6JyB8IHNlZCAncy86JC8vZycpCiAgICBjZCAkc3RhcnQKZG9uZQpgYGAKCiMgdmFycGFydAoKYGBge3J9CnZhcnBhcnQgPC0gc2ltcGxlX3ZhcnBhcnQobW1fZXhwdCkKdmFycGFydApgYGAKCiMgR1NWQQoKU2luY2UgSSBoYXZlIG5vdCByZWFkIHRoZSBraW5kbHktc2VudCByZXZpZXdzLCBJIHdpbGwgY2hlYXQgYSBsaXR0bGUKYW5kIHVzZSBHU1ZBIHRvIGdldCBzb21lIGlkZWFzIGFib3V0IHBvdGVudGlhbCBwYXBlcnMuICBJIGRlZmF1bHQgdG8KQzIgd2hpY2ggaXMgbGlrZWx5IG5vdCB0aGUgcmlnaHQgZ2VuZSBzZXQgbGlzdC4KCkkganVzdCBkb3dubG9hZGVkIHRoZSBuZXcgbXNpZ2RiLCBsZXQgdXMgdXNlIHRoYXQgaW5zdGVhZCBvZiB0aGUgbXVjaApsZXNzIGludGVyZXN0aW5nIEdTVkFkYXRhIHNldC4gIEZydXN0cmF0aW5nbHksIHRoZSBuZXcgdmVyc2lvbiBvZgpNU2lnREIgcHJvdmlkZXMgaW52YWxpZCBYTUwgKHRoZXJlIGFyZSBhcHBhcmVudGx5ICc8JyBjaGFyYWN0ZXJzIGluCnRoZSB0ZXh0IGZpZWxkcyBvZiB0aGlzIGZpbGUsIHdoaWNoIGlzIGV4cGxpY2l0bHkgZm9yYmlkZGVuIGluIHRoZSBYTUwKc3RhbmRhcmQpLCBzbyBJIHdyb3RlIGEgZnVuY3Rpb24gdG8gcmVhZCB0aGUgYW5ub3RhdGlvbnMgZnJvbSB0aGUKU1FMaXRlIGRhdGFiYXNlLgoKRklYTUU6IEkgbmVlZCB0byBkbyBzb21lIHdvcmsgdG8gY2xlYW4gdXAgdGhlIElEcyB3aXRoIHRoaXMgbmV3IGZ1bmN0aW9uLgoKYGBge3J9Cm1tX2dzdmEgPC0gc2ltcGxlX2dzdmEobW1fZXhwdCwgb3JnZGIgPSAib3JnLk1tLmVnLmRiIikKI21zaWdfbWV0YSA8LSBnZXRfbXNpZ2RiX21ldGFkYXRhKG1tX2dzdmEsCiMgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtc2lnX2RiID0gInJlZmVyZW5jZS9tc2lnZGJfdjIwMjMuMi5NbS9tc2lnZGJfdjIwMjMuMi5NbS5kYiIpCgoKbW1fZ3N2YV9zaWcgPC0gZ2V0X3NpZ19nc3ZhX2NhdGVnb3JpZXMobW1fZ3N2YSkKYGBgCgojIEtyYWtlbiBtYXRyaXgKCkNvbWUgYmFjayB0byB0aGlzLCBub3RlIHRvIHNlbGYgdGhlIHByZXZpb3VzIGl0ZXJhdGlvbiB3YXMgZXhwbGljaXRseQpsb29raW5nIGZvciBQc2V1ZG9tb25hcyBjb250YW1pbmF0aW9uLgoKYGBge3IsIGV2YWw9RkFMU0V9CmdlbnVzX2V4cHQgPC0gY3JlYXRlX2V4cHQoZ2F0aGVyZWRbWyJuZXdfZmlsZSJdXSwKICAgICAgICAgICAgICAgICAgICAgICAgICBmaWxlX2NvbHVtbiA9ICJrcmFrZW5tYXRyaXgiLCBmaWxlX3R5cGUgPSAidGFibGUiKQpnZW51c19ub3JtIDwtIG5vcm1hbGl6ZV9leHB0KGdlbnVzX2V4cHQsIGNvbnZlcnQgPSAiY3BtIikKcGxvdF9kaXNoZWF0KGdlbnVzX25vcm0pCmdlbnVzX25vcm12MiA8LSBub3JtYWxpemVfZXhwdChnZW51c19leHB0LCBjb252ZXJ0ID0gImNwbSIsIHRyYW5zZm9ybSA9ICJsb2cyIikKcGxvdF9wY2EoZ2VudXNfbm9ybXYyKQpwbG90X2xpYnNpemUoZ2VudXNfZXhwdCkKaGVhZChleHBycyhnZW51c19leHB0KSkKZXhwcnMoZ2VudXNfZXhwdClbIlBzZXVkb21vbmFzIiwgXQpgYGAKCiMgRGlmZmVyZW50aWFsIEV4cHJlc3Npb24KClVudGlsIEkgZ2V0IG1vcmUgbWVhbmluZ2Z1bCBjb25kaXRpb24gbmFtZXMsIEkgd2lsbCBqdXN0IGRvIEIvQSBDL0EKQy9CCgpgYGB7cn0Ka2VlcGVycyA8LSBsaXN0KAogICJiYSIgPSBjKCJCIiwgIkEiKSwKICAiY2EiID0gYygiQyIsICJBIiksCiAgImNiIiA9IGMoIkMiLCAiQiIpKQpkZSA8LSBhbGxfcGFpcndpc2UobW1fZXhwdCwgZmlsdGVyID0gVFJVRSwgbW9kZWxfYmF0Y2ggPSBGQUxTRSkKdGFibGVzIDwtIGNvbWJpbmVfZGVfdGFibGVzKAogIGRlLCBrZWVwZXJzID0ga2VlcGVycywgZXhjZWwgPSBnbHVlKCJleGNlbC9kZV90YWJsZXMtdnt2ZXJ9Lnhsc3giKSkKc2lnIDwtIGV4dHJhY3Rfc2lnbmlmaWNhbnRfZ2VuZXMoCiAgdGFibGVzLCBhY2NvcmRpbmdfdG8gPSAiZGVzZXEiLCBleGNlbCA9IGdsdWUoImV4Y2VsL2RlX3NpZy12e3Zlcn0ueGxzeCIpKQpgYGAKCiMgT250b2xvZ3kgZW5yaWNobWVudAoKbW0zOCBpcyBuaWNlbHkgc3VwcG9ydGVkIGluIGdQcm9maWxlci9jbHVzdGVyUHJvZmlsZXIuCgpgYGB7cn0KYWxsX2dwIDwtIGFsbF9ncHJvZmlsZXIoc2lnLCBzcGVjaWVzID0gIm1tdXNjdWx1cyIpCmFsbF9ncAphbGxfZ3BbWyJiYV91cCJdXQphbGxfZ3BbWyJiYV9kb3duIl1dCgphbGxfZ3BbWyJjYV91cCJdXQphbGxfZ3BbWyJjYV9kb3duIl1dCgphbGxfZ3BbWyJjYl91cCJdXQphbGxfZ3BbWyJjYl9kb3duIl1dCgojIyBhbGxfY3AgPC0gYWxsX2NsdXN0ZXJwcm9maWxlcihzaWcsIHNwZWNpZXMgPSAibW11c2N1bHVzIikKYGBgCg==

Examining an experiment in mouse embryogenesis.

atb abelew@gmail.com

2024-02-01