I downloaded a new version of cellranger along with the various reference files provided by 10x for the VD(J) references etc. I got a bit distracted by the pipeline language implemented by 10x called ‘martian’. I have the feeling that it might prove a good thing to play with.
Here are the commands I ran to separate the samples and perform the alignments. There are 4 sample names and each was done with one run of the ‘normal’ GEX scRNASeq method and one of the (new to me) V(D)J library.
The sample names are below, but the types are GEX
module add cellranger
names="A_Control A_IM_Mex09 A_IN_Mex09 A_Mock_Mex09"
for name in names; do
echo $name
cellranger count --fastqs=preprocessing \
--transcriptome=/sw/local/cellranger/202212/reference/refdata-gex-mm10-2020-A \
--id="${name}_GEX" --sample="${name}_GEX"
cellranger vdj --fastqs preprocessing/ \
--reference /sw/local/cellranger/202212/reference/refdata-cellranger-vdj-GRCm38-alts-ensembl-7.0.0/ \
--id="${name}_VDJ" --sample="${name}_VDJ"
done
April kindly sent some information from 10x which shows that I should have used the multi pipline when preprocessing the data.
Intra-Muscular vs. Nasal
I wrote 4 separate configuration csv files using the templates I downloaded and following a little reading. It seemed to me that I should be able to process them all as a single csv file, but when I attempted that, cellranger did not react well. It also took a few tries before I got the various reference/library options correct.
Note that once cellranger successfully ran for the samples I moved them to the multi/ directory so that I can compare the outputs to when I simply did the ‘count’ operation.
The following invocations of cellranger all appear to work without any problems. Ideally I would like them to be done in a single run, though.
module add cellranger
cellranger multi --id control --csv sample_sheets/multi_config_try03_control.csv
cellranger multi --id mock --csv sample_sheets/multi_config_try03_mock.csv
cellranger multi --id m_vaccine --csv sample_sheets/multi_config_try03_m.csv
cellranger multi --id n_vaccine --csv sample_sheets/multi_config_try03_n.csv
mv control mock m_vaccine n_vaccine multi/
The following command, which in theory should do the preprocessing of all samples together, fails. It seems my understanding of the multi csv input is incorrect.
module add cellranger
rm -rf run01 && cellranger multi --id run01 --csv sample_sheets/multi_config_try04_all.csv
## Error in running command bash
read.csv("sample_sheets/multi_config_try04_all.csv", comment.char="#")
## X.gene.expression.
## 1 reference
## 2 no-target-umi-filter
## 3 chemistry
## 4 expect-cells
## 5 include-introns
## 6 no-secondary
## 7 no-bam
## 8 min-assignment-confidence
## 9
## 10
## 11
## 12
## 13
## 14
## 15
## 16 [feature]
## 17
## 18
## 19
## 20
## 21 [vdj]
## 22 reference
## 23
## 24
## 25
## 26
## 27
## 28
## 29
## 30
## 31 [libraries]
## 32 fastq_id
## 33 feature_types
## 34
## 35
## 36 A_Control_GEX
## 37 gene expression
## 38
## 39
## 40 A_Mock_Mex09_GEX
## 41 gene expression
## 42
## 43
## 44 A_IM_Mex09_GEX
## 45 gene expression
## 46
## 47
## 48 A_IN_Mex09_GEX
## 49 gene expression
## 50
## 51
## 52 A_Control_VDJ
## 53 vdj
## 54
## 55
## 56 A_Mock_Mex09_VDJ
## 57 vdj
## 58
## 59
## 60 A_IM_Mex09_VDJ
## 61 vdj
## 62
## 63
## 64 A_IN_Mex09_VDJ
## 65 vdj
## 66
## 67
## X
## 1 /sw/local/cellranger/202212/reference/refdata-gex-mm10-2020-A
## 2 false
## 3 auto
## 4 10000
## 5 false
## 6 false
## 7 false
## 8 0.9
## 9
## 10
## 11
## 12
## 13
## 14
## 15
## 16
## 17
## 18
## 19
## 20
## 21
## 22 /sw/local/cellranger/202212/reference/refdata-cellranger-vdj-GRCm38-alts-ensembl-7.0.0/
## 23
## 24
## 25
## 26
## 27
## 28
## 29
## 30
## 31
## 32 fastqs
## 33 subsample_rate
## 34
## 35
## 36 preprocessing
## 37 1
## 38
## 39
## 40 preprocessing
## 41 1
## 42
## 43
## 44 preprocessing
## 45 1
## 46
## 47
## 48 preprocessing
## 49 1
## 50
## 51
## 52 preprocessing
## 53 1
## 54
## 55
## 56 preprocessing
## 57 1
## 58
## 59
## 60 preprocessing
## 61 1
## 62
## 63
## 64 preprocessing
## 65 1
## 66
## 67
## X.1
## 1
## 2
## 3
## 4
## 5
## 6
## 7
## 8 # The minimum estimated likelihood to call a sample as tagged with a Cell Multiplexing Oligo instead of Unassigned". Optional. Default: 0.9. Introduced in Cell Ranger 6.0.2."
## 9
## 10
## 11
## 12
## 13
## 14
## 15
## 16
## 17
## 18
## 19
## 20
## 21
## 22
## 23
## 24
## 25
## 26
## 27
## 28
## 29
## 30
## 31
## 32 physical_library_id
## 33
## 34
## 35
## 36 gex
## 37
## 38
## 39
## 40 gex
## 41
## 42
## 43
## 44 gex
## 45
## 46
## 47
## 48 gex
## 49
## 50
## 51
## 52 vdj
## 53
## 54
## 55
## 56 vdj
## 57
## 58
## 59
## 60 vdj
## 61
## 62
## 63
## 64 vdj
## 65
## 66
## 67
Here is a snippet of code copy-pasted from:
which, if I read it correctly, will read in the vdj output and add it as a set of annotations to the seurat object. Note, I made some changes to it because I simply cannot help myself.
<- function(seurat_obj, name = "control", type = "t") {
add_clonotype <- glue::glue("multi/{name}/outs/per_sample_outs/{name}/vdj_t")
vdj_directory <- glue::glue("{vdj_directory}/filtered_contig_annotations.csv")
vdj_csv <- glue::glue("{vdj_directory}/clonotypes.csv")
reference_csv <- readr::read_csv(vdj_csv, show_col_types = FALSE)
tcr <- readr::read_csv(reference_csv, show_col_types = FALSE)
ref
<- duplicated(tcr[["barcode"]])
tcr_duplicate_barcode_idx <- tcr[!tcr_duplicate_barcode_idx, ]
tcr_nodup
<- as.data.frame(merge(tcr_nodup, ref, by.x = "raw_clonotype_id", by.y = "clonotype_id"))
both rownames(both) <- both[["barcode"]]
"barcode"]] <- NULL
both[[
<- Seurat::AddMetaData(object = seurat_obj, metadata = both)
new return(new)
}
The following block is mostly a cut/paste of itself where I set the (over)simplified name of each sample. This then becomes the template for the path and parameters used to read the data, create a seurat object, and add the clonotype data from the vdj run.
<- "control"
name <- glue::glue("multi/{name}/outs/per_sample_outs/{name}/count/sample_filtered_feature_bc_matrix")
path <- Seurat::Read10X(path) %>%
a_control CreateSeuratObject(project = name) %>%
add_clonotype(name = name)
<- "mock"
name <- glue::glue("multi/{name}/outs/per_sample_outs/{name}/count/sample_filtered_feature_bc_matrix")
path <- Seurat::Read10X(path) %>%
a_mock CreateSeuratObject(project = name) %>%
add_clonotype(name = name)
<- "m"
name <- glue::glue("multi/{name}/outs/per_sample_outs/{name}/count/sample_filtered_feature_bc_matrix")
path <- Seurat::Read10X(path) %>%
a_m CreateSeuratObject(project = name) %>%
add_clonotype(name = name)
<- "n"
name <- glue::glue("multi/{name}/outs/per_sample_outs/{name}/count/sample_filtered_feature_bc_matrix")
path <- Seurat::Read10X(path) %>%
a_n CreateSeuratObject(project = name) %>%
add_clonotype(name = name)
For the moment I want to be able to play with the individual samples as well as the aggregate so that I can better understand the data. So I guess it works out that I didn’t figure out how to run all the samples at the same time via ‘cellranger multi’.
<- merge(a_control, a_mock) %>%
all merge(a_m) %>%
merge(a_n)
## Warning in CheckDuplicateCellNames(object.list = objects): Some cell names are
## duplicated across objects provided. Renaming to enforce unique cell names.
## Warning in CheckDuplicateCellNames(object.list = objects): Some cell names are
## duplicated across objects provided. Renaming to enforce unique cell names.
<- !is.na(a_control[["raw_clonotype_id"]])
control_clono summary(control_clono)
## raw_clonotype_id
## Mode :logical
## FALSE:12601
## TRUE :1937
<- !is.na(a_mock[["raw_clonotype_id"]])
mock_clono summary(mock_clono)
## raw_clonotype_id
## Mode :logical
## FALSE:10480
## TRUE :3014
<- !is.na(a_m[["raw_clonotype_id"]])
m_clono summary(m_clono)
## raw_clonotype_id
## Mode :logical
## FALSE:9256
## TRUE :4113
<- !is.na(a_n[["raw_clonotype_id"]])
n_clono summary(n_clono)
## raw_clonotype_id
## Mode :logical
## FALSE:5412
## TRUE :3033
<- as.factor(LETTERS[Idents(object=all)])
cluster_letters names(cluster_letters) <- colnames(x=all)
<- as.character(cluster_letters)
control <- control
mock <- control
m_vaccine <- control
n_vaccine
<- control == "A"
control_idx <- "Control"
control[control_idx] !control_idx] <- "Stimulated"
control[<- mock == "B"
mock_idx <- "Mock"
mock[mock_idx] !mock_idx] <- "Not mock"
mock[<- m_vaccine == "C"
m_vaccine_idx <- "M"
m_vaccine[m_vaccine_idx] !m_vaccine_idx] <- "Not M"
m_vaccine[<- n_vaccine == "D"
n_vaccine_idx <- "N"
n_vaccine[n_vaccine_idx] !n_vaccine_idx] <- "Not N"
n_vaccine[
<- AddMetaData(
all object=all,
metadata=cluster_letters,
col.name="cluster_letters")
<- AddMetaData(
all object=all,
metadata=control,
col.name="control")
<- AddMetaData(
all object=all,
metadata=mock,
col.name="mock")
<- AddMetaData(
all object=all,
metadata=m_vaccine,
col.name="M")
<- AddMetaData(
all object=all,
metadata=n_vaccine,
col.name="N")
"percent_mt"]] <- PercentageFeatureSet(all, pattern="^mt-")
all[["percent_ribo"]] <- PercentageFeatureSet(all, pattern="^Rp[sl]")
all[[VlnPlot(all, features="nFeature_RNA", pt.size=0)
VlnPlot(all, features="percent_mt", pt.size=0)
VlnPlot(all, features="percent_ribo", pt.size=0)
VlnPlot(all, features="nCount_RNA", pt.size=0)
FeatureScatter(all, "percent_ribo", "percent_mt")
FeatureScatter(all, "nCount_RNA", "nFeature_RNA")
FeatureScatter(all, "nCount_RNA", "nFeature_RNA")
## How many cells have specific chains associated with them
sum(!is.na(all$chain))
## [1] 12097
ncol(all)
## [1] 49846
## Get a summary of the counts / cell
summary(colSums(all@assays$RNA@counts))
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 500 1301 2210 3421 4576 37440
dim(all)
## [1] 32285 49846
<- WhichCells(all, expression = nFeature_RNA > 200)
sufficient_rna_observed <- subset(all, cells = sufficient_rna_observed)
filt dim(filt)
## [1] 32285 49838
<- rowSums(all) > 3
sufficiently_observed_idx summary(sufficiently_observed_idx)
## Mode FALSE TRUE
## logical 12113 20172
<- subset(filt, features = rownames(all)[sufficiently_observed_idx])
filt dim(filt)
## [1] 20172 49838
## Keep cells with at least some ribosomal reads
## Note the Percent function above actually puts in a floating point
## number from 0-100, not (as I assumed from 0-1).
<- WhichCells(filt, expression = percent_ribo > 5)
high_ribosomal <- subset(filt, cells = high_ribosomal)
filt dim(filt)
## [1] 20172 37946
<- WhichCells(filt, expression = percent_mt < 20)
low_mitochondrial <- subset(filt, cells = low_mitochondrial)
filt dim(filt)
## [1] 20172 37929
## Now drop mitochondrial genes
<- grepl("^mt-", rownames(filt))
mt_idx summary(mt_idx)
## Mode FALSE TRUE
## logical 20159 13
<- filt[!mt_idx, ]
filt dim(filt)
## [1] 20159 37929
<- NormalizeData(object=all) %>%
all_norm FindVariableFeatures() %>%
ScaleData() %>%
RunPCA() %>%
FindNeighbors() %>%
FindClusters() %>%
RunTSNE() %>%
RunUMAP(reduction = "pca", dims = 1:10)
## Centering and scaling data matrix
## PC_ 1
## Positive: Ager, Sparc, Mettl7a1, Cldn18, Emp2, Cd63, Cystm1, Aqp5, Clic3, Krt7
## Gsn, Limch1, Cryab, Alcam, Selenbp1, Scnn1a, Selenbp2, Timp3, Krt18, Krt19
## Sec14l3, Gprc5a, Cyp2b10, Krt8, Igfbp6, Igfbp7, Mal2, Myh14, Npnt, Scd2
## Negative: Rac2, Coro1a, Arhgdib, Ms4a4b, Laptm5, Cd3e, Srgn, Selplg, Cd52, Cd3g
## Cd3d, Ptprc, Lat, Thy1, Ltb, Vps37b, Lck, Ms4a6b, Gimap3, Ptprcap
## Il7r, Sept1, Itgb7, Gramd3, Lsp1, Skap1, Cd53, Cytip, Cd2, Fyb
## PC_ 2
## Positive: Wfdc2, Atp1b1, Cbr2, Ctsh, Sftpd, Napsa, Cldn3, Sftpb, Cxcl15, Slc34a2
## Sftpa1, Chil1, Muc1, Lyz2, Sftpc, Lamp3, Chchd10, Ppp1r14c, Lgi3, Krt18
## Scd1, Cldn7, Ptprf, Car8, Irx1, Baiap2l1, Hc, Epcam, Krt8, Dram1
## Negative: Sod3, Serping1, Bgn, Prelp, Plpp3, Olfml3, Tcf21, Spon1, Hsd11b1, Fn1
## Rarres2, Loxl1, Mfap4, Plxdc2, Gpx3, Cdo1, Itga8, Pdgfra, Clec3b, Dpep1
## Col1a2, Pcolce2, Ptgis, Gyg, Col1a1, Inmt, Ogn, Fhl1, Fmo2, Lrp1
## PC_ 3
## Positive: Rtkn2, Spock2, Scnn1g, Igfbp2, Pdpn, Col4a3, Sema3e, Cyp2b10, Scnn1b, Col4a4
## Flrt3, Hopx, Cytl1, Lama3, Tacstd2, Ndnf, Tmem37, Itgb6, Krt7, Krt19
## Lamb3, Ptpre, Bdnf, Cyp4b1, Clic5, Crlf1, Cdkn2b, Mab21l4, Scnn1a, Hs2st1
## Negative: Cbr2, Cxcl15, Slc34a2, Sftpb, Sftpd, Chil1, Sftpa1, Cldn3, Napsa, Dram1
## Lyz2, Lamp3, Sftpc, Muc1, Ppp1r14c, Scd1, Mgst1, Hc, Lgi3, Car8
## Nupr1, Fabp5, S100g, Ctsc, Egfl6, Hp, Cpm, Atp1b1, Fasn, Chia1
## PC_ 4
## Positive: Cd74, H2-Ab1, H2-Aa, Slc34a2, Sftpc, H2-Eb1, Napsa, Sftpa1, Sftpb, Lamp3
## Cxcl15, Lyz2, Hc, Chil1, Lgi3, Scd1, Sftpd, Lpcat1, Cd93, Dram1
## S100g, Ly6c1, Egfl6, Tmem252, Sema3c, Pla2g1b, Etv5, Car4, Chia1, Ctsc
## Negative: Fam183b, Foxj1, Tmem212, Cyp2s1, 1110017D15Rik, Gm19935, 1700007K13Rik, Ccdc153, Cfap126, Tctex1d4
## Spaca9, 1700094D03Rik, AU040972, BC051019, Rsph1, Dnali1, Dynlrb2, Nme5, Mns1, Gm867
## 1700016K19Rik, Pifo, Arhgdig, 1700001C02Rik, Ccdc113, Tekt4, Mlf1, Odf3b, 2410004P03Rik, Sntn
## PC_ 5
## Positive: Coro1a, Rac2, Lgals1, Cd3e, Arhgdib, Laptm5, Cd52, Ms4a4b, Selplg, Cd3g
## Ifi27l2a, Ptprc, Thy1, Ptprcap, Lsp1, Cd3d, Lat, Sept1, Lck, Ltb
## Cytip, Ms4a6b, Itgb7, Gimap3, Skap1, Icos, Il2rb, Cd2, Vps37b, Fyb
## Negative: Ly6a, Tm4sf1, Ly6c1, Ly6e, Cd93, Sema3c, Car4, Tmem252, Kdr, Stmn2
## Plaur, Id1, Tmcc2, H2-Ab1, Lyve1, Gja4, Efnb2, Fam183b, Hspa1b, Hspb1
## Cyp4b1, Foxj1, Odf3b, Tbx3, Cd74, Tmem212, Rgs12, 1110017D15Rik, Ccdc153, 1700007K13Rik
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 49846
## Number of edges: 1551538
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9142
## Number of communities: 27
## Elapsed time: 16 seconds
## Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
## To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
## This message will be shown once per session
## 17:57:42 UMAP embedding parameters a = 0.9922 b = 1.112
## 17:57:42 Read 49846 rows and found 10 numeric columns
## 17:57:42 Using Annoy for neighbor search, n_neighbors = 30
## 17:57:42 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 17:57:51 Writing NN index file to temp file /tmp/RtmptPz7ea/file1761b2551e40fc
## 17:57:51 Searching Annoy index using 1 thread, search_k = 3000
## 17:58:17 Annoy recall = 100%
## 17:58:20 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 17:58:25 Initializing from normalized Laplacian + noise (using irlba)
## 17:58:35 Commencing optimization for 200 epochs, with 2065552 positive edges
## 17:59:02 Optimization finished
DimPlot(object=all_norm, reduction="tsne")
<- DimPlot(all_norm, reduction="umap", group.by="cluster_letters", label=TRUE)
plotted plotted
<- NormalizeData(object=filt) %>%
filt_norm FindVariableFeatures() %>%
ScaleData() %>%
RunPCA() %>%
FindNeighbors() %>%
FindClusters() %>%
RunTSNE() %>%
RunUMAP(reduction = "pca", dims = 1:10)
## Centering and scaling data matrix
## PC_ 1
## Positive: Rac2, Coro1a, Arhgdib, Cd3e, Ms4a4b, Laptm5, Selplg, Cd52, Cd3g, Cd3d
## Ptprc, Lat, Thy1, Ltb, Lck, Ms4a6b, Ptprcap, Il7r, Sept1, Gramd3
## Lsp1, Itgb7, Cd2, Fyb, Cd28, Icos, Cd69, Il2rb, Ccr7, AW112010
## Negative: Sparc, Emp2, Ager, Igfbp7, Cd63, Mettl7a1, Gsn, Cldn18, Tns1, Timp3
## Limch1, Gpx3, Cystm1, Cst3, Cryab, Bgn, Clic3, Aqp5, Bcam, Krt7
## Npnt, Selenbp1, Hspb1, Pmp22, Ifitm3, Selenbp2, Inmt, Apoe, Serping1, Ccn1
## PC_ 2
## Positive: Sftpd, Sftpb, Napsa, Cxcl15, Slc34a2, Sftpa1, Wfdc2, Cbr2, Chil1, Cldn3
## Atp1b1, Sftpc, Lyz2, Lamp3, Muc1, Ppp1r14c, Dram1, Ctsh, Scd1, Lgi3
## Hc, Lpcat1, Car8, S100g, Ctsc, Egfl6, Pi4k2b, Fabp5, Abca3, Irx1
## Negative: Bgn, Serping1, Sod3, Plpp3, Rarres2, Prelp, Hsd11b1, Tcf21, Sparcl1, Clec3b
## Olfml3, Gpx3, Col1a2, Mfap4, Vim, Spon1, Fhl1, Loxl1, Col1a1, Cdo1
## Fn1, Pcolce2, Ptgis, Mxra8, Plxdc2, Ppp1r14a, Col3a1, Ogn, Gsn, Lrp1
## PC_ 3
## Positive: Rtkn2, Spock2, Scnn1g, Igfbp2, Cyp2b10, Col4a3, Pdpn, Sema3e, Hopx, Scnn1b
## Krt7, Col4a4, Krt19, Sec14l3, Tacstd2, Flrt3, Cyp4b1, Lama3, Cytl1, Scnn1a
## Clic5, Tmem37, Clic3, Ndnf, Itgb6, Mab21l4, Lamb3, Cdkn2b, Tspan8, Lmo7
## Negative: Mgst1, Nupr1, Col6a1, Dram1, Slc34a2, Sod3, Cxcl15, Chil1, Sftpb, Loxl1
## Apoe, Ces1d, Col3a1, Sftpd, Sftpa1, Lyz2, Col1a1, Lamp3, Napsa, Col1a2
## Cbr2, C3, Sftpc, Scd1, Hc, Serping1, Muc1, Ppp1r14c, Lgi3, Prelp
## PC_ 4
## Positive: Cldn5, Plvap, Icam2, Hpgd, Cd93, Tmem252, Tm4sf1, H2-Ab1, Sema3c, Cd74
## Lyve1, Stmn2, Pcdh1, Hilpda, Gja4, Mcam, Ier3, Efnb2, Ly6c1, Cebpd
## Ly6a, Hspa1a, Kdr, Ifitm3, Sox17, Dll4, Car4, H2-Eb1, Plk2, H2-Aa
## Negative: Birc5, Lgals1, Mki67, Pclaf, Top2a, Ube2c, Ccna2, Rrm2, Cdk1, S100a6
## Ccnb2, Coro1a, Rac2, Cdca8, Spc24, Tk1, Knl1, Cks1b, Aurkb, Tpx2
## Arhgdib, Hist1h1b, Lgals3, Cdca3, Nusap1, Laptm5, Cenpf, Ifi27l2a, Hmgb2, Kif11
## PC_ 5
## Positive: Ager, Rtkn2, Aqp5, Scnn1g, Npnt, Col4a4, Col4a3, Rnase4, Cldn18, Gprc5a
## Ndnf, Spock2, Igfbp2, Scd2, Igfbp6, Slc39a8, Rgcc, Fam189a2, Limch1, Sftpc
## Slc34a2, Crlf1, Flrt3, Tmem37, Mal2, Scnn1b, Inmt, Napsa, Pdpn, Hc
## Negative: Fam183b, Foxj1, Tmem212, 1110017D15Rik, Ccdc153, 1700007K13Rik, Gm19935, Cfap126, 1700094D03Rik, Spaca9
## Cyp2s1, Tctex1d4, AU040972, Odf3b, Rsph1, Dnali1, Dynlrb2, BC051019, 1700016K19Rik, Nme5
## 1700001C02Rik, Sntn, Tekt4, Pifo, Ccdc113, Gm867, Mns1, 1700088E04Rik, Cdhr4, Ak7
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
##
## Number of nodes: 37929
## Number of edges: 1177655
##
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.9064
## Number of communities: 25
## Elapsed time: 9 seconds
## 18:03:57 UMAP embedding parameters a = 0.9922 b = 1.112
## 18:03:57 Read 37929 rows and found 10 numeric columns
## 18:03:57 Using Annoy for neighbor search, n_neighbors = 30
## 18:03:57 Building Annoy index with metric = cosine, n_trees = 50
## 0% 10 20 30 40 50 60 70 80 90 100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 18:04:03 Writing NN index file to temp file /tmp/RtmptPz7ea/file1761b231f15123
## 18:04:03 Searching Annoy index using 1 thread, search_k = 3000
## 18:04:23 Annoy recall = 100%
## 18:04:25 Commencing smooth kNN distance calibration using 1 thread with target n_neighbors = 30
## 18:04:30 Initializing from normalized Laplacian + noise (using irlba)
## 18:04:36 Commencing optimization for 200 epochs, with 1557970 positive edges
## 18:04:57 Optimization finished
DimPlot(object=filt_norm, reduction="tsne")
<- DimPlot(filt_norm, reduction="umap", group.by="cluster_letters", label=TRUE)
plotted plotted
<- CellCycleScoring(
filt object = filt,
g2m.features = cc.genes$g2m.genes,
s.features = cc.genes$s.genes)
## Warning: The following features are not present in the object: MCM5, PCNA, TYMS,
## FEN1, MCM2, MCM4, RRM1, UNG, GINS2, MCM6, CDCA7, DTL, PRIM1, UHRF1, MLF1IP,
## HELLS, RFC2, RPA2, NASP, RAD51AP1, GMNN, WDR76, SLBP, CCNE2, UBR7, POLD3, MSH2,
## ATAD2, RAD51, RRM2, CDC45, CDC6, EXO1, TIPIN, DSCC1, BLM, CASP8AP2, USP1, CLSPN,
## POLA1, CHAF1B, BRIP1, E2F8, not searching for symbol synonyms
## Warning: The following features are not present in the object: HMGB2, CDK1,
## NUSAP1, UBE2C, BIRC5, TPX2, TOP2A, NDC80, CKS2, NUF2, CKS1B, MKI67, TMPO, CENPF,
## TACC3, FAM64A, SMC4, CCNB2, CKAP2L, CKAP2, AURKB, BUB1, KIF11, ANP32E, TUBB4B,
## GTSE1, KIF20B, HJURP, CDCA3, HN1, CDC20, TTK, CDC25C, KIF2C, RANGAP1, NCAPD2,
## DLGAP5, CDCA2, CDCA8, ECT2, KIF23, HMMR, AURKA, PSRC1, ANLN, LBR, CKAP5, CENPE,
## CTCF, NEK2, G2E3, GAS2L3, CBX5, CENPA, not searching for symbol synonyms
## Warning in AddModuleScore(object = object, features = features, name = name, :
## Could not find enough features in the object from the following feature lists:
## S.Score Attempting to match case...Could not find enough features in the object
## from the following feature lists: G2M.Score Attempting to match case...
VlnPlot(filt, features = c("S.Score", "G2M.Score"),
group.by = "orig.ident",
ncol = 4, pt.size = 0)
::pander(sessionInfo()) pander
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats4, stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: ggplot2(v.3.4.0), SeuratObject(v.4.1.3), Seurat(v.4.3.0), hpgltools(v.1.0), testthat(v.3.1.6), reticulate(v.1.26), SummarizedExperiment(v.1.28.0), GenomicRanges(v.1.50.2), GenomeInfoDb(v.1.34.6), IRanges(v.2.32.0), S4Vectors(v.0.36.1), MatrixGenerics(v.1.10.0), matrixStats(v.0.63.0), Biobase(v.2.58.0) and BiocGenerics(v.0.44.0)
loaded via a namespace (and not attached): ica(v.1.0-3), ps(v.1.7.2), Rsamtools(v.2.14.0), foreach(v.1.5.2), lmtest(v.0.9-40), rprojroot(v.2.0.3), crayon(v.1.5.2), rbibutils(v.2.2.11), MASS(v.7.3-58.1), nlme(v.3.1-161), backports(v.1.4.1), sva(v.3.46.0), GOSemSim(v.2.24.0), rlang(v.1.0.6), XVector(v.0.38.0), HDO.db(v.0.99.1), ROCR(v.1.0-11), irlba(v.2.3.5.1), nloptr(v.2.0.3), callr(v.3.7.3), limma(v.3.54.0), filelock(v.1.0.2), BiocParallel(v.1.32.5), rjson(v.0.2.21), bit64(v.4.0.5), glue(v.1.6.2), sctransform(v.0.3.5), pbkrtest(v.0.5.1), parallel(v.4.2.0), processx(v.3.8.0), vipor(v.0.4.5), spatstat.sparse(v.3.0-0), AnnotationDbi(v.1.60.0), DOSE(v.3.24.2), spatstat.geom(v.3.0-3), tidyselect(v.1.2.0), usethis(v.2.1.6), fitdistrplus(v.1.1-8), variancePartition(v.1.28.0), XML(v.3.99-0.13), tidyr(v.1.2.1), zoo(v.1.8-11), GenomicAlignments(v.1.34.0), xtable(v.1.8-4), magrittr(v.2.0.3), evaluate(v.0.19), Rdpack(v.2.4), cli(v.3.5.0), zlibbioc(v.1.44.0), rstudioapi(v.0.14), miniUI(v.0.1.1.1), sp(v.1.5-1), bslib(v.0.4.2), fastmatch(v.1.1-3), aod(v.1.3.2), treeio(v.1.22.0), shiny(v.1.7.4), xfun(v.0.36), pkgbuild(v.1.4.0), gson(v.0.0.9), cluster(v.2.1.4), caTools(v.1.18.2), tidygraph(v.1.2.2), KEGGREST(v.1.38.0), tibble(v.3.1.8), ggrepel(v.0.9.2), ape(v.5.6-2), listenv(v.0.9.0), Biostrings(v.2.66.0), png(v.0.1-8), future(v.1.30.0), withr(v.2.5.0), bitops(v.1.0-7), ggforce(v.0.4.1), plyr(v.1.8.8), GSEABase(v.1.60.0), pillar(v.1.8.1), gplots(v.3.1.3), cachem(v.1.0.6), GenomicFeatures(v.1.50.3), fs(v.1.5.2), clusterProfiler(v.4.6.0), vctrs(v.0.5.1), ellipsis(v.0.3.2), generics(v.0.1.3), devtools(v.2.4.5), tools(v.4.2.0), beeswarm(v.0.4.0), munsell(v.0.5.0), tweenr(v.2.0.2), fgsea(v.1.24.0), DelayedArray(v.0.24.0), fastmap(v.1.1.0), compiler(v.4.2.0), pkgload(v.1.3.2), abind(v.1.4-5), httpuv(v.1.6.7), rtracklayer(v.1.58.0), sessioninfo(v.1.2.2), plotly(v.4.10.1), GenomeInfoDbData(v.1.2.9), gridExtra(v.2.3), edgeR(v.3.40.1), lattice(v.0.20-45), deldir(v.1.0-6), utf8(v.1.2.2), later(v.1.3.0), dplyr(v.1.0.10), BiocFileCache(v.2.6.0), jsonlite(v.1.8.4), scales(v.1.2.1), graph(v.1.76.0), tidytree(v.0.4.2), pbapply(v.1.6-0), genefilter(v.1.80.2), lazyeval(v.0.2.2), promises(v.1.2.0.1), doParallel(v.1.0.17), R.utils(v.2.12.2), goftest(v.1.2-3), spatstat.utils(v.3.0-1), rmarkdown(v.2.19), cowplot(v.1.1.1), Rtsne(v.0.16), pander(v.0.6.5), downloader(v.0.4), uwot(v.0.1.14), igraph(v.1.3.5), survival(v.3.4-0), yaml(v.2.3.6), htmltools(v.0.5.4), memoise(v.2.0.1), profvis(v.0.3.7), BiocIO(v.1.8.0), locfit(v.1.5-9.7), graphlayouts(v.0.8.4), viridisLite(v.0.4.1), digest(v.0.6.31), assertthat(v.0.2.1), RhpcBLASctl(v.0.21-247.1), mime(v.0.12), rappdirs(v.0.3.3), RSQLite(v.2.2.20), yulab.utils(v.0.0.6), future.apply(v.1.10.0), remotes(v.2.4.2), data.table(v.1.14.6), urlchecker(v.1.0.1), blob(v.1.2.3), R.oo(v.1.25.0), splines(v.4.2.0), labeling(v.0.4.2), RCurl(v.1.98-1.9), broom(v.1.0.2), hms(v.1.1.2), colorspace(v.2.0-3), ggbeeswarm(v.0.7.1), aplot(v.0.1.9), ggrastr(v.1.0.1), sass(v.0.4.4), Rcpp(v.1.0.9), RANN(v.2.6.1), enrichplot(v.1.18.3), fansi(v.1.0.3), tzdb(v.0.3.0), brio(v.1.1.3), parallelly(v.1.33.0), R6(v.2.5.1), grid(v.4.2.0), ggridges(v.0.5.4), lifecycle(v.1.0.3), curl(v.4.3.3), minqa(v.1.2.5), leiden(v.0.4.3), jquerylib(v.0.1.4), PROPER(v.1.30.0), Matrix(v.1.5-3), qvalue(v.2.30.0), desc(v.1.4.2), RcppAnnoy(v.0.0.20), RColorBrewer(v.1.1-3), iterators(v.1.0.14), spatstat.explore(v.3.0-5), stringr(v.1.5.0), htmlwidgets(v.1.6.0), polyclip(v.1.10-4), biomaRt(v.2.54.0), purrr(v.1.0.0), shadowtext(v.0.1.2), gridGraphics(v.0.5-1), mgcv(v.1.8-41), globals(v.0.16.2), patchwork(v.1.1.2), spatstat.random(v.3.0-1), progressr(v.0.12.0), codetools(v.0.2-18), GO.db(v.3.16.0), gtools(v.3.9.4), prettyunits(v.1.1.1), dbplyr(v.2.2.1), R.methodsS3(v.1.8.2), gtable(v.0.3.1), DBI(v.1.1.3), highr(v.0.10), ggfun(v.0.0.9), tensor(v.1.5), httr(v.1.4.4), KernSmooth(v.2.23-20), stringi(v.1.7.8), vroom(v.1.6.0), progress(v.1.2.2), reshape2(v.1.4.4), farver(v.2.1.1), annotate(v.1.76.0), viridis(v.0.6.2), ggtree(v.3.6.2), xml2(v.1.3.3), boot(v.1.3-28.1), lme4(v.1.1-31), restfulr(v.0.0.15), readr(v.2.1.3), ggplotify(v.0.1.0), scattermore(v.0.8), bit(v.4.0.5), scatterpie(v.0.1.8), spatstat.data(v.3.0-0), ggraph(v.2.1.0), pkgconfig(v.2.0.3) and knitr(v.1.41)
message(paste0("This is hpgltools commit: ", get_git_commit()))
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset 911e7d4beebdc73267ec4be631a305574289efd3
## This is hpgltools commit: Tue Jan 17 10:36:44 2023 -0500: 911e7d4beebdc73267ec4be631a305574289efd3
<- paste0(gsub(pattern="\\.Rmd", replace="", x=rmd_file), "-v", ver, ".rda.xz")
this_save message(paste0("Saving to ", this_save))
## Saving to index-v202301.rda.xz
<- sm(saveme(filename=this_save)) tmp