1 Slurm

I set up slurm in a fashion which is very similar to the cbcb cluster in the hopes that it would provide a testing/playground for getting accostomed to working with a cluster before playing with the larger umiacs system.

There are a few configuration files which are important for this, they reside in /etc/slurm, as one might guess. This also makes heavy assumptions about shared filesystems. major serves these filesystems to the other computers in the lab, the relevant configuration files are /etc/exports as well as /etc/default/nfs*. I changed the networking parameters (tcp window size, latency, etc) in the /etc/systctl.conf on all hosts as well to hopefully improve the nfs experience slightly.

2 sshfs

I like to keep all the data at umiacs and play with it in a local interactive session. In order to do that I make heavy use of sshfs. This is perhaps not the most appropriate way of working, but it does work. However, if one wishes to use this across all the computers in the lab, the sshfs does not share locks over nfs and therefore requires one to initialize the sshfs connection on any/all nodes one wishes to use.

3 zfs

We have two large local filesystems on major, currently named ‘z1’ and ‘lab’. They are using 8 4Tb ssd drives for ‘lab’ and 6 8Tb ssd drives for ‘z1’. I have been doing some reading about how to properly set up zfs in linux and have come to the conclusion that I potentially did some important things wrong; so I am going to recreate those filesystems from scratch here and write down each step and my reasoning.

3.1 what I did wrong

The default sectorsize for writing data is 512 bytes and is denoted by the parameter ‘ashift’ which is the 2^n bits per sector and therefore 9. This is autodetected by zfs and is visible in a couple of ways:

cat /sys/class/block/sdg/queue/physical_block_size
cat /sys/class/block/sdg/queue/logical_block_size
## 512
## 512

As you can see, 512 appears to be correct. This is notable because many SSD drives report 512 when they actually use 4096 and if the filesystem assumes 512 it can potentially (and definitely in the case of zfs) perform 16x more write operations per chunk of data than it should, which of course is very bad when dealing with SSDs which have an explicltly finite number of available write operations before they die.

The reason they report 512 is to stay compatible with operating systems which do not know how to address 4k.

Given this, the assumption that we will need to replace drives sometime in the reasonably near future, those drives are increasingly likely to be 4k, and many of our files are large and can therefore benefit from larger sectors; I am going to recreate the filesystems on major to use 4k sectors. We just got a bunch of new disks, so I can juggle the data in the interim.

The /z1 filesystem is much larger, so I am rsyncing all data from /lab to it first. Once complete, I will disconnect major from cruzi/brucei, log in as the super-user and invoke:

zpool status
umount -f -l /lab
zpool destroy lab

Ideally the zpool status command will remind me of the actual disks used for the lab pool, but they are oddly all reported with the long and annoying scsi names, which is fine, but I am not 100% certain how best to create the new pool with them. The silly way I guess is to note that z1 is currently comprised of sdd, sdh, sdj, sdk, sdm, sdn; root is sdo. Therefore lab must be sda, sdb, sdc, sde, sdf, sdg, sdi, sdl.

If we assume that remains true when I recreate the fs, I will invoke:

zpool create -o ashift=12 lab raidz sda sdb sdc sde sdf sdg sdi sdl
zfs set atime=off lab
zfs set sharenfs='rw=*' lab
zfs set compresson=on lab
exportfs -ar

After the exportfs is complete I can reconnect brucei/cruzi.

I will then juggle the data back to /lab and repeat with the z1 filesystem which I think I will rename to ‘scratch’.

zpool status
umount -f -l /z1
zpool destroy z1
zpool create -o ashift=12 scratch raidz sdd sdh sdj sdk sdm sdn
zfs set atime=off scratch
zfs set sharenfs='rw=*' scratch
zfs set compresson=on scratch

Since I am likely to rename the filesystem, I will need to edit /etc/exports and potentially /etc/fstab so that nfs will find the new fs appropriately. In addition I will need to edit the automount configuration on brucei/cruzi.

4 Automounting

I use the default automounter on all hosts in the lab to acquire filesystems. Through some shenanigans, this can potentially include the sshfs systems at umiacs, but that comes with some risks and annoyances. The relevant configuration files are in /etc/auto*

4.1 auto.master.d

More explicitly, the main configurations reside in /etc/auto.master.d/ and include one config file for each fs I set up, smb (for the sequencer), nfs (for the synology), rclone (for fun), and sshfs (not really used, but in theory useful).

5 The Synology

Ideally, this device should provide backups for all the raw data in the lab as well as general backups for major. We have had some minor problems getting disks for it, so it only has the raw data currently.

It is connected to a little 4 port switch and uses the network 10.10.13.0. As a result, major has a network configuration for it which looks like this in /etc/network/interfaces

iface eno2np1 inet static
  address 10.10.13.10
  netmask 255.255.255.0
  post-up ip route add 10.10.13.0/24 dev eno2np1 src 10.10.13.10 table rt2
  post-up ip route add default via 10.10.13.1 dev eno2np1 table rt2
  post-up ip rule add from 10.10.13.10/32 table rt2
  post-up ip rule add to 10.10.13.10/32 table rt2
  post-up ip rule add to 10.10.13.13/32 table rt2
  post-up ip rule add to 10.10.13.100/32 table rt2

auto eno1np0
iface eno1np0 inet dhcp
  metric 2

The ethernet cable associated with eno2np1 goes into that same 4 port switch as the synology and the sequencer’s datastore and communicates with them via 10.10.13.0. The eno1np0 cable gets dhcp from the university. The metric 2 is so that I can use the umiacs vpn.

6 The vpn

I use the following alias for communicating with the umiacs vpns:

alias umiacsvpn='printf "umTd0mhp2tr.iacs\npush" | sudo openconnect \
  --passwd-on-stdin --user=abelew --protocol=nc --useragent \
  "Pulse-Secure/9.1.11.6725" --os=win vpn.umiacs.umd.edu'
pander::pander(sessionInfo())

R version 4.2.0 (2022-04-22)

Platform: x86_64-pc-linux-gnu (64-bit)

locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C

attached base packages: stats4, stats, graphics, grDevices, utils, datasets, methods and base

other attached packages: hpgltools(v.1.0), testthat(v.3.1.6), reticulate(v.1.28), SummarizedExperiment(v.1.28.0), GenomicRanges(v.1.50.2), GenomeInfoDb(v.1.34.9), IRanges(v.2.32.0), S4Vectors(v.0.36.1), MatrixGenerics(v.1.10.0), matrixStats(v.0.63.0), Biobase(v.2.58.0) and BiocGenerics(v.0.44.0)

loaded via a namespace (and not attached): utf8(v.1.2.3), RUnit(v.0.4.32), tidyselect(v.1.2.0), lme4(v.1.1-31), RSQLite(v.2.2.20), AnnotationDbi(v.1.60.0), htmlwidgets(v.1.6.1), grid(v.4.2.0), BiocParallel(v.1.32.5), scatterpie(v.0.1.8), devtools(v.2.4.5), munsell(v.0.5.0), codetools(v.0.2-19), miniUI(v.0.1.1.1), withr(v.2.5.0), colorspace(v.2.1-0), GOSemSim(v.2.24.0), filelock(v.1.0.2), knitr(v.1.42), rstudioapi(v.0.14), DOSE(v.3.24.2), Rdpack(v.2.4), GenomeInfoDbData(v.1.2.9), polyclip(v.1.10-4), farver(v.2.1.1), bit64(v.4.0.5), downloader(v.0.4), rprojroot(v.2.0.3), treeio(v.1.22.0), vctrs(v.0.5.2), generics(v.0.1.3), gson(v.0.0.9), clusterGeneration(v.1.3.7), xfun(v.0.37), BiocFileCache(v.2.6.0), R6(v.2.5.1), doParallel(v.1.0.17), graphlayouts(v.0.8.4), locfit(v.1.5-9.7), gridGraphics(v.0.5-1), bitops(v.1.0-7), cachem(v.1.0.6), fgsea(v.1.24.0), DelayedArray(v.0.24.0), assertthat(v.0.2.1), promises(v.1.2.0.1), BiocIO(v.1.8.0), scales(v.1.2.1), ggraph(v.2.1.0), enrichplot(v.1.18.3), gtable(v.0.3.1), sva(v.3.46.0), processx(v.3.8.0), tidygraph(v.1.2.3), rlang(v.1.0.6), genefilter(v.1.80.3), splines(v.4.2.0), rtracklayer(v.1.58.0), lazyeval(v.0.2.2), broom(v.1.0.3), yaml(v.2.3.7), reshape2(v.1.4.4), GenomicFeatures(v.1.50.4), backports(v.1.4.1), httpuv(v.1.6.8), qvalue(v.2.30.0), clusterProfiler(v.4.6.0), tools(v.4.2.0), usethis(v.2.1.6), ggplotify(v.0.1.0), ggplot2(v.3.4.1), ellipsis(v.0.3.2), gplots(v.3.1.3), RColorBrewer(v.1.1-3), jquerylib(v.0.1.4), sessioninfo(v.1.2.2), Rcpp(v.1.0.10), plyr(v.1.8.8), progress(v.1.2.2), zlibbioc(v.1.44.0), purrr(v.1.0.1), RCurl(v.1.98-1.10), ps(v.1.7.2), prettyunits(v.1.1.1), remaCor(v.0.0.11), viridis(v.0.6.2), cowplot(v.1.1.1), urlchecker(v.1.0.1), ggrepel(v.0.9.3), fs(v.1.6.1), variancePartition(v.1.28.4), magrittr(v.2.0.3), data.table(v.1.14.6), mvtnorm(v.1.1-3), pkgload(v.1.3.2), patchwork(v.1.1.2), hms(v.1.1.2), mime(v.0.12), evaluate(v.0.20), xtable(v.1.8-4), HDO.db(v.0.99.1), pbkrtest(v.0.5.2), RhpcBLASctl(v.0.23-42), XML(v.3.99-0.13), gridExtra(v.2.3), compiler(v.4.2.0), biomaRt(v.2.54.0), tibble(v.3.1.8), shadowtext(v.0.1.2), KernSmooth(v.2.23-20), crayon(v.1.5.2), minqa(v.1.2.5), htmltools(v.0.5.4), ggfun(v.0.0.9), mgcv(v.1.8-41), later(v.1.3.0), aplot(v.0.1.9), tidyr(v.1.3.0), DBI(v.1.1.3), tweenr(v.2.0.2), dbplyr(v.2.3.0), MASS(v.7.3-58.2), rappdirs(v.0.3.3), boot(v.1.3-28.1), Matrix(v.1.5-3), brio(v.1.1.3), cli(v.3.6.0), rbibutils(v.2.2.13), igraph(v.1.4.0), parallel(v.4.2.0), pkgconfig(v.2.0.3), GenomicAlignments(v.1.34.0), plotly(v.4.10.1), xml2(v.1.3.3), foreach(v.1.5.2), ggtree(v.3.6.2), annotate(v.1.76.0), bslib(v.0.4.2), XVector(v.0.38.0), yulab.utils(v.0.0.6), stringr(v.1.5.0), callr(v.3.7.3), digest(v.0.6.31), graph(v.1.76.0), Biostrings(v.2.66.0), fastmatch(v.1.1-3), rmarkdown(v.2.20), tidytree(v.0.4.2), edgeR(v.3.40.2), PROPER(v.1.30.0), GSEABase(v.1.60.0), restfulr(v.0.0.15), curl(v.5.0.0), shiny(v.1.7.4), Rsamtools(v.2.14.0), gtools(v.3.9.4), rjson(v.0.2.21), nloptr(v.2.0.3), lifecycle(v.1.0.3), nlme(v.3.1-162), jsonlite(v.1.8.4), aod(v.1.3.2), desc(v.1.4.2), viridisLite(v.0.4.1), limma(v.3.54.1), fansi(v.1.0.4), pillar(v.1.8.1), lattice(v.0.20-45), KEGGREST(v.1.38.0), fastmap(v.1.1.0), httr(v.1.4.4), pkgbuild(v.1.4.0), survival(v.3.5-3), GO.db(v.3.16.0), glue(v.1.6.2), remotes(v.2.4.2), png(v.0.1-8), iterators(v.1.0.14), pander(v.0.6.5), bit(v.4.0.5), ggforce(v.0.4.1), stringi(v.1.7.12), sass(v.0.4.5), profvis(v.0.3.7), blob(v.1.2.3), caTools(v.1.18.2), memoise(v.2.0.1), dplyr(v.1.1.0) and ape(v.5.6-2)

message(paste0("This is hpgltools commit: ", get_git_commit()))
## If you wish to reproduce this exact build of hpgltools, invoke the following:
## > git clone http://github.com/abelew/hpgltools.git
## > git reset c100d5666f5032d24c933739015d267ef651c323
## This is hpgltools commit: Wed Mar 1 09:50:14 2023 -0500: c100d5666f5032d24c933739015d267ef651c323
this_save <- paste0(gsub(pattern = "\\.Rmd", replace = "", x = rmd_file), "-v", ver, ".rda.xz")
message("Saving to ", this_save)
## Saving to template-v20230317.rda.xz
tmp <- sm(saveme(filename = this_save))
---
title: "Notes on the computers in the lab."
author: "atb abelew@gmail.com"
date: "`r Sys.Date()`"
output:
  html_document:
    code_download: true
    code_folding: show
    fig_caption: true
    fig_height: 7
    fig_width: 7
    highlight: zenburn
    keep_md: false
    mode: selfcontained
    number_sections: true
    self_contained: true
    theme: readable
    toc: true
    toc_float:
      collapsed: false
      smooth_scroll: false
  rmdformats::readthedown:
    code_download: true
    code_folding: show
    df_print: paged
    fig_caption: true
    fig_height: 7
    fig_width: 7
    highlight: zenburn
    width: 300
    keep_md: false
    mode: selfcontained
    toc_float: true
  BiocStyle::html_document:
    code_download: true
    code_folding: show
    fig_caption: true
    fig_height: 7
    fig_width: 7
    highlight: zenburn
    keep_md: false
    mode: selfcontained
    toc_float: true
---

<style type="text/css">
body, td {
  font-size: 16px;
}
code.r{
  font-size: 16px;
}
pre {
 font-size: 16px
}
</style>

```{r options, include=FALSE}
library("hpgltools")
library("reticulate")
tt <- devtools::load_all("~/hpgltools")
knitr::opts_knit$set(
  width = 120, progress = TRUE, verbose = TRUE, echo = TRUE)
knitr::opts_chunk$set(error = TRUE, dpi = 96)
lua_filters <- rmarkdown::pandoc_lua_filter_args("pandoc-zotxt.lua")
old_options <- options(
  digits = 4, stringsAsFactors = FALSE, knitr.duplicate.label = "allow")
ggplot2::theme_set(ggplot2::theme_bw(base_size = 10))
rundate <- format(Sys.Date(), format = "%Y%m%d")
previous_file <- ""
ver <- format(Sys.Date(), "%Y%m%d")

##tmp <- sm(loadme(filename=paste0(gsub(pattern="\\.Rmd", replace="", x=previous_file), "-v", ver, ".rda.xz")))
rmd_file <- "template.Rmd"
```

# Slurm

I set up slurm in a fashion which is very similar to the cbcb cluster
in the hopes that it would provide a testing/playground for getting
accostomed to working with a cluster before playing with the larger
umiacs system.

There are a few configuration files which are important for this, they
reside in /etc/slurm, as one might guess.  This also makes heavy
assumptions about shared filesystems.  major serves these filesystems
to the other computers in the lab, the relevant configuration files
are /etc/exports as well as /etc/default/nfs*.  I changed the
networking parameters (tcp window size, latency, etc) in the
/etc/systctl.conf on all hosts as well to hopefully improve the nfs
experience slightly.

# sshfs

I like to keep all the data at umiacs and play with it in a local
interactive session.  In order to do that I make heavy use of sshfs.
This is perhaps not the most appropriate way of working, but it does
work.  However, if one wishes to use this across all the computers in
the lab, the sshfs does not share locks over nfs and therefore
requires one to initialize the sshfs connection on any/all nodes one
wishes to use.

# zfs

We have two large local filesystems on major, currently named 'z1' and
'lab'.  They are using 8 4Tb ssd drives for 'lab' and 6 8Tb ssd drives
for 'z1'.  I have been doing some reading about how to properly set up
zfs in linux and have come to the conclusion that I potentially did
some important things wrong; so I am going to recreate those
filesystems from scratch here and write down each step and my
reasoning.

## what I did wrong

The default sectorsize for writing data is 512 bytes and is denoted
by the parameter 'ashift' which is the 2^n bits per sector and
therefore 9.  This is autodetected by zfs and is visible in a couple
of ways:

```{bash find_ashift}
cat /sys/class/block/sdg/queue/physical_block_size
cat /sys/class/block/sdg/queue/logical_block_size
```

As you can see, 512 appears to be correct.  This is notable because
many SSD drives report 512 when they actually use 4096 and if the
filesystem assumes 512 it can potentially (and definitely in the case
of zfs) perform 16x more write operations per chunk of data than it
should, which of course is very bad when dealing with SSDs which have
an explicltly finite number of available write operations before they
die.

The reason they report 512 is to stay compatible with operating
systems which do not know how to address 4k.

Given this, the assumption that we will need to replace
drives sometime in the reasonably near future, those drives
are increasingly likely to be 4k, and many of our files are large and
can therefore benefit from larger sectors; I am going to recreate the
filesystems on major to use 4k sectors.  We just got a bunch of new
disks, so I can juggle the data in the interim.

The /z1 filesystem is much larger, so I am rsyncing all data from /lab
to it first.  Once complete, I will disconnect major from
cruzi/brucei, log in as the super-user and invoke:

```{bash destroy, eval=FALSE}
zpool status
umount -f -l /lab
zpool destroy lab
```

Ideally the zpool status command will remind me of the actual disks
used for the lab pool, but they are oddly all reported with the long
and annoying scsi names, which is fine, but I am not 100% certain how
best to create the new pool with them.  The silly way I guess is to
note that z1 is currently comprised of sdd, sdh, sdj, sdk, sdm, sdn;
root is sdo. Therefore lab must be sda, sdb, sdc, sde, sdf, sdg, sdi,
sdl.

If we assume that remains true when I recreate the fs, I will invoke:

```{bash new_zpool, eval=FALSE}
zpool create -o ashift=12 lab raidz sda sdb sdc sde sdf sdg sdi sdl
zfs set atime=off lab
zfs set sharenfs='rw=*' lab
zfs set compresson=on lab
exportfs -ar
```

After the exportfs is complete I can reconnect brucei/cruzi.

I will then juggle the data back to /lab and repeat with the z1
filesystem which I think I will rename to 'scratch'.

```{bash z1, eval=FALSE}
zpool status
umount -f -l /z1
zpool destroy z1
zpool create -o ashift=12 scratch raidz sdd sdh sdj sdk sdm sdn
zfs set atime=off scratch
zfs set sharenfs='rw=*' scratch
zfs set compresson=on scratch
```

Since I am likely to rename the filesystem, I will need to edit
/etc/exports and potentially /etc/fstab so that nfs will find the new
fs appropriately.  In addition I will need to edit the automount
configuration on brucei/cruzi.

# Automounting

I use the default automounter on all hosts in the lab to acquire
filesystems.  Through some shenanigans, this can potentially include
the sshfs systems at umiacs, but that comes with some risks and
annoyances.  The relevant configuration files are in /etc/auto*

## auto.master.d

More explicitly, the main configurations reside in /etc/auto.master.d/
and include one config file for each fs I set up, smb (for the
sequencer), nfs (for the synology), rclone (for fun), and sshfs (not
really used, but in theory useful).

# The Synology

Ideally, this device should provide backups for all the raw data in
the lab as well as general backups for major.  We have had some minor
problems getting disks for it, so it only has the raw data currently.

It is connected to a little 4 port switch and uses the network
10.10.13.0.  As a result, major has a network configuration for it
which looks like this in /etc/network/interfaces

<pre>
iface eno2np1 inet static
  address 10.10.13.10
  netmask 255.255.255.0
  post-up ip route add 10.10.13.0/24 dev eno2np1 src 10.10.13.10 table rt2
  post-up ip route add default via 10.10.13.1 dev eno2np1 table rt2
  post-up ip rule add from 10.10.13.10/32 table rt2
  post-up ip rule add to 10.10.13.10/32 table rt2
  post-up ip rule add to 10.10.13.13/32 table rt2
  post-up ip rule add to 10.10.13.100/32 table rt2

auto eno1np0
iface eno1np0 inet dhcp
  metric 2
</pre>

The ethernet cable associated with eno2np1 goes into that same 4 port
switch as the synology and the sequencer's datastore and communicates
with them via 10.10.13.0.  The eno1np0 cable gets dhcp from the
university.  The metric 2 is so that I can use the umiacs vpn.

# The vpn

I use the following alias for communicating with the umiacs vpns:

<pre>
alias umiacsvpn='printf "umTd0mhp2tr.iacs\npush" | sudo openconnect \
  --passwd-on-stdin --user=abelew --protocol=nc --useragent \
  "Pulse-Secure/9.1.11.6725" --os=win vpn.umiacs.umd.edu'
</pre>

```{r saveme}
pander::pander(sessionInfo())
message(paste0("This is hpgltools commit: ", get_git_commit()))
this_save <- paste0(gsub(pattern = "\\.Rmd", replace = "", x = rmd_file), "-v", ver, ".rda.xz")
message("Saving to ", this_save)
tmp <- sm(saveme(filename = this_save))
```
