2024-03-15

Hit Identification

Hit identification can be accomplished using one of three different methods:

1. Select an algorithm during data import

Select an algorithm from the drop down during data import. Hits will be identified and the Hit List name and description text fields will be enabled so you can register the hit list.

2. In the scatterplot/replot view, view and generate a hit list

When plotting or replotting, the tool icon provides the option to view a hit list. The hit list view provides the option to register and save the hit list.

3. Export data and use external data analysis to identify hits. Import the hit list.

Post data import, annotated data can be exported for visualization using other software. A hit list can be compiled external to LIMS*Nucleus and then imported.

Scroll down to the hit list and import under the tools icon:

2024-03-15

hitlist

A list of samples of interest Must have a header road named “name” One sample per line, no separator Primarily used to cherry pick samples from plate to plate

Layouts

LIMS*Nucleus makes use of the following definitions:

Sample: Item of interest being tracked by LIMS*Nucleus, i.e. the item in wells. Examples would be compounds, antibodies, bacterial clones, DNA fragments, siRNAs.

Target: the item with which the sample interacts, usually coated on the bottomn of the microwell plate e.g. the antigen for an antibody or the enzyme (target) of a compound.

When creating layouts there are three attributes that need to be defined:

Entity	Attribute
Sample	type, replication
Target	replication

LIMS*Nucleus support 5 sample types:

Type	ID
unknown	1
positive control	2
negative control	3
blank	4
edge	5

LIMS*Nucleus has twenty pre-defined layouts installed at the time of system installation. Custom sample layouts can be defined and imported by administrators. A sample layout import file that defines four control wells at the bottom of column 7 looks like:

When viewed in the layout viewer, the above file would provide the following sample layout:

For every sample layout imported, an additional 5 layouts are created that define sample and target replication. These layouts are discussed in detail on the replication page.

Here is a sample layout import file that defines 8 controls in a 384 well plate, randomly scattered, excluding edge wells

When reformatted into 1536, the layout will look like:

2024-03-15

LIMS*Nucleus - Multi-Well Plate Management Software

LIMS*Nucleus is a software program used to manage multi-well plates in an academic or industrial environment. Functionality includes:

Generate 96, 384 or 1536 well plates with or without samples
Collect plates into plate sets
Group or split plate sets
Reformat plates - four 96 well plates into a 384 well plate; four 384 well plates into a 1536 well plate
Associate assay data with plate sets
Identify hits scoring in assays using included algorithms - or write your own
Export annotated data
Generate worklists for liquid handling robots
Rearray hits into a smaller collection of plates
Prototype algorithms, visualization with R/Shiny
Evaluate an online instance
Video overviews of features and capabilities

LIMS*Nucleus has a restricted set of features - multi-well plate management, hit identification, rearraying - and serves as the core of a larger system. Source code is available for modification. The architecture is simple client/server with no middleware or ORM. The client utilizes Bootstrap/Datatables and the database is PostgreSQL The software is packaged as a Guix pack for easy installation/configuration. R/Shiny dashboards can be used to extend functionality.

Next: Monoliths vs Systems

2024-03-15

Shiny Utilities

Mutation Visualization

Compare parental and mutant sequences

After perfoming error prone PCR (random) or oligonucleotide (directed) mutagenesis you will want to visualize your sequences and determine the rate of mutation incorporation. A typical visualization is the stacked bar chart as in this figure from Finlay et al. JMB (2009) 388, 541-558:

To decode this graphic you must:

estimate the percentage of each amino acid by comparison to the Y axis
compare relative amino acid abundance by comparing the area of boxes
correlate color with amino acid identity
compare to the reference sequence at the bottom of the graph

An easier to interpret graphic would be a scatter plot of sequence index (i.e. nucleotide position) on the X axis vs frequency on the Y. The data points are the single letter amino acid code. Highlight the reference sequence with a red letter.

The first step is to align all sequences. Start with a multi-fasta file of all sequences:


$cat ./myseqs.fasta

>ref
GLVQXGGSXRLSCAASGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGSTYY
ADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCAKDHRRPKGAFDIWGQGTMVTVSS
GGGGSGGGGSGGGGSGQSALTQPASVSGSPGQSITISCTGTSSDVGAYNYVSWYQQYPGK
APKLMIYEVTNRPSGVSDRFSGSKSGNTASLTISGLQTGDEADYYCGTWDSSLSAVV
>BSA130618a-A01
glvxxggxxrlscasgftfssyamswvrqapgklewvsaisgsggstyysdsvkgrftissdnskntlylqmnslraedt
avyycakdhrrpkgafdiwgqgtmvtvssggggsggggsggggsgqsaltqprsvsgtpgqsviisctgtssdvggskyv
swyqqhpgnapkliiydvserpsgvsnrfsgsksgtsaslaitglqaedeadyycqsydsslvvf
>BSA130618a-A02
glvqpggxxrlscasgftfssyamswvrqapgkglewvsaisgsggstyyadsvkgrftisrdnskntlylqmnslraed
tavyycakdhrrpngafdiwgqgtmvtvssggggsggggsggggsgqsvvtqppsmsaapgqkvtiscsgsssnignnyv
swyqqlpgtapklliydnnkrpsxipdrfsgsksgtsatlitglqtgdeadyycgtwdsslsagvf
>BSA130618a-A03
glvqxggxxrlscasgftfssyamswvrqapgkglewvsaisgsggstyyadsvkgrftisrdnskntlylqmnslraed
tavyycakdhrrpkgafdiwgqgtmvtvssggggsggggsggggsgsyeltqppsvsvspgqtasitcsgsssniginyv
swyqqvpgtapklliyddtnrpsgisdrfsgsksgtsatlgitglqtgdeadyycgtwdsslsvvvf

Above I have labeled my parental reference sequence “ref”. Use clustalo to perform the alignment and request the output in “clustal” format. The clustalo command can be run from within R using the system command. Read the alignment file into a matrix:

  input.file <- paste( getwd(), "/out.fasta", sep="")
  output.file <-  paste( getwd(), "/out.aln", sep="")
  system( paste("c:/progra~1/clustalo/clustalo.exe -infile=", input.file, " -o ", output.file, ".aln --outfmt=clustal", sep=""))   
 
 in.file <- paste(getwd(), "/out.aln", sep="")  
 seqs.aln <- as.matrix(read.alignment(file = in.file, format="clustal"))

At each position determine the frequency of all 20 amino acids. Set up a second matrix that has one dimension as the length of the sequence and the other as 20 for each amino acid. This is the matrix that will hold the amino acid frequencies.

The R package “seqinr” provides a constant containing all single character amino acids as well as asterisk for the stop codon. Use this to name the rows of the frequency matrix.

    library(seqinr)
levels(SEQINR.UTIL$CODON.AA$L)

[1] "*" "A" "C" "D" "E" "F" "G" "H" "I" "K" "L" "M" "N" "P" "Q" "R" "S" "T" "V"
[20] "W" "Y"

aas <- c(levels(SEQINR.UTIL$CODON.AA$L), 'X')
freqs <- matrix(  ncol=dim(seqs.aln)[2], nrow=length(aas))
rownames(freqs) <- aas

#Process through the matrix, calculating the frequency for each amino acid.
for( col in 1:dim(aligns)[2]){
     for( row in 1:length(aas)){
          freqs[row, col] <- length(which(toupper(seqs.aln[,col])==aas[row]))/dim(seqs.aln)[1]
      }
}

Set up an empty plot for Frequency (Y axis) vs nucleotide index (X axis). Y range is 0 to 1, X range is one to the length of the sequence i.e. the number of columns in the frequency matrix. Plot frequencies >0 in black, using the single letter amino acid code as the plot character.

    plot(1, type="n", xlab="Sequence Index", ylab="Frequency", xlim=c(1, dim(freqs)[2]), ylim=c(0, 1))
for( i in 1:length(aas)){
       points( which(freqs[i,]>0), freqs[i, freqs[i,]>0], pch=rownames(freqs)[i], cex=0.5)
       }

Overlay the reference sequence in red.

ref <-seqs.aln[rownames(seqs.aln)=="ref",]
for(i in 1:length(ref)){
     if(  length( freqs[rownames(freqs)[rownames(freqs)==toupper(ref[i])],i] ) > 0){
    if(freqs[rownames(freqs)[rownames(freqs)==toupper(ref[i])],i] > 0){
                  points( i,freqs[rownames(freqs)[rownames(freqs)==toupper(ref[i])],i]  , pch=toupper(ref[i]), cex=0.5, col="red")
              }
            }
    }

This is what it looks like (open in a new tab to see detail):

It’s easy to see which amino acid is parental, and its relative abundance to other amino acids is clear.
Consider position 61: N is the parental amino acid but T is now more abundant in the panel of mutants. K and S are the next most abundant amino acids.

Should multiple amino acids have the same or close to the same frequency, the graph can get cluttered and difficult to interpret. Adjusting the Y axis can help clarify amino acid identity. At each position percentages may not add up to 100 depending on the number of gaps. Consider the sequence “RFSGS” at positions 69-73 which is in a region containing gaps for some of the clones:

Installation

Edit your channels.scm file to include the labsolns channel

Once edited:


$guix pull
$guix package -i mutvis
$source $HOME/.guix-profile/etc/profile

##run the bash script

$mutvis.sh

2024-03-15

Overview

Monoliths

LIMS (Laboratory Management Information Systems) can be broadly characterized into 2 groups, monoliths and systems. The difference is less about functionality and more about architecture. Monoliths are a large all inclusive application that maximize automation and minimizes user intervention. Monoliths are very efficient when a process is standardized and unchanging.

Advantages

Full automation, maximum reduction in FTE requirements
Consistant reproducible processing
Enhancements, upgrades, and training outsourced to the vendor
User groups provide resources for problem solving (bug fixes, add on components, help with problems)

Disadvantages

Cost
Many moving parts (database, ORM, web server, interface)
Complex - requires extensive training
Feature creep
Brittle - difficult to change in response to a changing process
Dependant on vendor for bug fixes and upgrades
Off-the-shelf solutions may not satisfy all requirements
May depend on obscure components (old programming languages, object database, image)
Custom solutions may be obsolete on delivery
Resistance to use

Next>> Systems

2024-03-15

Workflows

General navigation

LIMS*Nucleus works with a nested heirarchy of entities. The object heirarchy can be navigted by clicking hyperlinks in the data tables. The left hand menu items allow for global navigation. Since users are often concerned with only one project at a time, LIMS*Nucleus tracks the current (default) project, which is visbile in the menu area. The default project can be changed by listing all project (first menu item) and clicking into a project. The tools icon presents workflows associated with the visible entity, and often require selection of row(s) in the data table.

2024-03-15

Download

Install LIMS*Nucleus using a Guix pack

A Guix pack installation is a simple installation suitable for users not interested in the Guix package manager. The detailed instructions below follow the same process automated by the install script found on the evaluate page, but provides additional diagnostic information.

The first step in the install script is setting up the database. Manually install the database using instructions on the postgres page.

If you are using the install script and it is not executable, change permissions:

1	chmod 777 install-limsn-pack.sh

Download and unzip the archive

1 2	wget https://github.com/labsolns/labsolns/releases/download/v0.1.0p/limsn-0.1.0-pack.tar.gz tar xf ./limsn-0.1.0-pack.tar.gz

Look at the contents of the bin directory

$ls ./bin
art         dropuser  guile-config           initdb                 oid2name           pg_controldata  pg_receivewal   pg_standby       pg_waldump  reindexdb
clusterdb   ecpg      guile-snarf            install-pg-aws-ec2.sh  pg_archivecleanup  pg_ctl          pg_recvlogical  pg_test_fsync    pgbench     start-limsn.sh
createdb    gnuplot   guile-tools            install-pg-aws-rds.sh  pg_basebackup      pg_dump         pg_resetwal     pg_test_timing   postgres    vacuumdb
createuser  guild     init-limsn-channel.sh  lnpg.sh                pg_checksums       pg_dumpall      pg_restore      pg_upgrade       postmaster  vacuumlo
dropdb      guile     init-limsn-pack.sh     load-pg.sh             pg_config          pg_isready      pg_rewind       pg_verifybackup  psql

Various *limsn*.sh scripts are needed to configure and start up LIMS*Nucleus. You can also use psql to diagnose the database.

Place $HOME/bin on $PATH

1
2
3

$export PATH="$HOME/bin${PATH:+:}$PATH"
$echo $PATH
/home/admin/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

Note that in the above example I am using the admin account on AWS so $HOME == /home/admin

Initialize LIMS*Nucleus

Initialize by executing ./bin/init-limsn-pack.sh. The function of the various scripts is described on the scripts page. Check that the $HOME/.config/limsn directory has been created and artanis.conf has been copied into the directory:

1 2	$ ls $HOME/.config/limsn artanis.conf

Check that $HOME/.bashrc has been modified to include exports for LC_ALL, PATH, GUILE_LOAD_PATH, and GUILE_DBD_PATH.

$HOME/.bashrc

...
$cat $HOME/.bashrc
/# sources /etc/bash.bashrc).
if ! shopt -oq posix; then
  if [ -f /usr/share/bash-completion/bash_completion ]; then
    . /usr/share/bash-completion/bash_completion
  elif [ -f /etc/bash_completion ]; then
    . /etc/bash_completion
  fi
fi
export LC_ALL="C"
export PATH=/home/admin/bin:/gnu/store/2rl49lcanmqn26s660dd85lv7pfn0ykb-limsn-0.1.0/bin:/home/admin/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
export GUILE_LOAD_PATH=/home/admin/gnu/store/rj0pzbki1m5hpcshs614mhkrgs2b3i9d-artanis-0.5.2/share/guile/site/3.0:/home/admin/gnu/store/780bll8lp0xvj7rnazb2qdnrnb329lbw-guile-json-3.5.0/share/guile/site/3.0:/home/admin/gnu/store/jmn100gjcpqbfpxrhrna6gzab8hxkc86-guile-redis-2.1.1/share/guile/site/3.0:/home/admin/gnu/store/3f0lv3m4vlzqc86750025arbskfrq05p-guile-dbi-2.1.8/share/guile/site/2.2
export GUILE_DBD_PATH=/home/admin/gnu/store/z5kilafxayw2kdvn3anw1shkqij17dqb-guile-dbd-postgresql-2.1.8/lib

source .bashrc to make certain that all environment variables have been properly set:

1	$source $HOME/.bashrc

Modify the artanis.conf configuration file

Critical parameters are described on the configuration page. You must have in hand IP addresses for the database and client.

1	sudo nano $HOME/.config/limsn/artanis.conf

Make sure the database is available and loaded

The psql command below will work on a local database - modify accordingly. You should find 10 preloaded projects. Projects 4-9 are empty and can be used for experimentation:

$psql -U ln_admin -h 127.0.0.1 lndb
psql (13.4, server 11.14 (Debian 11.14-0+deb10u1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

lndb==>select * from project;

id | project_sys_name |                  descr                   |     project_name     | sessions_id |            updated            
----+------------------+------------------------------------------+----------------------+-------------+-------------------------------
  1 | PRJ-1            | 3 plate sets with 2 96 well plates each  | With AR, HL          | 9999999999  | 2022-04-06 11:07:05.854606+00
  2 | PRJ-2            | 1 plate set with 2 384 well plates each  | With AR              | 9999999999  | 2022-04-06 11:07:06.825439+00
  3 | PRJ-3            | 1 plate set with 1 1536 well plate       | With AR              | 9999999999  | 2022-04-06 11:07:08.32809+00
  4 | PRJ-4            | description 4                            | MyTestProj4          | 9999999999  | 2022-04-06 11:07:13.364018+00
  5 | PRJ-5            | description 5                            | MyTestProj5          | 9999999999  | 2022-04-06 11:07:13.365426+00
  6 | PRJ-6            | description 6                            | MyTestProj6          | 9999999999  | 2022-04-06 11:07:13.367082+00
  7 | PRJ-7            | description 7                            | MyTestProj7          | 9999999999  | 2022-04-06 11:07:13.368542+00
  8 | PRJ-8            | description 8                            | MyTestProj8          | 9999999999  | 2022-04-06 11:07:13.370008+00
  9 | PRJ-9            | description 9                            | MyTestProj9          | 9999999999  | 2022-04-06 11:07:13.371613+00
 10 | PRJ-10           | 2 plate sets with 10 96 well plates each | Plates only, no data | 9999999999  | 2022-04-06 11:07:13.372956+00
(10 rows)

lndb=>

Start LIMS*Nucleus

Start in detached mode so you can close the terminal. To kill the process you will need to look up the PID and kill. Start in regular mode to monitor any error messages.

1	$nohup start-limsn.sh &

to kill Ctrl-C in interactive mode or in detached mode:

$ ps aux | grep artanis
admin    12479  2.9  6.0 154628 60944 pts/0    Sl   13:22   0:00 /gnu/store/cnfsv9ywaacyafkqdqsv2ry8f01yr7a9-guile-3.0.7/bin/guile \ /gnu/store/dfa7p2zvk4xlhaq1y3hsqkzpqd73ggni-artanis-0.5.2/bin/.art-real work -h0.0.0.0 --config=/home/admin/.config/limsn/artanis.conf
admin    12494  0.0  0.0   3084   880 pts/0    S+   13:22   0:00 grep artanis

$ kill -15 12479

2024-03-15

Entities

Plate

Plates are one of three formats - 96, 384, or 1536 well The plate system name in the format PLT-NNN is automatically assigned at creation All plates are part of plate sets

Plates can be assigned a variety of types. Depending on the type, a plate may not contain samples. For example, assay plates are transient and discarded after data collection, so could not serve as the source for rearraying or replica plating.

Installed types are:

Type	Description	Contain samples?
assay	contain associated data	no
rearray	created during a reformat operation	yes
archive	designated for storage	yes
master	original plate of samples	yes
daughter	result of replica plating or grouping operations	yes
replicate	result of replica plating	yes

Plates are of various types - assay, rearray, glycerol, etc. Plate types are to provide clarity to the user - no convention is enforced

2024-03-15

Entities

PlateSet

Composed of plates Specific to a project All plates within a plate set must be of the same format (e.g. 96 well) Plate sets can be merged together (different plate types OK) When created, all plates in a plate set will be of the same plate type