Worklist

When a rearray workflow is executed, automatically a robot worklist suitable for executing the physical rearraying of samples is generated. That worklist is stored, and its existance is indicated in the “Worklist” column in the Plate Set view. With the plate set containing a worklist of interest selected, under the tools icon select the menu item ‘worklist’:

The worklist will be displayed. Use the export buttons to select the desired export format:

The worklist will be generated and transferred to your spreadsheet application:

Provided install scripts

A variety of installation/configuration scripts for both the client and the PostgreSQL database server are provided as links on this web site or packaged with the LIMS*Nucleus client. Various scripts are described below. Scripts without hyperlinks are included in the install package.

Supplied Scripts

Name Description
install-limsn-ec2.sh Full installation on AWS including web server, database server, and application software
install-limsn-pack.sh Install LIMS*Nucleus client and optionally database using a Guix pack (easiest install)
lnpg.tar.xz archive of sql scripts for database configuration
install-pg-aws-ec2.sh Installation of the PostgreSQL database with LIMS*Nucleus tables, methods and example data. This script is called by install-limsn-ec2.sh. This script is only used to reinstall the database after manual deletion
install-pg-aws-rds.sh install database on AWS Remote Database Service PostgreSQL instance
start-limsn.sh Use to start the client application software. Run in detached mode so the terminal can be shut down.
init-limsn-pack.sh place $HOME on $PATH; modify $HOME/.bashrc; for use with Guix pack
init-limsn-channel.sh place $HOME on $PATH; modify $HOME/.bashrc; for use with channel installation
load-pg.sh load database by running all SQL scripts at command line
lnpg.sh run lnpg.scm passing necessary parameters to initialize database

Sequence evaluation

When processing sequences obtained from a vendor, it is useful to have an idea of how well the sequencing reactions worked, both in an absolute sense and relative to other recently obtained sequences in the same project. What follows is a primary sequence independent method of evaluating a collection (i.e. and order from an outside vendor) of sequences.

The first step is to align sequences by nucleotide index (ignoring the actual sequence). Start by reading the sequences into a list. I use the list s.b to hold forward (5’ to 3’) sequences, and the list s.f to hold the reverse (but in the 5’ to 3’ orientation, as sequenced) sequences:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
rm(list=ls(all=TRUE))
library(seqinr)

working.dir <- "B:/<my-working-dir>/"


back.files <- list.files( paste(working.dir, "back/", sep="" ))
for.files <- list.files( paste(working.dir, "for/", sep="" ))

> back.files[1:20]
[1] "MBC20120428a-A1-PXMF1.seq" "MBC20120428a-A10-PXMF1.seq"
[3] "MBC20120428a-A11-PXMF1.seq" "MBC20120428a-A12-PXMF1.seq"
[5] "MBC20120428a-A2-PXMF1.seq" "MBC20120428a-A3-PXMF1.seq"
[7] "MBC20120428a-A4-PXMF1.seq" "MBC20120428a-A5-PXMF1.seq"
[9] "MBC20120428a-A6-PXMF1.seq" "MBC20120428a-A7-PXMF1.seq"
[11] "MBC20120428a-A8-PXMF1.seq" "MBC20120428a-A9-PXMF1.seq"
[13] "MBC20120428a-B1-PXMF1.seq" "MBC20120428a-B10-PXMF1.seq"
[15] "MBC20120428a-B11-PXMF1.seq" "MBC20120428a-B12-PXMF1.seq"
[17] "MBC20120428a-B2-PXMF1.seq" "MBC20120428a-B3-PXMF1.seq"
[19] "MBC20120428a-B4-PXMF1.seq" "MBC20120428a-B5-PXMF1.seq"
>

Next determine the number of files read and create a list of that length to hold the sequences. Then read them in and inspect a sequence:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
s.b <- list()
length(s.b) <- length(back.files)
s.f <- list()
length(s.f) <- length(for.files)

for(i in 1:length(back.files)){
s.b[[i]] <- read.fasta(paste( working.dir, "back/", back.files[[i]], sep=""))
}

for(i in 1:length(for.files)){
s.f[[i]] <- read.fasta(paste( working.dir, "for/", for.files[[i]], sep=""))
}

> getSequence(s.b[[2]])[[1]][1:200]
[1] "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "n" "c" "n" "n" "n"
[19] "g" "t" "c" "c" "a" "c" "t" "g" "c" "g" "g" "c" "c" "g" "c" "c" "a" "t"
[37] "g" "g" "g" "a" "t" "g" "g" "a" "g" "c" "t" "g" "t" "a" "t" "c" "a" "t"
[55] "c" "c" "t" "c" "t" "t" "c" "t" "t" "g" "g" "t" "a" "g" "c" "a" "a" "c"
[73] "a" "g" "c" "t" "a" "c" "a" "g" "g" "c" "g" "c" "g" "c" "a" "c" "t" "c"
[91] "c" "g" "a" "t" "a" "t" "t" "g" "t" "g" "a" "t" "g" "a" "c" "t" "c" "a"
[109] "g" "t" "c" "t" "c" "c" "a" "c" "t" "c" "t" "c" "c" "c" "t" "g" "c" "c"
[127] "c" "g" "t" "c" "a" "c" "c" "c" "c" "t" "g" "g" "c" "g" "a" "g" "c" "c"
[145] "g" "g" "c" "c" "g" "c" "c" "a" "t" "c" "t" "c" "c" "t" "g" "c" "a" "g"
[163] "g" "t" "c" "t" "a" "g" "t" "c" "a" "g" "a" "g" "c" "c" "t" "c" "c" "t"
[181] "a" "c" "a" "t" "a" "a" "t" "g" "g" "a" "t" "a" "c" "a" "a" "c" "t" "a"
[199] "t" "a"


Note that ambiguities are indicated with an “n”. The sequence evaluation will involve counting the number of ambiguitites at each index position. The expectation is that initially - first 25 or so bases - will have a large number of ambiguities, falling to near zero at position 50. This is the run length required to get the primer annealed and incoporating nucleotides. Next will follow 800-1200 positions with near zero ambiguity count. How long exactly is a function of the sequencing quality. Towards the end of the run the ambiguities begin to rise as the polymerase loses energy. Finally the ambiguity count will fall as the reads terminate.

Create a vector nbsum that will tally the count of ambiguities at a given index. Then process through each sequence and count, at each index, the number of ambiguities. The total count of ambiguities is entered into nbsum at the corresponding index position.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
for( i in 1:length(s.b)){
for( j in 1:length( getSequence(s.b[[i]][[1]]))){
if( getSequence(s.b[[i]])[[1]][j] == "n") nbsum[j] <- nbsum[j] + 1
}
}

> nbsum[1:100]
[1] 167 168 168 168 168 166 163 164 160 153 149 142 131 150 135 125 120 111
[19] 99 93 79 80 59 51 61 52 48 38 26 20 17 17 22 20 18 14
[37] 16 15 13 11 14 16 23 21 13 12 6 7 5 6 9 5 3 3
[55] 1 4 3 1 1 3 3 3 2 0 1 0 0 0 0 1 1 1
[73] 5 5 12 14 21 25 24 28 29 31 21 20 8 8 7 10 4 2
[91] 2 3 5 1 3 0 3 1 1 0
>


x <- 1:1200
plot(nbsum[x])

Overlay the reverse reads in red.

1
2
3
4
5
6
7
8
9
10
nfsum <- vector( mode="integer", length=2000)

for( i in 1:length(s.f)){
for( j in 1:length( getSequence(s.f[[i]][[1]]))){
if( getSequence(s.f[[i]])[[1]][j] == "n") nfsum[j] <- nfsum[j] + 1
}
}

points(nfsum[x], col="red")

I have created a shiny app that implements the above code. Download it here.

Installation

Edit your channels.scm file to include the labsolns channel

Once edited:


$guix pull
$guix package -i seqeval
$source $HOME/.guix-profile/etc/profile

##run the bash script

$ seqeval.sh

R / Shiny

Use R-Shiny to prototype algorithms and visualizations and extend LIMS*Nucleus. Below is a list of assay runs from Project 1. The assay run hyperlink transfers you to a Shiny dashboard that allows you to manipulate and visualize your data and generate a hit list.

ID Name Description
AR-1 assay_run1 PS-1 LYT-1;96;4in12
AR-2 assay_run2 PS-2 LYT-1;96;4in12
AR-3 assay_run3 PS-3 LYT-1;96;4in12

Simplifying Assumptions

  • Always use the 3 character well name A01, not A1

  • Default to tab delimitted text In some cases comma or tab delimitted will be offered as an option. Proprietary formats are avoided.

  • Plates are always filled by column Well number is derived from the order of filling.

*Reformatting is performed in the “Z” pattern Quadrants are numbered in the Z pattern.

  • Plate sets contain plates of the same format and layout

  • Always import a full plate of data, even if the plate isn’t full e.g. a data file for three 384 well plates should have 3*384=1152 rows even if the third plate isn’t full. Only control wells and unknown wells with samples will be processed.

Split plate sets

Only plate sets can be split. You can think of splitting a plate set as a regrouping of plates within a plate set. Navigate into the plate set of interest containing the plates to be grouped and highlight the plates. Select Utilities/Group from the menu bar:

A dialog will open. Fill in the name and description for the new plate set. The plates must be of the same format and layout, which will be indicated in the dialog box. Select a plate type and press OK.

Systems

A systems approach involves the integration of multiple independant commercial and custom software products to work in unison towards a common goal. A systems approach allows flexibility by allowing for the upgrade or discard and replacement of individual components as requirements change.

Advantages

  • Flexible; can evolve as process evolves
  • Best of breed components can be used
  • Portability of knowledge (Spotfire, R, SQL)
  • Adaptable to containerization

Disadvantages

  • Components on different upgrade cycles
  • Components use different technologies with scattered expertise
  • Configuration challenges: missing libraries, auxilliary software
  • May depend on external network connectivity
  • User training can be challenging
  • Integration can be challenging

References

Microservices as innovation enablers best practices == common practices
Split the monolith
Trulia switches to “Islands”
A contrarian’s (with vested interests) view

Case study of monolith implementation: Why Doctors hate their computers Discusses feature creep and the “Tar Pit”

Proprietary IT give big companies their edge.

Rob Brigham, Amazon AWS senior manager for product management: “Now, don’t get me wrong. It was architected in multiple tiers, and those tiers had many components in them. But they’re all very tightly coupled together, where they behaved like one big monolith. Now, a lot of startups, and even projects inside of big companies, start out this way. They take a monolith-first approach, because it’s very quick, to get moving quickly. But over time, as that project matures, as you add more developers on it, as it grows and the code base gets larger and the architecture gets more complex, that monolith is going to add overhead into your process, and that software development lifecycle is going to begin to slow down.”

When computational pipelines go ‘clank’

Next>> Features

Target Layouts

Target layouts define the pattern of targets coated on assay plates. The available patterns are described on the replication page. Observe the patterns under the “Target Pattern” column and note that singlicates, duplicates, and quadruplicates are the only allowed options. Duplicates are always in the same column, while sample duplicates are in the same row. Before setting up a layout pattern, targets must be imported as described on the targets page. Alternatively you can use the built in generic targets Target1, Target2 etc. Note that assigning targets is not required and is available only to allow merging with target information held in other systems.

To set up a layout, navigate into the project of interest and select the menu item Utilities/Targets/Create Target Layout. Provide a name and description, and select the level of replication desired. The dropdowns will be enabled as needed:

Once the layout is saved, it is available for use during reformatting or plate set creation. Note that the layout will only appear as an option when appropriate selections have been made e.g. replication is singlicates:

Targets

For a definition of target see the layouts page. Targets are primarily used to annotate data and assist with merging LIMS*Nucleus data with data from other systems. Defining targets is optional and if not done, generic “Target1”, “Target2” labels will be used in output. Using targets requires three steps:

  1. Register targets inividually or (administrator) import in bulk.
  2. Define target layouts
  3. Apply layouts to plate sets

Defining layouts only makes sense when creating assay plate sets. Apply the target layout during the reformating step.

There are two methods of importing targets:

Bulk import by an administrator

Under the admin menu item select “Bulk target import”. A file chooser dialog will appear. Choose an import file with the format described below:

1
2
3
4
5
6
7
8
9
project	target 	description	accession
1 muCD71 Mouse transferrin receptor FHD8SU29
1 huCD71 Human transferrin receptor JDHSU789
1 cynoCD71 Monkey transferrin receptor KSIOW8H3
1 BSA Bovine serum albumin KEUI87YH
2 Lysozyme Lysozyme KDJFG98D
2 GAPDH Glyceraldehyde Phosphate Dehydrogenase KFIIOD09
2 ICAM4 ICAM 4 integrin KL0OIE7U
2 IL21R IL21 receptor KOI89IUY

Here is an example target import file: targets200.txt

Column header spelling, capitalization, and order are critical. Indicate the project to which the target should be associated in column one. Import will fail if the project id is not in the database. For targets that should be available to all projects, place “NULL” (no quotes) in the first column. Only administrators can designate target project id as NULL during bulk import. Note that currently there is no opportunity to update an accession at a later time should it be blank upon import.

One at a time import by users

Under the menu bar Targets/Add New Target will show all targets. At the top use the tool button to navigate to the add target page:

Fill in the form. Press Submit. The target is associated with the current project and is only available within that project. Once targets have been registered, they can be used in a target layout.

Terminology

Sample: Item in a well. Could be antibody, small molecule, virus, antisense oligo, expression construct, etc.
Target: Material coated on or in an assay plate. Substance of interest that will interact with samples.
Rearray: Select random samples (hits) across a plate and place them in a new plate.
Reformat: Combine plates of one format into a higher density plate e.g. collapse four 96 well plates into a 384 well plate
Group: Combine two or more plate sets into one plate set; combine a subset of plates from a plate set into a new plate set
Format: Number of wells in a plate e.g. 96, 384, 1536.
Hit: A sample that surpasses and assay threshold.
Source: Plates from which samples are drawn.
Destination: Plates into which samples are deposited.
Plate set order The assigned order of plates within a plate set. Order is visible in the client.
Required data Data required for LIMS*Nucleus to function e.g. plate layouts, assay types, well types
Example data Fake data that can be used to test LIMS*Nucleus functionality