Making lemonade from lemons. Our vehicle depreciates faster than anyone elses, so buy one for your business and get the tax break.
STIHIE
NY celebrates relaxing abortion laws
“I’ve noticed that everyone who is for abortion has already been born.” – Ronald Reagan
Nick Land, Neoreaction, the Dark Enlightenment.
Overstimulation: video games, porn, drugs, EDM, movies/tv
Immigration
Visualization of immigration over the years 1820-2013.
Realtalker Of The Week: Steve King
Nice list of immigration relevant talking points.
Ten Theses on Immigration - Ross Douthat
The hidden costs of Immigration - Christopher Caldwell with research and data from George Borjas.
How Would a Billion Immigrants Change the American Polity? - Nathan Smith - Prognostications on open borders and immigration.
Experiment Manager
There is a lot of repetitious work involved in running experiments, not only in the lab but also computationally. Much of the computational drudgery can be minimized using the following approach.
First establish an on disk convention for data organization and management. I organize by year/project/experiment. I assume experiments are of two types, either part of a project, or one-offs. In a drug development environment, projects would be specific therapeutic target e.g. “HER2″ or “IL2″ and would be added to the “data” directory. Under each project would be individual experiments, so you can think of projects as a collection of related experiments.
One-offs, as with technology development or assay optimization experiments, are collected under “misc” (miscellaneous). Experiments are directly under misc with no project container.
--2015 |--data | |--Project1 | | |--P1Experiment1 | | |--P1Experiment2 | | |--P1Experiment3 | | |--P1Experiment4 | | |--P1Experiment5 | | | |--Project2 | | |--P2Experiment1 | | |--P2Experiment2 | |--Project3 | |--Project4 | |--Project5 | |--misc |--Experiment1 |--Experiment2 |--Experiment3 |--Experiment4 |--Experiment5
An experiment, whether in “data” or in “misc” is further subdivided into four directories as illustrated below:
--2015 |--data |--Project1 |--P1Experiment1 |--code |--input |--output |--results
code: scripts used to process data
input: raw data off instruments. Could be numerical, images, sequences or other
output: results of running the code
results: formatted results suitable for notebooks, slides in presentations etc.
--2015 |--data |--Project1 |--P1Experiment1 |--code | |--process-data.R | |--input | |--instrument-data.txt | |--annotation.txt | |--image1.jpg | |--image2.jpg | |--sequences.fasta | | |--output | |--Proj1Exp1-out.txt | | |--results |--Proj1Exp1Data.xlsx |--Proj1Exp1Slides.ppt
A consistent layout makes it easy to find a specific piece of information. Scripts know to read data from “../input/” and write to “../output”. When searching for processed information, look in results.
Setting up the work environment outlined above can be handled easily by an EMACS-lisp script. Lisp can create the directory structure and then populate with template files that are renamed prior to copying into the destination directory. Lisp can write code directly into the R script. For example you can write code to set the working directory and set a prefix variable used to name files. All this can be attached to a function name that is launched when it is time to set up a new experiment.
To begin, choose a location on disk as I outline above. Starting with a top level directory that is the year helps to further categorize experiments. Create the “data” and “misc” subdirectories. You don’t need to create project directories as that will be managed by the lisp method. I will name the method “create-project”.
1 | (defun create-project ( project exp script ) |
create-project is an interactive method and so must be identified as such with the interactive method call. The interactive method call MUST be the first statement in the create-project method. create-project takes 2 or 3 arguments for which the user will be prompted.
- Project name: an optional argument. Enter an existing project to have the experiment added to an existing project directory. Enter a novel project to have a new directory created under the “data” directory. Leave blank to have the experimental directory created unter the “misc” directory i.e. this experiment is not part of a project.
- experiment name: enter the experiment name using your personally established naming convention. For example I use my initials followed by the date followed by some desciptive text e.g. PL20150216pcr2. This entry will also be written to the R script to be used as a prefix for all created files.
- script number: an integer indicating which script to copy into the “code” directory
create-project not only creates directories but will copy and rename template files inserted into those directories. Next step is to set up the templates. Create a “templates” directory somewhere accessible:
--2015 |--data |--misc |--templates |--htc | |--htc1.R | |--htc2.R | |--htc3.R | |--ngs | |--ngs1.R | |--ngs2.R | |--ngs3.R |--mut | |--mut1.R | |--mut2.R | |--proj |--template1.xlsx |--template2.xlsx |--template3.xlsx |--template.pptx
My templates includes a powerpoint and excel template that will be renamed with the value of the “exp” variable that was populated by the user when create-project is invoked. I also have a series of R scripts, some useful for high throughput cloning (htc), next gen sequencing, mutagenesis etc. These will be copied into the “code” subdirectory of my experiment. I don’t change the name of the R scripts. I find I like to know what the script will do just by looking at its name.
First create the directories and add the powerpoint file to the results directory. The file is renamed using the value of “exp” queried from the user:
1 |
|
Next we want to copy over the Excel and R script files, depending on the value of the integer entered by the user. Since there are potentially many (changing) scripts and Excel templates suitable for many different experiments, I print out an association list that I can refer to when creating a new experiment. The list might look like:
- cytotoxicity
- mutagenesis
- ELISA
- sequencing
- cloning
Depending on what I am doing I enter the appropriate number and the proper scripts/excel files are copied over. The code uses a conditional statement to select amongst the possibilities:
1 |
|
(In retrospect, an association list might have been a cleaner option.)
The above code uses a couple of custom methods: modify-file-with-wkdir and save-modified-file. First modify-file-with-wkdir:
1 | (defun modify-file-with-wkdir ( the-file working-dir subdir exp) |
Let’s look at the first few lines of one of the template R scripts:
1 | rm(list=ls(all=TRUE)) |
In the region between the hashtags I use modify-file-with-wkdir to write R code that sets the working directory as well as the experiment id prefix. That way when I open the R script in EMACS, I don’t have to set these manually, which would involve looking up the location of the file. After Lisp has modified the file it looks like this:
1
2
3
4
5
6
7
8 #######################################
##line 10; wrkdir() inserted here
setwd("~/2015/data/HER2/MBC150330composite/")
exp.id <-"MBC150330composite"
########################################
The next method just saves the modified file.
1 | (defun save-modified-file (working-dir subdir script-file-name) |
Because create-project was declared interactive, I can use M-x create-project to launch the method. I could even bind to a function key if I have any to spare.
In a future post I will show how to extend this system to manage a multi-step process.
Design forms using EMACS Widget library
Using the function sqlite-query, we can now interact programmatically with SQLITE. For end users, we need to design a user friendly GUI. The EMACS widget library is about as friendly as EMACS will get. There is very limited information on how to use the EMACS widgets. You can read about the widget library within EMACS info, or on the web. Ye Wenbin also has a useful tutorial. Should you find other resources, please let me know. Essentially you create a form with a widget for each field, such as a text field, dropdown, radio button etc. Once the user has populated the fields, a button assembles and submits the SQL statement. Here is a simple example with widgets for some of the fields in our “ab” database table:
1 | (require 'widget) |
To use the form, issue the command (add-antibody), which is also connected to one of the menu items discussed the the sqlite-mode section. Once the data has been entered, a “submit” button is pressed which will assemble the sql statement:
1 | (setq sql-command (concat "INSERT INTO ab VALUES(NULL, '" (car (cdr (split-string selected-gene-id "\t"))) "','" |
Note that any input through a text field is processed with chomp to remove whitespace.
Handling Checkbox user input
Here is a truncated example showing 2 of 6 checkboxes on one of my forms:
1 | (widget-create 'checkbox |
First set up variables to hold the status of each element in the check box list, as well as a variable to hold the final semicolon delimited string. I am showing all six variables.
1 | ;;antibody application variables |
Next is the function to be invoked when the user clicks the “OK” or similar button on the form. Items that are selected can be cons into a list for further processing. Here I simply concatenate the list using semicolon as a separator for the purpose of storage in the database and reporting.
1 | (defun create-ab-applications-list () |
Populating dropdown lists on-the-fly from database content
Quite often you wish to give the user a prepopulated dropdown where the content comes from a field in the database. Here is an example of form code for a dropdown:
1 | (widget-create 'menu-choice |
The dropdown content is:
1 | '(item "mouse") |
Suppose rather than hard coding I want to pull out all the species available in the ab.host field in my database? Using my sqlite-query command, it would be easy enough to get the list:
( “mouse” “rabbit” “goat” “donkey” )
1 | (setq sql-command "SELECT host FROM ab;") |
But how to I convert that to the format shown above for dropdown content? The dropdown content shown above is equivalent to:
‘(item “mouse”) ‘(item “rabbit”) ‘(item “goat”) ‘(item “donkey”)
which is equivalent to:
(quote (item “mouse”)) (quote (item “rabbit”)) (quote (item “goat”)) (quote (item “donkey”))
Note that this is not a list, as it does not have circumscribed parentheses. Since we want to write code, a macro is the way to go. This is what it looks like:
1 | (defmacro multi-item () |
I mapcar over my host-species list, creating the statement (quote (item “element”)) for each element of the list, then splice that into my backquoted widget-create code. With this code, it is easy now to create a dropdown the contents of which depend on other selections in the form e.g. an initial dropdown, checkbox, or radio button. The notify statement of the initial selection should set a variable and then refresh the form, invoking the query that will populate the dependant dropdown list. You could, for example, make the second list visible only after the initial selection.
Though the widget library is crude by current standards, it will meet most needs either directly or with a hack.