[ale] screen -d -m

Dow Hurst Dow.Hurst at mindspring.com
Wed Apr 12 22:57:22 EDT 2006


I found a nice trick.  I wanted to run a python program called Modeller 
that generates protein structures from templates.  Not important to 
y'all, but the point is that it is a program that somewhere was starting 
something that wouldn't go into the background.  I was trying to start 
multiple instances across a cluster using a script.  Each node needed 
two ssh commands to start two instances of Modeller.  This true 
embarrassing parallellism at it's best!

My example:

#!/bin/bash
CWD=`pwd`
ssh node001a "cd $CWD/01;mod8v2 01_vary_loop.py &"
ssh node001a "cd $CWD/02;mod8v2 02_vary_loop.py &"
ssh node002a "cd $CWD/03;mod8v2 03_vary_loop.py &"

and so on up to 40.

Well, the first ssh statement would not go into the background with a 
"&" at the end, or a "2>&1 ./log &" at the end.  I tried regular 
expression tricks like \& and such but finally realized that some part 
of the python process was holding on the the terminal and ssh would not 
complete and return.  I hope my explanation makes sense here!  Anyway, I 
found that using screen -d -m would start a virtual terminal for ssh to 
run in and allow Modeller to run to completion while my script started 
other instances.

So I ended up with this, which works:

#!/bin/bash
CWD=`pwd`
screen -d -m ssh node001a "cd $CWD/01;mod8v2 01_vary_loop.py"
screen -d -m ssh node001a "cd $CWD/02;mod8v2 02_vary_loop.py"
screen -d -m ssh node002a "cd $CWD/03;mod8v2 03_vary_loop.py"

I thought this was a neat solution for my problem and wanted to share 
it.  By the way, the cluster here has ssh setup so no password is 
required and that is how the script can run without problems.  I'm sure 
there is a much better way to do this with the installed Torque and Maui 
but I haven't figured out the job templates to start using them yet.  
I'm still building my own stuff like this or using mpirun for the 
parallel enabled programs.  Modeller is not parallelized so most people 
use a single random number seed and run one instance of the program for 
a long time to generate lots of possible protein structure conformations 
for their target.  I wanted to turn around the job faster so use a 
different random number for each instance of Modeller I run and then use 
the whole cluster to run 40 instances of Modeller.  I got 1000 
structures in 1 hour rather than taking a couple of days on 1 CPU.  I 
just hadn't realized screen could spawn a command in the background 
pre-detached and then exit when finished.
Thanks,
Dow

 




More information about the Ale mailing list