14 de abril de 2015

HOWTO install Grid Engine on multi-core Linux box to run GET_HOMOLOGUES

I post this in English so that all users of GET_HOMOLOGUES can benefit. It is inspired on a previous tutorial posted in scidom and the expertise of David Ramírez from our provider SIE.

The aim of this entry is to demonstrate how to set up a Grid Engine queue manager on your local multi-core Linux box or server so that you can get the most of that computing power during your GET_HOMOLOGUES jobs. We will install Grid Engine on its default path, /opt/, which requires superuser permissions. As I'll do this un Ubuntu, I will do 'sudo su' to temporarily get superuser privileges; this should be replaces simply with 'su' in other Linux flavours. Otherwise you can install it elsewhere if you don't have admin rights.

1) Visit http://gridscheduler.sourceforge.net , create a new user and download the latest 64bit binary to /opt:

$ cd /opt
$ sudo su 
$ useradd sgeadmin
$ wget -c http://dl.dropbox.com/u/47200624/respin/ge2011.11.tar.gz $ tar xvfz ge2011.11.tar.gz
$ chown -R sgeadmin ge2011.11/
$ chgrp -R sgeadmin ge2011.11/

$ ln -s ge2011.11 sge

2) Set relevant environment variables in /etc/bash.bashrc  [system-wide, can also be named /etc/basrhc] or alternatively in ~/.bashrc for a given user:

export arch=x86_64
export SGE_ROOT=/opt/sge
export PATH=$PATH:"$SGE_ROOT/bin/linux-x64"

And make this changes live:

$ source /etc/bash.bashrc

3) Set your host name to anything but localhost by editing /etc/hosts so that the first line is something like this (localhost or 127.0.x.x IP addresses are not valid):   yourhost

4) Install Grid Engine server with all defaults except cluster name, which usually will be 'yourhost':
$ ./install_qmaster

5) Install Grid Engine client with all defaults:
$ ./install_execd

6) Optionally configure default all.q queue:
$  qconf -mq all.q

7) Add your host to list of admitted hosts:
$ qconf -as yourhost

$ exit

You should now be done. A test will confirm this. Please open a new terminal and move to your GET_HOMOLOGUES installation folder:

$ cd get_homologues-x86-20150306
$  ./get_homologues.pl -d sample_buch_fasta -m cluster

If the jobs finishes successfully then you are indeed done. I hope this helps,

2 comentarios:

  1. I'd like to add a tutorial I wrote some time ago to use Sun Grid Engine (SGE) for running task in a cluster (in this case the cluster will be our multicore computer): Sun Grid Engine Tutorial.

    Y una versión extendida en español en el libro Bioinformática con Ñ, capítulo 6, página 190.

  2. This has been updated in: