#!/perl/bioinfo: TFmodeller

8 de febrero de 2018

Modelling transcription factor complexes in the terminal

Hi,
I just updated our good old server TFmodeller, available at http://www.ccg.unam.mx/tfmodeller,
so that it uses the current collection of 95% non-redundant protein-DNA complexes extracted from the Protein Data Bank. As of Feb 7, 2018, there are 977 such complexes, which can be downloaded.
In addition, I just wrote a Perl client so that predictions can be ordered from the terminal via a SOAP interface, producing XML output which should be easy to parse. The PDB format coordinates of the resulting model are marked-up with tags. The input is a peptide FASTA file. This is the code:

#!/usr/bin/perl -w
use strict;
use SOAP::Lite;

my $URL = 'http://maya.ccg.unam.mx:8080/axis';
my $WSDL = "$URL/TFmodellerService.jws?WSDL";

my $infile = $ARGV[0] || die "# usage: $0 \n";
my ($inFASTA,$result);
open(FASTA,'<',$infile) ||die "#cannot read $infile\n";
$/ = undef;
$inFASTA = ; # slurp
close(FASTA);

my $soap = SOAP::Lite->uri($URL)
                     ->proxy($URL, timeout => 300 )
                     ->service($WSDL);

eval { $result = $soap->TFmodeller($inFASTA) };
if($@){ die $@ }
else{ print $result }

The original Java client can still be found here. Note that the output includes a sequence alignment of query and template with residues contacting DNA nitrogen bases highlighted:

HEADER model 1zrf_A 203 DNACOMPLEX resol=2.10 21 8e-46
REMARK query    MILLLSKKNAEERLAAFIYNLSRRFAQRGFSPREFRLTMTRGDIGNYLGLTVETISRLLG
REMARK template KVGNLAFLDVTGRIAQTLLNLAKQ-PDAMTHPDGMQIKITRQEIGQIVGCSRETVGRILK
REMARK contacts ........................ ................*........***...*...

Bruno