some of you might have heard of our footprintDB collection, which is somewhat unique in that it annotates DNA motifs from different sources together with their cognate transcription factors (TF) and their interface residues. it was published in 2014 and is regularly updated and queried by users around the world, who usually perform interactive searches.
There is also a web services interface which is also quite useful, but slow if you have many sequences to scan (see examples in the manual). Things are even worse if you have a complete genome or proteome. And that's exactly what Teshome Mulugeta, who's visiting the lab from Norway, needed to do.
|ACE2 DNA motif, taken from http://floresta.eead.csic.es/footprintdb/index.php?motif=cb6f6b343b895dfa1c3776c99fbedda7 .|
>1:ACE2 [Saccharomyces cerevisiae] libs:JASPAR;CISBP; motif:vTGCTGGtym;mCCAGCa; url MDNVVDPWYINPSGFAKDTQDEEYVQHHDNVNPTIPPPDNYILNNENDDGLDNLLGMDYYNIDDLLTQELRDLDIPLVPSPKTGDGS SDKKNIDRTWNLGDENNKVSHYSKKSMSSHKRGLSGTAIFGFLGHNKTLSISSLQQSILNMSKDPQPMELINELGNHNTVKNNNDDF DHIRENDGENSYLSQVLLKQQEELRIALEKQKEVNEKLEKQLRDNQIQQEKLRKVLEEQEEVAQKLVSGATNSNSKPGSPVILKTPA MQNGRMKDNAIIVTTNSANGGYQFPPPTLISPRMSNTSINGSPSRKYHRQRYPNKSPESNGLNLFSSNSGYLRDSELLSFSPQNYNL NLDGLTYNDHNNTSDKNNNDKKNSTGDNIFRLFEKTSPGGLSISPRINGNSLRSPFLVGTDKSRDDRYAAGTFTPRTQLSPIHKKRE SVVSTVSTISQLQDDTEPIHMRNTQNPTLRNANALASSSVLPPIPGSSNNTPIKNSLPQKHVFQHTPVKAPPKNGSNLAPLLNAPDL TDHQLEIKTPIRNNSHCEVESYPQVPPVTHDIHKSPTLHSTSPLPDEIIPRTTPMKITKKPTTLPPGTIDQYVKELPDKLFECLYPN CNKVFKRRYNIRSHIQTHLQDRPYSCDFPGCTKAFVRNHDLIRHKISHNAKKYICPCGKRFNREDALMVHRSRMICTGGKKLEHSIN KKLTSPKKSLLDSPHDTSPVKETIARDKDGSVLMKMEEQLRDDMRKHGLLDPPPSTAAHEQNSNRTLSNETDAL
The header contains the internal accession number, the main TF name, the organism name, the source libraries, the DNA motifs (from JASPAR and CISBP in the example) and a URL where the full annotation and references are available,