#!/perl/bioinfo: output

So about to use samtools pileup or mpileup? Would you like to take a look at the file, instead of going straightforward for the BCF and bcftools automation?

Ok! this is what an SNP looks like!

contig_100029 698 C 39 .$.....,.,,,,.,,,.,.,..,......,,,,,,.,,^S.
contig_100029 699 A 38 .....,.,,,,.,,,.,.,..,......,,,,,,.,,.
contig_100029 700 A 38 .....,.,,,,.,,,.,.,..,......,,,,,,.,,.
contig_100029 701 C 39 TTTTTtTttttTtttTtTtTTtTTTTTTttttttTttT^ST
contig_100029 702 A 40 .....,.,,,,.,,,.,.,..,......,,,,,,.,,..^S,
contig_100029 703 G 41 .....,.,,,,.,,,.,.,..,......,,,,,,.,,.,.,
contig_100029 704 G 42 .....,.,,,,.,,,.,.,..,......,,,,,,.,,.,.,^S.

I have ommited the last column (read base qualities).

And what about a reference skip? Maybe want to see an spliced alignment? Here you are!

contig_100029 516 T 43 ,,,.,,...,.,,.................,.,,,,.,,,.,^S.

contig_100029 517 A 43 ,,,.,,...,.,,.................,.,,,,.,,,.,.

contig_100029 518 G 43 ,,,.,,...,.,,.................,.,,,,.,,,.,.

contig_100029 519 A 43 ,,,.,,...,.,,.................,.,,,,.,,,.,.

contig_100029 520 G 43 ,,,.,,...,.,,.................,.,,,,.,,,.,.

contig_100029 521 G 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 522 T 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 523 G 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 524 A 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

. . . . . . . . . . . . . . .

contig_100029 651 C 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 652 A 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 653 G 43 <<<><<>>><><<>>>>>>>>>>>>>>>>><><<<<><<<><>

contig_100029 654 G 44 ,,,.,,...,.,,.................,.,,,,.,,,.,.^S,

contig_100029 655 G 44 ,,,.,,..$.,.,,............$.....,.,,,,.,,,.,.,

contig_100029 656 C 42 ,,,.,,..,.,,................,.,,,,.,,,.,.,

contig_100029 657 G 42 ,,,.,,..,.,,................,.,,,,.,,,.,.,

Who wants an IGV when you can scroll up and down seeing how things pile down and up?
An it is memory efficient, right?
You just have to grep or awk filter your region of interest!

Do you want me to explain about columns?

1st: reference_name

2nd: position in reference

3rd: base on reference

4th: depth, or number of bases pilling-up over this reference position

5th: each symbol comes from a read, with some exceptions. Lets look at it:

"." and ",": matches! one in fwd strand, "," in reverse.

"$": end of read. Followed by mapping symbol of that read base ("." or "," for example). So, 2 symbols.

"^": start of read. Followed by mapping quality of the read (the MAPQ field in SAM format) and the mapping symbol of that base ("." or "," for example).

">" "<": reference skip. That is, the read still maps, but no in these reference bases. So probably an spliced alignment with part of the read aligning before the ">" and part after that. ">" fwd strand, "<" reverse.

6th: (not shown, just working with HQ bases ;)

Looking for more information? Ask please!

Por mucho que nos guste programar en Perl, a veces hay que recurrir a otros lenguajes más adecuados para propósitos concretos. Este es el caso de PHP cuando se trata de realizar interfaces web para nuestros scripts de Perl.
El primer problema que nos surge al pasar a PHP es cómo enviar y recibir datos a través de un formulario web. Los datos enviados los podemos usar para ejecutar en el servidor nuestro script de Perl y generar resultados.
A continuación muestro el ejemplo de un formulario de búsqueda de secuencias DNA o Proteínas:

 <?php  
 // Shows a search input data formulary  
 function showSearchForm(){  
      ?>  
      <h3>Search form:</h3></br>  
      <form enctype="multipart/form-data" action="<?php print $_SERVER['PHP_SELF'] ?>" method="POST">  
      Search name: <input type="text" name="search_name" size="40" maxlength="127"></br></br>  
      Email: <input type="text" name="user_email" size="40" maxlength="127"></br></br>  
      Search input type:  
      <input type="radio" name="search_type" value="stamp"> DNA  
      <input type="radio" name="search_type" value="blastp"> Protein  
      </br></br>  
      Exclude query from results <input type="checkbox" name="search_exclude" value="yes"></br></br>  
      Limit number of results per query: <input type="text" name="search_limit" size="2" maxlength="2"></br></br>  
      Query data or file:</br></br>  
      <textarea name="query_data" rows="10" cols="67"></textarea>  
      </br></br>  
      <input type="file" name="query_file" size="20"></br></br>  
      Genomes:   
      <select name="specie" class="selectbutton">  
      <option value="human">Homo sapiens</option>  
      <option value="mouse">Mus musculus</option>  
      <option value="arabidopsis">Arabidopsis thaliana</option>  
      <option value="rice">Oryza sativa</option>  
      </select>  
      </br></br>  
      <input type="submit" name="search_submitted" value="Search">  
      </form>  
      <?php  
 }  
 ?>

El siguiente paso consistirá en validar los datos introducidos para volver a mostrar el formulario si los datos fueran erróneos. Usaremos la variable especial $_SESSION para guardar los datos enviados con el formulario (almacenados en la variable $_POST). La variable $_SESSION nos permitirá acceder a su contenido si ejecutamos 'session_start()' al comienzo del código PHP. Antes de guardar los datos enviados, borraremos los contenidos anteriores de $_SESSION. Si en vez de enviar un archivo, hemos enviado datos en formato texto, guardaremos estos datos en un archivo temporal ($_SESSION['query_file']). Si faltan datos en algún campo del formulario, crearemos un mensaje de error en $_SESSION['errors'].

 <?php  
 // Validates search form data  
 function validateSearchData(){  
      // Stores in $_SESSION variable all the post data and in $_SESSION['errors'] if some important data is missing  
      // Clear previous $_SESSION values  
      unset($_SESSION['errors']);  
      unset($_SESSION['search_name']);  
      unset($_SESSION['user_email']);  
      unset($_SESSION['search_type']);  
      unset($_SESSION['search_exclude']);  
      unset($_SESSION['search_limit']);  
      unset($_SESSION['specie']);  
      unset($_SESSION['output_dir']);  
      unset($_SESSION['query_file']);  
      unset($_SESSION['query_data']);  
      (!empty($_POST['search_name'])) ? $_SESSION['search_name'] = trim($_POST['search_name']) : $_SESSION['errors'][] = 'name for the search';  
      (!empty($_POST['user_email'])) ? $_SESSION['user_email'] = trim($_POST['user_email']) : $_SESSION['errors'][] = 'email';;  
      (!empty($_POST['search_type'])) ? $_SESSION['search_type'] = $_POST['search_type'] : $_SESSION['errors'][] = 'search input type';  
      if (!empty($_POST['search_exclude'])){  
           ($_POST['search_exclude'] == 'yes') ? $_SESSION['search_exclude'] = 1 : $_SESSION['search_exclude'] = 0;  
      }  
      (!empty($_POST['search_limit'])) ? $_SESSION['search_limit'] = trim($_POST['search_limit']) : '';  
      (!empty($_POST['specie'])) ? $_SESSION['specie'] = $_POST['specie'] : '';  
      // Store the query data into a file inside the /tmp folder  
      if (!empty($_FILES['query_file']['name']) && !empty($_FILES['query_file']['tmp_name']) && $_FILES['query_file']['size']>0){  
           $_SESSION['query_file'] = $_FILES['query_file']['tmp_name'];  
      }elseif (!empty($_POST['query_data'])){  
           $_SESSION['query_data'] = $_POST['query_data'];  
           $_SESSION['query_file'] = '/tmp/'.$_SESSION['search_name'].".data";  
           $file_handle = fopen($_SESSION['query_file'], "w");  
           fwrite($file_handle, $_POST['query_data']);  
           fclose($file_handle);  
      } else{  
           $_SESSION['errors'][] = 'data or file to process';  
      }  
      if (isset($_SESSION['errors'])){  
           return;  
      }  
 }  
 ?>

Finalmente, falta introducir las dos funciones anteriores en un archivo (por ejemplo 'index.php') para que sean leídas e interpretadas por el servidor web. Si los datos del formulario son válidos se ejecutará un pequeño código de Perl, si no se volverá a mostrar el formulario indicando los datos que faltan.

 <?php  
 // Example file of a PHP search formulary with input data validation and script execution  
 session_start();  
 ?>  
 <html>  
 <head>  
 <title>Search Form</title>  
 </head>  
 <body>  
 <?php  
 if(!isset($_POST['search_submitted'])){  
      ?><h3>Search form:</h3><?php  
      showSearchForm();  
 } else {  
      validateSearchData();  
      if (!isset($_SESSION['errors'])){  
           executeScript();  
      } else {  
           ?><h3>Search form:</h3><?php  
           print '<font color="red"><b>Please specify: '.join(', ',$_SESSION['errors']).'.</b></font></br></br>';  
           unset($_SESSION['errors']);  
           showSearchForm();  
      }  
 }  
 ?>  
 </body>  
 </html>  
 <?php  
 // Shows an input data formulary filled with previous values  
 function showSearchForm(){  
      ?>  
      <form enctype="multipart/form-data" action="<?php print $_SERVER['PHP_SELF'] ?>" method="POST">  
      Search name: <input type="text" name="search_name" size="40" maxlength="127" <?php ($_SESSION['search_name']) ? print 'value="'.$_SESSION['search_name'].'"' : '' ?> ></br></br>  
      Email: <input type="text" name="user_email" size="40" maxlength="127" <?php ($_SESSION['user_email']) ? print 'value="'.$_SESSION['user_email'].'"' : '' ?> ></br></br>  
      Search input type:  
      <input type="radio" name="search_type" value="stamp" <?php ($_SESSION['search_type'] == 'stamp') ? print 'checked' : '' ?>> DNA  
      <input type="radio" name="search_type" value="blastp" <?php ($_SESSION['search_type'] == 'blastp') ? print 'checked' : '' ?>> Protein  
      </br></br>  
      Exclude query from results <input type="checkbox" name="search_exclude" value="yes" <?php ($_SESSION['search_exclude'] == 'yes') ? print 'checked' : print 'checked' ?>></br></br>  
      Limit number of results per query: <input type="text" name="search_limit" size="2" maxlength="2" <?php ($_SESSION['search_limit']) ? print 'value="'.$_SESSION['search_limit'].'"' : print 'value="10"'; ?> ></br></br>  
      Query data or file:</br></br>  
      <textarea name="query_data" rows="10" cols="67"><?php ($_SESSION['query_data']) ? print $_SESSION['query_data'] : '' ?></textarea>  
      </br></br>  
      <input type="file" name="query_file" size="20"></br></br>  
      Genomes:   
      <select name="specie" class="selectbutton">  
      <option value="human" <?php (!isset($_SESSION['specie']) || $_SESSION['specie'] == 'human') ? print 'selected' : '' ?>>Homo sapiens</option>  
      <option value="mouse" <?php ($_SESSION['specie'] == 'mouse') ? print 'selected' : '' ?>>Mus musculus</option>  
      <option value="arabidopsis" <?php ($_SESSION['specie'] == 'arabidopsis') ? print 'selected' : '' ?>>Arabidopsis thaliana</option>  
      <option value="rice" <?php ($_SESSION['specie'] == 'rice') ? print 'selected' : '' ?>>Oryza sativa</option>  
      </select>  
      </br></br>  
      <input type="submit" name="search_submitted" value="Search">  
      </form>  
      <?php  
 }  
 // Validates search form data  
 function validateSearchData(){  
      // Stores in $_SESSION variable all the post data and in $_SESSION['errors'] if some important data is missing  
      // Clear previous $_SESSION values  
      unset($_SESSION['errors']);  
      unset($_SESSION['search_name']);  
      unset($_SESSION['user_email']);  
      unset($_SESSION['search_type']);  
      unset($_SESSION['search_exclude']);  
      unset($_SESSION['search_limit']);  
      unset($_SESSION['specie']);  
      unset($_SESSION['output_dir']);  
      unset($_SESSION['query_file']);  
      unset($_SESSION['query_data']);  
      (!empty($_POST['search_name'])) ? $_SESSION['search_name'] = trim($_POST['search_name']) : $_SESSION['errors'][] = 'name for the search';  
      (!empty($_POST['user_email'])) ? $_SESSION['user_email'] = trim($_POST['user_email']) : $_SESSION['errors'][] = 'email';;  
      (!empty($_POST['search_type'])) ? $_SESSION['search_type'] = $_POST['search_type'] : $_SESSION['errors'][] = 'search input type';  
      if (!empty($_POST['search_exclude'])){  
           ($_POST['search_exclude'] == 'yes') ? $_SESSION['search_exclude'] = 1 : $_SESSION['search_exclude'] = 0;  
      }  
      (!empty($_POST['search_limit'])) ? $_SESSION['search_limit'] = trim($_POST['search_limit']) : '';  
      (!empty($_POST['specie'])) ? $_SESSION['specie'] = $_POST['specie'] : '';  
      // Store the query data into a file inside the /tmp folder  
      if (!empty($_FILES['query_file']['name']) && !empty($_FILES['query_file']['tmp_name']) && $_FILES['query_file']['size']>0){  
           $_SESSION['query_file'] = $_FILES['query_file']['tmp_name'];  
      }elseif (!empty($_POST['query_data'])){  
           $_SESSION['query_data'] = $_POST['query_data'];  
           $_SESSION['query_file'] = '/tmp/'.$_SESSION['search_name'].".data";  
           $file_handle = fopen($_SESSION['query_file'], "w");  
           fwrite($file_handle, $_POST['query_data']);  
           fclose($file_handle);  
      } else{  
           $_SESSION['errors'][] = 'data or file to process';  
      }  
      if (isset($_SESSION['errors'])){  
           return;  
      }  
 }  
 // Executes a Perl script and prints the output  
 function executeScript(){  
      $output = `perl -e 'print "Hello world\n"'`;  
      print '<pre>'.$output.'</pre>';  
 }  
 ?>

#!/perl/bioinfo

8 de octubre de 2013

What does an SNP look like?

27 de diciembre de 2010

Enviar y recibir datos de un formulario con PHP

Archivo del blog