#!/perl/bioinfo: artículos

Mostrando entradas con la etiqueta artículos. Mostrar todas las entradas

24 de febrero de 2020

Pon tu conocimiento de plantas al servicio de UniProt

Hola,
esta mañana escuché a Michele Magrane explicar cómo funciona el proceso de curación de literatura en UniProt, la colección de proteínas más importante del mundo, que desarrollan entre el EBI, PIR y SIB. Por si no lo sabéis, la curación es el proceso por el que personas expertas extraen información y evidencias experimentales de artículos para agregarla de manera trazable a secuencias de proteínas.

Tras explicarnos que el núcleo (SwissProt) apenas pasa del medio millón de secuencias y que la parte automatizada se acerca a los 180 millones, nos dijo que creen que el proceso de curación es escalable, como describen en este artículo, dado que solamente un porcentaje muy pequeño de los artículos que se publican (3%) les sirven para anotar sus proteínas.

En cuanto a los plantas, mencionó que fundamentalmente curan artículos de Arabidopsis thaliana y Oryza sativa, por este orden.

Felizmente es posible sugerir artículos para proteínas de UniProt, y de esa manera contribuir a su anotación y curación por expertos. Para ello solamente necesitas un identificador ORCID y un artículo publicado que hayas leído y que contribuya a describir la proteína en cuestión. En la figura verás el enlace "Add a publication" arriba a la derecha:


https://www.uniprot.org/uniprot/Q5Y386/protvista

Manos a la obra,
Bruno

4 de marzo de 2013

PubReader, el guiño de PubMed a los tablets con HTML5 y CSS3

PubMed, la mejor base de datos de artículos científicos online, en concreto su subsección de consulta gratuita PubMed Central (PMC), ha estrenado el año con novedades tecnológicas. Una de las cosas que más me gusta de PubMed y en general de las bases de datos del NCBI, además de que su consulta es gratuita, es su sobriedad, su buen funcionamiento y su resistencia a añadir florituras a su interfaz online. Creo que la máxima de 'si algo funciona bien no hay que intentar cambiarlo' se aplica a PubMed.

Sin embargo, el otro día al consultar nuestro nuevo artículo en PubMed me sorprendí al ver a la derecha una imagen de un iPad, similar a la típica que muestra Amazon con su Kindle. La imagen llevaba el título 'Click here to read article using PubReader' y por supuesto probé qué era eso. Muy gratamente descubrí que se trata de PubReader, una nueva herramienta para leer cómodamente los artículos en pantallas táctiles y tablets.

PubReader está desarrollado con HTML5 y CSS3. El esquema de visualización de PubReader es muy simple, lee el artículo de PMC en formato XML y lo transforma en HTML que junto con algo de Javascript y CSS simula un lector de libros al modo de Kindle o del nuevo Reader de Windows 8. Además podemos descargarnos el código fuente de PubReader para hacer nuestro propio lector online de libros.

29 de diciembre de 2011

Jornadas Bioinformáticas JBI 2012 (XI Edición)

Aprovecho esta última entrada del año para dar difusión a las Jornadas de Bioinformática JBI 2012. Como muchos sabréis, este congreso es el principal punto de encuentro anual de nuestra comunidad en la península ibérica, así que nuestro laboratorio también estará en Barcelona del 23 al 25 de Enero. El programa completo de las jornadas se puede descargar en este enlace.

http://sgu.bioinfo.cipf.es/jbi2012

Este año presentaremos parte de nuestro trabajo reciente:

"Genome-wide clustering of transcription factors by comparison of predicted protein-DNA interfaces"

donde explicamos y evaluamos la anotación de interfaces de reconocimiento de DNA en secuencias de proteínas por medio de diferentes aproximaciones como BLAST, TFmodeller, DP-Bind y DISIS.

El tema principal de las jornadas será "Arquitectura genómica, anotación y diseño", sobre el cual se discutirán los diferentes avances en la integración de los campos de la Biología, Medicina e Informática en el campo de la Genómica. Además se tratarán los siguientes temas:
- Análisis de datos de secuenciación de alto rendimiento (NGS)
- Bioinformática estructural
- Algoritmos de biología computacional y computación de alto rendimiento
- Análisis de sequencias, filogenética y evolución
- Bases de datos, herramientas y tecnologías en biología computacional
- Bioinformática en transcriptómica y proteómica
- Biología de sistemas

ENGLISH:

The XIth Spanish Symposium on Bioinformatics (JBI2012) will take place in January 23-25, 2012 in Barcelona, Spain. Co-organised by the Spanish Institut of Bioinformatics and the Portuguese Bioinformatics Network and hosted by the Barcelona Biomedical Research Park (PRBB). The full program can be downloaded from this link.

This year, the reference topic is “Genome Architecture, Annotation and Design” for which the conference will provide the opportunity to discuss the state of the art for the integration of the fields of biology, medicine and informatics. We invite you to submit your work and share your experiences in the following topics of interest including, but not limited to:

- Analysis of high throughput data (NGS)
- Structural Bioinformatics
- Algorithms for computational biology and HPC
- Sequence analysis, phylogenetics and evolution
- Databases, Tools and technologies for computational biology
- Bioinformatics in Transcriptomics and Proteomics
- System and Synthetic Biology

Our contribution to the congress:

Genome-wide clustering of transcription factors by comparison of predicted protein-DNA interfaces

Transcription Factors (TFs) play a central role in gene regulation by binding to DNA target sequences, mostly in promoter regions. However, even for the best annotated genomes, only a fraction of these critical proteins have been experimentally characterized and linked to some of their target sites. The dimension of this problem increases in multicellular organisms, which tend to have large collections of TFs, sometimes with redundant roles, that result of whole-genome duplication events and lineage-specific expansions. In this work we set to study the repertoire of Arabidopsis thaliana TFs from the perspective of their predicted interfaces, to evaluate the degree of DNA-binding redundancy at a genome scale. First, we critically compare the performance of a variety of methods that predict the interface residues of DNA-binding proteins, those responsible for specific recognition, and measure their sensitivity and specificity. Second, we apply the best predictors to the complete A.thaliana repertoire and build clusters of transcription factors with similar interfaces. Finally, we use our in-house footprintDB to benchmark to what extent TFs in the same cluster specifically bind to similar DNA sites. Our results indicate that there is substantial overlap of DNA binding specificities in most TF families. This observation supports the use of interface predictions to construct reduced representation of TF sets with common DNA binding preferences.

22 de septiembre de 2011

Regulación por microRNAs exógenos de la dieta?

Hola,
uno de los temas principales que estudiamos en nuestro laboratorio es la regulación de la expresión génica, que como sabéis puede darse a varios niveles.
Si hace poco nuestro colega Lorenz Bülow nos contaba en un seminario de la EEAD la integración de la regulación transcripcional y postranscripcional en la planta modelo Arabidopsis thaliana, organizada en la base de datos relacional AthaMap, hoy descubro un artículo reciente en Cell Research donde los autores publican evidencia de la presencia de cerca de 30 microRNAs de arroz en muestras de sangre de poblaciones humanas y de ganado en China. El artículo, que parte de muestreos masivos de secuencias y luego confirma los resultados por PCR, sugiere de manera convincente que algunos microRNAs expresados en el grano de arroz, dieta fundamental de las poblaciones estudiadas, pueden regular la expresión génica de sus comensales. Seguro que el artículo será puesto a prueba en posteriores análisis y estudios, para validarlo de manera inequívoca, porque estas observaciones desvelan dos hechos que sin duda tendrán mucho impacto:
1) puede haber transferencia de ácidos nucleicos de las plantas a los mamíferos que se las comen, a pesar de la digestión
2) puede haber fenómenos de regulación genética en la naturaleza a través de la dieta sin que medien hormonas, directamente por microRNAs, moléculas en torno a los 22 ribonucleótidos de tamaño, capaces de atravesar los epitelios del tracto digestivo

fuente: http://mcb.berkeley.edu/labs/he/Research.htm

Recomiendo la lectura de la fuente original y si tenéis algo que añadir por favor usad los comentarios, un saludo,
Bruno

15 de septiembre de 2011

Guía de campo de tecnologías de secuenciación

Hola,
ayer me encontré en la Red una revisión, publicada en Mayo de 2011 por TC Glenn, que contiene la siguiente tabla, muy útil para comparar de un vistazo las plataformas de secuenciación de segunda generación disponibles actualmente:

Tabla original publicada en Molecular Ecology Resources

Esta tabla se complementa con otras disponibles en la 'NGS Field Guide', actualizadas regularmente, incluyendo por ejemplo los costes y los tipos de errores más frecuentes en cada una de ellas. De hecho habrá que esperar para tener datos empíricos de los errores típicos de la plataforma IonTorrent, que por ahora se basan en datos proporcionados por la compañía (previas a su publicación en Nature el pasado mes de Julio).
Hasta otra, Bruno

13 de septiembre de 2010

Jornadas Bioinformáticas JBI 2010 (X Edición), nuestro laboratorio estará allí...

Las Jornadas Bioinformáticas son la cita anual obligada para los bioinformáticos españoles. Este año se celebrará su décima edición del 27 al 29 de Octubre en Torremolinos (Málaga). La organización de las mismas corre a cargo de la Universidad de Málaga, el Instituto Nacional de Bioinformática y la Red Portuguesa de Bioinformática. Este año el tema central es "La bioinformática aplicada a la medicina personalizada", sobre el cual se discutirá la integración de los campos de la biología, medicina e informática para el desarrollo de terapias más específicas y efectivas. Sin embargo, éste no será el único tema a tratar, también se compartirán resultados y experiencias en otros campos:
- Análisis de datos en técnicas de alto rendimiento como la secuenciación de nueva generación.
- Bioinformática estructural
- Algoritmos de biología computacional y técnicas de computación de alto rendimiento
- Análisis de secuencias, filogenética y evolución
- Bases de datos, herramientas y tecnologías de biología computacional
- Bioinformática en transcriptómica y proteómica
- Biología sintética y de sistemas

IN ENGLISH:

The Xth Spanish Symposium on Bioinformatics (JBI2010) will take place in October 27-29, 2010 in Torremolinos-Málaga, Spain. Co-organised by the National Institute of Bioinformatics-Spain and the Portuguese Bioinformatics Network and hosted by the University of Malaga (Spain).

This year, the reference topic is “Bioinformatics for personalized medicine” for which the conference will provide the opportunity to discuss the state of the art for the integration of the fields of biology, medicine and informatics. We invite you to submit your work and share your experiences in the following topics of interest including, but not limited to:
- Analysis of high throughput data (NGS)
- Structural Bioinformatics
- Algorithms for computational biology and HPC
- Sequence analysis, phylogenetics and evolution
- Databases, Tools and technologies for computational biology
- Bioinformatics in Transcriptomics and Proteomics
- System and Synthetic Biology

Nuestras aportaciones

Nuestro laboratorio va a participar en las Jornadas Bioinformáticas con tres contribuciones que presentaré a continuación:

3D-footprint: a database for the structural analysis of protein–DNA complexes (paper)
The relation between amino-acid substitutions in the interface of transcription factors and their recognized DNA motifs
101DNA: a set of tools for Protein-DNA interface analysis

3D-footprint: a database for the structural analysis of protein–DNA complexes
3D-footprint is a living database, updated and curated on a weekly basis, which provides estimates of binding specificity for all protein–DNA complexes available at the Protein Data Bank. The web interface allows the user to: (i) browse DNA-binding proteins by keyword; (ii) find proteins that recognize a similar DNA motif and (iii) BLAST similar DNA-binding proteins, highlighting interface residues in the resulting alignments. Each complex in the database is dissected to draw interface graphs and footprint logos, and two complementary algorithms are employed to characterize binding specificity. Moreover, oligonucleotide sequences extracted from literature abstracts are reported in order to show the range of variant sites bound by each protein and other related proteins. Benchmark experiments, including comparisons with expert-curated databases RegulonDB and TRANSFAC, support the quality of structure-based estimates of specificity. The relevant content of the database is available for download as flat files and it is also possible to use the 3D-footprint pipeline to analyze protein coordinates input by the user. 3D-footprint is available at http://floresta.eead.csic.es/3dfootprint with demo buttons and a comprehensive tutorial that illustrates the main uses of this resource.

The relation between amino-acid substitutions in the interface of transcription factors and their recognized DNA motifs

Transcription Factors (TFs) play a key role in gene regulation by binding to DNA target sequences. While there is a vast literature describing computational methods to define patterns and match DNA regulatory motifs within genomic sequences, the prediction of DNA binding motifs (DBMs) that might be recognized by a particular TF is a relatively unexplored field. Numerous DNA-binding proteins are annotated as TFs in databases; however, for many of these orphan TFs the corresponding DBMs remain uncharacterized. Standard annotation practice transfer DBMs of well known TFs to those orphan protein sequences which can be confidently aligned to them, usually by means of local alignment tools such as BLAST, but these predictions are known to be error-prone. With the aim of improving these predictions, we test whether the knowledge of protein-DNA interface architectures and existing TF-DNA binding experimental data can be used to generate family-wise interface substitution matrices (ISUMs). An experiment with 85 Drosophila melanogaster homeobox proteins demonstrate that ISUMs: i) capture information about the correlation between the substitution of a TF interface residue and the conservation of the DBM; ii) are valuable to evaluate TFs alignments and iii) are better classifiers than generic amino-acid substitution matrices and that BLAST E-value when deciding whether two aligned homeobox proteins bind to the same DNA motif.

101DNA: a set of tools for Protein-DNA interface analysis

Analysis of protein-DNA interfaces has shown a great structural dependency. Despite the observation that related proteins tend to use the same pattern of amino acid and base contacting positions, no simple recognition code has been found. While protein contacts with the sugar-phosphate backbone of DNA provide stability and yield very little specificity information, contacts between amino acid side-chains and DNA bases (direct readout) apparently define specificity, in addition to some constrains defined by DNA sequence-dependent features, namely indirect readout.
Recent approaches have proposed bipartite graphs as an structural way of analysing interfaces from a protein-DNA-centric viewpoint. With this perspective in mind, we have developed a set of tools for the dissection and comparison of protein-DNA interfaces. Taking a protein-DNA complex file in PDB format as input, the software generates a 2D matrix that represents a bipartite graph of residue contacts obtained after applying a simple distance threshold that captures all non-covalent interactions. The generated 2D matrices allow a fast and simple visual inspection of the interface and have been successfully produced for the current non-redundant set of protein-DNA complexes in the 3D-footprint database.
As a second utility to compare 2 interfaces, the 101DNA software includes an aligment tool where a dynamic programming matrix is created with the Local Affine Gap algorithm and traced back as a finite state automata. The scores between pairs of interface amino acid residues are calculated as a function of the observed contacts with DNA nitrogen bases. This tool produces local interface alignments which are independent of the underlying protein sequence, but that faithfully represent the binding architecture. Preliminary tests show that these local alignments successfully identify binding interfaces that share striking similarity despite belonging to different protein superfamilies, and these observations support this graph-theory approach.