Hi, today I just want to share a tool to prepare nice pictures from
genomic data files. It can be used straight-forward with some
commands and arguments, but an interesting point about this
is that actually uses an intermediate configuration file which can be
smoothly edited. We will go deeper on that later. A remarkable fact
is that is easy to use, as well as manageable in huge amount of
parameters, but also very well described, intuitive, and overwhelming
in details.
First, find here the documents and all guides of usage, and the
reference paper as well.
As mentioned, it takes different output files from several omics, from
widely used bed or FASTAs types, MAFs, GTFs… up to some others
(which I have not tested yet) such as HiC matrix, or epigenetic
annotation Epilogos.
It is coded in python. It can be easily installed using pip
$ pip install pyGenomeTracks
All dependencies should be automatically installed. I personally
recommend having a conda environment for pyGenomeTracks.
To use it, pyGenomeTracks needs a configuration file describing the
requirements for the tracks included in the projected image. Something
like the “instructions” or “cooking recipe”.
Using terminal command lines, it is possible to build this file. Basic
arguments are input files, from which the plot will grab the data and
build the plot, and the output tracks.ini file. This is the configuration file,
and will be shaped according to the type of file that you are providing as
input.
$ make_tracks_file --trackFiles <bigwig file> <bed file> etc. -o tracks.ini
Once you have the configuration file, go for the command which will
generate the image.
$ pyGenomeTracks --tracks tracks.ini -o image.png
Depending on the plot type, you may have to provide some other
arguments. For instance, if your input is a .bed file, you maybe want to
capture a specific region:
$ make_tracks_file --trackFiles <bed file> --region chr1:1000000-4000000
-o tracks.ini
And some other stuff such as title, font, width/height or resolution. In
summary, you can re-use a single configuration file to try and try to plot
as many times your data without re-editing the parameters you probably
expended some time before optimizing them.
And, about this time to prepare the parameters about the plot, here it comes
the most interesting part. You can edit the instruction file from a command
line, but, since it is at the end a text file that stores parameters in each line,
it allows building it manually. Having the guide from all parameters of a plot
types, and keeping an intuitive structure, you are able to control a wide set
of variables, colours, styles… And it is even possible to stack plots,
increasing the possibilities up to the limit of your imagination.
I don’t want to make this post longer. Almost forgot, I’m Joan, a training
researcher. Developing some tasks, I wasn’t able to find an adequate tool to
quickly plot some haplotypes I am working on, and I faced pyGenomeTracks.
Have a look at one of my beautiful and colourful images, as an example.
Hope to write often here!
No hay comentarios:
Publicar un comentario