#!/perl/bioinfo: Plant Genomes in a Changing Environment 2019 (I)

Hi, these are my notes of the talks I attended of the first day of the Plant Genomes in a Changing Environment 2019 conference.

Jump to day 2 or day 3.

Unlocking the polyploidy potential of wheat through genomics (Cristobal Uauy, John Innes Centre, UK)

Phenotypes of agricultural importance are complex, with continuous gradients instead of discrete states, and differences being often difficult to tell from noise. This only gets worse with polyploids, where QTL effects are subtler than in diplids. He explains that Arabidopsis thaliana is as far from wheat as platypus to human. He then talks about the rich wheat genomic resources which are bringing people to work in this species perhaps for the first time. All these resources are documented at http://www.wheat-training.com.

He then moves to describe a particular example of combining these tools, using https://bioconductor.org/packages/release/bioc/html/GENIE3.html to predict target genes of wheat TFs using around 900 RNAseq experiments. They see no evidence of TFs preferring targets from same (A,B,D) subgenome, even when the D subgenome has eveolved independently for thousands of years with respect to A & B. He’s asked how difficult it is to map & genotype specific subgenomes, he says with Polymarker you can enrich on subgenome-specific regions. They use https://pachterlab.github.io/kallisto to assemble transcripts and have validated in chr-deletion lines that the transcript from those missing chromosomes are not expressed.

ENSEMBL plants – Visualizing the Wheat Genome in Ensembl Plants (Guy Naamati, EMBL-EBI, UK)

He starts with a quick tour of http://plants.ensembl.org/Triticum_aestivum. He then explains the gene trees for wheat produced with https://www.ensembl.org/info/genome/compara . Then he moves on to the wheat variant collections and the TILLING mutants and the KASP markers. Finally he also mentions our preparative work with a dozen wheat cultivar assemblies from http://www.10wheatgenomes.com . He concludes showing off the Ensembl Outreach team that do Ensembl training around the world. He is asked why gene models keep changing across releases and whether it is possible to know the most abundant isoform (canonical?) . He’s also asked how the 10+ varieties are going to be loaded. Another question is how low sequence identity orthologues are managed.

Expression atlas - Submission, archival and visualisation of plant sequencing data (Nancy George, EMBL-EBI, UK)

She guides us through a submission from start to end: i) annotate metadata with https://www.ebi.ac.uk/fg/annotare, ii) import expression data with https://www.ebi.ac.uk/arrayexpress and https://www.ebi.ac.uk/gxa (min 3 replicates, biological question, reference in Ensembl). These steps eventually result in baseline expression reports such as http://plants.ensembl.org/Triticum_aestivum/Gene/ExpressionAtlas?g=TraesCS3D02G273600;r=3D:379535906-379539827

She then moves on to say how the plant community are still not embracing the single-cell sequencing efforts due to technical challenges.

Benchmarking and development of an ensemble motif mapping approach to improve gene regulatory network inference (Marc Jones, VIB / Ghent University, Belgium)

He introduces TF binding motifs and how they can be used to scan genome sequences to predict genomic sites. They compared different motif aligners, including MOODS, cluster-buster, FIMO and matrix-scan. The observe that FIMO sites tend to match more often with those from other tools. They then compared site predictions to ChIPseq read depth, in order to compute precision and recall. FIMO comes best in terms of precision and worst on recall. When they look at the first 7000 sites, their 4 tested aligners behave similarly. Eventually they combined FIMO and cluster-buster, as they report many sites missed by the others. The full set of results is described at http://www.plantphysiol.org/content/181/2/412

No genome required: Finding genetic variants associated with plant phenotypes without complete genome information (Yoav Voichek, Max Planck Institute for Developmental Biology, Germany)

He talks about doing GWAS analysis with K-mer distributions instead of mapping to a reference genome. They start with a PAV table of 31-mers across genotypes. That table can be used to characterize a pan-genome after removing low depth kmers, as they did with 1000 A. thaliana genome sequence sets. From that they have developed a GWAS pipeline for k-mers which accounts for population structure. They assign genomic context to k-mers by i) mapping to ref genome, ii) LD and iii) assemblying reads containing the k-mers and then mapping. The code will be released soon in https://github.com/voichek/kmers-gwas

The 4th dimension of Gene Regulatory Networks: TIME (Gloria M Coruzzi, New York University, USA)
She talks about the time dimension in regulatory networks with the diagram on the left from https://europepmc.org/articles/PMC4558309. She proposes we should be handling TFs binding to DNA just like enzymes, with enzymatic kinetics. She tells 3 stories on A. thaliana.

The Just-in-TIME approach allowed to study genes expressed in response to N a as function of time with enriched cis elements and GO terms that you would have missed if analyzed in bulk https://www.pnas.org/content/115/25/6494.short. They apply ML to identify the TFs binding to those cis elements using time series gene expression and they validated the predictions with 7 TF perturbations, that affect 2K targets.

Hit-and-Run is another approach to study transient TF binding controlled by adding dexamethasone (developed by José Álvarez et al, soon in Nat Comms). She stresses that binding is a poor predictor of regulation, as most binding does not affect expression, and instead in many cases they can’t catch ChIPseq binding events that they know to happen. She also shows results of TFs binding to the 3’UTR. In order to catch those transient-binding TFs they used a new protocol called DamID. It turns out that most transient events are very early in the N response, while the stable binders tend to be late responding. She does not know whether transient sites are bound with less affinity, but she notes they do are enriched in neighbor sites from other TFs.

Finally, they performed network walking to connect transient TFs to their targets in A. thaliana, which they published at https://www.nature.com/articles/s41467-019-09522-1. It is called net walking because they walk from primary TFs, then to secondary regulated TFs and finally to indirect targets. They are now developing a method called OutPredict to introduce priors in their network inference.

Genetic and genomic studies of climate adaptation and genotype-by-environment interaction in switchgrass (Panicum virgatum, Tom Juenger, University of Texas at Austin, USA)

Talks about the evolutionary genetics of plant adaptation citing https://www.ncbi.nlm.nih.gov/pubmed/21550682 . His system is the C4, perennial, polyploid, wind-pollinated P. virgatum, related to http://plants.ensembl.org/Panicum_hallii_fil2.

They have resequenced 950 individuals 45x to map against a V5 PacBio assembly, yielding 46M SNPs. They belong to 4 populations. Their experiment sites span 24.3 degrees of latitude across 16 locations. They have published several articles, such as https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5100855 . They have been able to assign % of genetic variance to climate (such as mean temp of driest quarter) and geography and find SNPs associated to them. They conclude climate has been a stronger driver of adaptation than genetic isolation, and they observe widespread QTL x E interactions for local adaptation.

#!/perl/bioinfo

16 de octubre de 2019

Plant Genomes in a Changing Environment 2019 (I)

No hay comentarios:

Publicar un comentario