Abstract
Plants produce a wide variety of specialized metabolites: chemical compounds that play key roles in defence agains¬t pathogens and insects, in recruitment of microbiota and as scents and pigments to attract pollinators. Many of these metabolites have been applied in human society as pharmaceuticals, food additives, cosmetics or insecticides. Collectively, the plant kingdom possesses the largest amount of chemical diversity on earth, but most of this potential remains untapped. Specialized metabolites are synthesized by dedicated enzymatic pathways. Identifying the genes encoding such pathways can enable targeted discovery of new metabolites and is crucial for industrial production. In microbes, automated ‘genome mining’ technologies have emerged to predict pathways based on their chromosomal clustering of genes, to compare these pathways across species, and to link them to metabolites. However, in plants, the situation is considerably more complicated, as pathway genes are often not chromosomally clustered, frequent gain and loss of similar enzymes during evolution obscures pathway homology, and methods to link predicted to observed chemistry are underdeveloped. Here, I propose to develop an automated computational workflow for plant genome mining that addresses these challenges. To reliably identify plant specialized metabolic pathways, I will integrate and extend existing approaches based on coexpression, genomic clustering and coregulation. Subsequently, I will develop a new algorithm to establish cross-species pathway orthology. Finally, I will develop new methods to link resulting ‘pathway families’ to molecules observed in metabolomics data. To demonstrate the discovery potential of the approach, I will generate a prototypical multi-omics dataset for 10 closely related plant species and apply our methods on these data to analyse the dynamics of pathway evolution across these and perform collaborative structural characterization of new metabolites. Taken together, this will pave the way for streamlined discovery of plant specialized metabolic pathways and analysis of their diversity and evolution.