Using the LipidXplorer software for top-down shotgun dereplication of phenolic natural products from medicinal plants

Ricardo M Borges and Rahil Taujale

Federal University of Rio de Janeiro, Brazil
University of Georgia, USA

Dereplication analysis, now generally accepted as being the first stage of novel discoveries in natural product research, is an approach to sidestep the efforts involved in the isolation of known compounds. The requirement for a large-scale classification of potential drugs in plants that are threatened by extinction requires the development of accurate and rapid approaches for characterization of samples. One way to obtain high quality information consists in analyzing complex mixture by HR/MSMS in the direct infusion mode. A full scan HRMS profile can be recorded in this way with all the molecular features of a complex sample such as a plant extract to obtain molecular formula information. In addition MS/MS fragmentograms provides valuable structural information. Such type of metabolite profiling are however very complex to analyze and contain tens of thousands of features. Consequently, it is impractical to fully map different samples since it is virtually impossible to manage all existing database all together, mainly considering natural product compounds from plants, food, marine organisms, microorganisms and fungi. LipidXplorer is an open source software kit developed on the basis of Python language to analyze MS-based lipidomic acquired data. This software is based on the declarative molecular fragmentation query language (MFQL) where users can input the search parameters for their specific goals. This paper aims to present the application of the open source software LipidXplorer to achieve automatic dereplication of common phenolic content in natural product based complex mixtures obtained from plants with bases on direct infusion electrospray high-resolution mass spectrometry acquired data (DIMS). We wrote MFQL files with molecular structure restrictions for benzoic acid derivatives (phenolic C6C1), coumaroyl acid derivatives (phenolic C6C3), flavonoid derivatives (phenolic C6C3C6) and coumarins (phenolic C6C3 lactones). To test for the specificity of our results, we wrote additional MFQL files to filter ascarosides characteristic signaling compounds found in nematodes and we tested all these molecular tools on all samples. We validated our approach using well-known plant extracts and MZmine based and hand annotation to test the efficiency. Finally, we propose that this approach provides a complementary source of information for dereplication of natural products using MS/MS based molecular networks.