Phase I Small Business Innovation Research (SBIR)


Metabolomics plays an indispensable role in the growing systems biology approaches to identify reliable cancer biomarkers. Liquid chromatography coupled to mass spectrometry (LC-MS) and gas chromatography coupled to mass spectrometry (GC-MS) have been extensively used for high-throughput comparison of the levels of thousands of metabolites among biological samples. However, the potential values of many disease-associated analytes discovered by these platforms have been inadequately explored in systems biology research due to lack of computational tools. Partly due to these limitations, poor reproducibility of previously identified metabolite biomarker candidates has been observed, especially when they are evaluated through independent platforms and validation sets. This project aims to address this challenge using a new software tool (SysMet) that utilizes a network-based approach to uncover relationships between disease and metabolites by investigating the rewiring and conserved interactions among metabolites in the progression of the disease. In addition, we propose to extend the network-based approach for integrative analysis of multi-omics data to identify disease-associated metabolites. The tool will contribute to improving the ability of researchers to discover biomarkers by enhancing the role of metabolomics in systems biology research.


Phase I Small Business Innovation Research (SBIR)

In a typical untargeted metabolomics analysis by liquid chromatography-mass spectrometry (LC-MS), about 70% of the detected ions represent unknown analytes. While identification of the unknowns without putative IDs remains a significant challenge, we have the opportunity to identify more metabolites by improving the ability to prioritize multiple putative IDs assigned to the known-unknowns. This will be tremendously helpful in selecting promising metabolites for the subsequent experimental verification of the IDs. This project seeks to develop a probabilistic framework that assigns a priority score to each putative metabolite ID by combining information from multiple resources including compound databases, pathways, biochemical networks, and spectral libraries. The proposed probabilistic model will exploit the inter-dependent relationships between metabolites in biological organisms based on knowledge derived from pathways and biochemical networks to assign priority score to each putative metabolite IDs. If MS/MS data are available, the score for a putative ID will take into account how well the measured MS/MS matches against those in spectral libraries or fragment patterns predicted by in-silico spectral interpretation. Successful implementation and validation of the model will enable users to accurately identify putative metabolite IDs and assign priority scores by taking advantage of publicly available databases, pathways, and biochemical networks, spectral libraries, as well as various tools designed for isotope/adduct recognition, decomposition of isotopic patterns, and in-silico spectral interpretation.