Basic Information
Abstract Number: 1270-3    
Author Name: Oliver Fiehn Affiliation: Genome Center
Session Title: ACS Subdivision of Chromatography and Separation Chemistry Young Investigator Award
Event Type: Award
Event Title: Towards a Standardized Metabolomics Repository
Presider(s): Olesik, Susan Start Time: 09:20 AM ( Slot # 5 )
Date: Wednesday, March 16th, 2011 Location: 312B
Keywords: Bioinformatics, Data Analysis, Gas Chromatography/Mass Spectrometry, Liquid Chromatography/Mass Spectroscopy

Kind, TobiasGenome Center
Wohlgemuth, GertGenome Center

Abstract Content
Combined metabolomic platforms yield 500-3,000 metabolites per biological study. When publishing results in peer-reviewed journals, scientists detail a very small fraction of these in graphs or tables, leaving the majority of metabolomic data uninterpreted. How can metabolomic data be organized in open access, systematic databases that allow the scientific community to interrogate published data sets?

Today, we take or the wealth of genomic information for granted such as NCBI's Gene Expression Omnibus or JGI’s repositories. A standardized metabolomic repository will have to be platform-independent. Compound identifications have to be supported by at least two independent parameters, such as 'retention index' and 'mass spectrum': mere accurate mass data are not enough. Identified compounds must not be annotated by their names but by their International Chemical Identifier (InChI) structure codes – this is what defines chemicals. Database identifiers can be used as surrogates (e.g. PubChem or ChEBI, KEGG, HMDB).

The Fiehn laboratory has compiled a range of tools that will help developing standardized repositories. The Chemical Translation Service easily converts database identifiers (including CAS) to structures and other databases. Three mass spectral libraries were constructed: the FiehnLib libraries of over 2,000 retention–index based GC-TOF and GC-quad-MS spectra for over 1,000 different metabolites, the LipidBLAST library of 120,000 MS/MS spectra of complex lipids and the BinBase DB for metabolomic studies. BinBase stores data for over 25,000 samples in more than 360 studies covering microbes, plants and animals. Data are available for downloads.

Results are presented for breast cancer metabolomics as a three–tiered approach combining biochemical, chemical and mass spectral similarity distances to provide network graphs for both identified and novel, structurally unknown metabolites.