NMC DSP data Processing Tool Chain

From BioAssist
Jump to: navigation, search
  • Methods of Metabolite identification

Metabolite identification in general is required to be carried out through two types of methods:

Analytical approach involves production, separation and detection of ions.

In the analytical approach, first, the samples are scanned through LC-MS/GC-MS where the compounds are separated based on mobility/vaporization factors respectively .

The outcome of LC-MS/GC-MS is a “chromatogram” that contains spectrum with mass signals(m/z) and retention time (RT).

A single chromatogram corresponds to a specific sample that went through the machine and multiple chromatograms are obtained with the sequential introduction of the compounds into the mass spectrometer. However, the mass spectrometer detects the mass signals in a discrete fashion , e.g., for every 3 seconds. So, the peaks of the mass signal for every 3 seconds are stored in a file. This is raw data from the machine.

An example below shows a chromatogram (Source: http://www.wcaslab.com/tech/aldehyde.htm )

A number of peaks in each chromatogram are derived from compounds that are produced during sample derivatization through GC-MS/LC-MS.


The table below shows the RT of analytes found. (Source: http://www.wcaslab.com/tech/aldehyde.htm )


In the second step, various data analysis tools are used to process the raw data to produce the mass spectrum. That means, metabolite identification data processing steps often involves sequential use of diverse array of analysis tools and data resources in order to answer the biological questions. The processing can be to amplify the signals, reduce the noise, chromatographic alignment and/or peak detection.

Various deconvolution tools like AMDIS and LECO are available to produce the mass spectrum from the raw data in the case of single chromatogram. Whereas metabolite identification in the case of multiple chromatograms involves number of steps through number of tools like MetAlign/ XCMS, METOT etc. MetAlign tool is used for (a) base line correction, (b) noise detection and (c) ion-wise alignment. Metot tool is used for the pre-processing of MetAlign outputs by removing irrelevant noise information and formats data for the next step. MSClust is a intermediate data processing tool in the metabolomics data processing pipeline that works with the output generated by Metalign/Metot to remove the redundancy and for the tentative compound identification.

Tools for data processing and analysis

Processing tool Major function

Tool for the analysis, alignment and comparison of Mass Spectrometry datasets.
XCMS is used for removal of experimental artifacts, clean up the data for 
further analysis in the data processing and peak alignments.
pre-processing of MetAlign outputs  – removing irrelevant noisy information
and formatting data for the  next step
deriving compound mass spectra by ion clustering, data reduction: from a few  
thousands of ions to a few hundreds of metabolites  

The flowchart below illustartes the workflow in metabolomics data processing. Workflow-PRI-tools.PNG

Go to NMC

Go to Data processing and analysis