Metabolomics Data structures/entities

From BioAssist
Revision as of 14:22, 4 November 2010 by Rob Hooft (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The identification of the main entities that store relevant information leads to a better understanding of the complexity of the data structure. Below two systems are presented to show what is involved, but the platform that BioAssist will have to support eventually is a superset of what is shown here.

System A

The main entity would be Project/Study Information Table. This table would hold, for example, information concerning the samples being measured and which apparatus is used (e.g. what settings are used). This table should also contain information about the statistical and mathematical processes (steps) that data at different levels are subjected to. This way it is always possible to reproduce results. A unique project identifier enables linkage of all study information.

In the picture below an example is shown in which integrated peak area's are stored. Specifics of the peaks are stored in the 'peak table'. Furthermore, the 'study information table' combines all information from the different tables. Using the class- and longitudinal information easy crossectioning of the data is possible that can be used at a later stages (data analysis) Dbdesigner schema 1.0.5 FS.png

Storing data in a similar manner (like the example above) allows for a flexible/scalable solution in which data from any kind of apparatus can be stored. The reproduction of the results can be done using the data steps listed in the 'processing table'. Object and/or peak removal can be viewed as a (pre) process step. The unique identifier throughout the study for a specific sample is crucial. Different levels of meta-information can be achieved by annotating data in different ways. The appropriate level for cross-study analysis has to be determined.

The proposed design should lead to:

System B

The following diagram shows the classes involved in a second system involving metabolomics data capturing, modeled after minimal reporting standards published in literature.


Reference documentation/standards

Design software