Metabolomics Functional requirements

From BioAssist
Jump to: navigation, search

Architecture overview

NMC DSP.png

Goals for 1 September 2010

DSP Study capture

  • Save study designs (Subjects, Events, Samples), input via import wizard
  • Authentication, Authorization and Accounting (AAA)

There must be some basic functionality regarding AAA. (More later)

  • Communication with DSP Core and DSP Visualisation

DSP Core

  • Save peak tables from DCL Lipidomics platform and link them to samples (also QC's)
  • (Process MS raw data via mzMatch software and link the peak data to individual samples
  • AAA
  • Communication with DSP Study capture and DSP Query

DSP Query

  • For a certain study, retrieve an Excell file with in the first sheet the study capture metadata up to the samples, in the second sheet the measurement design, and in the third sheet the ‘clean’ peak table (columns: mass, retention time, label, sample 1..n, rows: peak data)

Uploading peak table information

The main requirement for September 1st 2010 is that peak table information should be associated with individual samples. As an example of how this works in practice we will look at real data from a "lipodomics" project in Leiden. Peak information is captured. The following information is available for each row in the form of an excel sheet:

  • Sample Name (String)
  • Data Path (Path)
  • Type (Sample, QC, blank etc)
  • Level (unknown, possibly empty)
  • Acq. Date-Time

The peak data is listed for each sample as RT and Area, both real numbers. These may occur multiple times for each sample, each under the heading of a Compound name. Two scenarios There SEEM to be two scenarios possible.

  • 1) The user already created a measurement design and now wants to upload peak table information for the samples. Note that the samples, including blanks and QT’s were created in the measurement design.
  • 2) There is no measurement design and all information in the excel sheet should be uploaded. If this is the case either samples are already present in the system, or they are absent.

The measurement design creates a batch - a list of samples which are to be measured in a certain order.

We will first concentrate on the first scenario. We assume the user first starts with a measurement design and later adds the peak information. The following steps are necessary:

  • Upload the excel sheet
  • Parse the excel
  • Check the sample names against the samples in the DSP (If mismatch report to user)
  • Add peak table info and other info to the samples

The second scenario will involve a similar procedure like the measurement design upload. First a list of samples is downloaded in an excel sheet template. The user then adds the peak table information and loads up the form.

  • generate a list of samples
  • Put the excel template file together
  • user adds downloads excel file and adds relevant data
  • upload, parse and add info to the DSP

Authentication, Authorization, Accounting NMC(draft)

  • Authentication on study level. The NMC will set up a procedure which describes the steps to take when a potential user requests access to the system. Authentication will be based on username password (https).
  • Authorisation. Three roles can be distinguished. User, Project leader, Sys admin. Only a Sysadmin can create new users. Studies are owned by a Project Leader. He can give read/write rights to Users with regarding to projects he owns. Read rights for a study includes all reading access data linked to that study. Write rights include write access to the entire study. A SysAdmin can read all studies.
  • Accounting. All write and download actions should be logged. Also logins, and opening of studies (reading) should be logged. Logs should be accessible to the Project Leader and SysAdmin only.
  • Authentication for all parts should be the same

REMARK: We supspect (May 2010) that more is needed. Not all users are probably allowed to see internediate data of a study. This is most likely the case in Leiden.

Goals for 1 September 2011

DSP Study capture

  • No additional goals

DSP Core

  • Process MS raw data with MetAlign
  • Process MS raw data with TNO-DECO
  • Process NMR data? (Peak picking, peak identification)
  • Run metabolite identification methods (spectral trees?)

DSP Visualisation

  • Retrieve data (metadata and peak tables) over several studies:
  • Give me all data with samples from Kopenhagen
  • Give me all QC samples from a set of studies
  • Give me all QC samples from a certain Batch within a study
  • Give me all peaks associated with a certain compound