Dynamic Lightpath

From BioAssist
Jump to: navigation, search

Connected sites

  • DLSG cluster at University of Groningen. Contact persons: Isthiaq Ahmad, Peter Horvatovich, Rainer Bischoff.
  • DLSG cluster at Academic Medical Center. Contact persons: Mark Santcroos, Silvia Olabarriaga, Perry Moerland, Antoine van Kampen.
  • DLSG cluster at Delft University. Contact persons: Jan Bot.
  • DLSG cluster at SARA. Contact persons: Tom Visser, Coen Schrijvers.

Surfnet supporting persons: Peter Hinrich, Nicole Gregoire and Migiel de Vos

Use Cases

TAPP pipeline

Involved laboratory and contact persons: Isthiaq Ahmad, Peter Horvatovich (Analytical Biochemistry, University of Groningen) and Frank Suits (IBM Watson Center)
Short description of the pipeline: TAPP is abbreviation of Threshold Avoiding Proteomics Pipeline developed by Frank Suits scientists working at IBM Watson Center. TAPP is preprocessing pipeline for label-free LC-MS proteomics data of multiple chromatograms, and provide a quantitative peak matrix amenable for statistical analysis. It consist of for tools:

  1. Meshing is reconstructing the raw data in grid with constant resolution.
  2. Centroid is performing peak detection and quantification and provide a quantitative peak lists.
  3. Warp2D time alignment tool correct for non linear retention time shifts between peaks of different chromatograms.
  4. Metamatch tool is clustering the same peaks in multiple chromatograms resulting the quantitative peak matrix.

Datasets:

  • 7*5=35 files of qTOF single stage MS data in line mode (~30 MB/files and ~1.050 GB in total) obtained from porcine CSF spiked at various concentration of horse heart Cytochrom C. The expected size of all output file is ~1 GB.
  • 7*5=35 files of qTOF single stage MS data in profile mode (25 GB/files and 1.050 TB in total)obtained from porcine CSF spiked at various concentration of horse heart Cytochrom C. The expected size of all output file is ~1 GB.

Testing methods: Testing will be performed using TAPP pipeline and the two datasets with the following two methods:

  • raw starting files at RUG DLSG cluster and processing at XXX cluster using the complete TAPP pipeline.
  • raw starting files at RUG DLSG cluster and processing at other locations having the dynamic lightpath connection with work load distributed between the different sites.

Implementation: This use case will use TAPP pipeline implemented in DAF and will using DRAC API to switch on and off dynamic lightpath during file transfer between DLSG clusters. Main programmer: Isthiaq Ahmad.

Meetings

Skype meeting on 10 May 2011

Action points:

  • I will describe our use case with data in NBIC Wiki (done)
  • Peter Hinrich will subject sites to perform testing of DLP in DLSG
  • Isthiaq will start implementing DLP in DAF and will use TAPP pipeline as use(test) case.
  • Jan Bot will submit application for EYR 3
  • Isthiaq Ahmad will start (or incorporate into DAF project) a trac project for DLP implementation