Proteomics:Programmers Meeting on 2010-01-26

From BioAssist
Jump to: navigation, search

Location and time

Antonius Deusinglaan 1, 9713 AV Groningen, Building 3211, room 0258

11:00 - 18:00



  • Data Analysis Framework (subsequently will be referred as DAF) source code investigation & general organization point:
    • module separation
    • package naming
    • extensibility requirements
    • security & authentication
  • Learning how to compile and deploy Data Analysis Framework.
  • Configuring DAF, running jobs.
  • Exploring workflows typical for RUG.
  • Documentation to DAF and provide tutorial
  • Make development priorities
  • Discuss the TavernaGrid plugin issue

Minutes notes

  • How to make a tool inventory: ask users to fill in spreadsheets or type directly at wiki page. Conclusion: send a spreadsheet.
    • Whether the data about tool usage is compatible with proposed XSD model? It is complete? It should be finally converted manually to the data models used by Data Analysis Framework (DAF) to integrate tools. Further information can be requested later on if this is required by the future extension of DAF data models.
  • Peter explained the typical workflow:
    1. Noise filtering & smoothing
    2. Peak detection & quantification
    3. Time alignment
    4. Peak matching
  • Warp2D application note as firts service of DAF and the Platform should be published ASAP (1st May). Hosting problems. NBIC is delaying to provide any services. No estimations are set up till now.
    • DAF should be initially hosted at RUG.
  • Concerns about how to migrate from TavernaGrid to DAF.
  • Minor source code introduction
    • DAF uses Job Description Language (JDL) to submit tasks to GRID, support from Condor library
    • DAF uses Topos layer to execute jobs more reliably
  • Priority of issues
  • Next programmers meeting

Action points


  1. Tool inventory (done on 28 January 2010)
    • Finalize the spreadsheet (remove workflow-related tabs, call the section "use case" or "example")
    • Setup publically available FTP server with read/write access for time user.
    • Upload the spreadsheet to the root FTP folder
    • Create initial directory structure
    • Upload the two sample open-access tools to show what is required for tool description
    • Upload the input/output example data files and describe example parameters
    • Send email to the mail list with the complete instruction list of how to processed, for example
      • Download the form from FTP server ftp://NNN using the credentials abc:bde
      • Fill in the form template
      • Create the folder for each tool
      • Upload the tool and all dependent libraries to the folder
      • For each tool use case, create a subfolder and upload the test data files that subfolder (see the provided file structure as a reference)
  2. For each reported tool, check that you can execute it, and it produces the same output file for the given input file.
  3. Register the domain for the platform
  4. Organize temporary hosting in Groningen
  5. Deploy Warp2D client application to that hosting
  6. Do other preparations for the publication


  1. Put two example tools descriptions from RUG into wiki
  2. Implement one workflow for DAF using these tools
  3. Improve the proteomics page (done):
    • Rename it to "Proteomics platform"
    • Add workpackages and describe the goals of the Platform with help of Bas van Breukelen
    • Add PIs and programmers for each work package and who is doing what
  4. Write a slide, showing the difference in DAF and Taverna approaches. Show this slide to Andrew Stubbs and get his feedback on how to migrate (done, see TavernaGrid workflow processing and DAF workflow processing).
  5. Authentication module: Investigate what are the possible solutions for simple HTTP authentication (done).
  6. Investigate, what are the possible ways to create workflows (JOpera, P-GRADE, Taverna, Galaxy, other?) Gather requirements (ask Andrew Stubbs to provide a complicated workflow). Make a comparison matrix.
  7. Organize the programmers meeting before next proteomics meeting (done).


  1. Provide the list of modules, into which DAF can be slitted
  2. Split the project into modules.
  3. Provide the list of potential extensions of DAF. The list should contain such improvements, which are beyond the TODO obligatory list for DAF.


  1. Approve that the information to be provided by a spreadsheet is enough to translate into proposed XML tool model

TODO list:

  1. Start Data Analysis Framework (DAF) as Trac project ASAP
  2. Commit new project to Data Analysis Framework repository in NBIC gForge.
  3. Create Data Analysis Framework (DAF) project in gForge and redirect to Trac.
  4. Create Data Analysis Framework (DAF) wiki page.