Proteomics:Gaining Momentum Meeting on 2010-01-15
Biomolecular Mass Spectrometry and Proteomics Group, Hugo R. Kruyt Building, Padualaan 8, 3584CA Utrecht, NL, room Z612
10:00 – 14:00, lunch included
- Rob Hooft, NBIC
- Bas van Breukelen, UU
- Perry Moerland
- Péter Horvatovich, RUG, DAF
- Silvia Olabarriaga
- Antoine van Kampen
- User:Huub Hoefsloot
- Twan America, WUR
- Carsten Byrman
- Morris Swertz, Introduction to MolGenis
- George Byelas
- Henk van den Toorn
- User:Andrew Stubbs, EMC
- We really need to come up with a list of short and long term deliverables.
- An inventory of what we currently have. What we all can use already or needs some adaptation. What needs to be done to make it useable for everyone.
- What programming technologies do we use and are we going to use
- Who is responsible for the each deliverable.
- Commitment to start using each other stuff.
- Come up with a good working showcase application. Preferable something that has a contribution of a participants. And commit to it.
- Introduction (in name as in person he cannot attend this day) of Dmitry who will together with me "trekken" this bioinformatics theme. He will focus on the technical part (programming/programmers/deliverables) mainly whereas I wills tart focusing more on the management of the whole project.
- Rob starts with introducing Dmitry
Dmitry is going to work together with Bas as technical leader of the proteomics taskforce. The idea is that Dmitry will take care of the technical part of the project. Where is additional programming expertise needed, is the documentation OK. How to get things on production level etc etc. He will visit the groups and tries to get an clear overview of the deliverables/standards/frameworks used.
- Bas points out that this year (2010) more collaboration is needed as well as interaction between groups to make the platform a success. Therefore it is desirable that the taskforce is going to meet more regularly like every once in the 4-8 weeks.
- Morris and Péter describe and show their implementation of DAF/Molgenis with a time-alignment algorithm. It is suggested to used this as the framework for the proteomics platform as it is versatile and can run both on GRID as well as on clusters. Péter and Morris propose that each group will write up their deliverables and workflows to create an overview "deliverables" that need to be implemented on/in DAF.
- TODO: Morris and Péter will create a template for describing tools and workflows and provide an example.
- Andrew (Rotterdam) points to a chicken and egg problem. The problem is tools are ready but need a framework. Or are written in a local framework and need to be changed to the DAF framework. Either way, you need DAF installed preferred locally on a test machine.
- TODO: Andrew and Morris will work together in getting DAF and Rotterdam workflow implemented. (Andrew you still need to fill out the forms).
- Andrew also points out that he is working on the end of the "pipeline" and cannot always wait until the beginning is completed. This should (after some debate) not be a problem. What is agreed upon is that some data standards need to be set. For one, RAW ms-data (straight from the machine) needs to be converted to mzML the current HUPO-PSI standard for MS(MS) data. This is the entry point for the complete pipeline. Next it was agreed that a standard for peak table/quantitative data is needed. A sub taskforce was created (Twan, Péter, Andrew) to think of a good data format.
- TODO: within six weeks Péter will create a draft version of such a peak/feature table document that can accommodate data from search engines, labeled/unlabeled quantitative data/ peaks etc.
- Finally it was agreed that one sample pipeline should be implemented first (this drawing will be sent around soon). Also three datasets are going to be made available. Bas, Péter and ....? are going to provide this.
- Mailinglist: Rob will make one (done)
- Website: templates will be filled in in 3 weeks from now (2 weeks to fill in the forms). We'll use the NBIC wiki for now. (Done, Dmitry will take lead in here)
- It would be handy to have a test box hosted maybe by NBIC so people can test tools developed by others. Rob will investigate if NBIC can provide the hosting (Corra, DAF (is on SARA), Galaxy (also for Proteomics))
- Example data (mzML) -> (mzML) peak detection (peak list) -> (peak list) alignment (skipped for labeled data) (alignment table / EMRT / APML?) -> (peaks or identified peptides) Stats / differential expression () -> () Biological Networks.
- Example data (mzML) -> (mzML) search engine OMSSA will be easiest to use as it's already implemented by SARA on the GRID - and/or - DeNovo identification (pepXML) -> (pepXML) hooks into peak detection / alignment.
- Twan will arrange a draft for the data format that is the input for the Stats (deadline 6 weeks from now)
- Andrew/Perry will arrange a draft for the data format for the features that come out of the Stats and will be the input for Biological Network analysis.
Bas: reminder NPC meeting 16 Feb. There will be a dedicated bioinformatics corner. Please register a poster and if possible do a demo. Bas will arrange internet.
Rob: same for NBIC meeting 29-30 March.
Next meeting Friday, March 5,10:00, Kruyt building Utrecht.