Difference between revisions of "Proteomics:Gaining Momentum Meeting on 2010-01-15"

From BioAssist
Jump to: navigation, search
m
m
Line 12: Line 12:
 
* [[User:Bvanbreukelen|Bas van Breukelen]], [[UU]]
 
* [[User:Bvanbreukelen|Bas van Breukelen]], [[UU]]
 
* [[User:Perry|Perry Moerland]]
 
* [[User:Perry|Perry Moerland]]
* [[User:Horvatovich|Peter Horvatovich]], [[RUG]], [http://bars.rug.nl/download/385aa58813216b91 DAF]
+
* [[User:Horvatovich|Péter Horvatovich]], [[RUG]], [http://bars.rug.nl/download/385aa58813216b91 DAF]
 
* [[User:SilviaOlabarriaga|Silvia Olabarriaga]]
 
* [[User:SilviaOlabarriaga|Silvia Olabarriaga]]
 
* [[User:AntoineVanKampen|Antoine van Kampen]]
 
* [[User:AntoineVanKampen|Antoine van Kampen]]
Line 37: Line 37:
 
* [[User:Rob Hooft|Rob]] starts with introducing [[User:Dmitry Katsubo|Dmitry]]<br>[[User:Dmitry Katsubo|Dmitry]] is going to work together with [[User:Bvanbreukelen|Bas]] as technical leader of the proteomics taskforce. The idea is that [[User:Dmitry Katsubo|Dmitry]] will take care of the technical part of the project. Where is additional programming expertise needed, is the documentation OK. How to get things on production level etc etc. He will visit the groups and tries to get an clear overview of the deliverables/standards/frameworks used.
 
* [[User:Rob Hooft|Rob]] starts with introducing [[User:Dmitry Katsubo|Dmitry]]<br>[[User:Dmitry Katsubo|Dmitry]] is going to work together with [[User:Bvanbreukelen|Bas]] as technical leader of the proteomics taskforce. The idea is that [[User:Dmitry Katsubo|Dmitry]] will take care of the technical part of the project. Where is additional programming expertise needed, is the documentation OK. How to get things on production level etc etc. He will visit the groups and tries to get an clear overview of the deliverables/standards/frameworks used.
 
* [[User:Bvanbreukelen|Bas]] points out that this year (2010) more collaboration is needed as well as interaction between groups to make the platform a success. Therefore it is desirable that the taskforce is going to meet more regularly like every once in the 4-8 weeks.
 
* [[User:Bvanbreukelen|Bas]] points out that this year (2010) more collaboration is needed as well as interaction between groups to make the platform a success. Therefore it is desirable that the taskforce is going to meet more regularly like every once in the 4-8 weeks.
* [[User:Mswertz|Morris]] and [[User:Horvatovich|Peter]] describe and show their implementation of [[DAF]]/[[Molgenis]] with a time-alignment algorithm. It is suggested to used this as the framework for the proteomics platform as it is versatile and can run both on GRID as well as on clusters. [[User:Horvatovich|Peter]] and [[User:Mswertz|Morris]] propose that each group will write up their deliverables and workflows to create an overview "deliverables" that need to be implemented on/in [[DAF]].
+
* [[User:Mswertz|Morris]] and [[User:Horvatovich|Péter]] describe and show their implementation of [[DAF]]/[[Molgenis]] with a time-alignment algorithm. It is suggested to used this as the framework for the proteomics platform as it is versatile and can run both on [[GRID]] as well as on clusters. [[User:Horvatovich|Péter]] and [[User:Mswertz|Morris]] propose that each group will write up their deliverables and workflows to create an overview "deliverables" that need to be implemented on/in [[DAF]].
* TODO: [[User:Mswertz|Morris]] and [[User:Horvatovich|Peter]] will create a template for describing tools and workflows and provide an example.
+
* TODO: [[User:Mswertz|Morris]] and [[User:Horvatovich|Péter]] will create a template for describing tools and workflows and provide an example.
 
* Andrew (Rotterdam) points to a chicken and egg problem. The problem is tools are ready but need a framework. Or are written in a local framework and need to be changed to the [[DAF]] framework. Either way, you need DAF installed preferred locally on a test machine.
 
* Andrew (Rotterdam) points to a chicken and egg problem. The problem is tools are ready but need a framework. Or are written in a local framework and need to be changed to the [[DAF]] framework. Either way, you need DAF installed preferred locally on a test machine.
 
* TODO: Andrew and [[User:Mswertz|Morris]] will work together in getting [[DAF]] and Rotterdam workflow implemented. (Andrew you still need to fill out the forms).
 
* TODO: Andrew and [[User:Mswertz|Morris]] will work together in getting [[DAF]] and Rotterdam workflow implemented. (Andrew you still need to fill out the forms).
* Andrew also points out that he is working on the end of the "pipeline" and cannot always wait until the beginning is completed. This should (after some debate) not be a problem. What is agreed upon is that some data standards need to be set. For one, RAW ms-data (straight from the machine) needs to be converted to [[mzML]] the current HUPO-PSI standard for MS(MS) data. This is the entry point for the complete pipeline. Next it was agreed that a standard for peak table/quantitative data is needed. A sub taskforce was created ([[User:TwanAmerica|Twan]], [[User:Horvatovich|Peter]], Andrew) to think of a good data format.
+
* Andrew also points out that he is working on the end of the "pipeline" and cannot always wait until the beginning is completed. This should (after some debate) not be a problem. What is agreed upon is that some data standards need to be set. For one, RAW ms-data (straight from the machine) needs to be converted to [[mzML]] the current HUPO-PSI standard for MS(MS) data. This is the entry point for the complete pipeline. Next it was agreed that a standard for peak table/quantitative data is needed. A sub taskforce was created ([[User:TwanAmerica|Twan]], [[User:Horvatovich|Péter]], Andrew) to think of a good data format.
* TODO: within six weeks [[User:Horvatovich|Peter]] will create a draft version of such a peak/feature table document that can accommodate data from search engines, labeled/unlabeled quantitative data/ peaks etc.
+
* TODO: within six weeks [[User:Horvatovich|Péter]] will create a draft version of such a peak/feature table document that can accommodate data from search engines, labeled/unlabeled quantitative data/ peaks etc.
* Finally it was agreed that one sample pipeline should be implemented first (this drawing will be sent around soon). Also three datasets are going to be made available. [[User:Bvanbreukelen|Bas]], [[User:Horvatovich|Peter]] and ....? are going to provide this.
+
* Finally it was agreed that one sample pipeline should be implemented first (this drawing will be sent around soon). Also three datasets are going to be made available. [[User:Bvanbreukelen|Bas]], [[User:Horvatovich|Péter]] and ....? are going to provide this.
  
 
=== Action points ===
 
=== Action points ===
Line 49: Line 49:
 
# Mailinglist: Rob will make one (''done'')
 
# Mailinglist: Rob will make one (''done'')
 
# Website: templates will be filled in in 3 weeks from now (2 weeks to fill in the forms). We'll use the [[NBIC]] wiki for now. (Done, Dmitry will take lead in here)
 
# Website: templates will be filled in in 3 weeks from now (2 weeks to fill in the forms). We'll use the [[NBIC]] wiki for now. (Done, Dmitry will take lead in here)
# It would be handy to have a test box hosted maybe by [[NBIC]] so people can test tools developed by others. Rob will investigate if [[NBIC]] can provide the hosting (Corra, [[DAF]] (is on SARA), Galaxy (also for Proteomics))
+
# It would be handy to have a test box hosted maybe by [[NBIC]] so people can test tools developed by others. Rob will investigate if [[NBIC]] can provide the hosting ([[Corra]], [[DAF]] (is on [[SARA]]), [[Galaxy]] (also for [[Proteomics]]))
# Example data ([[mzML]]) -> ([[mzML]]) peak detection (peak list) -> (peak list) alignment (skipped for labeled data) (alignment table / EMRT / APML?) -> (peaks or identified peptides) Stats / differential expression () -> () Biological Networks.
+
# Example data ([[mzML]]) -> ([[mzML]]) peak detection (peak list) -> (peak list) alignment (skipped for labeled data) (alignment table / [[EMRT]] / [[APML]]?) -> (peaks or identified peptides) Stats / differential expression () -> () Biological Networks.
# Example data ([[mzML]]) -> ([[mzML]]) search engine OMMSA will be easiest to use as it's already implemented by [[SARA]] on the Grid - and/or - DeNovo identification ([[pepXML]]) -> ([[pepXML]]) hooks into peak detection / alignment.
+
# Example data ([[mzML]]) -> ([[mzML]]) search engine [[OMSSA]] will be easiest to use as it's already implemented by [[SARA]] on the [[GRID]] - and/or - DeNovo identification ([[pepXML]]) -> ([[pepXML]]) hooks into peak detection / alignment.
 
# [[User:TwanAmerica|Twan]] will arrange a draft for the data format that is the input for the Stats (deadline 6 weeks from now)
 
# [[User:TwanAmerica|Twan]] will arrange a draft for the data format that is the input for the Stats (deadline 6 weeks from now)
 
# Andrew/Perry will arrange a draft for the data format for the features that come out of the Stats and will be the input for Biological Network analysis.
 
# Andrew/Perry will arrange a draft for the data format for the features that come out of the Stats and will be the input for Biological Network analysis.

Revision as of 22:13, 8 March 2010

Location

Biomolecular Mass Spectrometry and Proteomics Group, Hugo R. Kruyt Building, Padualaan 8, 3584CA Utrecht, NL, room Z612

Time

10:00 – 14:00, lunch included

Presenters

Agenda

  1. We really need to come up with a list of short and long term deliverables.
  2. An inventory of what we currently have. What we all can use already or needs some adaptation. What needs to be done to make it useable for everyone.
  3. What programming technologies do we use and are we going to use
  4. Who is responsible for the each deliverable.
  5. Commitment to start using each other stuff.
  6. Come up with a good working showcase application. Preferable something that has a contribution of a participants. And commit to it.
  7. Introduction (in name as in person he cannot attend this day) of Dmitry who will together with me "trekken" this bioinformatics theme. He will focus on the technical part (programming/programmers/deliverables) mainly whereas I wills tart focusing more on the management of the whole project.

Minutes notes

  • Rob starts with introducing Dmitry
    Dmitry is going to work together with Bas as technical leader of the proteomics taskforce. The idea is that Dmitry will take care of the technical part of the project. Where is additional programming expertise needed, is the documentation OK. How to get things on production level etc etc. He will visit the groups and tries to get an clear overview of the deliverables/standards/frameworks used.
  • Bas points out that this year (2010) more collaboration is needed as well as interaction between groups to make the platform a success. Therefore it is desirable that the taskforce is going to meet more regularly like every once in the 4-8 weeks.
  • Morris and Péter describe and show their implementation of DAF/Molgenis with a time-alignment algorithm. It is suggested to used this as the framework for the proteomics platform as it is versatile and can run both on GRID as well as on clusters. Péter and Morris propose that each group will write up their deliverables and workflows to create an overview "deliverables" that need to be implemented on/in DAF.
  • TODO: Morris and Péter will create a template for describing tools and workflows and provide an example.
  • Andrew (Rotterdam) points to a chicken and egg problem. The problem is tools are ready but need a framework. Or are written in a local framework and need to be changed to the DAF framework. Either way, you need DAF installed preferred locally on a test machine.
  • TODO: Andrew and Morris will work together in getting DAF and Rotterdam workflow implemented. (Andrew you still need to fill out the forms).
  • Andrew also points out that he is working on the end of the "pipeline" and cannot always wait until the beginning is completed. This should (after some debate) not be a problem. What is agreed upon is that some data standards need to be set. For one, RAW ms-data (straight from the machine) needs to be converted to mzML the current HUPO-PSI standard for MS(MS) data. This is the entry point for the complete pipeline. Next it was agreed that a standard for peak table/quantitative data is needed. A sub taskforce was created (Twan, Péter, Andrew) to think of a good data format.
  • TODO: within six weeks Péter will create a draft version of such a peak/feature table document that can accommodate data from search engines, labeled/unlabeled quantitative data/ peaks etc.
  • Finally it was agreed that one sample pipeline should be implemented first (this drawing will be sent around soon). Also three datasets are going to be made available. Bas, Péter and ....? are going to provide this.

Action points

  1. Mailinglist: Rob will make one (done)
  2. Website: templates will be filled in in 3 weeks from now (2 weeks to fill in the forms). We'll use the NBIC wiki for now. (Done, Dmitry will take lead in here)
  3. It would be handy to have a test box hosted maybe by NBIC so people can test tools developed by others. Rob will investigate if NBIC can provide the hosting (Corra, DAF (is on SARA), Galaxy (also for Proteomics))
  4. Example data (mzML) -> (mzML) peak detection (peak list) -> (peak list) alignment (skipped for labeled data) (alignment table / EMRT / APML?) -> (peaks or identified peptides) Stats / differential expression () -> () Biological Networks.
  5. Example data (mzML) -> (mzML) search engine OMSSA will be easiest to use as it's already implemented by SARA on the GRID - and/or - DeNovo identification (pepXML) -> (pepXML) hooks into peak detection / alignment.
  6. Twan will arrange a draft for the data format that is the input for the Stats (deadline 6 weeks from now)
  7. Andrew/Perry will arrange a draft for the data format for the features that come out of the Stats and will be the input for Biological Network analysis.

Important dates:

Bas: reminder NPC meeting 16 Feb. There will be a dedicated bioinformatics corner. Please register a poster and if possible do a demo. Bas will arrange internet.

Rob: same for NBIC meeting 29-30 March.

Next meeting Friday, March 5,10:00, Kruyt building Utrecht.