From BioAssist
Revision as of 12:47, 27 May 2009 by Mswertz (Talk | contribs)

Jump to: navigation, search

Managing life science information

PhD course Managing life science information

Target audience
Bioinformatics PhD students
Ammar Benabdelkader, Peter Boncz, Andrew Gibson, Frank van Harmelen, Iwan Herman, M. Scott Marshall, Barend Mons, Marco Roos, Morris Swertz
Guest lectures
Carole Goble, Katy Wolstencroft
M. Scott Marshall, Marco Roos
25-29 May 2009
Informatics Institute, F0.09, Science Park Amsterdam, the Netherlands
For participants without their own laptop with wifi we have limited hands-on facilities.
Lecture points
3EC (3 weeks)


Considering the complexity of biological systems it is not surprising that the management of life science information is one of the most challenging aspects of bioinformatics. For example, (medical) biologists have compiled over 17 million papers, and well over a thousand databases are known. However, a large number of information resources end up on an already formidable 'data graveyard'. Following this course can help you prevent your data or your information management system to have the same fate.

Target audience

If you would like to learn about how to perform powerful and flexible data and information management for your bioinformatics application, and how to work with data in distributed databases or through Web Services and workflows, or how a Web2.0 approach can help you reach out to users and leverage their contribution, then this course is for you. We assume a basic understanding of (relational) databases and programming.

Course description

This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. Students will gain insight into Semantic Web languages and tools, federated databases, (Taverna) workflows, and Web2.0. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.



  • Day 1 and 2 - Knowledge-based information management
    • Where you will
      • learn about how the Semantic Web languages and tools can be used to manage biological data 'intelligently'
      • acquire some hands-on experience with these languages and tools
      • know what OWL and RDF mean and why they exist
      • learn about community-based science
  • Day 3 - Database workhorses
    • Where you will
      • learn about how to use relational databases for managing heterogeneous and distributed data
      • learn how laboratory information can be realistically managed, example: MolGenis
      • get hands-on experience with postgreSQL and MolGenis
  • Day 4 - Taverna and web services for collaborative data integration
    • Where you will
      • get a full tutorial on applying Taverna to implement data integration pipelines
      • get hands-on experience with Taverna
  • Day 5 - Hands-on Semantic Data integration
    • Where you will deploy what you have learned on your own application or on an example case, with experts present
  • Day 5 + two weeks - Practical assignment
    • Where you will work on a case to enjoy your new skills at information management (more details follow)
    • We will celebrate your skills during a mini-symposium and drinks at the very end of the course


  • Thursday (e-BioLab)
    • 9.00 'Introduction to Taverna' by Katy Wolstencroft
    • 10.30 Coffee and tea
    • 11.00 'Taverna tutorial' by Katy Wolstencroft
    • 12.30 Lunch
    • 13.30 Hands on with Taverna
    • 15.00 Coffee and Tea
    • 15.30 Hands on with Taverna cont.
  • Friday (e-BioLab/F0.09)
    • 9.00 'Health Care and Life Science interest group of W3C: Applications and Practice' by M. Scott Marshall (e-BioLab)
    • Hands-on with experts (e-BioLab)
    • 12.30 Lunch + Hands-on continued (e-BioLab)
    • 13.30 'The Seven Sins of Bioinformtics' by Carole Goble (F0.09)
    • 15.00 Discussion and wrap-up (F0.09)
    • 16.00 Two-week practical assignments, closure, and drinks (F0.09)
  • Friday June 12 - Minisymposium Information management in Life Science (F0.09 - tentative)

Digital support

For collecting and sharing results of the hands-on sessions students are requested to sign up to and join the BioWiseInformationManagement2009 group. Students and lecturers are member of the Google group BioWiseInformationManagement2009 for sharing documents, in particular for the two-week assignment following the week of lectures. (Follow the links to sign up and view the groups.)

Recommended Software

We would appreciate it if you could bring your own laptop. The software that we will use in the course includes:

Protégé 4
Taverna1.7.1 NB Look out for Taverna 2.1beta on the same site; it is close to being released.
SWObjects enables federation of SPARQL (also to SQL)

Please consider installing these software tools. If applicable also bring an example of a database you are working on so you can use it in the hands-on sessions (e.g. wrap your MySQL database using MOLGENIS). Don't worry if you have any trouble installing; we will help especially with software that is critical for the course.

Example Data, Applications, and Lab Practicals

Practice with OWL:

Protege Guide (.doc)
Ontology assignment(.doc)

Semantic Web Data Integration (UCSC ENCODE Application):

Original web page description of SWEDI
Web page accompanying journal article about SWEDI
Relevant Data for SWEDI

Practice with MOLGENIS:

MOLGENIS practical guide (.pdf)

Recommended Reading

The Semantic Web for the Working Ontologist Book
The Semantic Web Primer Book
A Journey to Semantic Web Query Federation in Life Sciences accepted, BMC Bioinformatics
Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges, by Lincoln Stein Nature Reviews
Automation of in-silico data analysis processes through workflow management systems, Paolo Romano Briefings in Bioinformatics
Beyond standardization: dynamic software infrastructures for systems biology, Morris Swertz and Ritsert Jansen Nature Reviews
Calling on a million minds for community annotation in WikiProteins, Barend Mons et al. BMC Bioinformatics
Pharmas Nudge Semantic Web Technology Toward Practical Drug Discovery Applications GenomeWeb article
Concept Web Alliance Hits Ground Running in Bid to Harness Semantic Web for Life Sciences GenomeWeb article

Related Links

The W3C Semantic Web Health Care and Life Sciences Interest Group

More information and registration

Please visit or contact M. Scott Marshall or Marco Roos