BioWiseInformationManagement2009
Contents
Managing life science information
PhD course Managing life science information
- Target audience
- Bioinformatics PhD students
- Lecturers
- Ammar Benabdelkader, Peter Boncz, Andrew Gibson, Frank van Harmelen, Iwan Herman, M. Scott Marshall, Barend Mons, Marco Roos, Morris Swertz
- Guest lectures
- Carole Goble, Katy Wolstencroft
- Coordinators
- M. Scott Marshall, Marco Roos
- Date
- 25-29 May 2009
- Location
- Informatics Institute, F0.09, Science Park Amsterdam, the Netherlands
- Limitations
- For participants without their own laptop with wifi we have limited hands-on facilities.
- Lecture points
- 3EC (3 weeks)
Motivation
Considering the complexity of biological systems it is not surprising that the management of life science information is one of the most challenging aspects of bioinformatics. For example, (medical) biologists have compiled over 17 million papers, and well over a thousand databases are known. However, a large number of information resources end up on an already formidable 'data graveyard'. Following this course can help you prevent your data or your information management system to have the same fate.
Target audience
If you would like to learn about how to perform powerful and flexible data and information management for your bioinformatics application, and how to work with data in distributed databases or through Web Services and workflows, or how a Web2.0 approach can help you reach out to users and leverage their contribution, then this course is for you. We assume a basic understanding of (relational) databases and programming.
Course description
This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. Students will gain insight into Semantic Web languages and tools, federated databases, (Taverna) workflows, and Web2.0. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.
Programme
Overview
- Day 1 and 2 - Knowledge-based information management
- Where you will
- learn about how the Semantic Web languages and tools can be used to manage biological data 'intelligently'
- acquire some hands-on experience with these languages and tools
- know what OWL and RDF mean and why they exist
- learn about community-based science
- Where you will
- Day 3 - Database workhorses
- Where you will
- learn about how to use relational databases for managing heterogeneous and distributed data
- learn how laboratory information can be realistically managed, example: MolGenis
- get hands-on experience with postgreSQL and MolGenis
- Where you will
- Day 4 - Taverna and web services for collaborative data integration
- Where you will
- get a full tutorial on applying Taverna to implement data integration pipelines
- get hands-on experience with Taverna
- Where you will
- Day 5 - Hands-on Semantic Data integration
- Where you will deploy what you have learned on your own application or on an example case, with experts present
- Day 5 + two weeks - Practical assignment
- Where you will work on a case to enjoy your new skills at information management (more details follow)
- We will celebrate your skills during a mini-symposium and drinks at the very end of the course
Agenda
- Monday (F0.09)
- 9.00 Reception
- 9.30 Welcome by Scott Marshall and Marco Roos
- 10.30 Break
- 11.00 'Why Semantic Web for e-bioScience?' by Frank van Harmelen
- 12.30 Lunch
- 13.30 'Managing distributed data' by Ammar Benabdelkader
- 15.00 Break
- 15.30 'A million minds to manage life science information' by Barend Mons
- 17.00 Welcome Reception (Drinks!)
- Tuesday (F0.09)
- 9.00 'Introduction to Semantic Web languages and tools' by Ivan Herman
- 10.30 Break
- 11.00 'Introduction to Semantic Web languages and tools (cont.)' by Ivan Herman
- 12.30 Lunch
- 13.30 'Creating biological ontologies for applications' by Andrew Gibson
- 15.00 Coffee & Tea
- 15.30 Semantic Web hands-on session
- Wednesday
- 9.00 'Introduction to the activities of the Health Care and Life Science interest group of W3C' by M. Scott Marshall (F0.09)
- 9.30 'Managing distributed data (cont.)' by Ammar Benabdelkader (F0.09)
- 10.30 Break
- 11.00 'Querying XML-based resources' by Peter Boncz (F0.09)
- 12.30 Lunch
- 13.30 'Practical laboratory information management' by Morris Swertz (e-BioLab)
- 15.00 Coffee & Tea
- 15.30 Hands on with MolGenis by Morris Swertz (e-BioLab)
- Thursday (e-BioLab)
- 9.00 'Introduction to Taverna' by Katy Wolstencroft
- 10.30 Coffee and tea
- 11.00 'Taverna tutorial' by Katy Wolstencroft
- 12.30 Lunch
- 13.30 Hands on with Taverna
- 15.00 Coffee and Tea
- 15.30 Hands on with Taverna cont.
- Friday
- Hands-on with experts (e-BioLab)
- 12.30 Lunch + Hands-on continued (e-BioLab)
- 13.30 'The Seven Sins of Bioinformtics' by Carole Goble (F0.09)
- 15.00 Discussion and wrap-up (F0.09)
- 16.00 Two-week practical assignments, closure, and drinks (F0.09)
- Friday June 12 - Minisymposium Information management in Life Science (F0.09 - tentative)
Recommended Software
We would appreciate it if you could bring your own laptop. The software that we will use in the course includes:
- Protégé 4
- PostgreSQL
- MolGenis
- Taverna1.7.1 NB Look out for Taverna 2.1beta on the same site; it is close to being released.
Please consider installing these software tools. If applicable also bring an example of a database you are working on so you can use it in the hands-on sessions (e.g. wrap your MySQL database using MOLGENIS). Don't worry if you have any trouble installing; we will help especially with software that is critical for the course.
Recommended Reading
- The Semantic Web for the Working Ontologist
- A Journey to Semantic Web Query Federation in Life Sciences
- Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges, by Lincoln Stein in Nature Reviews
- Automation of in-silico data analysis processes through workflow management systems, Paolo Romano in Briefings in Bioinformatics
- Beyond standardization: dynamic software infrastructures for systems biology, Morris Swerz and Ritsert Jansen, Nature Reviews
- Calling on a million minds for community annotation in WikiProteins, Barend Mons et al., BMC Bioinformatics
- Pharmas Nudge Semantic Web Technology Toward Practical Drug Discovery Applications
More information and registration
Please visit http://www.nbic.nl/biowise/school/EduProg/InfoMan09/ or contact M. Scott Marshall or Marco Roos