BioWiseInformationManagement2009
Managing life science information
PhD course Managing life science information
- Target audience
- Bioinformatics PhD students
- Lecturers
- Ammar Benabdelkader, Peter Boncz, Andrew Gibson, Frank van Harmelen, Iwan Herman, M. Scott Marshall, Barend Mons, Marco Roos, Morris Swertz
- Guest lectures
- Carole Goble, Katy Wolstencroft
- Coordinators
- M. Scott Marshall, Marco Roos
- Date
- 25-29 May 2009
- Location
- Informatics Institute, F0.09, Science Park Amsterdam, the Netherlands
- Limitations
- For participants without their own laptop with wifi we have limited hands-on facilities.
- Lecture points
- 3EC (3 weeks)
Motivation
Considering the complexity of biological systems it is not surprising that the management of life science information is one of the most challenging aspects of bioinformatics. For example, (medical) biologists have compiled over 17 million papers, and well over a thousand databases are known. However, a large number of information resources end up on an already formidable 'data graveyard'. Following this course can help you prevent your data or your information management system to have the same fate.
Target audience
If you would like to learn about how to perform powerful and flexible data and information management for your bioinformatics application, and how to work with data in distributed databases or through Web Services and workflows, or how a Web2.0 approach can help you reach out to users and leverage their contribution, then this course is for you. We assume a basic understanding of (relational) databases and programming.
Course description
This course introduces modern techniques for the management of life science data and knowledge for bioinformatics applications. Students will gain insight into Semantic Web languages and tools, federated databases, (Taverna) workflows, and Web2.0. After following this course students should be able to start creating their first applications based on these technologies or make more informed design decisions for their current application.
Programme
Overview
- Day 1 and 2 - Knowledge-based information management
- Where you will
- learn about how the Semantic Web languages and tools can be used to manage biological data 'intelligently'
- acquire some hands-on experience with these languages and tools
- know what OWL and RDF mean and why they exist
- learn about community-based science
- Where you will
- Day 3 - Database workhorses
- Where you will
- learn about how to use relational databases for managing heterogeneous and distributed data
- learn how laboratory information can be realistically managed, example: MolGenis
- get hands-on experience with postgreSQL and MolGenis
- Where you will
- Day 4 - Taverna and web services for collaborative data integration
- Where you will
- get a full tutorial on applying Taverna to implement data integration pipelines
- get hands-on experience with Taverna
- Where you will
- Day 5 - Hands-on Semantic Data integration
- Where you will deploy what you have learned on your own application or on an example case, with experts present
- Day 5 + two weeks - Practical assignment
- Where you will work on a case to enjoy your new skills at information management (more details follow)
- We will celebrate your skills during a mini-symposium and drinks at the very end of the course
Agenda
- Monday (F0.09)
- 9.00 Reception
- 9.30 Welcome by Scott Marshall and Marco Roos
- 10.30 Break
- 11.00 Why Semantic Web for e-bioScience? by Frank van Harmelen
- 12.30 Lunch
- 13.30 Managing distributed data by Ammar Benabdelkader
- 15.00 Break
- 15.30 A million minds to manage life science information .pptx by Barend Mons
- 17.00 Welcome Reception (Drinks!)
- Tuesday (F0.09)
- 9.00 Introduction to Semantic Web languages and tools by Ivan Herman
- 10.30 Break
- 11.00 Introduction to Semantic Web languages and tools (cont.) by Ivan Herman
- 12.30 Lunch
- 13.30 Creating biological ontologies for applications by Andrew Gibson
- 15.00 Coffee & Tea
- 15.30 Semantic Web hands-on session
- Wednesday (F0.09/e-BioLab)
- 9.30 'Practical laboratory information management' by Morris Swertz (F0.09)
- 10.30 Break
- 11.00 Querying XML-based resources by Peter Boncz (F0.09)
- 12.30 Lunch
- 13.30 'Practical laboratory information management' by Morris Swertz (e-BioLab)
- 15.00 Coffee & Tea
- 15.30 Hands on with MolGenis by Morris Swertz (e-BioLab)
- XX.XX 'Managing distributed data (cont.)' by Ammar Benabdelkader (e-BioLab, tentative)
- Thursday (e-BioLab)
- 9.00 'Introduction to Taverna' by Katy Wolstencroft
- 10.30 Coffee and tea
- 11.00 'Taverna tutorial' by Katy Wolstencroft
- 12.30 Lunch
- 13.30 Hands on with Taverna
- 15.00 Coffee and Tea
- 15.30 Hands on with Taverna cont.
- Friday (e-BioLab/F0.09)
- 9.00 'Health Care and Life Science interest group of W3C: Applications and Practice' by M. Scott Marshall (e-BioLab)
- Hands-on with experts (e-BioLab)
- 12.30 Lunch + Hands-on continued (e-BioLab)
- 13.30 'The Seven Sins of Bioinformtics' by Carole Goble (F0.09)
- 15.00 Discussion and wrap-up (F0.09)
- 16.00 Two-week practical assignments, closure, and drinks (F0.09)
- Friday June 12 - Minisymposium Information management in Life Science (F0.09 - tentative)
Digital support
For collecting and sharing results of the hands-on sessions students are requested to sign up to myExperiment.org and join the BioWiseInformationManagement2009 group. Students and lecturers are member of the Google group BioWiseInformationManagement2009 for sharing documents, in particular for the two-week assignment following the week of lectures. (Follow the links to sign up and view the groups.)
Recommended Software
We would appreciate it if you could bring your own laptop. The software that we will use in the course includes:
- Protégé 4
- PostgreSQL
- MolGenis
- Taverna1.7.1 NB Look out for Taverna 2.1beta on the same site; it is close to being released.
- e-DBI
- SWObjects enables federation of SPARQL (also to SQL)
Please consider installing these software tools. If applicable also bring an example of a database you are working on so you can use it in the hands-on sessions (e.g. wrap your MySQL database using MOLGENIS). Don't worry if you have any trouble installing; we will help especially with software that is critical for the course.
Example Data, Applications, and Lab Practicals
Practice with OWL:
- Protege Guide (.doc)
- Ontology assignment(.doc)
Semantic Web Data Integration (UCSC ENCODE Application):
Practice with MOLGENIS:
- MOLGENIS practical guide (.pdf)
- Code for Address Book exercises (optional, it is also in the pdf)
- Code for Biomaterial exercises (optional, it is also in the pdf)
Recommended Reading
- The Semantic Web for the Working Ontologist Book
- The Semantic Web Primer Book
- A Journey to Semantic Web Query Federation in Life Sciences accepted, BMC Bioinformatics
- Towards a cyberinfrastructure for the biological sciences: progress, visions and challenges, by Lincoln Stein Nature Reviews
- Automation of in-silico data analysis processes through workflow management systems, Paolo Romano Briefings in Bioinformatics
- Beyond standardization: dynamic software infrastructures for systems biology, Morris Swertz and Ritsert Jansen Nature Reviews
- Calling on a million minds for community annotation in WikiProteins, Barend Mons et al. BMC Bioinformatics
- Pharmas Nudge Semantic Web Technology Toward Practical Drug Discovery Applications GenomeWeb article
- Concept Web Alliance Hits Ground Running in Bid to Harness Semantic Web for Life Sciences GenomeWeb article
Related Links
The W3C Semantic Web Health Care and Life Sciences Interest Group
More information and registration
Please visit http://www.nbic.nl/biowise/school/EduProg/InfoMan09/ or contact M. Scott Marshall or Marco Roos