NGS Member Overview

From BioAssist
Jump to: navigation, search

This is a list of the people involved in the Next Generation Sequencing task force of NBIC BioAssist.

Scientific leader: Prof. Johan den Dunnen (LUMC/CMSB)

Scientific advisor: Dr. Victor Guryev (Hubrecht), Dr. Kai Ye (LUMC), Dr. Jeroen Laros (LUMC)

Project leader: Dr. Hailiang (Leon) Mei (NBIC)

Key scientists:

Associated Groups

  • Amsterdam Medical Center, Clinical Epidemiology, Biostatistics and Bioinformatics group
  • University of Amsterdam, MicroArray Department & Integrative BioInformatics Unit
  • VUMC, Clinical Genetics department
  • Erasmus Medical Center, Biomics group
  • Erasmus Medical Center, Bioinformatics group
  • Erasmus Medical Center, Complex Genetics group
  • Hubrecht Institute, Genome Biology group
  • Leiden University Medical Center, Center for Human and Clinical Genetics
  • Leiden University Medical Center, Molecular Epidemiology group
  • Nijmegen Centre for Molecular Life Sciences, Bacterial Genomics group
  • Nijmegen Centre for Molecular Life Sciences, Department of Human Genetics
  • SARA, High Perfomance Computing and Visualization
  • Technical University Delft, Bioinformatics group
  • University Medical Center Groningen, Genomics Coordination Centre
  • Wageningen University, Bioinformatics group
  • Wageningen University, Central Veterinary Institute
  • Wageningen University, Animal Breeding and Genomics Centre
  • The Netherlands Institute of Ecology
  • NKI, core sequencing facility
  • Maastricht University, Genome Center
  • KeyGene N.V., Bioinformatics group
  • BaseClear, Bioinformatics group
  • DSM, Bioinformatics group
  • RijkZwaan, Bioinformatics group
  • GeneTwister, Bioinformatics group

@Wagingen

Jan van Haarst (WUR-PRI)

Profile on networks:LinkedIn

We are currently busy with de novo assembly of

  • Tomato : Mostly 454 Ti data, plus Sanger reads and a couple of SOLiD runs (in total about 90 Gbase)
  • Potato : Mostly Illumina reads, ranging from 75-125 bp, plus Sanger reads and a few 454 runs.

We have experience with

  • newbler,
  • CABOG
  • MIRA
  • SOAPdenovo

At the moment I'm trying to get ABySS, Velvet and EULER-SR to work on the potato set.

  • Programming skills : BASH, PERL and PYTHON.
  • OS : Windows, Linux, OSX

We mostly run the software on our own cluster(s), one of which contains a machine with 256GB.

Alex Bossers (WUR-CVI)

Profile on networks:LinkedIn

Our group at CVI-WUR merely focuses on:

  • Reverse vaccinology (NGS + other omics)
  • Pathogen genome plasticity (CGH high-demnsity arrays and NGS)
  • Host-pathogen interactions (mostly micro-array based but "switching/adding NGS"
  • other PathoGenOmics including protein microarrays

Experience:

  • Programming skills: Perl/CGI, PHP, XHTML/CSS, BASH, Cheetah, JAVA, javascript, Matlab, Assembler (68k), (visual)BASIC for applications.
  • Data management: MySQL, SQlite, M$SQL/Access
  • OS: Windows 3.1->7 x64, Linux (Ubuntu x64), biolinux
  • Favorite apps: Artemis/ACT, MUMmer, PHPMyAdmin, DNAstar, Axon Acuity, putty, Xming, Notepad++, EasyEclipse for LAMP.

NGS:

  • mostly pathogen genome sequencing (bacteria and viruses)
  • starting RNA-seq for transcriptomics and starting meta-genomics for virus discovery
  • mostly Roche 454 (FLX, Ti (paired-end)) and some Illumina mate-pair (all outsourced/collaborative)

We just @WUR initialised a central production and development server dedicated to galaxy workflow management (together with PRI).

Freddy de Bree (WUR-CVI)

I used to work on the folding and molecular characteristics of secretory proteins. This brought me into the realm of structural biology. An interest for wide-scale physiological responses, however, brought me into gene expression studies and in particular microarray data analysis. This is where I expanded most of my programming and statistical skills. Currently I am sideways still working on microarray data, but mainly on NGS data for pathogen discovery.

Current Interests: Pathogen Discovery (NGS or microarray), Assembly, Mapping, Probe Design, Databases

Programming: mainly Perl, R and shell, but also MySQL and a bit of Java and Python

OS: mainly Linux-variants and Windows, used to work on a Mac

Martin Elferink (WUR-ABG)

LinkedIn profile: [[1]]

My research aims at the identification of causative variant involved in the pulmonary hypertension syndrome (PHS) in chicken. For this purpose, we re-sequenced the genomes of 16 individuals that were selected on extreme phenotypes for PHS. Re-sequencing was performed using the Illumina HiSeq 2000 platform. The project involves the detection of SNPs and structural variants within the 16 individuals. The main focus will be on genetic variants within QTL regions identified by a previous GWA study.

Research interest:

  • Detection of genetic variation, in particular structural variants.
  • Genetic variation - phenotype relation.

Software:

  • Mosaik Aligner
  • BWA
  • Stampy
  • Samtools
  • GATK
  • Picard
  • Annovar
  • Haploview

Programming language:

  • Python
  • Shell scripting


Hendrik-Jan Megens (WUR-ABGC)

Profile on LinkedIn

Senior Researcher and lecturer at Wageningen University, Animal Breeding and Genomics Centre.

Scientific interests:

Coming from a wet-lab background I discovered I had more talent for programming than for pipetting. I have moved into applied bioinformatics in the past 7 years, while retaining focus on my research interests:

  • evolutionary genomics (generation and maintenance of, and selection on, structural and single nucleotide variation; speciation and outbreeding depression; inbreeding depression and heterosis)
  • population genetics (genetic consequences of population management, domestication, selection)

Main current projects

  • Genome sequencing of pig and turkey (genetic map construction, repetitive element evolution)
  • Re-sequencing projects on various livestock species

We are currently sequencing >300 pigs, wild boar, and outgroup species. The project aims to elucidate major patterns in biogeography and domestication of the pig, resulting from selection and demography.

Procedures

  • Main sequencing platform is Illumina (We started in 2008 on Solexa GA, to currently Illumina HiSeq).
  • Depending on research questions various short-read mapping programs are used (Mosaik, BWA, BWA/Stampy, MrsFAST)
  • Variant calling is done mostly by SamTools, but we are currently investigating other software (GATK).
  • Functional analysis of variants (Annovar, customly scripted tools)
  • Various population- and phylogenomic approaches to tackle specific questions (RAxML, Beagle, coalhmm, etc., and customly scripted tools)

infrastructure

  • two 48-core (Opteron) HP Proliant machines, one with 192 and one with 512 GB RAM.
  • Storage capacity is 110TB.

Main programming languages:

  • shell scripting
  • Perl
  • Python
  • R
  • SQL

Favorite distros: Fedora/CentOS

Mattias de Hollander (NIOO-KNAW)

Profile on networks: LinkedIn[2],

Mattias de Hollander works as a 'embedded' bioinformatician at the Netherlands Institute of Ecology (NIOO) in Wageningen[3]. The NIOO is a research institute of the Royal Netherlands Academy of Arts and Sciences (KNAW) and conducts marine, terrestrial and freshwater ecological research, with the aim of elucidating how living organisms interact with each other and with their surroundings.

We are doing analysis of 454 Titanium sequencing data of 16S/18S rRNA genes and metatranscriptomes to discover the presence and function of bacteria and fungi living in different types of soils under various conditions. Analysis is mainly done using the tools incorporated in QIIME: Quantative Insights Into Microbial Ecology[4].

Currently I am working on porting the Cloud version of Galaxy to the HPC Cloud @ SARA.

Programming language / IT platforms:

  • Mostly python
  • Linux (Ubuntu), local multicore server, Lisa cluster (SARA), HPC Cloud (SARA)

@Amsterdam

Barbera van Schaik (AMC)

Profile on networks: LinkedIn[5], myExperiment[6], Bioinformatics Laboratory at AMC[7]

Sequence platforms: Roche FLX/Titanium, ABI Solid

Software we use: bwa, samtools, varscan, annovar, blat, blast, roche package, celera assembler (cabog), R, Solid RNApipeline, IGV, and many more

Sequence applications we (have) work(ed) on:

  • Basic stuff: group sequences per MID, count things, quality control, run existing analysis software
  • Splice variant detection
  • Metagenomics (virus discovery)
  • Sequence assembly and comparison of bacteria strains
  • Small RNAs
  • Re-sequencing (genome, exome)

Programming language / IT platforms:

  • Perl and shell scripting
  • Linux, Dutch grid and cloud[8], EBioInfra[9]

Aldo Jongejan (AMC)

Profile on networks: LinkedIn [10]

I started as a wet-lab chemist (growing bacteria in 100 L fermentors, isolating enzymes and measuring kinetics, synthesis of (modified) organic cofactors). Because we wanted a model of our enzyme I rolled into the computer modelling by building homology models and doing ab initio quantum mechanical calculations. I continued modelling and hacking programs/scripting as a post-doc on drug design and virtual screening for G-protein coupled receptors at the VU Amsterdam. After 3 years as a teacher in bioinformatics at the Hogeschool Leiden I joined the bioinformatics dept. of the AMC [11] to work on exome sequencing.

Programming 'skills' (I don't consider myself a programmer, more a hacker of codes :-) ):

  • unix scripting languages (awk, sed,..)
  • python
  • perl, java, C, R

OS-es:

  • MacOSX, (l)unix, windows

Experience:

  • enzyme kinetics, biochemistry, organic chemistry
  • homology modelling
  • ab initio calculations
  • drug design
  • teaching
  • next generation sequencing (mostly SOLiD)
  • ...

Mateusz Kuzak (UvA)

I am scientific programmer in MicroArray Department at Swammerdam Institute for Life Sciences, University of Amsterdam.

I am currently working on allelic variation visualization in VLPB (Virtual Lab for Plant Breeding) project. The aim of the work is to develop web based application for exploration and visualization of naturally occurring single nucleotide polymorphism (SNPs) from next-generation sequencing data. I am building the application mostly with use of JavaScript with some perl and python code.

I have experience in microscopy image analysis, python, MATLAB and little bit of Java programming. I am currently learning a lot in web programming on both clint and server side, mostly in JavaScript (jQuery, Node.js) and data visualization with excellent d3 library.

@Nijmegen

Victor de Jager (CMBI/UMCN)

Sequence applications:

  • I am currently working on a metagenomics project involving 1)gut microbiota and 2) a complex starter culture. Focus and expertise on functional diversity, strain and variation detection.
  • Snp/Indel detection and annotation
  • Sequence assembly, (de novo and resequencing) of bacteria
  • Taxonomy determination

Sequence platforms: Roche FLX/Titanium, Illumina Solexa,

Software that we use: blat (local+grid), blast (Local+grid), Roche newbler, Celera WGS Assembler, Soapsnp, Maq, Megan, RDP(Arb), FGweb, Django, GMOD Tools

Software that we develop: Robust SNP/Indel detection & annotation tool (Perl), Metagenomics annotation pipeline Using Grid version of InterproScan, Metagenomics mining tool based on HMMs

Programming language / IT platforms:

  • Programming languages: (Bio)Perl, (Bio)Python , R, Shell scripting, Taverna
  • Databases: PostgreSQL, MySQL
  • Linux, Windows, Mac

Christian Gilissen (UMCN)

We're working with the Roche/454 Titanium sequencer mainly in combination with enrichment strategies. We use this for gene identification by finding causative mutations in the DNA of (human) patients. We've set up a JAVA pipeline for automated analysis of this kind of data and the prioritization of variants. A second application is the use of these techniques in a diagnostic setting in combination with barcoding.

We also work with a SOLiD sequencer for doing exome enrichment. We're setting up a similar analysis pipeline as for the Roche, in combination with the lifetech Bioscope package.

Programming languages etc.: Java, python, Bash. Torque, mysql, linux and windows.

Experience with: Roche software (newbler), lifetech Bioscope

@Groningen

Freerk van Dijk (UMCG)

I'm currently working at the University Medical Center in Groningen in the group of Morris Swertz. We mostly do sequencing data analysis from human genetics (exome) experiments. I'm involved in developing and maintaining the analysis pipeline of the "Genoom van Nederland" project. Incorporated in this pipeline are GATK, BWA and Picard. The purpose of this analysis pipeline is to perform alignment, detect SNP's and indels.

Sequencing platforms:

  • Illumina HiSeq 2000

Programming languages & Expertise:

  • Perl, Java, Python, R, MySQL, HTML/CSS
  • Galaxy
  • Linux, Mac OSX

Pieter Neerincx (UMCG)

I'm working in the group of Morris Swertz @ UMCG

Research interests

  • Natural (sequence) variation and anything related to evolution.
  • Workflow management systems and related technology that allows wet-lab biologists to explore NGS data.

Favorite tools

  • Good ol' SRS.
  • Taverna, Galaxy and more recent Molgenis.
  • Ensembl
  • Perl, AppleScript, Python.
  • Apache and MySQL

Wil Bruins-Koetsier (UMCG)

Sequencing:

  • I work at the UMC Groningen in the group of Morris Swertz. We have a HiSeq at our disposal and have been using a Genome Analyzer IIx before. We're developing/maintaining an analysis pipeline for alignment and SNP calling (GATK, BWA, Picard), indels will follow. I mainly work on our in house human genetics projects (whole exome, other applications such as small RNAs, mapping translocations and ChipSeq may follow).

IT:

  • Perl, Java, bash/shell scripting, R, MySQL, HTML/CSS
  • Taverna
  • Linux (Fedora is kind of my thing), Mac

I do not attend every meeting, you may see me occasionally.

@Leiden

Jeroen Laros (LUMC)

Head of bioinformatics at the Leiden Genome Technology Center (LGTC), which is part of Human Genetics at the Leiden University Medical Center. Studied computer science and did his PhD in Leiden.

Sequencing platforms:

  • Illumina GAII
  • Illumina HiSeq 2000
  • Helicos
  • Roche 454

Expertise:

  • Algorithms / Programming (various high level languages, scripting languages, assemblers)
  • Data mining
  • Computer networks / Operating systems
  • Human Genome Variation nomenclature

Zuotian Tatum (LUMC)

I am a scientific programmer at Department of Human Genetics, Leiden University Medical Center.

Programming languages:

  • PHP
  • Python
  • C#
  • MySQL/T-SQL/PostgreSQL
  • Javascript (JQuery)

Martijn Vermaat (LUMC)

I am a scientific programmer at the Department of Human Genetics, Leiden University Medical Center. Studied computer science at VU University Amsterdam.

Main projects:

Expertise:

  • NGS analysis pipelines
  • Programming languages
  • Databases

Marten Boetzer (BaseClear)

Leon Mei (NBIC)

Leon Mei is the primary contact person in the BioAssist Engineering Team for the sequencing platform.

Profile on networks: LinkedIn[12], myExperiment[13]

I have work experience on software architecture, bio-signal processing, task assignment and scheduling algorithms.

Programming language / IT platforms:

  • Programming languages: Java, C, SQL, Python, Matlab, PHP

@Rotterdam

Rutger Brouwer (EMC)

Rutger Brouwer is a BioAssist programmer who specializes in data visualization solutions for next-generation sequencing data.

Sequencing platforms:

  • Illumina Genome Analyzer IIx
  • Illumina Hiseq 2000

Expertise:

  • Currently I am working at the Erasmus Medical center at the center of Biomics. Here we deal with a wide range of (mostly eukaryotic) organisms. Our group primarily performs functional sequencing experiments, such as ChIP-seq and RNA-seq. Visualization is very important in these experiments as the data derived from NGS can be very complex.
  • My previous expertise lies in the field of DNA microarray analysis and prokaryote genetics. There I did, among other things, visualization of complex data.

Programming languages: Perl, C++, Python, BASH, R

Mirjam van den Hout (EMC)

Mirjam van den Hout works at the Erasmus MC Centre for Biomics, as a bioinformatician, performing data management, setting up automated data analysis pipelines, and performing specific data analysis for all our NGS sequencing. We currently work on Chip-Seq, RNA-Seq, Exome-sequencing (SNP and variant detection), Methylation, 3C/4C.

Slavik Koval (EMC)

NGS data analysis. Alignment pipelines, variants calling (SNPs, SVs). GWAS studies. Genetic networks.

@Maastricht

Bart de Koning (Maastricht University)

@Delft

Jurgen Nijkamp (TU Delft)

Profile on network: LinkedIn[14]

I currently work on a de novo assembly of a yeast strain.

Sequence platforms: Illumina and 454 Software: Maq, Velvet, Nucmer, Lookseq and Abyss

Programming language / IT platforms: Mac OS X / Perl / Matlab / Shell scripting

Former Members

  • Frans Paul Ruzius (Hubrecht)
  • Matthew Hestand (LUMC)
  • Joris Lops (UMCG)