NGS Generic Pipelines
Return to the main page of Next Generation Sequencing
Contents
Overview of NGS Pipeline
In general, any NGS tasks consist of 5 main steps. The focus of NBIC NGS platform collaborative effort is at the step 3, 4, and 5.
1, Sample preparation
Lab workers need to select and culture the biological samples that will be used as the NGS study materials.
2, Machine sequencing
Run the sequencer machines.
3, Quality control
After reads are produced, we should do a quick check to determine whether the task is successful and remove faulty reads as much as possible. Typical tasks of this step includes:
- Unique: (To be extended)
- Barcode: (To be extended)
- Linker edit: (To be extended)
- Check contamination: If some reads match to a certain organism (e.g., a bacteria) that is not part of the study, it is possible that the sample is contaminated by this organism.
- QC bases, e.g., read and trim: (To be extended)
- Deal with "N" reads: E.g., change all "N" (unknown) bases to "A".
4, Assembly or alignment
Depending on the purpose, we either perform a de-novo assembly task or an alignment/mapping task. Refer to the generic pipelines presented below.
5, Analysis and reporting
Resulted Genome, Gene sequences, SNP, Indel, or other statistical reports are provided.
NBIC NGS Alignment Pipeline
Possible tools and formats to be included:
- EMC Bioinformatics group (Stephan Nouwens) is using an Illumina infrastructure: GApipeline + Casava for SNP calling.
- GApipeline: Input as FastQ; Output as scarf
- CASAVA: Input as scarf; Output as Illumina build ( only viewable in genome studio a commercial software Illumina)
NBIC NGS De-novo Assembly Pipeline
Reporting and visualization
The requirements on result analysis and reporting are very diverse and end-user specific. We will look into it in more details.
We will look into the possibility of having a local UCSC browser. It could be nicely integrated with Galaxy server. As Wilfred pointed out in the meeting, a good user interface is crucial and we could learn useful tips from CLC Bio. We will look into customizing the Galaxy and UCSC browser interface.
Common platform
All NGS pipelines are shared via the NBIC Galaxy Server.