NGS Generic Pipelines

From BioAssist
Jump to: navigation, search

Return to the main page of Next Generation Sequencing

Overview of NGS Pipeline

In general, any NGS tasks consist of 5 main steps. The focus of NBIC NGS platform collaborative effort is at the step 3, 4, and 5.

5 major steps in NGS generic pipeline

1, Sample preparation

Lab workers need to select and culture the biological samples that will be used as the NGS study materials.

2, Machine sequencing

Run the sequencer machines.

3, Quality control

After reads are produced, we should do a quick check to determine whether the task is successful and remove faulty reads as much as possible. Typical tasks of this step includes:

  • Unique: (To be extended)
  • Barcode: (To be extended)
  • Linker edit: (To be extended)
  • Check contamination: If some reads match to a certain organism (e.g., a bacteria) that is not part of the study, it is possible that the sample is contaminated by this organism.
  • QC bases, e.g., read and trim: (To be extended)
  • Deal with "N" reads: E.g., change all "N" (unknown) bases to "A".

4, Assembly or alignment

Depending on the purpose, we either perform a de-novo assembly task or an alignment/mapping task. Refer to the generic pipelines presented below.

5, Analysis and reporting

Resulted Genome, Gene sequences, SNP, Indel, or other statistical reports are provided.

NBIC NGS Alignment Pipeline

NGS Alignment generic pipeline

Possible tools and formats to be included:

  1. EMC Bioinformatics group (Stephan Nouwens) is using an Illumina infrastructure: GApipeline + Casava for SNP calling.
    • GApipeline: Input as FastQ; Output as scarf
    • CASAVA: Input as scarf; Output as Illumina build ( only viewable in genome studio a commercial software Illumina)

NBIC NGS De-novo Assembly Pipeline

NGS de-novo assembly generic pipeline

Reporting and visualization

The requirements on result analysis and reporting are very diverse and end-user specific. We will look into it in more details.

We will look into the possibility of having a local UCSC browser. It could be nicely integrated with Galaxy server. As Wilfred pointed out in the meeting, a good user interface is crucial and we could learn useful tips from CLC Bio. We will look into customizing the Galaxy and UCSC browser interface.

Common platform

All NGS pipelines are shared via the NBIC Galaxy Server.