NBIC and BITS (Bioinformatics facility of the VIB, http://www.bits.vib.be) are maintaining both Galaxy servers focusing on NGS data analysis. To ease administration on their servers, they will establish a Bioinformatics RPM Repository on bioinformatics tools, the primary focus being NGS tools. The purpose is to come to a stable repository of easily installable packages for the common bioinformatics tools that can be used in (for example) Galaxy.

Be welcome to join our effort, which is just starting. Interested, please email hailiang dot mei at nbic dot nl or bits at vib.be.


Proposed structure

We will use a trac repository to host our spec files. The spec files will be stored in trunk with the basename of the application. Each released version of the spec file will be tagged with the basename + version of the program.

Based on the spec files we will create two repositories. One repository that has the latest version of the tools and adheres to the LSB file hierarchy. (i.e. will use the directories /usr/bin /usr/lib etc.) The second repository will have versioned RPMs for the different packages that can be installed in the directory structure proposed by the Galaxy Team: $GALAXY_APP/basename/version.

The packages will be signed by NBIC. We will publish the public key for the packages in the repository.

Build tools

To build the RPM packages in a controlled manner we will use Mock. The build environment will be a CentOS 5 virtual machine.

Some more information on building RPMs:

Overview tools

Tool NGS pipeline part RPM? Link to source Priority Maintainer
bfast Aligner Yes - [1] http://sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page
Bowtie Aligner U.D. [2] http://bowtie-bio.sourceforge.net/index.shtml BITS
BWA Aligner YES – [3] http://bio-bwa.sourceforge.net/
Cgatools Format conversion - Analysis:SNP detection No http://cgatools.sourceforge.net/ BITS
FastQC Quality control No http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
Fastxtoolkit Quality control Yes [4] and dep. [5] http://hannonlab.cshl.edu/fastx_toolkit/
Filo Format conversion No https://github.com/arq5x/filo/
HiTec Quality control No http://www.csd.uwo.ca/~ilie/HiTEC/
Maq Aligner YES – [6] http://maq.sourceforge.net
Mosaik-aligner Aligner No http://code.google.com/p/mosaik-aligner/
PASS Aligner No http://pass.cribi.unipd.it/cgi-bin/pass.pl?action=Download
Picard-tools Format conversion No http://picard.sourceforge.net/command-line-overview.shtml
Pindel Analysis: SV No https://trac.nbic.nl/pindel/ David van Enckevort
Samstat Quality control No http://samstat.sourceforge.net/ BITS
Samtools Format conversion YES – rpmsearch http://samtools.sourceforge.net/
Bamtools Format Conversion No http://sourceforge.net/projects/bamtools/
Bamview Visualisation No http://bamview.sourceforge.net/
Shrimp Aligner No http://compbio.cs.toronto.edu/shrimp/
SOAP Aligner No http://soap.genomics.org.cn/soapaligner.html
SOCS Aligner - Solid No http://solidsoftwaretools.com/gf/project/socs/
Tabix Format conversion U.D. [7] http://sourceforge.net/projects/samtools/files/
Vcftools Format conversion - Data mining U.D. [8] http://vcftools.sourceforge.net/
Genomeanalysis TK Quality control - Analysis No http://www.broadinstitute.org/gsa/wiki/index.php/Main_Page#The_Genome_Analysis_Toolkit_.28GATK.29
pe-asm Assembling No http://code.google.com/p/pe-asm/
perl-Bio-SamTools Format conversion Yes [9] http://code.google.com/p/pe-asm/
PerM Aligner No http://code.google.com/p/perm/
prinseq Quality control No http://prinseq.sourceforge.net
Kent Tools Format conversion No http://hgwdev.cse.ucsc.edu/~kent/
Bedtools Data mining YES – [10]
Repeatmasker Cleaning No http://www.repeatmasker.org/

Additional links to NGS tools

Additional Linux package repositories for bioinformatics tools