NBIC & VIB Bioinformatics RPM Repository

From BioAssist
Jump to: navigation, search


NBIC and BITS (Bioinformatics facility of the VIB, http://www.bits.vib.be) are maintaining both Galaxy servers focusing on NGS data analysis. To ease administration on their servers, they will establish a Bioinformatics RPM Repository on bioinformatics tools, the primary focus being NGS tools. The purpose is to come to a stable repository of easily installable packages for the common bioinformatics tools that can be used in (for example) Galaxy.

Be welcome to join our effort, which is just starting. Interested, please email hailiang dot mei at nbic dot nl or bits at vib.be.


Wanting to get involved? Please contact hailiang dot mei at nbic dot nl or BITS for more information.

Proposed structure

We will use a trac repository to host our spec files. The spec files will be stored in trunk with the basename of the application. Each released version of the spec file will be tagged with the basename + version of the program.

Based on the spec files we will create two repositories. One repository that has the latest version of the tools and adheres to the LSB file hierarchy. (i.e. will use the directories /usr/bin /usr/lib etc.) The second repository will have versioned RPMs for the different packages that can be installed in the directory structure proposed by the Galaxy Team: $GALAXY_APP/basename/version.

The packages will be signed by NBIC. We will publish the public key for the packages in the repository.

Build tools

To build the RPM packages in a controlled manner we will use Mock. The build environment will be a CentOS 5 virtual machine.

Some more information on building RPMs:

Overview tools

Orange color highlighted tools are the ones that we are working on.

Tool NGS pipeline part RPM? Link to source License issues Priority Maintainer
agile Aligner No http://users.eecs.northwestern.edu/~smi539/agile.html
bfast Aligner Yes - [1] http://sourceforge.net/apps/mediawiki/bfast/index.php?title=Main_Page Adam Huffman
Bowtie Aligner U.D. [2] http://bowtie-bio.sourceforge.net/index.shtml BITS Adam Huffman
BWA Aligner YES – [3] http://bio-bwa.sourceforge.net/ Adam Huffman
drFast Aligner (only solid) No http://drfast.sourceforge.net/
FR-HIT Aligner No http://weizhong-lab.ucsd.edu/frhit/
GASSST Aligner No http://www.irisa.fr/symbiose/projects/gassst/
gnumap Aligner No http://dna.cs.byu.edu/gnumap/
Maq Aligner YES – [4] http://maq.sourceforge.net
MIRA Aligner (in mapping mode) No http://sourceforge.net/projects/mira-assembler/
MOM Aligner No http://mom.csbc.vcu.edu/
Mosaik-aligner Aligner No http://code.google.com/p/mosaik-aligner/ Luc Ducazu
NovoAlign Aligner No http://www.novocraft.com/main/page.php?id=968
PASS Aligner No http://pass.cribi.unipd.it/cgi-bin/pass.pl?action=Download
SSAHA2 Aligner No http://www.sanger.ac.uk/resources/software/ssaha2/
Shrimp Aligner No http://compbio.cs.toronto.edu/shrimp/
SOAP Aligner No http://soap.genomics.org.cn/soapaligner.html
SOCS Aligner - Solid No http://solidsoftwaretools.com/gf/project/socs/
PerM Aligner No http://code.google.com/p/perm/
RaserS Aligner No http://www.seqan.de/downloads/projects.html#c13
Zoom Aligner No http://www.bioinformaticssolutions.com/all-products/zoom/index.php
Cgatools Format conversion - Analysis:SNP detection No http://cgatools.sourceforge.net/ BITS Joachim Jacob
FastQC Quality control No http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/
Fastxtoolkit Quality control Yes [5] and dep. [6] http://hannonlab.cshl.edu/fastx_toolkit/ Adam Huffman
Filo Format conversion U.D. [7] https://github.com/arq5x/filo/ Adam Huffman
HiTec Quality control No http://www.csd.uwo.ca/~ilie/HiTEC/
Picard-tools Format conversion U.D. http://picard.sourceforge.net/command-line-overview.shtml Adam Huffman
Pindel Analysis: SV No https://trac.nbic.nl/pindel/ NBIC David van Enckevort
Samstat Quality control No http://samstat.sourceforge.net/ BITS Luc Ducazu
Samtools Format conversion YES – rpmsearch http://samtools.sourceforge.net/ Adam Huffman (EPEL branches)
Bamtools Format Conversion No https://github.com/pezmaster31/bamtools
Bamview Visualisation No http://bamview.sourceforge.net/
Tabix Format conversion U.D. [8] http://sourceforge.net/projects/samtools/files/ Adam Huffman
Vcftools Format conversion - Data mining U.D. [9] http://vcftools.sourceforge.net/ Adam Huffman
Genomeanalysis TK Quality control - Analysis No http://www.broadinstitute.org/gsa/wiki/index.php/Main_Page#The_Genome_Analysis_Toolkit_.28GATK.29
pe-asm Assembling No http://code.google.com/p/pe-asm/
perl-Bio-SamTools Format conversion Yes [10] http://code.google.com/p/pe-asm/ Adam Huffman
prinseq Quality control No http://prinseq.sourceforge.net
Kent Tools Format conversion No http://hgwdev.cse.ucsc.edu/~kent/ Adam Huffman
Bedtools Data mining YES – [11] Adam Huffman
Repeatmasker Cleaning No http://www.repeatmasker.org/

Additional links to NGS tools


Additional Linux package repositories for bioinformatics tools