Difference between revisions of "Galaxy VM"

From BioAssist
Jump to: navigation, search
m
(Installation)
 
(25 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
== Introduction ==
 
== Introduction ==
In collaboration with the people running the NBIC Galaxy server, the BioAssist Engineering Team has made scripts that can extract a Galaxy Virtual Machine that has all the NBIC galaxy tools available. Users can download the virtual machine and run it on their local hardware.
+
In collaboration with the people running the [[NBIC Galaxy Server]], the BioAssist Engineering Team has made scripts that can extract a Galaxy Virtual Machine that has all the NBIC galaxy tools available. Users can download the virtual machine and run it on their local hardware.  
  
This project is meant to help people that would otherwise have to upload huge datasets to a remote server, or that are concerned about privacy of data.
+
Our goals are:
 +
* '''''Easy installation and configuration of a local Galaxy server'''''
 +
* '''''Run a local Galaxy so that you don't need to transfer 10's of GBs of data via Internet and worry about data security and privacy'''''
 +
* '''''We will shared all [[NBIC Galaxy Server: NBIC Pipelines | NGS/Proteomics pipelines]] developed by NBIC developers via the NBIC Galaxy VM'''''
  
== Getting the VMware image ==
+
== Minimal system requirements ==
The latest virtual machine image is available at [http://bet1.nbiceng.net/galaxy/galaxy.tar.bzip2 http://bet1.nbiceng.net/galaxy/galaxy.tar.bzip2]. It is a VMware virtual machine image. To run the VM you will need a version of the VMware hypervisor, for evaluation or small deployments the free versions (VMware Player, VMware Server or VMware vSphere Hypervisor (ESXi)) can be used, for a production environment it is recommended to run on VMware vSphere. Other hypervisors are currently not directly supported by NBIC, but if you disable the VMware Tools in the virtual machine you should be able to run it in KVM, VirtualBox, Xen or VirtualPC as well. We are always interested to hear about your experiences with other hypervisors.
+
 
+
=== Minimal system requirements ===
+
  
 
The virtual machine can run with as little as 1GB of RAM assigned to it. It has a 100GB virtual harddisk and it requires a single network interface that is connected to your network (no NAT or local machine network). The actual amount of memory required depends on the number of users and the kind of jobs you run. When working with large datasets it can not hurt to assign an ample amount of memory.
 
The virtual machine can run with as little as 1GB of RAM assigned to it. It has a 100GB virtual harddisk and it requires a single network interface that is connected to your network (no NAT or local machine network). The actual amount of memory required depends on the number of users and the kind of jobs you run. When working with large datasets it can not hurt to assign an ample amount of memory.
  
=== Some considerations for installation ===
+
== Installation ==
 +
 
 +
* Step 1: Install VM software
 +
To run the VM you will need a version of the VMware hypervisor, for evaluation or small deployments the free versions (VMware Player, VMware Server or VMware vSphere Hypervisor (ESXi)) can be used, for a production environment it is recommended to run on VMware vSphere. Other hypervisors are currently not directly supported by NBIC, but if you disable the VMware Tools in the virtual machine you should be able to run it in KVM, VirtualBox, Xen or VirtualPC as well. We are always interested to hear about your experiences with other hypervisors.
 +
 
 +
* Step 2: Download NBIC Galaxy VM
 +
The latest virtual machine image is available at [http://downloads.nbiceng.net/galaxy/ our VM repository]. It is a VMware virtual machine image.
 +
 
 +
* Step 3: Load Galaxy VM
 +
** The Virtual Machine image is based on CentOS 5.5 64-bit (RedHat Enterprise Linux compatible). You should create a new virtual machine for this distribution or the closest match. In the virtual machine you need to configure memory, networking and configure the virtual disk to point to the galaxy (.vmdk) disk image.
 +
** For '''VMware player''' user, here are a list of instructions on how to load the .vmdk disk:
 +
**# Open VMware Player -> "Create a New Virtual Machine"
 +
**# Choose "I will install the operating system later" -> "Next"
 +
**# Choose "Linux" with the version of "Red Hat Enterprise Linux 5 64-bit" -> "Next"
 +
**# Give a name, e.g. "galaxyVM" -> "Next"
 +
**# Keep default settings at "Specify Disk Capacity" page since we will remove this disk anyway -> "Next"
 +
**# "Ready to Create Virtual Machine" -> "Finish"
 +
**# You now get back to the window of "VMware Player" and you should have a VM called "galaxyVM" listed in the left panel. Select "galaxyVM" and click "Edit virtual machine settings".
 +
**# "Virtual Machine Settings", from the Hardware list remove Hard Disk.
 +
**# Click "Add" and choose "Hard Disk" -> "Next"
 +
**# You get "Select a Disk" page, choose "Use an existing virtual disk" -> "Next"
 +
**# At "Select an Existing Disk", point to the .vmdk disk image.
 +
**# Now, you are done. Save it and you can start the GalaxyVM by clicking "Play virtual machine".
 +
 
 +
** For '''Oracle Virtualbox''', version at time of writing is 4.1.8.
 +
**# Open VirtualBox Manaver -> "New"
 +
**# A wizard appears, with some text, "Next"
 +
**# Choose a name, e.g. "galaxyVM", and choose "Linux" with the version of "Red Hat 64-bit" in the Operating System section -> "Next"
 +
**# Choose memory size, this can be adjusted later, e.g. 2045 M -> "Next"
 +
**# Choose "Use existing hard disk", browse to and select the .vmdk.
 +
**# "Ready to Create Virtual Machine" -> "Finish"
 +
**# You now get back to the window of "VMware Player" and you should have a VM called "galaxyVM" listed in the left panel. Select "galaxyVM" and click "Edit virtual machine settings".
 +
**# "Virtual Machine Settings", from the Hardware list remove Hard Disk.
 +
**# Click "Add" and choose "Hard Disk" -> "Next"
 +
**# You get "Select a Disk" page, choose "Use an existing virtual disk" -> "Next"
 +
**# Choose "Select an Existing Disk", point to the .vmdk disk image. This will need be changed later but saves the removal of a useless drive later. -> "Next"
 +
**# Check your settings and click "Finish" This will create the virtual machine. NOTE: this will not boot yet, since the OS expects a SCSI controller. Let's fix that:
 +
**# Select the Galaxy VM and select "Settings"
 +
**# click on "Storage" to see the controller tree with drives
 +
**# Remove the SATA controller together with the virtual drive
 +
**# Add a SCSI controller by clicking on the gray square with the '+' sign (beneath the controller tree)
 +
**# Add a drive to the SCSI controller by clicking on the + drive icon next to the controller in the tree
 +
**# Choose 'use an existing disk' and navigate to the galaxy .vmdk file.
 +
**# Now it's finished, the VM can be started.
 +
 
 +
* Step 4: Configure Galaxy VM
 +
** The image has an admin login account with username (admin) and password (install123) and supports sudo to execute administrative tasks. Please change the password of this account before making your VM available on the network.
 +
** This Galaxy VM is configured to request an IP address by DHCP. If DHCP is available, but it fails to bring up the network interface there is a mismatch between the MAC address of the virtual machine and the MAC address configured in the image. You can resolve this by editing the file /etc/sysconfig/network-scripts/ifcfg-eth0 and update the line HWADDR with the correct MAC address. If DHCP is not available you need to reconfigure the network with command ''sudo /usr/sbin/system-config-network-tui'' for your network, please consult with your network administrator for the correct network settings. On reboot the machine should display the URL to access galaxy on the console.
 +
** Sometimes, you will need a restart of the host machine and VM itself to make sure the DHCP configuration has done the right thing.
 +
* Step 5: Access Galaxy VM
 +
To access the Galaxy you should go to <nowiki>http://<ip-address>/galaxy/</nowiki>. The "/" at the end of the URL is currently required (will be fixed in a later release of the VM)
  
* The image has an admin login account with username admin and password change123me and supports sudo to execute administrative tasks. Please take the hint and change the password of this account before making your VM available on the network. The root user can also login with the password change123me from the console, but remote login for root has been disabled. The root password should be changed as well.
 
* If the virtual machine is configured to request an IP address by DHCP. If DHCP is available, but it fails to bring up the network interface there is a mismatch between the MAC address of the virtual machine and the MAC address configered in the image. You can resolve this by editing the file /etc/sysconfig/network-scripts/ifcfg-eth0 and update the line HWADDR with the correct MAC address. If DHCP is not available you need to reconfigure the network with command ''sudo system-config-network-tui'' for your network, please consult with your network administrator for the correct network settings. On reboot the machine should display the URL to access galaxy on the console.
 
* To access the Galaxy you should go to <nowiki>http://<ip-address>/galaxy/</nowiki> the / at the end of the URL is currently required (will be fixed in a later release of the VM)
 
 
* We '''strongly''' suggest that you make a snapshot or a copy of your virtual machine before going into production. This allows you to reset it to a clean state easily if required.
 
* We '''strongly''' suggest that you make a snapshot or a copy of your virtual machine before going into production. This allows you to reset it to a clean state easily if required.
  
 +
== FAQs ==
 +
 +
* Why I receive errors during the download of the NBIC Galaxy VM?
 +
First, check to see whether your Internet connection works or not. It is always nice to use a download manager. Make sure you have a 64bit system as 32bit system can't support a file larger than 2GBs.
 +
 +
* What should I do if I want more disk space for the VM?
 +
Currently the virtual machine has 100GB local storage. If necessary you can extend the size of the image, but this is not trivial. If disk space is a big issue you can also relocate the /nbic directory to a network shared disk. Consult with your system administrators to see if this is possible and how to set this up. Fast network access is recommended if you choose this option.
  
=== Some notes about disk usage ===
+
The virtual machine does not run any cleanup jobs yet, to prevent running out of diskspace you should configure the cleanup jobs. See [http://bitbucket.org/galaxy/galaxy-central/wiki/Config/PurgeHistoriesAndDatasets the galaxy documentation] on how to setup these jobs.
  
* Currently the virtual machine has 100GB local storage. If necessary you can extend the size of the image, but this is not trivial. A next release of the virtual machine image will probably use LVM to make things a bit easier. In the meanwhile, if disk space is a big issue you can also relocate the /nbic directory to a network shared disk. Consult with your system administrators to see if this is possible and how to set this up. Fast network access is recommended if you choose this option.
+
* How to update a VM to the latest Galaxy version?
* The virtual machine does not run any cleanup jobs yet, to prevent running out of diskspace you should configure the cleanup jobs. See [http://bitbucket.org/galaxy/galaxy-central/wiki/Config/PurgeHistoriesAndDatasets the galaxy documentation] on how to setup these jobs.
+
You can update the Galaxy VM as a normal Galaxy installation. The following procedure should work:
 +
# sudo su nbic
 +
# cd /nbic/prog/galaxy
 +
# hg pull
 +
# hg update
 +
# sh manage_db upgrade (this may fail due to missing sqlalchemy modules. You can download it and install manually)
 +
# sudo su admin
 +
# sudo /etc/init.d/galaxy restart
  
 +
== Contacts ==
 +
Please contact nbicgalaxy-admin@trac.nbic.nl for any question you have regarding the Galaxy VM.
  
 
[[Category:NBIC Galaxy|Galaxy VM]]
 
[[Category:NBIC Galaxy|Galaxy VM]]
 
[[Category:BioAssist Engineering Team / Project]]
 
[[Category:BioAssist Engineering Team / Project]]

Latest revision as of 13:02, 24 January 2012

Introduction

In collaboration with the people running the NBIC Galaxy Server, the BioAssist Engineering Team has made scripts that can extract a Galaxy Virtual Machine that has all the NBIC galaxy tools available. Users can download the virtual machine and run it on their local hardware.

Our goals are:

  • Easy installation and configuration of a local Galaxy server
  • Run a local Galaxy so that you don't need to transfer 10's of GBs of data via Internet and worry about data security and privacy
  • We will shared all NGS/Proteomics pipelines developed by NBIC developers via the NBIC Galaxy VM

Minimal system requirements

The virtual machine can run with as little as 1GB of RAM assigned to it. It has a 100GB virtual harddisk and it requires a single network interface that is connected to your network (no NAT or local machine network). The actual amount of memory required depends on the number of users and the kind of jobs you run. When working with large datasets it can not hurt to assign an ample amount of memory.

Installation

  • Step 1: Install VM software

To run the VM you will need a version of the VMware hypervisor, for evaluation or small deployments the free versions (VMware Player, VMware Server or VMware vSphere Hypervisor (ESXi)) can be used, for a production environment it is recommended to run on VMware vSphere. Other hypervisors are currently not directly supported by NBIC, but if you disable the VMware Tools in the virtual machine you should be able to run it in KVM, VirtualBox, Xen or VirtualPC as well. We are always interested to hear about your experiences with other hypervisors.

  • Step 2: Download NBIC Galaxy VM

The latest virtual machine image is available at our VM repository. It is a VMware virtual machine image.

  • Step 3: Load Galaxy VM
    • The Virtual Machine image is based on CentOS 5.5 64-bit (RedHat Enterprise Linux compatible). You should create a new virtual machine for this distribution or the closest match. In the virtual machine you need to configure memory, networking and configure the virtual disk to point to the galaxy (.vmdk) disk image.
    • For VMware player user, here are a list of instructions on how to load the .vmdk disk:
      1. Open VMware Player -> "Create a New Virtual Machine"
      2. Choose "I will install the operating system later" -> "Next"
      3. Choose "Linux" with the version of "Red Hat Enterprise Linux 5 64-bit" -> "Next"
      4. Give a name, e.g. "galaxyVM" -> "Next"
      5. Keep default settings at "Specify Disk Capacity" page since we will remove this disk anyway -> "Next"
      6. "Ready to Create Virtual Machine" -> "Finish"
      7. You now get back to the window of "VMware Player" and you should have a VM called "galaxyVM" listed in the left panel. Select "galaxyVM" and click "Edit virtual machine settings".
      8. "Virtual Machine Settings", from the Hardware list remove Hard Disk.
      9. Click "Add" and choose "Hard Disk" -> "Next"
      10. You get "Select a Disk" page, choose "Use an existing virtual disk" -> "Next"
      11. At "Select an Existing Disk", point to the .vmdk disk image.
      12. Now, you are done. Save it and you can start the GalaxyVM by clicking "Play virtual machine".
    • For Oracle Virtualbox, version at time of writing is 4.1.8.
      1. Open VirtualBox Manaver -> "New"
      2. A wizard appears, with some text, "Next"
      3. Choose a name, e.g. "galaxyVM", and choose "Linux" with the version of "Red Hat 64-bit" in the Operating System section -> "Next"
      4. Choose memory size, this can be adjusted later, e.g. 2045 M -> "Next"
      5. Choose "Use existing hard disk", browse to and select the .vmdk.
      6. "Ready to Create Virtual Machine" -> "Finish"
      7. You now get back to the window of "VMware Player" and you should have a VM called "galaxyVM" listed in the left panel. Select "galaxyVM" and click "Edit virtual machine settings".
      8. "Virtual Machine Settings", from the Hardware list remove Hard Disk.
      9. Click "Add" and choose "Hard Disk" -> "Next"
      10. You get "Select a Disk" page, choose "Use an existing virtual disk" -> "Next"
      11. Choose "Select an Existing Disk", point to the .vmdk disk image. This will need be changed later but saves the removal of a useless drive later. -> "Next"
      12. Check your settings and click "Finish" This will create the virtual machine. NOTE: this will not boot yet, since the OS expects a SCSI controller. Let's fix that:
      13. Select the Galaxy VM and select "Settings"
      14. click on "Storage" to see the controller tree with drives
      15. Remove the SATA controller together with the virtual drive
      16. Add a SCSI controller by clicking on the gray square with the '+' sign (beneath the controller tree)
      17. Add a drive to the SCSI controller by clicking on the + drive icon next to the controller in the tree
      18. Choose 'use an existing disk' and navigate to the galaxy .vmdk file.
      19. Now it's finished, the VM can be started.
  • Step 4: Configure Galaxy VM
    • The image has an admin login account with username (admin) and password (install123) and supports sudo to execute administrative tasks. Please change the password of this account before making your VM available on the network.
    • This Galaxy VM is configured to request an IP address by DHCP. If DHCP is available, but it fails to bring up the network interface there is a mismatch between the MAC address of the virtual machine and the MAC address configured in the image. You can resolve this by editing the file /etc/sysconfig/network-scripts/ifcfg-eth0 and update the line HWADDR with the correct MAC address. If DHCP is not available you need to reconfigure the network with command sudo /usr/sbin/system-config-network-tui for your network, please consult with your network administrator for the correct network settings. On reboot the machine should display the URL to access galaxy on the console.
    • Sometimes, you will need a restart of the host machine and VM itself to make sure the DHCP configuration has done the right thing.
  • Step 5: Access Galaxy VM

To access the Galaxy you should go to http://<ip-address>/galaxy/. The "/" at the end of the URL is currently required (will be fixed in a later release of the VM)

  • We strongly suggest that you make a snapshot or a copy of your virtual machine before going into production. This allows you to reset it to a clean state easily if required.

FAQs

  • Why I receive errors during the download of the NBIC Galaxy VM?

First, check to see whether your Internet connection works or not. It is always nice to use a download manager. Make sure you have a 64bit system as 32bit system can't support a file larger than 2GBs.

  • What should I do if I want more disk space for the VM?

Currently the virtual machine has 100GB local storage. If necessary you can extend the size of the image, but this is not trivial. If disk space is a big issue you can also relocate the /nbic directory to a network shared disk. Consult with your system administrators to see if this is possible and how to set this up. Fast network access is recommended if you choose this option.

The virtual machine does not run any cleanup jobs yet, to prevent running out of diskspace you should configure the cleanup jobs. See the galaxy documentation on how to setup these jobs.

  • How to update a VM to the latest Galaxy version?

You can update the Galaxy VM as a normal Galaxy installation. The following procedure should work:

# sudo su nbic
# cd /nbic/prog/galaxy
# hg pull
# hg update
# sh manage_db upgrade (this may fail due to missing sqlalchemy modules. You can download it and install manually)
# sudo su admin
# sudo /etc/init.d/galaxy restart

Contacts

Please contact nbicgalaxy-admin@trac.nbic.nl for any question you have regarding the Galaxy VM.