Difference between revisions of "NBIC Galaxy Server: How to update NBIC Galaxy"

From BioAssist
Jump to: navigation, search
(Saving the image)
Line 30: Line 30:
 
# Restart the server using the "Play" button, and the services (see above)
 
# Restart the server using the "Play" button, and the services (see above)
  
=Below are obsolete information for older HPC cloud. They are kept as reference.=
+
See also: [[Obsolete instructions for Galaxy server on old HPC cloud]]
== Cloud template ==
+
* change BUCKET_CLUSTER, BUCKET_DEFAULT, CLUSTER_NAME if a different bucket name is used at the cloudman script download site. For example https://cloudman-on.dtls.nl/cloudman-on
+
* Once the hosting site of the bucket is moved, you also need to check/modify the following two variables in /opt/cloudman/pkg/ec2autorun.py
+
SERVICE_ROOT = 'http://cloudman-on.dtls.nl'
+
DEFAULT_BUCKET_NAME = 'cloudman-on' # Ensure this bucket is accessible to anyone!
+
 
+
* PASSWORD is set to have authentication on cloudman UI login. The default username is "admin"
+
 
+
CONTEXT=[
+
  BOOT_SCRIPT_NAME=cm_boot.py,
+
  BOOT_SCRIPT_PATH=/tmp/cm,
+
  BUCKET_CLUSTER=cloudman-on,
+
  BUCKET_DEFAULT=cloudman-on,
+
  CLOUDMAN_HOME=/mnt/cm,
+
  CLOUD_TYPE=opennebula,
+
  CLUSTER_NAME=cloudman-on,
+
  ON_HOST=http://one-xmlrpc.calligo.sara.nl:2633/RPC2,
+
  ON_PASSWORD=plain:row8inking,
+
  ON_PROXY=10.0.137.3:3128,
+
  ON_USERNAME=<username>,
+
  PASSWORD=<password>,
+
  ROLE=master,
+
  TESTFLAG=false ]
+
CPU=1
+
DISK=[
+
  BUS=virtio,
+
  IMAGE="GalaxyMaster_2012_10_18",
+
  IMAGE_UNAME=hailiang.mei@nbic.nl ]
+
GRAPHICS=[
+
  TYPE=vnc ]
+
MEMORY=8192
+
NAME="NBIC Galaxy"
+
NIC=[
+
  MODEL=e1000,
+
  NETWORK=internet,
+
  NETWORK_FILTER=226,
+
  NETWORK_UNAME=oneadmin ]
+
NIC=[
+
  MODEL=e1000,
+
  NETWORK=sombrerogalaxy-nbic,
+
  NETWORK_UNAME=hailiang.mei@nbic.nl ]
+
OS=[
+
  ARCH=x86_64,
+
  BOOT=hd ]
+
RAW=[
+
  TYPE=kvm ]
+
TEMPLATE_ID=2514
+
VCPU=1
+
 
+
== VM Shutdown and Start Instructions ==
+
In order to stop the running VM at the SARA cloud and save it into a image file, we need to shutdown the VM and restart a new instance. Below are the steps:
+
# Perform an update of the linux packages
+
#: sudo nbic-dist-upgrade
+
# Save the logs printed by the update in the Server Logs
+
# Log into cloud management console: https://ui.hpccloud.sara.nl/
+
# Activate "save as" image of the current running VM, named with the current date e.g. "GalaxyMaster_2012_10_18"
+
# Shutdown the VM
+
# Go to the images section, and make the new image readable and writable for the group
+
# Go to the templates section, update "IMAGE=GalaxyMaster_2012_10_18" in the template "NBIC Galaxy"
+
# Unpack cm.tar.gz in the shared dropbox folder.
+
# Update "IMAGE=GalaxyMaster_2012_10_18" in cm/clouds/opennebula.py
+
#: tar czf cm.tar.gz cm
+
# In the cloud management console, create a new VM (in the "Virtual Machines" section, click "+New" in the upper right corner): name it as "nbicgalaxy" and select the "NBIC Galaxy" template.
+
#* Note, make sure "nbicgalaxy" is spelled correctly as it is crucial to have the correct DNS name for our VM: nbicgalaxy.sombrerogalaxy-nbic.cloudlet.sara.nl
+
# In the cloud management console, select the running VM, click button "Update Properties" and give the "group" access to "use" and "manage" the VM
+
# Once the new VM is running, you could log into it and start the lanch of Cloudman service (including Galaxy)
+
 
+
== Cloudman Startup Instructions ==
+
 
+
(Note: normally nginx should start automatically when the VM restarts. In case it is not the case, you can manually start it by executing: "sudo /opt/galaxy/pkg/nginx/sbin/nginx".)
+
 
+
You need to restart cloudman in case the VM is restarted.
+
 
+
* stop the running cloudman process
+
sudo sh /mnt/cm/run.sh --stop-daemon
+
 
+
* kill the hanging Galaxy process that still occupies the Galaxy port if necessary
+
sudo kill -9 `sudo fuser -n tcp 42284`
+
 
+
* kill all running sge process (let cloudman handle the start of SGE)
+
sudo pkill -U sgeadmin
+
 
+
* update /etc/hosts with master node's internal IP address
+
ifconfig eth1
+
sudo vim /etc/hosts and replace the line
+
10.0.137.xxx ubuntu
+
 
+
* if there are lines specifying worker nodes in /etc/hosts, remove them
+
 
+
* if you updated the IP address in the previous step: restart rabbitmq server
+
sudo pkill -U rabbitmq
+
sudo /etc/init.d/rabbitmq-server start
+
+
* start cloudman script
+
sudo sh /mnt/cm/run.sh --daemon
+
 
+
When cloudman is not working for the cloud middleware, some things need to be started by hand instead:
+
 
+
  sudo su postgres
+
  /usr/lib/postgresql/9.1/bin/postgres -D /mnt/galaxyData/pgsql/data -p 5840
+
 
+
  sudo /opt/galaxy/pkg/nginx/sbin/nginx
+
 
+
  sudo /bin/su - galaxy -c "export SGE_ROOT=/opt/sge; sh /mnt/galaxyTools/galaxy-central/run.sh --daemon"
+
 
+
 
+
* go to http://galaxy.nbic.nl/cloud and login with admin/{default password} to initiate the launch of cloud Galaxy. When it asks you for the 'initial storage' just ignore this and click on the 'cloudman' button at the bottom of the pop-up.
+
 
+
* Check the log file /mnt/cm/paster.log to see if cloudman services (including postgres, SGE, Galaxy) start successfully.
+
 
+
== Galaxy Startup Instructions ==
+
 
+
Start and stop Galaxy only.
+
sudo /bin/su - galaxy -c "export SGE_ROOT=/opt/sge; sh /mnt/galaxyTools/galaxy-central/run.sh --stop-daemon"
+
sudo /bin/su - galaxy -c "export SGE_ROOT=/opt/sge; sh /mnt/galaxyTools/galaxy-central/run.sh --daemon"
+
 
+
== Galaxy Update Instructions ==
+
When PennState Galaxy team releases new versions, we need to update our Galaxy codebase.
+
 
+
==== Stop the service ====
+
# lock the job submission at the NBIC Galaxy server via the admin UI.
+
# Stop the Galaxy daemon.
+
sudo /mnt/galaxyTools/galaxy-central/run.sh --stop-daemon
+
 
+
==== Download new release of Galaxy ====
+
Galaxy codebase is located at /mnt/galaxyTools (which is a NFSv4 mount of the cloud VM. All mount points are owned by user "postgres" as a workaround to support cloudman and postgres service)
+
* go to the directory of /mnt/galaxyTools/sources/ and check out the latest code (central version) from Penn State:
+
hg clone https://bitbucket.org/galaxy/galaxy-central galaxy-<year>-<month>-<date>
+
Note: this is just a copy for reference. Do not modify this copy!
+
* Copy this folder to /mnt/galaxyTools/. This will become the active installation with local mods.
+
cp /mnt/galaxyTools/sources/galaxy-<year>-<month>-<date> /mnt/galaxyTools/ -r
+
* Update the Galaxy symbolic link
+
cd /mnt/galaxyTools
+
rm galaxy-central
+
ln -s galaxy-<year>-<month>-<date>/ galaxy-central
+
 
+
==== Backup the Galaxy DB ====
+
 
+
Start the database service if it is not running (e.g., stopped by the cloudman script.)
+
sudo su postgres
+
/usr/lib/postgresql/9.1/bin/postgres -D /mnt/galaxyData/pgsql/data -p 5840
+
 
+
Backup the current galaxy DB.
+
sudo su postgres
+
pg_dump -p 5840 galaxy > galaxyDB_backup.out
+
 
+
==== Update database ====
+
* Configure the new galaxy: we keep the all the files (datasets) associated with histories stored in the Postgres database in a separate database folder outside the galaxy root folder.
+
cd /mnt/galaxyTools/galaxy-central
+
rmdir database
+
# Note: make sure this is the EMPTY database folder of the new install and not
+
# a symlink pointing to our data in /opt/galaxy/data/database/
+
ln -s /mnt/galaxyData/database database
+
* Test starting up the cloudman/Galaxy server (and fetch/check eggs if necessary)
+
sudo /mnt/cm/run.sh --daemon
+
 
+
==== Copy NBIC tools and data ====
+
* Customize the new Galaxy: We keep copies of modified (config) files in the [https://trac.nbic.nl/conerice conerice] repository. You should diff our customized versions in the conerice repo with the new versions in Galaxy. When necessary merge updates from the Galaxy developers with our local customizations and commit a new version to the conerice repo. If there are no changes you can simply copy the files below:
+
cd /mnt/galaxyTools/sources/
+
svn update conerice
+
cp conerice/masterfiles/galaxy/sombrero/static/welcome.html              /mnt/galaxyTools/galaxy-central/static/
+
cp conerice/masterfiles/galaxy/sombrero/config/universe_wsgi.ini          /mnt/galaxyTools/galaxy-central/
+
cp conerice/masterfiles/galaxy/sombrero/config/tool_data_table_conf.xml          /mnt/galaxyTools/galaxy-central/
+
cp conerice/masterfiles/galaxy/sombrero/config/tool_conf.xml              /mnt/galaxyTools/galaxy-central/
+
cp conerice/masterfiles/galaxy/sombrero/config/shed_tool_conf.xml          /mnt/galaxyTools/galaxy-central/
+
cp conerice/masterfiles/galaxy/sombrero/config/datatypes_conf.xml          /mnt/galaxyTools/galaxy-central/
+
cp conerice/masterfiles/galaxy/sombrero/lib/galaxy/datatypes/registry.py  /mnt/galaxyTools/galaxy-central/lib/galaxy/datatypes/
+
cp conerice/masterfiles/galaxy/sombrero/tools/data_source/upload.py      /mnt/galaxyTools/galaxy-central/tools/data_source/
+
* Create symlinks for our tools and datatypes from the NBIC Galaxy Module Repository
+
cd /mnt/galaxyTools/sources/galaxytools
+
svn update
+
svn export trunk /mnt/galaxyTools/nbic_gmr
+
cd /mnt/galaxyTools/galaxy-central/tools
+
ln -s /mnt/galaxyTools/nbic_gmr/ nbic_gmr
+
cd /mnt/galaxyTools/galaxy-central/static/images
+
ln -s /mnt/galaxyTools/nbic_gmr/data/image-links/ nbic_gmr
+
cd /mnt/galaxyTools/galaxy-central/lib/galaxy/datatypes/
+
for item in $(ls /mnt/galaxyTools/nbic_gmr/data/datatypes/); do ln -s /mnt/galaxyTools/nbic_gmr/data/datatypes/$item; done
+
* Create symlinks for custom tool-data files in /mnt/galaxyTools/galaxy-central/tool-data/
+
cd /mnt/galaxyTools/galaxy-central/tool-data/
+
for item in $(ls /mnt/galaxyData/tool-data/); do ln -s /mnt/galaxyData/tool-data/$item; done
+
 
+
==== Restart the service ====
+
sudo /mnt/cm/run.sh --daemon
+
* Check the log file /mnt/galaxyTools/galaxy-central/paster.log to see if Galaxy started (without errors).
+
 
+
== Useful tips ==
+
 
+
==== Creating new user accounts ====
+
To keep the server safe it is important to have a policy on the user accounts.
+
* Create a new user with the following command (replace group1,group2 and user with the correct values of course, if no special group memberships are necessary, you can omit the -G ...)
+
sudo useradd -G sudo -m -U -b /home -s /bin/bash -c 'Full name of the user' username
+
* Always use a strong password on user accounts (8 or more characters and use letters, numbers and specials). Set the initial password with
+
sudo passwd user
+
* When you create an account and have to e-mail the password to the user enforce a password change on first use with the following command:
+
sudo chage -d 0 user
+
* When you are giving temporary access to someone set an expiration date for the account with the following command:
+
sudo chage -E YYYY-MM-DD user
+
* make sure the user is member of the right groups. Users who need to have access to galaxy need to be in the devs group, but only give group membership if it is really necessary. Anyone who is member of the devs group can potentially break stuff...
+
* To change the group memberships later on you can use the following command:
+
sudo usermod -a -G devs user
+
 
+
==== HPC cloud status ====
+
 
+
Sometimes it is handy to check the system status in the cloud.
+
 
+
The file-server that serves the virdirs (NFS):
+
https://ganglia.surfsara.nl/?c=Calligo-Services&h=m-fileserver01.calligo.sara.nl&m=load_one&r=hour&s=by%20name&hc=4&mc=2
+
 
+
The file-server that serves the VM images:
+
https://ganglia.surfsara.nl/?c=Calligo-Services&h=m-fileserver02.calligo.sara.nl&m=load_one&r=hour&s=by%20name&hc=4&mc=2
+
 
+
 
[[Category:E-science / Conventions]]
 
[[Category:E-science / Conventions]]
 
[[Category:Galaxy]]
 
[[Category:Galaxy]]

Revision as of 16:07, 5 August 2016

Galaxy start procedure at new HPC cloud

  1. go to https://ui.hpccloud.surfsara.nl/
  2. use template NBICGalaxy to instantiate a new VM with name "galaxy". Both disk images using for Galaxy are in persistent state, so all changes should be automatically saved.
  3. due to a problem with migration, you need to access the VM via VNC and restart eth0 and eth1 to get correct IP address
  4. Now you can start Galaxy system using the following commands.
ssh galaxy.nbic.nl

Check whether /var/run/postgresql exists and has the right owners. Otherwise create it:

sudo mkdir /var/run/postgresql/
sudo chown postgres:postgres /var/run/postgresql/

And start database, webserver and galaxy:≈

sudo su postgres
/usr/lib/postgresql/9.1/bin/pg_ctl start -o "-p 5840" -D /mnt/galaxyData/pgsql/data
exit
sudo /opt/galaxy/pkg/nginx/sbin/nginx
sudo /bin/su - galaxy -c "export SGE_ROOT=/opt/sge; sh /mnt/galaxyTools/galaxy-central/run.sh --daemon"

Saving the image

The server runs on a persistent image. That means that when it is properly stopped and deployed, a restart will have all changes saved. However, if the server is stopped because of a hardware or power failure of the cloud node, it will NOT be saved. It is therefore a good idea to save a copy every once in a while after big updates or configuration changes.

  1. Log in on sara cloud
  2. Identify the running server.
  3. Using the "Pause" button select ""power off" and wait for POWEROFF state.
  4. Click on the name of the server, and the "Storage" tab
  5. In the line for the "vda" drive, click on the "Save as" icon/label
  6. Type "GalaxyMaster_yyyymmdd" as the name of the backup
  7. Wait for the save to complete (up to 15 minutes?)
  8. Restart the server using the "Play" button, and the services (see above)

See also: Obsolete instructions for Galaxy server on old HPC cloud