Difference between revisions of "BigGrid virtualisatie"
m (→Kick-off - Monday July 6, 2009) |
m (→Tuesday July 18, 2009) |
||
Line 13: | Line 13: | ||
=== Tuesday July 18, 2009 === | === Tuesday July 18, 2009 === | ||
− | * [ | + | * [[/notulen_20090818|Agenda en notulen]] |
== Presentations == | == Presentations == |
Revision as of 20:33, 26 August 2009
Contents
Members and assignment
- Sander Klous - Nikhef (Chair)
- Ronald Starink - Nikhef
- Marc van Driel - NBIC
- Pieter van Beek - SARA
- Ron Trompert - SARA
Project Proposal, goedgekeurd door ET
Meetings
Kick-off - Monday July 6, 2009
Tuesday July 18, 2009
Presentations
- Slides - Sander Klous (Monday July 6, 2009), a summary of the CERN virtual machines workshop (see other information) and an introduction for the kick-off meeting of the BIG grid virtual machines working group.
- Slides - Sander Klous (Tuesday August 18, 2009), Class 2 VM scenario.
- Slides - Pieter van Beek (Tuesday August 18,2009), Virtual Machines op Big Grid Hardware.
- Slides - Ronald Starink (Tuesday August 18, 2009), Policy & Security issues for Class 2/3 Virtual Machines.
Open Issues
- Network Address Translation - What is the load?
- Virtual Machine Isolation - Prohibit internal network connectivity with IPTables.
- Image repository - Storage Area Network or distributed over worker nodes.
- Policy document
Infrastructure
We are setting up a testbed to investigate technical issues related to virtual machine management.
Hardware and Operating Systems
- Two Dell 1950 machines, dual CPU, 4 cores per CPU
- One machine has a CentOS-5 installation
- One machine has a Debian-squeeze installation
Software
- CentOS-5 comes with Xen 3.0
- Debian-squeeze comes with Xen 3.3
- Debian-squeeze Xen packages have a problem with tap:aio.
Fix: ln -s /usr/lib/xen-3.2-1/bin/tapdisk /usr/sbin echo xenblktap >> /etc/modules
- Opennebula has been installed (stand alone) on CentOS-5 following this guide
- A few additional staps were needed:
- Install rubygems and rubygem-sqlite3
- Opennebula has to be added to the sudoers file for xm and xentop
- Sudoers should not require a tty
- A few additional staps were needed:
wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm wget ftp://fr.rpmfind.net/linux/EPEL/5/x86_64/rubygems-1.3.1-1.el5.noarch.rpm sudo rpm -Uvh rubygems-1.3.1-1.el5.noarch.rpm rubygem-sqlite3-ruby-1.2.4-1.el5.x86_64.rpm
In /etc/sudoers (on all machines) opennebula ALL = NOPASSWD: /usr/sbin/xm opennebula ALL = NOPASSWD: /usr/sbin/xentop #Defaults requiretty
- Installed iSCSI target and client software for shared image repository
- Howtos: Debian client/server, CentOS client, CentOS server
- Maybe test later with encrypted iSCSI
- Two new machines ordered with the required iSCSI offload
- Image repository consists of LVM volume groups
- Performance of LVM is better than file based images
- Each logical volume contains an image
- This allows easy creation/deletion of new images
- VMs can run from cloned (Copy-On-Write) images
Implementation issues
Implemented iSCSI image management for opennebula following the storage guide
- Changed oned configuration as shown below
- Added tm_iscsi configuration
- Implemented transfer manager commands
- Modified XenDriver.cc and TransferManager.cc to support LVM images
- Added LVM management commands to the /etc/sudoers file
In /opt/opennebula/etc/oned.conf: TM_MAD = [ name = "tm_iscsi", executable = "one_tm", arguments = "tm_iscsi/tm_iscsi.conf", default = "tm_iscsi/tm_iscsi.conf" ]
/opt/opennebula/etc/tm_iscsi/tm_iscsi.conf /opt/opennebula/etc/tm_iscsi/tm_iscsirc
/opt/opennebula/lib/tm_commands/iscsi/tm_clone.sh /opt/opennebula/lib/tm_commands/iscsi/tm_delete.sh /opt/opennebula/lib/tm_commands/iscsi/tm_ln.sh /opt/opennebula/lib/tm_commands/iscsi/tm_mkimage.sh /opt/opennebula/lib/tm_commands/iscsi/tm_mkswap.sh /opt/opennebula/lib/tm_commands/iscsi/tm_mv.sh
.../one-1.2.0/src/vmm/XenDriver.cc .../one-1.2.0/src/tm/TransferManager.cc
In /etc/sudoers (on all machines) opennebula ALL = NOPASSWD: /usr/sbin/lvcreate opennebula ALL = NOPASSWD: /usr/sbin/lvremove opennebula ALL = NOPASSWD: /usr/sbin/lvchange opennebula ALL = NOPASSWD: /usr/sbin/lvrename
Scalability
The above implementation works fine for a minimalistic scenario. However, when more than one concurrent boot from a virtual machine image is necessary, some changes are needed. First of all, the cluster extensions should be enabled for LVM (clvm). Note that the locking mechanism of the cluster LVM daemon makes use of the RedHat cluster management tools (cman). In order to get cluster LVM running, the following packages and minimalistic configuration files were installed:
On CentOS: cman and lvm2-cluster
On Debian cman, libcman and clvm
Configuration files (both systems) /etc/cluster/cluster.conf, /etc/cluster/lvm.conf
The second issue is that snapshots and clones on LVM are not (yet) cluster aware. Fortunately, for our purpose they do not need to be (because the base virtual machine image doesn't change during usage). The workaround is to create a cluster aware Copy-On-Write partition in LVM. This partition can be enabled exclusively on the worker node and mapped together with the base (cluster aware) virtual machine image to a local snapshot with 'dmsetup'. When modifications have to be stored after a shutdown of the virtual machine, it is sufficient to remove the mapping and synchronize the cluster aware Copy-On-Write partition (by disabling it on the worker node). The Copy-On-Write clone is enabled in the repository for the entire cluster, which makes it accessible from all worker nodes. It can now be used by Virtual Machines as any other base image. So, Copy-On-Write clones are fully recursive. A few modifications were needed to implement this:
- Small update of the TransferManager.cc patch.
- Updated transfer management commands
- The sudoers file has to allow lvs, dmsetup and blockdev to setup the local mapping
In /etc/sudoers (on all machines) opennebula ALL = NOPASSWD: /usr/sbin/lvs opennebula ALL = NOPASSWD: /sbin/dmsetup opennebula ALL = NOPASSWD: /sbin/blockdev
Local caching
Network traffic for Virtual Machine management can be optimized significantly with two caches on each worker node:
- A read cache for the original Virtual Machine image to facilitate reuse on the same worker node.
- A write-back cache for the copy-on-write clone to allow local writes when the virtual machine is active.
If requested by the user, the copy-on-write clone can be synchronized with the image repository when the virtual machine is done. After this synchronization, the write-back cache becomes obsolete and can be removed. We implemented both the read and the write-back cache at block device level (i.e. iSCSI/LVM level) with dm-cache. One LVM partition on the worker node serves as persistent local read cache for the virtual machine image. Another LVM partition on the worker node serves as transient local write-back cache for the copy-on-write clone. The transient cache is created and removed on demand by OpenNebula.
Unfortunately no CentOS or debian packages are available for dm-cache. Here is the recipe to build the kernel module from source.
On debian: apt-get install linux-source linux-patch-debian cd /usr/src tar jxf linux-source-2.6.26.tar.bz2 /usr/src/kernel-patches/all/2.6.26/apply/debian -a x86_64 -f xen cd linux-source-2.6.26 cp /boot/config-2.6.26-2-xen-amd64 .config <In the Makefile: EXTRAVERSION = -2-xen-amd64> make prepare cp /usr/src/linux-headers-2.6.26-2-xen-amd64/Module.symvers . cp -r /usr/src/linux-kbuild-2.6.26/scripts/* scripts cd wget http://github.com/mingzhao/dm-cache/tarball/master tar zxvf dm-cache.tar.gz cd dm-cache/2.6.29 ln -s /usr/src/linux-source-2.6.26/drivers/md/dm.h . ln -s /usr/src/linux-source-2.6.26/drivers/md/dm-bio-list.h . <In dm-cache.c: change BIO_RW_SYNCIO to BIO_RW_SYNC (line 172)> <Create Makefile> make insmod dm-cache.ko cp dm-cache.ko /lib/modules/2.6.26-2-xen-amd64/kernel/drivers/md/dm-cache.ko
Links
- CERN June 2009 workshop on virtual machines