Author Topic: Cached (SSD) storage infrastructure for VM's  (Read 692 times)

0 Members and 1 Guest are viewing this topic.

Offline Mad Penguin

  • #Mad_Penguin_UK
  • Administrator
  • Hero Member
  • *****
  • Posts: 1320
  • Karma: 10017
  • Gender: Male
    • View Profile
    • Linux in the UK
    • Awards
Cached (SSD) storage infrastructure for VM's
« on: November 14, 2013, 11:08:01 am »
Currently there seem to be three choices when it comes to where and how to store your virtual machine images, these would be;
  • Local storage, either RAW images or Cooked (eg; QCOW2) format
  • Remote storage, typically a shared and/or replicated system like NFS or Gluster
  • Shared storage over dedicated hardware
There are “many” issues with each of these options in terms of latency, performance, cost and resilience – there is no 'ideal' solution. After facing this problem over and over again, we've come up with a fourth option.

Cache your storage on a local SSD, but hold your working copy on a remote server, or indeed servers. Using such a mechanism, we've managed to eradicate all of the negatives we experienced historically other options.

Features
  • Virtual machines run against SSD image caches local to the hypervisor
  • Images are stored remotely and accessed via TCP/IP
  • The Cache is LFU (*not* LRU) which makes it relatively 'intelligent'
  • Bandwidth related operations are typically 'shaped' to reduce spikes
  • Cache analysis (1 command) will give you an optimal cache size for your VM usage
  • The storage server support sparse storage, inline compression and snapshots
  • The system supports TRIM end-to-end, VM deletes are reflected in backend usage
  • All reads/writes are checksummed
  • The database is log-structured and takes sequential writes [which is very robust and very quick]
  • Database writing is “near” wire-speed in terms of storage hardware performance
  • Live migration is supported
  • The cache handles Replica's and will parallel write and stripe read (RAID 10)
  • Snapshot operations are hot and “instant” with almost zero performance overhead
  • Snapshots can be mounted RO on temporary caches
  • Cache presents as a standard Linux block device
  • Raw images are supported to make importing pre-existing VM's easier
Which means ...

In terms of how these features compare to traditional mechanisms, network bottlenecks are greatly reduced as the vast majority of read operations will be serviced locally, indeed if you aim for a cache hit rate of 90%, then you should be able to run 10x the number of VM's as an NFS based solution on the same hardware (from an IO perspective) Write operations are buffered and you can set an average and peak rate for writing (per instance) so write peaks will be levelled with the local SSD acting as a huge [persistent] write buffer. (this write buffer survives shutdowns and will continue to to flush on reboot)

If you assume a 90% hitrate, then 90% of your requests will be subject to a latency of 0.1ms (SSD) rather then 10ms (HD) , so the responsiveness of instances running on cache when compared (for example) to NFS is fairly staggering. If you take a VM running Ubuntu Server 12.04 for example and type “shutdown -r now”, and time hitting the return key to when it comes back with a login prompt, my test kit takes under 4 seconds - as opposed to 30-60 seconds on traditional NFS based kit.

And when it comes to cost, this software has been designed to run on commodity hardware, that means desktop motherboards / SSD's on 1G NIC's – although I'm sure it'll be more than happy to see server hardware should anyone feels that way inclined.

The software is still at the Beta stage, but we now have a working interface for OpenNebula. Although it's not complete it can be used to create, run and maintain both persistent and non-persistent images. Note that although this should run with any Linux based hypervisor, every system has it's quirks – for now we're working with KVM only and using Ubuntu 13.10 as a host. (13.04 should also be Ok, but there are issues with older kernels so 12.xx doesn't currently fly [as a host])

As of today we have a public rack-based testbed so we should be able to provide a demonstration within the next few weeks, so if you're interested in helping / testing, please do get in touch.


Online Mark Greaves (PCNetSpec)

  • Administrator
  • Hero Member
  • *****
  • Posts: 13878
  • Karma: 344
  • Gender: Male
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
    • Awards
Re: Cached (SSD) storage infrastructure for VM's
« Reply #1 on: November 21, 2013, 02:26:45 am »
Anything I can do to help ?
WARNING: You are logged into reality as 'root'

logging in as 'insane' is the only safe option.

Offline Gareth Bult

  • Jr. Member
  • **
  • Posts: 1
  • Karma: 0
  • Gender: Male
    • View Profile
    • Awards
Re: Cached (SSD) storage infrastructure for VM's
« Reply #2 on: November 21, 2013, 12:15:35 pm »
I guess now would be a good time to post some instructions, be aware however that this ain't easy, not least if you don't have a spare OpenNebula cloud knocking around .. :) 
  • Obtain the software;
    apt-get install git uuid lvm2 liblvm2app2.2 liblvm2cmd2.02
    cd /usr/src/
    git clone git@github.com:garethbult/vdc-nebula.git
  • Generate a config file
    mkdir -p /etc/vdc
    mkdir -p /home/datastores
    vi /etc/vdc/config
Code: [Select]
[global]
  host = <your machine name>
  proto = lsfs
  path = /home/datastores
  size = 10G

[instance]
  path = /home/datastores
  size = 10G
  proto = lsfs
  • Make sure NBD is loaded
    modprobe nbd
  • Check you have the right libraries loaded
    ldd ./vdc-config/remotes/binaries/vdc-server
    ldd
    ./vdc-config/remotes/binaries/vdc-store
    ("not found" means something is missing")

  • Start the Server
    ./vdc-config/remotes/binaries/vdc-server
  • Create a client cache
    ./vdc-config/remotes/binaries/vdc-tool --create -n instance -c1G lvm:<vg>/instance vdc:localhost/instance

    <vg> needs to be a local volume group which has some space, obviously the intention is that this will be
    SSD based space, but any space will do for testing.

  • Start the cache
    ./vdc-config/remotes/binaries/vdc-store -n instance
At this point, report any problems with the instructions and give yourself a medal.
{ something is bound to go wrong before you reach this point }

Moving on, you can check /dev/vdc/mapper/instance and make sure it exists.
If it does, mkfs -text4 -m 0 /dev/vdc/mapper/instance ...
« Last Edit: November 21, 2013, 12:28:54 pm by Gareth Bult »
Fear is not real. It is a product of thoughts you create.

Online Mark Greaves (PCNetSpec)

  • Administrator
  • Hero Member
  • *****
  • Posts: 13878
  • Karma: 344
  • Gender: Male
  • "-rw-rw-rw-" .. The Number Of The Beast
    • View Profile
    • PCNetSpec
    • Awards
Re: Cached (SSD) storage infrastructure for VM's
« Reply #3 on: November 22, 2013, 11:17:13 pm »
Hmm .. haven't even got a spare SSD, so I can't really help with testing,

I was wondering if there was any other way I could help ?

I dunno .. documentation (hard if I don't understand it) .. maybe moving a copy of my VPS onto it and seeing if I can break it somehow .. dunno, but if you can think of *anything* gimme a shout.

--
« Last Edit: November 22, 2013, 11:22:03 pm by Mark Greaves (PCNetSpec) »
WARNING: You are logged into reality as 'root'

logging in as 'insane' is the only safe option.

 


SimplePortal 2.3.3 © 2008-2010, SimplePortal