Auto Restart KVM VMs while forcing KVM snapshot reversion

(this is mostly notes for my later self - feel free to drop me a line to ask questions)

TLDR My problem:

I have some stateless throwaway vm instances which discard all changes every power-off. Also, the closed-source software in the vm has driver issues that occasionally require a reboot, detected from inside the vm and initiated as an OS reboot. When rebooting, I want to simulate power-off/power-on to trigger the KVM snapshot discard. And, I want to make sure that any vm that shuts down gets restarted. How? Obscure qemu-kvm options and libvirt hooks to the rescue - special use for -snapshot and -no-reboot.

More detail:

I use libvirt (virsh) to manually control some transient, sort of stateless KVM virtual machines. I need to periodically stop/start those vms automatically (from inside so I can do a clean service shutdown). Also I want to auto-restart particular vms any time they shut down. The VMs run kvm-qemu snapshots (-snapshot option) that throw away all system changes at every power-off.

I have a close-source application that occasionally gets into an error state that can not be resolved by restarting the software (driver issues). It requires a full (virtual) system reboot.

A small monitor was written prior to my involvement in this project that can detect the unfixable error state, and initiate a nice clean service shutdown and reboot to resolve the problem. In KVM this results in a warm boot which doesn't throw away the -snapshot saved changes.

So, qemu-kvm has an option -no-reboot that forces process exit when the vm tries to do a warm boot. This shuts off the VM but does not restart it. I need to auto-restart vms that shut themselves down by trying to reboot.

So, I really have three requirements:

  • qemu-kvm needs to be invoked with -snapshot (fresh image every power off)
  • Reboots should really be a libvirt stop/start to get a 'cold boot' effect and throw away the snapshot,
  • Servers can reboot themselves. These warm boots are turned into a VM stop, and they need to be auto restarted asap

While qemu-kvm has the support I need, the version of libvirt on Centos 7 that I use doesn't have direct support for either option. I make a wrapper for qemu-kvm and then specify a custom emulator for these vms. I use both -snapshot and -no-reboot like this:

1
2
#!/bin/sh
    exec /usr/libexec/qemu-kvm "$@" -snapshot -no-reboot

And then we replace the block in the vm definition. This meets my first two requirements, but any system reboot or periodic shutdown will stay off.

What can I do to make sure they are always running? First mark them to start on host boot:

virsh autostart foo

which makes them start when the host boots but I still need to make sure they get restarted if they stop.

There are a few options I considered to solve this problem.

First, I could add some kind of 'forever' loop to the emulator script above that will just run the emulator again once it exits. e.g.

 #!/bin/sh
    while true; do
        exec /usr/libexec/qemu-kvm "$@" -snapshot -no-reboot
    done

Or Second, I could write some kind of supervisor that tries to start any stopped vms, like a standalone daemon, or an every-minute-run cronjob.

Third, Libvirt supports event hooks. I can run my own code when certain events happen. The hook would watch for shutdown events and then 'virsh start' a vm that just shut down. This is the path I started with (because I use hooks for some autoscaling functionality too). The problem is that there is a deadlock. "virsh start foo" in a hook is running before the VM is actually stopped and the process will hang forever waiting for the shutdown to finish.

I confess to taking the lazy way out. I have a hook that gets shutdown events, forks child in the background and returns. The child sleeps for 3 seconds - presumably enough for the vm to stop, and then does a 'virsh start' like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#!/bin/sh
# install into /etc/libvirt/hooks/qemu
# any vm that shuts down will be restarted

if [ "$#" -lt 2 ]; then
    echo "usage: $0 <domname> <event> end -"
fi

# release event signifies shutdown is finished
if [ "$2" == 'release' ]; then
    sh -c "sleep 3; virsh start $1" < /dev/null 2&>1 >/dev/null &
fi

Dirty, but it works. Note that Centos 7 does not by default have /etc/libvirt/hooks - you have to make that directory and restart libvirtd to pick up any hooks you add.

If I didn't have other hooks code (for autoscaling), I probably would have gone with the first option above as it is a trivial few more lines of bash.

Is there some other way I should have done this? Drop me a line if you see something obvious I missed.


Bashcpio - pure bash (almost) cpio archive extraction

Sun 05 June 2016 by Fred Clift

Ok - let's get this out of the way. Is this important? No. Is it groundbreaking? No. Can I even explain how cool it is to non-technical friends? No. Could it ever possibly be useful? Maybe! see below.

I just spent a few evenings writing a pure-bash cpio extraction implementation ...

read more

Apple mail app TLS deficiencies revisited

Fri 02 October 2015 by Fred Clift

A quick update:

Mac OS X El Capitan has hit the streets so I retested. The mail app still can not make any TLS connections using TLSv1.1 or 1.2.

Also a /u/Tulsagrammer on Reddit pointed out to me that a the mystery of why this is so ...

read more

Frustration with Apple mail app on IOS and Yosemite

Thu 10 September 2015 by Fred Clift

At work I recently have been irritated by a problem that was exposed with Apple's mail app, both on IOS and Yosemite. Among other things, I maintain an imap server (using dovecot ) for our office email.

First some background

Dovecot makes it easy to enable TLS, and disallow unencrypted ...

read more

What do you get from a $200 multimeter?

Thu 02 October 2014 by Fred Clift

I was recently discussing with friends how I like Apple computers (e.g. the MacBook Pro I use for work) but have a hard time justifying the extra cost when in many cases I can get something 'good enough' for half the price.

And sometime around that conversation, the fact ...

read more

Not Sharing python command history between python2 and python3

Tue 16 September 2014 by Fred Clift

I'm finally getting around to switching most-of-the-time to python3.

One minor annoyance was that I had to fix my simple .pythonrc.py file that I use to turn on command history for python2.

In python 3, it appears that readline, tab-completion, and command history between sessions are automatically enabled ...

read more

Chromecast - This is why we can't have nice things

Thu 11 September 2014 by Fred Clift

There was a recent chromecast firmware update from google. This wasn't the long awaited new functionality promised at IO. It ostensibly has better tab-casting performance and some bug fixes (e.g. change in some apis related to subtitles, among others).

There has been some speculation that the reason for ...

read more

Chromecast - steps closer to a python native api

Fri 05 September 2014 by Fred Clift

So, after seeing this: https://gist.github.com/TheCrazyT/11263599 I got more interested in being able to speak the native chromecast api from python.

That lead me to this presentation by Sebastian Mauer: http://www.slideshare.net/mauimauer/chrome-cast-and-android-tv-add14 especially slide 20, which lead to a bit more google ...

read more

Chromecast - Displaying arbitrary URLs using pychromecast

Mon 25 August 2014 by Fred Clift

Continuing on with my Chromecast experiments... I have been playing with a python library on github by Paulus Schoutsen called pychromecast. At various points in it's life it has been able to interact with Chromecast devices to do a variety of things. With the official SDK release and firmware ...

read more

Chromecast - both cool and frustrating

Thu 21 August 2014 by Fred Clift

I recently purchased a Chromecast device. For the price it's a great media streamer. You can control it from your Android or IOS smartphone from many different apps. For chrome browser, you need to install a browser plugin. There are many websites that, when viewed in Chrome browser give ...

read more

Notes on chrome remote debugging

Thu 07 August 2014 by Fred Clift

These are mostly notes for me, but you might find them useful also.

On my laptop, I run chrome and usually have many, many tabs open across a few windows. Google searching for me usually ends up with me open-in-new-tabbing the first 5 or 10 links concurrently... It bugs me ...

read more

Why the Nook won my dollars over Kindle

Fri 18 July 2014 by Fred Clift

Summer Vacation Time.

So I vacation on the beach every summer. I read a lot. This year I decided I would try out an ebook reader. Because the plan is to be outdoors on the beach, E-Ink readers seem to be the way to go. There are two big names ...

read more

Managing Lots Of Pregenerated HTML And Other Files With Pelican

Sat 14 June 2014 by Fred Clift

For one of the Pelican-managed websites I maintain, I have a lot of files that I don't really want to manage with Pelican, and in some cases, I can't easily, without lots of gymnastics.

In this case, I have about 30k html files that are a dump of ...

read more

Why I wont fix your computer, part 3

Thu 12 June 2014 by Fred Clift

Not quite a computer, but...

I have an Embryon Pinball machine made by Bally in the early 80s - one of their early solid-state machines.

See either the Internet Pinball Database or Pinside for pictures and more info.

When I bought this machine, I got a great deal on it because ...

read more

Why I wont fix your computer, part 2

Wed 11 June 2014 by Fred Clift

On being stupid with my own hardware

I built a mame machine into a dead arcade cabinet for the breakroom of a former employer. Well, it was for me, and for a few others that had regularly been playing an original upright Vs Super Mario Brothers, and then later an ...

read more

Embedding PHP in Pelican-generated Static pages

Tue 10 June 2014 by Fred Clift

So, I wanted to make a simple website with pelican. But I had a little legacy php code that I still wanted to function.

It sure would be nice, I thought, if I could take my php app and with only small tweaks make it work as PART of pelican ...

read more

Trying out Pelican static site generator.

Tue 10 June 2014 by Fred Clift

TLDR


I'm running a couple of websites with pelican static website generator because it's easy to maintain, and lightweight, and kind of futurer-proof.

History


So I've tried a bunch of tools over the years to make personal web pages. I hand-rolled html (for a class) in the ...

read more

Why I wont fix your computer, part 1

Tue 10 June 2014 by Fred Clift

How not to upgrade a server

I was working at a unix admin at a private university. A research lab wanted an OS upgrade on their lab NFS and web server, which was indirectly my responsibility. User data, webserver, web content, all on an external (scsi) drive. I showed up ...

read more