If you want to do Xen development you should be working with upstream sources, and you should be
sending your patches upstream, ASAP, that is before they are even in
production. There simply should be no ifs or doubts about this. Doing it
any other way is simply detrimental in the long run. I'm new to
virtualization but from the architectural look of it I consider kvm
a good reaction to virtualization evolution with focus for a clean new
architecture that pairs up best with the latest hardware enhancements
only. The decision to not support new bells and whistles on things that
could be done through software but instead designed with hardware
support eliminates tons of support on the software side, but obviously
it relies on the assumption that folks will upgrade hardware and that
the hardware was designed properly. Xen however is full of a rich history, experience, and flexibility, and as such its important to realize that there should be no easy decision to claim what is a better solution right now.
One
thing I'm sure: both solutions at this point have a rich set of
expertise and design goals to be learned from, the one thing I see kvm
doing right is pushing Upstream First (TM) as a motto. Xen should learn
from that strategy as there are markets and innovative groups who
appreciate this tremendously. With the rapid pace of evolution of the
Linux kernel, there is simply no other way, and because of this Xen development should change to a must be working upstream only model, and join the Upstream First (TM)
bandwagon. In this post I will dive into the recipes required to get
the latest Xen and vanilla Linux sources and get you started on the Upstream First (TM) bandwagon with Xen. I provide instructions for getting both Xen and the upstream Linux kernel
configured properly. I will ignore anything not upstream on the Linux
kernel, as what we need to do with that delta is just get it upstream.
Additionally since even Debian has casted votes on supporting systemd
as a Linux init replacement I'll also provide instructions on how to
get systemd support on xen with active socket support as it seems that's
the way of the future for all Linux distributions. Both Fedora 20 and
OpenSUSE 13.1 have already jumped on systemd so you'll want proper
systemd support for these, as it stands right now Xen does not have
service unit files as part of its upstream sources, patches are in the
works though and this posts also illustrates some corner cases found
while implementing support, some general systemd autotools library helpers defined to make it easier for others to integrates support for systemd and an example code base which makes elaborate use of these helpers.
Please note that compiling xen with systemd support enables binaries to be used for systems either using legacy init or systemd using the the v5 series of integration patches documented here, systemd support patches are not yet merged upstream, but to help
provide wider coverage support you should enable its support as per the
instructions below and report any issues you have found to me. Since I
wish for as many folks to jump on the upstream bandwagon I'll cover
instructions only for getting the latest xen to run on the latest stable
vanilla kernel over a slew of Linux distributions, this includes the
Linux kernel as well as xen, and resolving all your dependencies. I'll
recommend building and embracing oxenstored for reasons I've stated before,
after all if you run into issues with the latest systemd series of
patches you can easily revert back to cxenstored by a simple flip on the
configuration file on either /etc/sysconfig/xencommons (rpm based
distributions) or /etc/defaults/xencommons (Debian based distributions)
(Note: this last part still needs to be worked on, right now this
requires a bit more work for systemd).
I have built tested the below instructions on OpenSUSE Tumbleweed, Debian testing, and Fedora 20. I have only run time tested this on OpenSUSE Tumbleweed
and Debian testing. Reports for any issues on run time on Fedora 20 and
Ubuntu are appreciated. Instructions for other Linux distributions are
welcomed so I can extend the documentation here while systemd support
patches get baked upstream, after that I will move all documentation to
the xen wiki.
[Image]
Getting an updated /sbin/installkernel
Linux distributions
shipping with grub2 will need to ensure that their /sbin/installkernel
script, which has to be provided by each Linux distribution, copies the
the kernel configuration upon a custom kernel install time. The
requirement for the config file comes from upstream grub2 /etc/grub.d/20_linux_xen which
will only add xen as an instance to your grub.cfg if and only if it finds in your config file either of:
CONFIG_XEN_DOM0=y CONFIG_XEN_PRIVILEGED_GUEST=y
Without this a user compiling and installing their own kernel with proper support for xen and
with the xen hypervisor present will not get their respective grub2
update script to pick up the xen hypervisor. Debian testing has proper
support for this, OpenSUSE required this change upstream on mkinitrd, so OpenSUSE folks will want to get the latest /sbin/installkernel hosted on the OpenSUSE mkinitrd repository on github.
# If on OpenSUSE update your /sbin/installkernel
git clone https://github.com/openSUSE/mkinitrd.git
cd mkinitrd
sudo cp sbin/installkernel /sbin/installkernel
Fedora might need a similar update. I welcome feedback on confirming this.
[Image]
Xen systemd build dependencies on OpenSUSE
# If you're now on the latest OpenSUSE you'll note its now a
# a rolling distribution base for (and also called Factory)
# The default instructions do not actually encourage you to
# install the source repositories, and even if you did
# install them the instructions disable them by default, so
# be sure to install them and enable them otherwise
# the command zypper source-install -d won't work.
# To enable the required repository if you already had it
# installed:
sudo zypper mr -e repo-src-oss
# Get the build dependencies for Xen
sudo zypper source-install -d xen
# Things not picked up by the build dependencies
sudo
zypper install systemd-devel gettext-tools\
ocaml ocaml-compiler-libs
ocaml-runtime \
ocaml-ocamldoc ocaml-findlib glibc-devel-32bit make patch
# Get build dependencies for Linux
sudo zypper source-install -d kernel-desktop
[Image]
Xen systemd build dependencies on Debian testing and maybe Ubuntu
Note that these instructions are not to enable systemd as the init process on Debian, although there are some instructions here to help you with that if you wish to venture into that.
[Image]
Xen systemd build dependencies on Fedora 20
Fedora
may need an update to /sbin/installkernel as OpenSUSE did for grub2
support, see the notes above for more details on that. Verification on
this is appreciated.
# Get build dependencies for xen
sudo yum-builddep xen
# Things not picked up by the build dependencies
sudo yum install glibc-devel.x86_64 systemd-devel.x86_64
# Get build dependencies for Linux
sudo yum-builddep kernel
# If on systemd, that is, if you have /run/systemd/system/
sudo systemctl daemon-reload
The
last step is to enable the systemd unit services you want, if you want
to test the active socket stuff, just enable xenstored.socket, and after
reboot you can just use netcat as root to tickle the socket as
described below, if you just want to have the xenstored service already
running enable the xenstored.service, which will also enable
xenstored.socket as its a dependency.
The
last step is to ensure the grub config updated to pick up the xen
hypervisor. This varies depending on Linux distributions. Below we cover
the distributions that I have tested booting on.
Updating grub for Xen on OpenSUSE
sudo update-bootloader --refresh
Updating grub for Xen on Debian and maybe Ubuntu
sudo update-grub
Reboot and test
That's all, reboot and make sure you pick the
right grub entry. Typically grub2 will list regular kernel entries and
hypervisor entries separated, with the option to go into advanced
settings for each one. Entering the advanced settings for the hypervisor
will enable you to pick the exact kernel you want to boot to. If you
have hardware with some virtualization capabilities you'll want to
enable that, this is done on through the BIOS / UEFI menu. Below are
some pictures of enabling the features on a Thinkpad T440p, and then the
flow through grub2.
Get into the virtualization menu on the system BIOS / UEFI menu.
[Image]
On
Intel hardware this will be labeled as Intel Virtualization Technology
and Intel VT-d Feature. For AMD the name is some other flashy similar
thing.
[Image]
Boot
into grub and you should now see an option for your distribution with
the Xen hypervisor, pick that if you want to go with the defaults, but
if instead you want to browse each hypervisor available pick the
advanced options.
[Image]
If
you picked the default hypervisor option you should be booting into the
Xen Hypervisor and that in turn will boot your kernel / distribution.
If you picked the advanced option you'll see the options for the
hypervisor as below. In my case I have only the bleeding edge unstable
version from git of the Xen hypervisor.
[Image]
Next
it will let you pick the kernel you want to boot your hypervisor with.
All of the kernels with support for Xen will be displayed.
[Image]
After this you should be booting into the Xen hypervisor and this in turn will boot Linux as dom0.
After bootup
Starting xen with old init
First verify you booted into a xen hypervisor first as follows:
mcgrof@garbanzo ~ $ cat /sys/hypervisor/type
xen
You're
all set, the next step is to start Xen. On Linux distributions stuck on
old init like Debian right now you just have to spawn the old init
script. This is done as follows:
mcgrof@garbanzo ~ $ sudo /etc/init.d/xencommons start
Starting /usr/local/sbin/oxenstored...
Setting domain 0 name and domid...
Starting xenconsoled...
Starting QEMU as disk backend for dom0
mcgrof@garbanzo ~ $ echo $?
0
You are ready to start creating guests!
Starting xen with systemd
First thing is to ensure your dom0 is now booted on the xen hypervisor. If you have systemd you can do this easily with:
mcgrof@ergon ~ $ sudo systemd-detect-virt
xen
Under the hood this is the same as the following:
mcgrof@garbanzo ~ $ cat /sys/hypervisor/type
xen
If you only enabled xenstored.socket you can verify the sockets by:
Next
to see the active socket magic trigger you can just use netcat to
tickle any of the sockets. Since the permissions are only to grant
access to the root user you'll need root to tickle the socket.
mcgrof@ergon ~ $ sudo systemctl status xenstored.service
xenstored.service - Xenstored - daemon managing xenstore file system
Loaded: loaded (/usr/local/lib/systemd/system/xenstored.service; disabled)
Active: active (running) since Tue 2014-05-20 04:33:09 PDT; 1 day 16h ago
Main PID: 1621 (oxenstored)
CGroup: /system.slice/xenstored.service
└─1621 /usr/local/sbin/oxenstored --no-fork
May 21 21:24:24 ergon oxenstored[1621]: xenstored is ready
Why you want active sockets
Systemd has support for "active sockets" or "socket based activation", but this concept is not new, socket based activation was pioneered by Apple's Launchd,
and that software was released under the Apache 2.0 license, that
project got its first release in 2005, while systemd's initial release
dates 2010. Go and watch Dave Zarzycki's talk at Google about Launchd, there's tons of talks about systemd and, here's an old introduction talk about systemd it by Lennart Poettering,
and Lennart does give Apple proper kudos here. Systemd is simply über
optimized for Linux, it takes advantage of tons of special Linux kernel
enhancements. Socket based activation is ideal for local service,
AF_UNIX sockets, although support does exist for inet sockets as well.
There are two reasons why you want active sockets:
On demand auto-spawning
Help with bootup parallelizaiton
The on demand auto-spawning can be taken advantage by xen if and
only if its tools are converted to try to open the unix socket when
they run, but they currently don't do this and some communication uses
the kernel ring interface, not the unix domain sockets. If you use the
stubdoms you also never end up using the unix domain sockets. The gains
from parrallelization however are awlays welcomed, you essentially let
systemd figure out how to bring things up by associating dependencies
rather than trying to pile things up in a specific strict numbered
order, this is all controlled by the service unit files and the
requirements specified. Udev lends a here as well, which is not merged
part of systemd, but I'll have to cover udev on another post. If one had
an ecosystem that one was sure did not require the service to be
spawned up all the time and you didn't need the kernel ring interface
immediatley up, you could just either enable only the xenstored.socket
or remove this section from the xenstored.service:
WantedBy=multi-user.target
A few things worth noting for daemons and systemd that I do not see covered clearly
in documentation, the exact expectations on the different type of
service types. Systemd supports different types of daemons, for those
that don't fork you should declare in your service unit file a type of:
Service=simple
For daemons that do call fork() you should use the following:
Service=forking
In
legacy init world, this consists of most of the daemons out there.
There's a bit of a caveat here though: systemd expects you to behave in a
certain way if you use Service=forking, your first parent process should be the one to call sd_notify_fds(),
you should not let child processes do the sd_notify_fds() call. What
deamons do vary and the assumption on systemd that daemon's spawn
sockets on the parent rather than children means deamons will need a bit
of a change in order to work with systemd properly as there is no way
to tell systemd a child is going to be the main process, even if you try
sd_notifyf() with the process ID of the child. Arguably there's a good
reason for this though, you should consider using Service=notify and when you use this type of service you don't fork
as part of your deamonizing effort, instead you just tell systemd when
your service is ready with sd_notify(). There's some curious
architectural design principles worth elaborating on that comes with
this that highlight a mistake typically in place on some deamons that do
fork. When deamonizing and forking killing the parent immediately is
the easy and fastest way from a programmer's perspective but
should typically not be done given that regular legacy init that spawn
daemons in order will enable processes to make use of the daemon under
the impression that the deamon is ready, leaving a small amount of time for a race condition to trigger.
Typically this is addressed with nasty undocumented workarounds, for
example retry connections to connect to the unix domain sockets on
daemons that are expected to be created after initialization. Mind you, the race condition is small
but yet very possible, specially if we want to boot up fast. This is
one of the races that systemd services using sd_notify() avoid by
design. This is pretty cool.
funk-systemd - example complex systemd daemon
Apart
from corner cases there is also the complexities introduced by the
different types of build systems / target systems, specially for
projects which really want to support multiple Operating Systems and
init systems such as Xen. To address different build environments and
targets a lot of projects use autotools, Xen follows this practice so
integrating support for systemd on Xen required proper autotools
support. Autotools support with systemd can get complicated fast -- you
see, systemd does not allow variable placements on ExecStart settings
for the binary you wish to run, this means that if your project uses
configure to dynamically place the path of the binary you will also need
proper replacement for the paths upon configure time. With autotools
this is accomplished with the AC_CONFIG_FILES() helper but in order to
make use of some paths with AC_CONFIG_FILES() you'll want to eval and
call AC_SUBST() on them. This is not only useful for the ExecStart but
also consider the different placements of the socket files. If using
${prefix} for any of the paths you will need to work with a not-so-well
documented $ac_default_prefix. You also have to consider the different
types of build environments and the different types of target systems
that a project wishes to support for a produced single binary daemon.
The different build environments may vary. A project may wish to
support forcing systemd to be present, some may wish to only use systemd
if the development libraries are present, and others may with to
require you to specify that you want systemd explicitly. As far as
target systems are concerned -- they vary as well, in the worst case
scenario a project may wish to support legacy init with and without
systemd libraries present and then for the case where systemd is the
init process. In this example situation if its desirable to support a
single binary for all types of init systems the dynamic link loader
(using dlopen(), dlsym()) can be used, or a in-place replacement for
sd_booted() can be implemented as well instead of relying and calling on
the systemd helper sd_booted(). A project such as Xen that supports two
daemons for the same type of service also needs to consider which route
to take for supporting and maintaining service until files for the
different possible daemons. There's different strategies for this. A lot
of this is not well documented, and good examples for for projects as
complex as Xen's build system are not readily available, let alone cover
all the cases I've described. Becuase of all this and since I ended up
doing the work for systemd Xen integration I made sure to try to
generalize a solution and address all types of environments as described
above, I have also stuffed a sample daemon which also covers documents
the legacy init corner case that sd_notify() explicitly addresses. You
can find the sample code here, the autoconf helpers defined and
documented here are also being submitted as part of the xen system
integration patches:
To
look at an example solution for the legacy init race condition look at
the usage of funk_wait_ready() which is called on the parent process
that forks. As for xen, the legacy init daemon has as part of init
script a retry counter, we should be able to remove that code with a
similar solution for the legacy socket implementation. In this tree you
will also find a few helpers if you want to get ramped up with systemd
and autoconf which xen's systemd ingration patches make use of: src/m4/systemd.m4 - systemd autoconf library which enables easy build integration support for systemd. There are four build options supported
AX_ENABLE_SYSTEMD() - enables systemd by default and requires an
explicit --disable-systemd option flag to configure if you want to
disable systemd support.
AX_ALLOW_SYSTEMD() - systemd will be disabled by default and
requires you to run configure with --enable-systemd to look for and
enable systemd
AX_AVAILABLE_SYSTEMD() - systemd will be disabled by default but if
your build system is detected to have systemd build libraries it will be
enabled. You can always force disable with --disable-systemd. This is
the option we have decided to use for Xen.
If you want to use the dynamic link loader you should use
AX_AVAILABLE_SYSTEMD() but must then ensure to use -rdynamic -ldl when
linking, if using automake autotools will deal with this for
you,otherwise you must ensure this is in place on your Makefile.
src/m4/paths.m4
- Implements AX_LOCAL_EXPAND_CONFIG() which you can use to replace meta
@VAR@ variables on files defined with AC_CONFIG_FILES(). You might want
to make use of this for example on systemd service unit file ExecStart,
on the socket definition file, and/or the code that connects to the
sockets.
src/funk_dynamic_helpers.c
- example systemd integration implementation support using the dynamic
link loader -- using dlopen() and dlsym() which can be used for the
one-binary-fits all solutions. Although a solution with this strategy
was tested for systemd, this is not the option we are going to support
on Xen.
funk daemon with-autoconf implementation - example implementation with the above helpers with autoconf support alone
funk daemon with-automake implementation - example implementation with the above helpers with automake support
README and INSTALL - read these for more details on this example
Systemd support for projects with multiple daemon replacements
Xen
is a good example of a project that requires support for multiple
alternative binaries that can run as the daemon. For such type of
situations there are a few possible solutions, this has been discussed only briefly on the systemd-devel list, you can end up implementing:
Define a service unit file each for daemon, and define one target
which defines the overall service. Service unit files that require the
service will require the target, not the actual service unit file. The
service unit files are then mutually exclusive with each other, the
system administrator would then have to then manually select which
service unit to enable. The downside to this strategy is you end up with
multiple service unit files which in the worst case are identical and
only differ on the ExecStart path.
Define a service unit file for each daemon and define an
Alias=foo.service for the general service. Services that need to depend
on this service would then Require the alias, not the specific service
file for each binary. The same downside is present with this solution.
One service file and environment variables to be used by a binary
launcher which will get use getenv() and execve() to launch the
respective preferred daemon. This option gives the flexibility to be
easily compatible with legacy init daemons that typically require
/etc/sysconfig/ or /etc/default/ configuration files. Although Lennart has clarified that ideally the systemd-way could be to ignore /etc/sysconfig and /etc/default all together
this solution would still enable to ignore /etc/sysconfig/ and
/etc/default/ by requiring the default variable to be set via
Environment=FOO_DEFAULT_DAEMON=/usr/local/sbin/bar. For support with
legacy init systems the EnvironmentFile=-/etc/sysconfig/foodaemon and
EnvironmentFile=-/etc/default/foodaemon can be used.
No example code or service unit files is provided at this point, what we end up doing for Xen remains to be decided.
Ocaml and systemd support
Xen has an ocaml implementation of the xenstore so as you can imagine we also had to add some support for systemd with ocaml. I won't provide examples here, but just not that support has been provided using a C interface wrapper. For details please review the posted patches.
"Building and booting vanilla Xen on vanilla Linux with systemd"
No comments yet. -