Thursday, July 13, 2023

Zones, way back when

The original big ticket feature in Solaris 10 was Zones, a simple virtualization technology that allowed a set of processes to be put aside in a separate namespace and be under the illusion that this was a separate computer system, all under a single shared kernel.

As a result of this sleight of hand, you could connect to a zone using ssh (or, remember this was way back, telnet or rsh), and from the application level you really were in a separate system - with your own file system and network namespaces. It was like magic.

Of the features in Solaris 10, Zones and DTrace were present early in the beta cycle, while SMF just made it into the last couple of beta builds, and ZFS wasn't actually available to customers until well after the first Solaris 10 release.

I ended up using zones in production quite accidentally. In the Solaris 10 Platinum Beta, we were testing the new features, just giving them a good beating, when one of our webservers (it was something like a Netra X1) died. Sure, we could have got it repaired, or reconfigured another server. But as an experiment, I simply fired up a zone on one of my beta systems, gave it the IP address of the failed server, installed apache, copied over the website, and we were back in service in about 5 minutes.

The Zones framework turns out to be incredibly flexible and powerful. I suspect most don't realize just what it's actually capable of, as Sun only gave you a canned product in two variations - whole-root and sparse-root zones. Later you saw glimpses of the power available with the first incarnation of LX zones (or SCLA - Solaris Containers for Linux Applications) and then the Solaris 8 and Solaris 9 containers, which allowed a different set of applications to run inside a zone.

Things actually became more limited in OpenSolaris and its derivatives such as Solaris 11; not only was LX removed, but so were sparse-root zones, and the diversity of potential zone types dwindled.

In illumos, some of the distributions have pushed Zones a bit further. Tribblix brought back sparse root zones, and introduced the alien brand - essentially a way to run any illumos OS or application in a zone. OmniOS has brought back LX, and it's reasonably current (in terms of keeping up with changes in the Linux world). SmartOS ran KVM in Zones, allowing double-hulled virtualization. And we now have bhyve as a fully supported offering for any illumos distribution, usually
embedded in a Zone.

Using a sparse-root zone is incredibly efficient. By sharing the main operating system files (mostly /lib and /usr, but can be others) you can save huge amounts of disk space - you only have to have one copy so that's a saving of anything for a couple of hundred megabytes to a couple of gigabytes of storage per zone. It gets better, because the read-only segments of any binaries and shared libraries are shared between zones, which dramatically reduces the additional memory footprint of each zone. Further on from that, because Solaris has this trick whereby any shared object used more that 8 times (or something like that) is kept resident in memory, all the common applications are always in memory and start incredibly quickly.

One of the things I did was use sparse-root zones and shared filesystems for a development -> test -> production setup. Basically, you create 3 zones, sparse-root ensures they're identical, and 3 filesystems - one each for development, test, and production. You share the development filesystem read-only into the test zone, so deployment from development to test is a straight copy. Likewise test to production.

One of the weaknesses of the way that zones were managed (distinct from the underlying technology framework) is that it was based around packaging. In Solaris 10, packaging and packages knew about zones, and the details about what files and packages ended up in a zone was embedded in the package metadata. Not only is this complex, it's also very rigid - you can't evolve the system without changing the packaging system and modifying all the packages. Sadly, IPS carried forward the same mistake. (In Tribblix, packaging knows nothing about zones whatsoever, but my zones understand packaging and can do the right thing with it - not only with much more flexibility but many times quicker.)

Later on in the Solaris 10 timeframe we got ZFS, which allowed you to do interesting things around sharing data and quickly creating copies of data for zones, allowing you to extend the virtual capabilities of zones from cpu and memory to storage. And the key missing piece, virtualized networking, never made it to Solaris 10 at all, but had to wait for crossbow to arrive in OpenSolaris.