tag:blogger.com,1999:blog-97268332024-03-11T03:23:24.287+00:00The Trouble with Tribbles...Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.comBlogger571125tag:blogger.com,1999:blog-9726833.post-83313023721787281702024-02-19T20:19:00.000+00:002024-02-19T20:19:24.915+00:00The SunOS JDK builder<p>I've been building <a href="https://ptribble.blogspot.com/2021/12/keeping-java-alive-on-illumos.html">OpenJDK on Solaris and illumos</a> for a while.</p><p>This has been moderately successful; illumos distributions now have access to up to date LTS releases, most of which work well. (At least 11 and 17 are fine; 21 isn't quite right.)</p><p>There are even some third-party collections of my patches, primarily for Solaris (as opposed to illumos) builds.</p><p>I've added another tool. The <a href="https://github.com/ptribble/jdk-sunos-builder">SunOS jdk builder</a>.</p><p>The aim here is to be able to build every single jdk tag, rather than going to one of the existing repos which only have the current builds. And, yes, you could grope through the git history to get to older builds, but one problem with that is that you can't actually fix problems with past builds.</p><p>Most of the content is in the <a href="https://github.com/ptribble/jdk-sunos-patches">jdk-sunos-patches</a> repository. Here there are patches for both illumos and Solaris (they're ever so slightly different) for every tag I've built.</p><p>(That's almost every jdk tag since the Solaris/SPARC/Studio removal, and a few before that. Every so often I find I missed one. And there's been the odd bad patch along the way.)</p><p>The idea here is to make it easy to build every tag, and to do so on a current system. I've had to add new patches to get some of the older builds to work. The world has changed, we have newer compilers and other tools, and the OS we're building on has evolved. So if someone wanted to start building the jdk from scratch (and remember that you have to build all the versions in sequence) then this would be useful.</p><p>I'm using it for a couple of other things.</p><p>One is to put back SPARC support on illumos and Solaris. The initial port I did was on x86 only, so I'm walking through older builds and getting them to work on SPARC. We'll almost certainly not get to jdk21, but 17 seems a reasonable target.</p><p>The other thing is to enable the test suites, and then run them, and hopefully get them clean. At the moment they aren't, but a lot of that is because many tests are OS-specific and they don't know what Solaris is so get confused. With all the tags, I can bisect on failures and (hopefully) fix them.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-11610366693160404282023-11-22T14:31:00.000+00:002023-11-22T14:31:19.049+00:00Building up networks of zones on Tribblix<p>With OpenSolaris and derivatives such as illumos, we gained the ability to build a whole IT infrastructure in a single box, using virtualized networking (crossbow) to build the underlying network and then attaching virtualized systems (zones) atop virtualized storage (zfs).<br /><br />Some of this was present in Solaris 10, but it didn't have crossbow so the networking piece was a bit tricky (although I did manage to get surprisingly far by abusing the loopback interface).<br /><br />In <a href="http://www.tribblix.org/">Tribblix</a>, I've long had the notion of a router or proxy zone, which acts as a bridge between the outside world and a local virtual subnet. For the next release I've been expanding that into something much more flexible and capable.<br /><br />What did I need to put this together?<br /><br />The first thing is a virtual network. You use <span style="font-family: courier;">dladm</span> to create an <span style="font-family: courier;">etherstub</span>. Think of that as a virtual switch you can connect network links to.<br /><br />To connect that to the world, a zone is created with 2 network interfaces (<span style="font-family: courier;">vnics</span>). One over the system interface so it can connect to the outside world, and one over the <span style="font-family: courier;">etherstub</span>.<br /><br />That special router zone is a little bit more than that. It runs NAT to allow any traffic on the internal subnet - simple NAT, nothing complicated here. In order to do that the zone has to have IPFilter installed, and the zone creation script creates the right ipnat configuration file and ensures that IPFilter is started.<br /><br />You also need to have IPFilter installed in the global zone. It doesn't have to be running there, but the installation is required to create the IPFilter devices. Those IPFilter devices are then exposed to the zone, and for that to work the zone needs to use exclusive-ip networking rather than shared-ip (and would need to do so anyway for packet forwarding to work).<br /><br />One thing I learnt was that you can't lock the router zone's networking down with allowed-address. The anti-spoofing protection that allowed-address gives you prevents forwarding and breaks NAT.<br /><br />The router zone also has a couple of extra pieces of software installed. The first is <a href="http://www.haproxy.org/">haproxy</a>, which is intended as an ingress controller. That's not currently used, and could be replaced by something else. The second is <a href="https://dnsmasq.org/">dnsmasq</a>, which is used as a dhcp server to configure any zones that get connected to the subnet.<br /><br />With a network segment in place, and a router zone for management, you can then create extra zones.<br /><br />The way this works in Tribblix is that if you tell zap to create a zone with an IP address that is part of a private subnet, it will attach its network to the corresponding <span style="font-family: courier;">etherstub</span>. That works fine for an exclusive-ip zone, where the <span style="font-family: courier;">vnic</span> can be created directly over the <span style="font-family: courier;">etherstub</span>.<br /><br />For shared-ip zones it's a bit trickier. The <span style="font-family: courier;">etherstub</span> isn't a real network device, although for some purposes (like creating a <span style="font-family: courier;">vnic</span>) it looks like one. To allow shared-ip, I create a dedicated shared <span style="font-family: courier;">vnic</span> over the <span style="font-family: courier;">etherstub</span>, and the virtual addresses for shared-ip zones are associated with that <span style="font-family: courier;">vnic</span>. For this to work, it has to be plumbed in the global zone, but doesn't need an address there. The downside to the shared-ip setup (or it might be an upside, depending on what the zone's going to be used for) is that in this configuration it doesn't get a network route; normally this would be inherited off the parent interface, but there isn't an IP configuration associated with the <span style="font-family: courier;">vnic</span> in the global zone.<br /><br />The shared-ip zone is handed its IP address. For exclusive-ip zones, the right configuration fragment is poked into dnsmasq on the router zone, so that if the zone asks via dhcp it will get the answer you configured. Generally, though, if I can directly configure the zone I will. And that's either by putting the right configuration into the files in a zone so it implements the right networking at boot, or via <span style="font-family: courier;">cloud-init</span>. (Or, in the case of a solaris10 zone, I populate <span style="font-family: courier;">sysidcfg</span>.)<br /><br />There's actually a lot of steps here, and doing it by hand would be rather (ahem, very) tedious. So it's all automated by zap, the package and system administration tool in Tribblix. The user asks for a router zone, and all it needs to be given is the zone's name, the public IP address, and the subnet address, and all the work will be done automatically. It saves all the required details so that they can be picked up later. Likewise for a regular zone, it will do all the configuration based on the IP address you specify, with no extra input required from the user.<br /><br />The whole aim here is to make building zones, and whole systems of zones, much easier and more reliable. And there's still a lot more capability to add.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-8008898916062290422023-11-04T22:34:00.000+00:002023-11-04T22:34:31.844+00:00Keeping python modules in check<p>Any operating system distribution - and <a href="http://www.tribblix.org/">Tribblix</a> is no different - will have a bunch of packages for <a href="https://www.python.org/">python</a> modules.</p><p>And one thing about python modules is that they tend to depend on other python modules. Sometimes a lot of python modules. Not only that, the dependency will be on a specific version - or range of versions - of particular modules.</p><p>Which opens up the possibility that two different modules might require incompatible versions of a module they both depend on.</p><p>For a long time, I was a bit lax about this. Most of the time you can get away with it (often because module writers are excessively cautious about newer versions of their dependencies). But occasionally I got bitten by upgrading a module and breaking something that used it, or breaking it because a dependency hadn't been updated to match.</p><p>So now I always check that I've got all the dependencies listed in packaging with</p><p></p><blockquote><span style="font-family: courier;">pip3 show modulename</span></blockquote><p></p><p>and every time I update a module I check the dependencies aren't broken with</p><p></p><blockquote><span style="font-family: courier;">pip3 check</span></blockquote><p></p><p>Of course, this relies on the machine having all the (interesting) modules installed, but on my main build machine that is generally true.</p><p>If an incompatibility is picked up by <span style="font-family: courier;">pip3 check</span> then I'll either not do the update, or update any other modules to keep in sync. If an update is impossible, I'll take a note of which modules are blockers, and wait until they get an update to unjam the process.</p><p>A case in point was that <span style="font-family: courier;">urllib3</span> went to version 2.x recently. At first, nothing would allow that, so I couldn't update <span style="font-family: courier;">urllib3</span> at all. Now we're in a situation where I have one module I use that won't allow me to update <span style="font-family: courier;">urllib3</span>, and am starting to see a few modules requiring <span style="font-family: courier;">urllib3</span> to be updated, so those are held downrev for the time being.</p><p>The package dependencies I declare tend to be the explicit module dependencies (as shown by pip3 show). Occasionally I'll declare some or all of the optional dependencies in packaging, if the standard use case suggests it. And there's no obvious easy way to emulate the notion of extras in package dependencies. But that can be handled in package overlays, which is the safest way in any case.</p><p>Something else the checking can pick up is when a dependency is removed, which is something that can be easily missed.</p><p>Doing all the checking adds a little extra work up front, but should help remove one class of package breakage.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-13284798112466360822023-10-27T11:49:00.000+01:002023-10-27T11:49:14.089+01:00It seemed like a simple problem to fix<p>While a bit under the weather last week, I decided to try and fix what at first glance appears to be a simple problem:</p><p></p><blockquote>need to ship the manpage with exa</blockquote><p></p><p>Now, <a href="https://the.exa.website/">exa</a> is a modern file lister, and the package on Tribblix doesn't ship a man page. The reason for that, it turns out, is that there isn't a man page in the source, but you can generate one.</p><p>To build the man page requires pandoc. OK, so how to get pandoc, which wasn't available on Tribblix? It's written in <a href="https://www.haskell.org/">Haskell</a>, and I did have a Haskell package.</p><p>Only my version of Haskell was a bit old, and wouldn't build pandoc. The build complains that it's too old and unsupported. You can't even build an old version of pandoc, which is a little peculiar.</p><p>Off to upgrade Haskell then. You need Haskell to build Haskell, and it has some specific requirements about precisely which versions of Haskell work. I wanted to get to 9.4, which is the last version of Haskell that builds using make (and I'll leave Hadrian for another day). You can't build Haskell 9.4 with 9.2 which it claims to be too new, you have to go back to 9.0.</p><p>Fortunately we do have some <a href="https://us-central.manta.mnx.io/pkgsrc/public/pkg-bootstraps/index.html">bootstrap kit</a>s for illumos available, so I pulled 9.0 from there, successfully built Haskell, then cabal, and finally pandoc.</p><p>Back to exa. At which point you notice that it's been deprecated and replaced by <a href="https://eza.rocks/">eza</a>. (This is a snag with modern point tools. They can disappear on a whim.)</p><p>So let's build eza. At which point I find that the MSRV (Minimum Supported Rust Version) has been bumped to 1.70, and I only had 1.69. Another update required. Rust is actually quite simple to package, you can just <a href="https://forge.rust-lang.org/infra/other-installation-methods.html">download the stable version</a> and package it.</p><p>After all this, exa still doesn't have a man page, because it's deprecated (if you run <span style="font-family: courier;">man exa</span> you get something completely different from X.Org). But I did manage to upgrade Haskell and Cabal, I managed to package pandoc, I updated rust, and I added a replacement utility - eza - which does now come with a man page.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-85033391201473676932023-10-09T20:34:00.001+01:002023-10-09T20:34:37.409+01:00When zfs was young<p>On the Solaris 10 Platinum Beta program, one of the most exciting promised features was ZFS, the new file system.<br /><br />I was especially interested, given that I was in a data-heavy position at the time. The limits of UFS were painful, we had datasets into several terabytes already - and even the multiterabyte file system support that got added was actually pretty useless because the inode density was so low. We tried QFS and SAM-QFS, and they were pretty appalling too.<br /><br />ZFS was promised, and didn't arrive. In fact, there were about 4 of us on the beta program who saw the original zfs implementation, and it was quite different from what we have now. What eventually landed as zfs in Solaris was a complete rewrite. The beta itself was interesting - we were sent the driver, 3 binaries, and a 3-line cheatsheet, and that was it. There was a fundamental philosophy here that the whole thing was supposed to be so easy to use and sufficiently obvious that it didn't need a manual, and that was actually true. (It's gotten rather more complex since, to be fair.)<br /><br />The original version was a bit different in terms of implementation than what you're used to, but not that much. The most obvious change was that originally there wasn't a top-level file system for a pool. You created a pool, and then created your file systems. I'm still not sure which is the correct choice. And there was a separate zacl program to handle the ACLs, which were rather different.<br /><br />In fact, ACLs have been a nightmare of bad implementations throughout their history on Solaris. I already had previous here, having got the POSIX draft ACL implementation reworked for UFS. The original zfs implementation had default aka inheritable ACLs applied to existing objects in a directory. (If you don't immediately realise how bad that is, think of what this allows you to do with hard links to files.) The ACL implementations have continued to be problematic - consider that zfs allows 5 settings for the aclinherit property as evidence that we're glittering a turd at this point.<br /><br />Eventually we did get zfs shipped in a Solaris 10 update, and it's been continually developed since then. The openzfs project has given the file system an independent existence, it's now in FreeBSD, you can run it (and it runs well) on Linux, and in other OS variations too.<br /><br />One of the original claims was that zfs was infinitely scalable. I remember it being suggested that you could create a separate zfs file system for each user. I had to try this, so got together a test system (an Ultra 2 with an A1000 disk array) and started creating file systems. Sure, it got into several thousand without any difficulty, but that's not infinite - think universities or research labs and you can easily have 10,000 or 100,000 users, we had well over 20,000. And it fell apart at that scale. That's before each is an NFS share, too. So that idea didn't fly.<br /><br />Overall, though, zfs was a step change. The fact that you had a file system that was flexible and easily managed was totally new. The fact that a file system actually returned correct data rather than randomly hoping for the best was years ahead of anything else. Having snapshots that allowed users to recover from accidentally deleted files without waiting days for a backup to be restored dramatically improved productivity. It's win after win, and I can't imagine using anything else for storing data.<br /><br />Is zfs perfect? Of course not, and to my mind one of the most shocking things is that nothing else has even bothered to try and come close.<br /><br />There are a couple of weaknesses with zfs (or related to zfs, if I put it more accurately). One is that it's still a single-node file system. While we have distributed storage, we still haven't really matured that into a distributed file system. The second is that while zfs has dragged storage into the 21st century, allowing much more sophisticated and scalable management of data, there hasn't been a corresponding improvement in backup, which is still stuck firmly in the 1980s.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com2tag:blogger.com,1999:blog-9726833.post-79207881282926224382023-10-04T19:52:00.000+01:002023-10-04T19:52:38.094+01:00SMF - part of the Solaris 10 legacy<p>The Service Management Facility, or SMF, integrated extremely late in the Solaris 10 release cycle. We only got one or two beta builds to test, which seemed highly risky for such a key feature.<br /><br />So there was very little time to gather feedback from users. And something that central really can't be modified once it's released. It had to work first time.<br /><br />That said, we did manage some improvements. The current implementation of `svcs -x` is largely due to me struggling to work out why a service was broken.<br /><br />One of the obvious things about SMF is that it relies on manifests written in XML. Yes, that's of its time - there's a lot of software you can date by the file format it uses.<br /><br />I don't have a particular problem with the use of XML here, to be honest. What's more of a real problem is that the manifest files were presented as a user interface rather than an internal implementation detail, so that users were forced to write XML from scratch with little to no guidance.<br /><br />There are a lot of good features around SMF.<br /><br />Just the very basic restart of an application that dies is something that's so blindingly obvious as a requirement in an operating system. So much so that once it existed I refused to support anything that didn't have SMF when I was on call - after all, most of the 3am phone calls were to simply restart a crashed application. And yes, when we upgraded our systems to Solaris 10 with SMF our availability went way up and the on-call load plummeted. <br /><br />Being able to grant privileges to a service, and just within the context of that service, without having to give privileges to an application (eg set*id) or a user, makes things so much safer. Although in practice it's letting applications bind to privileged ports while running as a regular user, as that's far and away the most common use case.<br /><br />Dependencies has been a bit of a mixed bag. Partly because working out what the dependencies should be in the first place is just hard to get right, but also because dependency declaration is bidirectional - you can inject a dependency on yourself into another service, and that other service may not respond well, or you can create a circular dependency if the two services are developed independently.<br /><br />One part of dependency management in services is deciding whether a given service should start or not given the state of other services (such as its dependencies). Ideally, you want strict dependency management. In the real world, systems are messy and complicated, the dependency tree isn't terribly well understood, and some failure modes don't matter. And in many cases you want the system to try and boot as far as possible so you can get in and fix it.<br /></p><p>A related problem is that we've ended up with a complex mesh of services because someone had to take the old mess of rc scripts and translate them into something that would work on day 1. And nobody - either at the time or since - has gone though the services and studied whether the granularity is correct. One other thing - that again has never happened - once we got a good handle on what services there are is to look at whether the services we have are sensible, or whether there's an opportunity to rearchitect the system to do things better, And because all these services are now baked into SMF, it's actually quite difficult to do any major reworking of the system.<br /></p><p>Not only that, but because people write SMF manifests, they simply copy something that looks similar to the problem at hand, so bad practices and inappropriate dependency declarations multiply.<br /><br />This is one example of what I see as the big problem with SMF - we haven't got supporting tools that present the administrator with useful abstractions, so that everything is raw.<br /><br />In terms of configuration management, SMF is very much a mixed bag. Yes, it guarantees a consistent and reproducible state of the system. The snag is that there isn't really an automated way to capture the essential state of a system and generate something that will reproduce it (either later or elsewhere) - it can be done, but it's essentially manual. (Backing up the state is a subset of this problem.)<br /><br />It's clear that there were plans to extend the scope of SMF. Essentially, to be the Solaris version of the Windows registry. Thankfully (see also systemd for where this goes wrong) that hasn't happened much.<br /><br />In fact, SMF hasn't really involved in any material sense since the day it was introduced. It's very much stuck in time.<br /><br />There were other features that were left open. For example, there's the notion of the scope of SMF, and the only one available right now is the "localhost" scope - see the smf(7) manual in illumos - so in theory there could be other, non-localhost, scopes. And there was the notion of monitor methods, which never appeared but I can imagine solving a range of niggling application issues I've seen over the years.<br /><br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-6437020662409778412023-09-11T15:56:00.005+01:002023-09-11T15:56:43.517+01:00Retiring isaexec in Tribblix<p>One of the slightly unusual features in illumos, and Solaris because that's where it came from, is <span style="font-family: courier;">isaexec</span>.</p><p>This facility allows you to have multiple implementations of a binary, and then <span style="font-family: courier;">isaexec</span> will select the best one (for some definition of best).</p><p>The full implementation allows you to select from a wide range of architectures. On my machine it'll allow the following list:</p><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">amd64 pentium_pro+mmx pentium_pro<br />pentium+mmx pentium i486 i386 i86</span></p></blockquote></blockquote><p>If you wanted, you could ship a highly tuned pentium_pro binary, and eke out a bit more performance.<br /></p><p>The common case, though, and it's actually the only way <span style="font-family: courier;">isaexec</span> is used in illumos, is to simply choose between a 32-bit and 64-bit binary. This goes back to when Solaris and illumos supported 32-bit and 64-bit hardware in the same system (and you could actually choose whether to boot 32-bit or 64-bit under certain circumstances). In this case, if you're running a 32-bit kernel you get a 32-bit application; if you're running 64-bit then you can get the 64-bit version of that application.</p><p>Not all applications got this treatment. Anything that needed to interface directly with the kernel did (eg the <span style="font-family: courier;">ps</span> utility). And for others it was largely about performance or scalability. But most userland applications were 32-bit, and still are in illumos. (Solaris has migrated most to 64-bit now, we ought to do the same.)</p><p>It's been 5 years or more since illumos removed the 32-bit kernel, so the only option is to run in 64-bit mode. So now, <span style="font-family: courier;">isaexec</span> will only ever select the 64-bit binary.<br /></p><p>A while ago, Tribblix simply removed the remaining 32-bit binaries that <span style="font-family: courier;">isaexec</span> would have executed on a 32-bit system. This saved a bit of space.</p><p>The upcoming m32 release goes further. In almost all cases <span style="font-family: courier;">isaexec</span> is no longer involved, and the 64-bit binary sits directly in the PATH (eg, in <span style="font-family: courier;">/usr/bin</span>). There's none of the wasted redirection. I have put symbolic links in, just in case somebody explicitly referenced the 64-bit path.</p><p>This is all done by manipulating packaging - Tribblix runs the IPS package repo through a transformation step to produce the SVR4 packages that the distro uses, and this is just another filter in that process.</p><p>(There are a handful of exceptions where I still have 32-bit and 64-bit. Debuggers, for example, might need to match the bitness of the application being debugged. And the way that <span style="font-family: courier;">sh</span>/<span style="font-family: courier;">ksh</span>/<span style="font-family: courier;">ksh93</span> is installed needs a slightly less trivial transformation to get it right.)<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-84591639360672549612023-09-04T18:56:00.001+01:002023-09-04T18:56:43.802+01:00Modernizing scripts in Tribblix<p>It's something I've been putting off for far too long, but it's about time to modernize all the shell scripts that <a href="http://www.tribblix.org/">Tribblix</a> is built on.</p><p>Part of the reason it's taken this long is the simple notion of, if it ain't broke, don't fix it.</p><p>But some of the scripting was starting to look a bit ... old. Antiquated. Prehistoric, even.</p><p>And there's a reason for that. Much of the scripting involved in Tribblix is directly derived from the system administration scripts I've been using since the mid-1990s. That involved managing Solaris systems with SVR4 packages, and when I built a distribution derived from OpenSolaris, using SVR4 packages, I just lifted many of my old scripts verbatim. And even new functionality was copied or slightly modified.<br /></p><p>Coming from Solaris 2.3 through 10, this meant that they were very strictly Bourne Shell. A lot of the capabilities you might expect in a modern shell simply didn't exist. And much of the work was to be done in the context of installation (i.e. Jumpstart) where the environment was a little sparse.</p><p>The most obvious code smell is extensive use of backticks rather than $(). Some of this I've refactored over time, but looking at the code now, not all that much.</p><p>One push for this was adding <a href="https://www.shellcheck.net/">ShellCheck</a> to Tribblix (it was a little bit of a game getting Haskell and Cabal to play nice, but I digress).</p><p>Running ShellCheck across all my scripts gave it a lot to complain about. Some of the complaints are justified, although many aren't (it's very enthusiastic about quoting everything in sight, even when that would be completely wrong).</p><p>But generally it's encouraged me to clean the scripts up. It's even managed to find a bug, although looking at code it thinks is just rubbish has found a few more by inspection.</p><p>The other push here is to speed things up. Tribblix is often fairly quick in comparison to other systems, but it's not quick enough for me. But more of that story later.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-43469480112818973032023-08-24T11:07:00.000+01:002023-08-24T11:07:21.583+01:00Speed up zone installation with this one weird trick<p>Sadly, the trick described below won't work in current releases of Solaris, or any of the <a href="https://illumos.org/">illumos</a> distributions. But back in the day, it was pretty helpful.</p><p>In Solaris 10, we had sparse root zones - which shared <span style="font-family: courier;">/usr</span> with the global zone, which not only saved space because you didn't need a copy of all the files, but creating them was much quicker because you didn't need to take the time to copy all the files.</p><p>Zone installation for sparse root zones was typically about 3 minutes for us - this was 15 years ago, so mostly spinning rust and machines a bit slower than we're used to today.</p><p>That 3 minutes sounds quick, but I'm an impatient soul, and so were my users. Could I do better?</p><p>Actually, yes, quite a bit. What's contributing to that 3 minutes? There's a bit of adding files (the <span style="font-family: courier;">/etc</span> and <span style="font-family: courier;">/var</span> filesystems are not shared, for reasons that should be fairly obvious). And you need to copy the packaging metadata. But that's just a few files.</p><p>Most of the time was taken up by building the contents file, which simply lists all the installed files and what package they're in. It loops over all the packages, merging all the files in that package into the contents file, which thus grows every time you process a package.</p><p>The trick was to persuade it to process the packages in an optimal order. You want to do all the little packages first, so that the contents file stays small as long as possible.</p><p>And the way to do that was to recreate the <span style="font-family: courier;">/var/sadm/pkg</span> directory. It was obvious that it was simply reading the directory and processing packages in the order that it found them. And, on ufs, this is the order that the packages were added to the directory. So what I did was move the packages to one side, create an empty <span style="font-family: courier;">/var/sadm/pkg</span>, and move the package directories back in size order (which you can get fairly easily by looking as the size of the spooled pkgmap files).</p><p>This doesn't quite mean that the packages get processed in size order, as it does the install in dependency order, but as long as dependencies are specified it otherwise does them in size order.</p><p>The results were quite dramatic - with no other changes, this took zone install times from the original 3 minutes to 1 minute. Much happier administrators and users.</p><p>This trick doesn't work at all on zfs, sadly, because zfs doesn't simply create a linear list of directory entries and put new ones on the end.</p><p>And all this is irrelevant for anything using IPS packaging, which doesn't do sparse-root zones anyway, and is a completely different implementation.</p><p>And even in <a href="http://www.tribblix.org/">Tribblix</a>, which does have sparse-root zones like Solaris 10 did, and uses SVR4 packaging, the implementation is orders of magnitude quicker because I just create the contents file in a single pass, so a sparse zone in Tribblix can install in a second or so.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-16752445269910215512023-08-23T16:36:00.000+01:002023-08-23T16:36:02.881+01:00Remnants of closed code in illumos<p>One of the annoying issues with <a href="https://illumos.org/">illumos</a> has been the presence of a body of closed binaries - things that, for some reason or other, were never able to be open sourced as part of OpenSolaris.</p><p>Generally the illumos project has had some success in replacing the closed pieces, but what's left isn't entirely zero.It took me a little while to work out what's still left, but as of today the list is:</p><blockquote><blockquote><p>etc/security/tsol/label_encodings.gfi.single<br />etc/security/tsol/label_encodings.example<br />etc/security/tsol/label_encodings.gfi.multi<br />etc/security/tsol/label_encodings<br />etc/security/tsol/label_encodings.multi<br />etc/security/tsol/label_encodings.single<br />usr/sbin/chk_encodings<br />usr/xpg4/bin/more<br />usr/lib/raidcfg/mpt.so.1<br />usr/lib/raidcfg/amd64/mpt.so.1<br />usr/lib/iconv/646da.8859.t<br />usr/lib/iconv/8859.646it.t<br />usr/lib/iconv/8859.646es.t<br />usr/lib/iconv/8859.646fr.t<br />usr/lib/iconv/646en.8859.t<br />usr/lib/iconv/646de.8859.t<br />usr/lib/iconv/646it.8859.t<br />usr/lib/iconv/8859.646en.t<br />usr/lib/iconv/8859.646de.t<br />usr/lib/iconv/iconv_data<br />usr/lib/iconv/646fr.8859.t<br />usr/lib/iconv/8859.646da.t<br />usr/lib/iconv/646sv.8859.t<br />usr/lib/iconv/8859.646.t<br />usr/lib/iconv/646es.8859.t<br />usr/lib/iconv/8859.646sv.t<br />usr/lib/fwflash/verify/ses-SUN.so<br />usr/lib/fwflash/verify/sgen-SUN.so<br />usr/lib/fwflash/verify/sgen-LSILOGIC.so<br />usr/lib/fwflash/verify/ses-LSILOGIC.so<br />usr/lib/labeld<br />usr/lib/locale/POSIX<br />usr/lib/inet/certlocal<br />usr/lib/inet/certrldb<br />usr/lib/inet/amd64/in.iked<br />usr/lib/inet/certdb<br />usr/lib/mdb/kvm/amd64/mpt.so<br />usr/lib/libike.so.1<br />usr/lib/amd64/libike.so.1<br />usr/bin/pax<br />platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.46<br />platform/i86pc/kernel/cpu/amd64/cpu_ms.GenuineIntel.6.47<br />lib/svc/manifest/network/ipsec/ike.xml<br />kernel/kmdb/amd64/mpt<br />kernel/misc/scsi_vhci/amd64/scsi_vhci_f_asym_lsi<br />kernel/misc/scsi_vhci/amd64/scsi_vhci_f_asym_emc<br />kernel/misc/scsi_vhci/amd64/scsi_vhci_f_sym_emc<br />kernel/strmod/amd64/sdpib<br />kernel/drv/amd64/adpu320<br />kernel/drv/amd64/atiatom<br />kernel/drv/amd64/usbser_edge<br />kernel/drv/amd64/sdpib<br />kernel/drv/amd64/bcm_sata<br />kernel/drv/amd64/glm<br />kernel/drv/amd64/intel_nhmex<br />kernel/drv/amd64/lsimega<br />kernel/drv/amd64/marvell88sx<br />kernel/drv/amd64/ixgb<br />kernel/drv/amd64/acpi_toshiba<br />kernel/drv/amd64/mpt<br />kernel/drv/adpu320.conf<br />kernel/drv/usbser_edge.conf<br />kernel/drv/mpt.conf<br />kernel/drv/intel_nhmex.conf<br />kernel/drv/sdpib.conf<br />kernel/drv/lsimega.conf<br />kernel/drv/glm.conf<br /><br /></p></blockquote></blockquote><p>Actually, this isn't much. In terms of categories:</p><p>Trusted, which includes those label_encodings, and labeld. Seriously, nobody can realistically run trusted on illumos (I have, it's ... interesting). So these don't really matter.</p><p>The iconv files actually go with the closed iconv binary, which we replaced ages ago, and our copy doesn't and can't use those files. We should simply drop those (they will be removed in Tribblix next time around).</p><p>There's a set of files connected to IKE and IPSec. We should replace those, although I suspect that modern alternatives for remote access will start to obsolete all this over time.</p><p>The scsi_vhci files are to get multipathing correctly set up on some legacy SAN systems. If you have to use such a SAN, then you need them. If not, then you're in the clear.</p><p>There are a number of drivers. These are mostly somewhat aged. The sdp stuff is being removed anyway as part of <a href="https://github.com/illumos/ipd/blob/master/ipd/0029/README.md">IPD29</a>, so that'll soon be gone. Chances are that very few people will need most of these drivers, although mpt was fairly widely used (there was an open mpt replacement <a href="https://www.illumos.org/issues/3">in the works</a>). Eventually the need for the drivers will dwindle to zero as systems with them in no longer exist (and, by the same token, we wouldn't need them for something like an aarch64 port).</p><p>Which just leaves 2 commands.</p><p>Realistically, the XPG4 more could be replaced by less. The standard was based on the behaviour of less, after all. I'm tempted to simply delete /usr/xpg4/bin/more and make it a link to less and have done with it.</p><p>As for pax, it's required by POSIX, but to be honest I've never used it, haven't seen anywhere that uses it, and read support is already present in things like libarchive and gtar. The <a href="https://sourceforge.net/projects/heirloom/">heirloom</a> pax is probably more than good enough.</p><p>In summary, illumos isn't quite fully open source, but it's pretty close and for almost all cases we could put together a fully functional open subset that'll work just fine.</p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com1tag:blogger.com,1999:blog-9726833.post-12425352189555495312023-08-09T12:46:00.001+01:002023-08-09T12:46:42.437+01:00Static Site Generators<p>The current <a href="http://www.tribblix.org/">Tribblix</a> website is a bit of a hack. Technically it's using a static site generator - a simple home-grown script that constructs pages from a bit of content and boilerplate - but I wanted to be able to go a bit further.</p><p>I looked at a few options - and there are really a huge number of them - such as <a href="https://gohugo.io/">Hugo</a> and <a href="https://www.getzola.org/">Zola</a>. (Both are packaged for Tribblix now, by the way.)</p><p>In the end I settled on <a href="https://nanoc.app/">nanoc</a>. That's packaged too (and I finally got around to having a very simple - rather naive - way of packaging gems).</p><p>Why nanoc, though? In this case it was really because it could take the html page fragments I already had and create the site from those, and after tweaking it slightly I end up with exactly the same html output as before.</p><p>Other options might be better if I was starting from scratch, but it would have been much harder to retain the fidelity of the existing site.</p><p>One advantage of the new system is that I can put the site under proper source control, so the repo is <a href="https://github.com/tribblix/tribblix-website">here</a>.<br /></p><p>There's still a lot of work to be done on filling out the content, but it should be easier to evolve the Tribblix website in future.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-8728710205870635752023-07-13T14:34:00.000+01:002023-07-13T14:34:30.754+01:00Zones, way back when<p>The original big ticket feature in Solaris 10 was Zones, a simple virtualization technology that allowed a set of processes to be put aside in a separate namespace and be under the illusion that this was a separate computer system, all under a single shared kernel.<br /><br />As a result of this sleight of hand, you could connect to a zone using ssh (or, remember this was way back, telnet or rsh), and from the application level you really were in a separate system - with your own file system and network namespaces. It was like magic.<br /><br />Of the features in Solaris 10, Zones and DTrace were present early in the beta cycle, while SMF just made it into the last couple of beta builds, and ZFS wasn't actually available to customers until well after the first Solaris 10 release.<br /><br />I ended up using zones in production quite accidentally. In the Solaris 10 Platinum Beta, we were testing the new features, just giving them a good beating, when one of our webservers (it was something like a Netra X1) died. Sure, we could have got it repaired, or reconfigured another server. But as an experiment, I simply fired up a zone on one of my beta systems, gave it the IP address of the failed server, installed apache, copied over the website, and we were back in service in about 5 minutes.<br /><br />The Zones framework turns out to be incredibly flexible and powerful. I suspect most don't realize just what it's actually capable of, as Sun only gave you a canned product in two variations - whole-root and sparse-root zones. Later you saw glimpses of the power available with the first incarnation of LX zones (or SCLA - Solaris Containers for Linux Applications) and then the Solaris 8 and Solaris 9 containers, which allowed a different set of applications to run inside a zone.<br /><br />Things actually became more limited in OpenSolaris and its derivatives such as Solaris 11; not only was LX removed, but so were sparse-root zones, and the diversity of potential zone types dwindled.<br /><br />In illumos, some of the distributions have pushed Zones a bit further. Tribblix brought back sparse root zones, and introduced the alien brand - essentially a way to run any illumos OS or application in a zone. OmniOS has brought back LX, and it's reasonably current (in terms of keeping up with changes in the Linux world). SmartOS ran KVM in Zones, allowing double-hulled virtualization. And we now have bhyve as a fully supported offering for any illumos distribution, usually<br />embedded in a Zone.<br /><br />Using a sparse-root zone is incredibly efficient. By sharing the main operating system files (mostly /lib and /usr, but can be others) you can save huge amounts of disk space - you only have to have one copy so that's a saving of anything for a couple of hundred megabytes to a couple of gigabytes of storage per zone. It gets better, because the read-only segments of any binaries and shared libraries are shared between zones, which dramatically reduces the additional memory footprint of each zone. Further on from that, because Solaris has this trick whereby any shared object used more that 8 times (or something like that) is kept resident in memory, all the common applications are always in memory and start incredibly quickly.<br /><br />One of the things I did was use sparse-root zones and shared filesystems for a development -> test -> production setup. Basically, you create 3 zones, sparse-root ensures they're identical, and 3 filesystems - one each for development, test, and production. You share the development filesystem read-only into the test zone, so deployment from development to test is a straight copy. Likewise test to production.<br /><br />One of the weaknesses of the way that zones were managed (distinct from the underlying technology framework) is that it was based around packaging. In Solaris 10, packaging and packages knew about zones, and the details about what files and packages ended up in a zone was embedded in the package metadata. Not only is this complex, it's also very rigid - you can't evolve the system without changing the packaging system and modifying all the packages. Sadly, IPS carried forward the same mistake. (In Tribblix, packaging knows nothing about zones whatsoever, but my zones understand packaging and can do the right thing with it - not only with much more flexibility but many times quicker.)<br /><br />Later on in the Solaris 10 timeframe we got ZFS, which allowed you to do interesting things around sharing data and quickly creating copies of data for zones, allowing you to extend the virtual capabilities of zones from cpu and memory to storage. And the key missing piece, virtualized networking, never made it to Solaris 10 at all, but had to wait for crossbow to arrive in OpenSolaris.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com1tag:blogger.com,1999:blog-9726833.post-77568163772731948362023-05-08T15:15:00.002+01:002023-05-08T15:15:54.685+01:00Maintaining old software with no sign of retirement<p>There's a lot of really old software out there. Some of it has simply been abandoned; others have been replaced by new versions. But old software never really goes away, and we end up maintaining it.<br /><br />This is especially tricky when old software depends on other old software, and we have to support the entire dependency tree.<br /><br />There's always python2 and python 3. Some old software may never be fixed; some current software has consciously decided to stick to python 2. Distributions will be shipping python 2 for a long time yet.<br /><br />Then there's PCRE and PCRE2. Some things have been updated; others haven't. Generally for this I'll keep updating, and eventually upstream might get around to migrating. But again I'll have to ship both for a while.<br /><br />And then there's gtk2 and gtk3. (I find it ironic that the gimp itself is still using gtk2.) There's no end in sight of the need to ship both.<br /><br />Some libraries have been deprecated entirely. the old libXp (the X printing library) is long gone. There were a couple of things built against it in Tribblix. I've just rebuilt chimera (a really old Xaw web browser if your memory doesn't go that far back) which was one consumer and now isn't; the other one was Motif (there's a convenient build flag --disable-printing to disable libXp support, which entertainingly breaks the build someplace else which I ended up having to fix).<br /><br />Another example, libpng has gone through several different revisions. Each slightly incompatible, and you have to be sure to run with the same version you built against. At least you can ship all the different versions, as they have the version in the names. Mind you, linking against 2 different versions of libpng at the same time (for example, if a dependency pulls in a different version of libpng) is a bad thing, so I did have to rebuild a number of applications to avoid that. I ship the old libpng versions in a separate compat package, I think chimera was the only consumer, but I updated that to use a more current libpng.<br /><br />A slightly different problem is the use of newer toolchains. Compilers are getting stricter over time, so old unmaintained software needs patches to even compile.<br /><br />Don't even get me started on openssl.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-89485290718120350202023-05-07T10:01:00.000+01:002023-05-07T10:01:07.163+01:00Upgrading MATE on Tribblix<p>I spent a little time yesterday updating <a href="https://mate-desktop.org/">MATE</a> on <a href="http://www.tribblix.org/">Tribblix</a>, to version 1.26.<br /><br />This was supposed to be part of the "next" release, but we had to make an out of sequence release for an illumos security issue, so everything gets pushed back a bit.<br /><br />Updating MATE is actually fairly easy, though. The components in MATE are largely decoupled, so can be updated independently of each other. (And there isn't really a MATE framework everything has to subscribe to, so the applications can be used outside MATE without any issues.)<br /><br />There's a bit of tidying up and polish that helps. For example, I delete static archives and the harmful libtool archive files. Not only does this save space, it helps maintainability down the line.<br /><br />Builds have a habit of picking up dependencies from the build system. Sometimes you can control this with judicious --enable-foo or --disable-foo flags, sometime you just have to make sure that the package you don't want pulled in isn't installed. The reverse is true - if you want a feature to be enabled, you have to make sure the dependencies are installed first and the feature will usually get enabled automatically.<br /><br />That's not always true. For example, you have to explicitly tell it you have OSS for audio, it doesn't work this out on its own.<br /><br />I took the opportunity to make everything 64-bit. Ultimately I want to get to 64-bit only. This involves a bit of working backwards - you have to make all consumers of a library 64-bit only first.<br /><br />A couple of components are held downrev. The calculator now wants to pull in mpc and mpfr, which I don't package. (They're used by gcc, but I drop a copy of mpc and mpfr into the build for gcc to find rather than packaging them separately the way that most of the other illumos distributions do.) And pluma wants gtksourceview-4 which I don't have yet. This is related to the lack of tight coupling I mentioned earlier - there really isn't any problem having the different pieces that make up MATE at different revisions.)<br /><br />You stumble across bugs along the way. For example, mate-control-center actually needs GLib 2.66 or later, which I don't have yet (there's another whole set of issues behind that), but it doesn't actually check for the right version. Fortunately the requirement is fairly localized and easy to patch out.<br /><br />That done, on to another set of updates...<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-26979060532461405492023-03-22T19:22:00.000+00:002023-03-22T19:22:14.949+00:00SPARC Tribblix m26 - what's in a number?<p>I've just released <a href="http://www.tribblix.org/download.html#sparc">Tribblix m26</a> for SPARC.</p><p>The release history on SPARC looks a little odd - m20, m20.6, m22, m25.1, and now m26. Do these release versions mean anything?</p><p>Up to and including m25.1, the illumos commit that the SPARC version was built from matched the corresponding x86 release. This is one reason there might be a gap in the release train - that commit might not build or work on SPARC.</p><p>As of m26, the version numbers start to diverge between SPARC and x86. In terms of illumos-gate, this release is closer to m25.2, but the added packages are generally fairly current, closer to m29. So it's a bit of a hybrid.</p><p>But the real reason this is a full release rather than an m25 update is to establish a new baseline, which allows me to establish compatibility guarantees and roll over versions of key components, in this case it allows me to upgrade perl.</p><p>In the future, the x86 and SPARC releases are likely to diverge further. Clearly SPARC can't track the x86 releases perfectly, as SPARC support is being removed from the mainline source following <a href="https://github.com/illumos/ipd/blob/master/ipd/0019/README.md">IPD 19</a>, and many of the recent changes in illumos simply aren't relevant to SPARC anyway. So future SPARC releases are likely to simply increment independently.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-57623975951768269962023-03-12T16:27:00.000+00:002023-03-12T16:27:11.582+00:00How I build the Tribblix AMIs<p>I run <a href="http://www.tribblix.org/">Tribblix</a> on AWS, and <a href="http://www.tribblix.org/aws.html">make some AMIs available</a>. They're only available in London (eu-west-2) by default, because that's the only place where I use them, and it costs money to have them available in other regions. If you want to run them elsewhere, you can copy the AMI.</p><p>It's not actually that difficult to create the AMIs, once you've got the hang of it. Certainly some of the instructions you might find can seem a little daunting. So here's how I do it. Some of the details here are very specific to my own workflow, but the overall principles are fairly generic. The same method would work for any of the illumos distributions, and you could customize the install however you wish.<br /></p><p>The procedure below assumes you're running Tribblix m29 and have bhyve installed.</p><p>The general process is to boot and install an instance into bhyve, then boot that and clean it up, save that disk as an image, upload to S3, and register an AMI from that image.<br /></p><p>You need to use the minimal ISO (I actually use a custom, even more minimal ISO, but that's just a convenience for myself). Just launch that as root:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">zap create-zone -t bhyve -z bhyve1 \<br />-x 192.168.0.236 \<br />-I /var/tmp/tribblix-0m29-minimal.iso \<br />-V 8G</span><br /></p><p>Note that this creates an 8G zvol, which is the starting size of the AMI.</p><p>Then run socat as root to give you a VNC socket to talk to</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">socat TCP-LISTEN:5905,reuseaddr,fork UNIX-CONNECT:/export/zones/bhyve1/root/tmp/vm.vnc<br /></span></p><p>and as yourself, run the vnc viewer</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">vncviewer :5<br /></span></p><p>Once it's finished booting, log in as root and install with the <span style="font-family: courier;">ec2-baseline</span> overlay which is what makes sure it's got the pieces necessary to work on EC2.</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">./live_install.sh -G c1t0d0 ec2-baseline</span><br /></p><p>Back as root on the host, ^C to get out of socat, remove the ISO image and reboot, so it will boot from the newly installed image.</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">zap remove-cd -z bhyve1 -r</span><br /></p><p>Restart socat and vncviewer, and log in to the guest again.</p><p>What I then do is to remove any configuration or other data from the guest that we don't want in the final system. (This is similar to the old <span style="font-family: courier;">sys-unconfig</span> that many of us used to Solaris will be familiar with.)<br /></p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">zap unconfigure -a</span><br /></p><p>I usually also ensure that a functional <span style="font-family: courier;">resolv.conf</span> exists, just in case dhcp doesn't create it correctly.</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">echo "nameserver<span> </span>8.8.8.8" > /etc/resolv.conf</span></p><p>Back on the host, shut the instance down by shutting down the bhyve zoned it's running in:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">zoneadm -z bhyve1 halt</span><br /></p><p>Now the zfs volume you created contains a suitable image. All you have to do is get it to AWS. First copy the image into a plain file:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">dd if=/dev/zvol/rdsk/rpool/bhyve1_bhvol0 of=/var/tmp/tribblix-m29.img bs=1048576<br /></span></p><p>At this point you don't need the zone any more so you can get rid of it:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">zap destroy-zone -z bhyve1</span><br /></p><p>The raw image isn't in a form you can use, and needs converting. There's a useful tool - the <a href="https://github.com/imcleod/VMDK-stream-converter">VMDK stream converter</a> (there's also a download <a href="https://mirrors.omnios.org/vmdk/VMDK-stream-converter-0.2.tar.gz">here</a>) - just untar it and run it on the image:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">python2 ./VMDK-stream-converter-0.2/VMDKstream.py /var/tmp/tribblix-m29.img /var/tmp/tribblix-m29.vmdk</span><br /></p><p>Now copy that vmdk file (and it's also a lot smaller than the raw img file) up to S3, in the following you need to adjust the bucket name from <span style="font-family: courier;">mybucket</span> to something of yours:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">aws s3 cp --cli-connect-timeout 0 --cli-read-timeout 0 \<br />/var/tmp/tribblix-m29.vmdk s3://mybucket/tribblix-m29.vmdk<br /></span></p><p>Now you can import that image into a snapshot:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">aws ec2 import-snapshot --description "Tribblix m29" \<br />--disk-container file://m29-import.json</span><br /></p><p>where the file <span style="font-family: courier;">m29-import.json</span> looks like this:<br /></p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">{<br /> "Description": "Tribblix m29 VMDK",<br /> "Format": "vmdk",<br /> "UserBucket": {<br /> "S3Bucket": "mybucket",<br /> "S3Key": "tribblix-m29.vmdk"<br /> }<br />}</span><br /></p><p>The command will give you a snapshot id, that looks like <span style="font-family: courier;">import-snap-081c7e42756d7456b</span>, which you can follow the progress of with</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">aws ec2 describe-import-snapshot-tasks --import-task-ids import-snap-081c7e42756d7456b</span></p><p>When that's finished it will give you the snapshot id itself, such as <span style="font-family: courier;">snap-0e0a87acc60de5394</span>. From that you can register an AMI, with</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">aws ec2 register-image --cli-input-json file://m29-ami.json</span><br /></p><p>where the <span style="font-family: courier;">m29-ami.json</span> file looks like:</p><p style="margin-left: 40px; text-align: left;"><span style="font-family: courier;">{<br /> "Architecture": "x86_64",<br /> "Description": "Tribblix, the retro illumos distribution, version m29",<br /> "EnaSupport": false,<br /> "Name": "Tribblix-m29",<br /> "RootDeviceName": "/dev/xvda",<br /> "BlockDeviceMappings": [<br /> {<br /> "DeviceName": "/dev/xvda",<br /> "Ebs": {<br /> "SnapshotId": "snap-0e0a87acc60de5394"<br /> }<br /> }<br /> ],<br /> "VirtualizationType": "hvm",<br /> "BootMode": "legacy-bios"<br />}<br /></span></p><p>If you want to create a Nitro-enabled AMI, change "<span style="font-family: courier;">EnaSupport</span>" from "<span style="font-family: courier;">false</span>" to "<span style="font-family: courier;">true</span>", and "<span style="font-family: courier;">BootMode</span>" from "<span style="font-family: courier;">legacy-bios</span>" to "<span style="font-family: courier;">uefi</span>".</p><p><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-42005714572303692362023-03-11T10:27:00.000+00:002023-03-11T10:27:21.938+00:00What, no fsck?<p>There was a huge amount of resistance early on to the fact that zfs didn't have an fsck. Or, rather, a <i>separate</i> fsck.<br /><br />I recall being in Sun presentations introducing zfs and question after question was about how to repair zfs when it got corrupted.<br /><br />People were so used to shoddy file systems that were so badly implemented that a separate utility was needed to repair file system errors caused by fundamental design and implementation errors in the file system itself that the idea that the file system driver itself ought to take responsibility for managing the state of the file system was totally alien.<br /><br />If you think about ufs, for example, there were a number of known failure modes, and what you did was take the file system offline, run the checker against it, and it would detect the known errors and modify the bits on disk in a way that would hopefully correct the problem. (In reality, if you needed it, there was a decent chance it wouldn't work.) Doing it this way was simple laziness - it would be far better to just fix ufs so it wouldn't corrupt the data in the first place (ufs logging went a long way towards this, eventually). And you were only really protecting against known errors, where you understood exactly the sequence of events that would cause the file system to end up in a corrupted state, so that random corruption was either undetectable or unfixable, or both.<br /><br />The way zfs thought about this was very different. To start with, eliminate all known behaviour that can cause corruption. The underlying copy on write design goes a long way, and updates are transactional so either complete or not. If you find a new failure mode, fix that in the file system proper. And then, correction is built in rather than separate, which means that it doesn't need manual intervention by an administrator, and all repairs can be done without taking the system offline.<br /><br />Thankfully we've moved on, and I haven't heard this particular criticism of zfs for a while.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-81131861866679960092022-11-21T14:43:00.000+00:002022-11-21T14:43:04.738+00:00A decade of Tribblix<p>I seem to have just missed the anniversary, but it turns out that <a href="http://www.tribblix.org/">Tribblix</a> has existed for slightly over a decade.</p><p>The initial blog post on <a href="https://ptribble.blogspot.com/2012/10/building-tribblix.html">Building Tribblix </a>was published on October 24th, 2012. But the ISO image (milestone 0) was October 21st, and it looks like the packages were built on October 4th. So there's a bit of uncertainty about the actual date, and I had been playing around with some of the bits and pieces for a while before that.</p><p>There have been a lot of releases. We're now on Milestone 28, but there have been several update releases along the way, so I make it 42 distinct releases in total. That doesn't include the LX-enabled OmniTribblix variant (there have been 20 of those by the way).</p><p>The focus (given hardware availability) has been x86, naturally. But the SPARC version has seen occasional bursts of life. Now I have a decent build system, it's catching up. Will there be an ARM version? Who knows...</p><p>Over the years there have been some notable highlights. It took a few releases to become fully self-hosting; package management had to be rebuilt; LibreOffice was ported; Xfce and MATE added as fully functional desktop offerings (with a host of others); a whole family of zones, including reimplementing the traditional sparse root; made available on clouds like AWS and Digital Ocean; network install via iPXE; huge numbers of packages (it's never-ending churn); and maintaining Java by default.</p><p>And it's been (mostly) fun. Here's to the next 10 years!</p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com6tag:blogger.com,1999:blog-9726833.post-61450374717511466282022-11-20T21:27:00.002+00:002022-11-20T21:27:46.863+00:00TREASURE - The Remote Execution and Access Service Users Really Enjoy<p>Many, many years ago I worked on a prototype of a software ecosystem I called TREASURE - The Remote Execution and Access Service Users Really Enjoy.</p><p>At the time, I was running the infrastructure and application behind an international genomics service. The idea was that we could centrally manage all the software and data for genomic analysis, provide high-end compute and storage capability, and amortize the cost across 20,000 academics so that individual researchers didn't have to maintain it all individually.</p><p>Originally, access was via telnet (I did say it was a long time ago). After a while we enabled X11, so that graphical application would work (running X11 directly across the internet was fun).</p><p>Then along came the web. One of my interesting projects was to write a web server that would run with the privileges of the authenticated user. (This was before apache came along, by the way!) And clearly a web browser might be able to provide a more user-friendly and universal interface than a telnet prompt.</p><p>We added VNC as well (it came out of Cambridge and we were aware of it well before it became public), so that users could view graphical applications more easily. This had a couple of advantages - all the hard work and complexity was at our end, where we had control, and X11 is quite latency sensitive so performance improved.</p><p>But ultimately what I wanted to do was to run the GUI on the user's machine, wit access to the user's files. Remember that the GUI is then not running where the software, genome databases, and all the compute power are located.</p><p>Hence the Remote Execution part of TREASURE - what we wanted was a system that would call across to a remote service to do the work, and return the result to the user. And the Access part was about making it accessible and transparent, which would lead to a working environment that people would enjoy using.</p><p>Ultimately, the core of TREASURE was originally a local GUI that knew how to run applications. Written in Java, it would therefore run on pretty much any client (and we had users with all sorts of Unix workstations in addition to Windows making inroads). The clever bit was to replace the java Runtime.getRuntime().exec() calls that ran applications locally with some form of remote procedure call. Being of its time, this might involve CORBA, RMI, SOAP, or JAX-WS with data marshalled as XML. In fact, I implemented pretty much every remote call mechanism available (and this did in fact come in useful as other places did make available some services using pretty random protocols). And then of course there's the server side which was effectively a CGI script.</p><p>The other key part was to work out which files needed to be sent across. Sometimes it was obvious (it's a GUI, the user has selected a file to analyse), but sometimes we needed to send across auxiliary files as well. And on the server side it ran in a little sandbox so you knew what output files had bee generated so you could return those.</p><p>Effectively, this was a production form of serverless computing running over 20 years ago. Only we called it GRID computing back then.</p><p>Another interesting feature of the architecture was the TREASURE CHEST, which was a source of applications. There were literally hundreds of possible applications you could run, and many more if you included interfaces to other providers. So rather than write all those into the app, there was a plugin system where you could download a jar file and run it as a plugin, and the TREASURE CHEST was where you could find these application. Effectively an app store, in modern terminology.</p><p>Sadly the department got closed down due to political incompetence, so the project never really got beyond the prototype stage. And while I still have bits and pieces of code, I don't appear to have a copy of the whole thing. A lot of the components would need to be replaced, but the overall concept is still sound.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-67173387532991255152022-11-15T12:07:00.000+00:002022-11-15T12:07:07.393+00:00Tribblix for SPARC m25.1<p>Following hot on the heels of the <a href="https://ptribble.blogspot.com/2022/11/tribblix-for-sparc-m22-iso-now-available.html">Tribblix Milestone 22 ISO for SPARC</a>, it's possible to upgrade that to a newer version. The new version that's available is m25.1.</p><p>(If the available versions look a bit random, that's because they are. Not every release on x86 was built for SPARC, and not all of the ones that were actually worked properly. So we have what we have.)</p><p>The major jump, aside from the underlying OS, in m25.1 for SPARC is that it brings in gcc7 (for applications, illumos itself is still built with gcc4), and generally there's a bunch of more modern applications available.</p><p>To upgrade m22 to m25.1 is a manual process. This is because there are steps that are necessary, and if you don't follow them exactly the system won't boot.</p><p>The underlying cause here of the various problems in this process is that it's a big jump from m22 to m25.1 and you will hit bugs in the upgrade process that have been fixed in intermediate releases.</p><p>First, take a note of the current BE, eg tribblix. You might need it later if things go bad and you need to reboot into the current (hopefully working) release.</p><p>You can manually add available versions for upgrade with the following trick (this is just one line, despite how it might be formatted):</p><p></p><blockquote><span style="font-family: courier;">echo "m25.1|http://pkgs.tribblix.org/release-m25.1.sparc/TRIBzap.0.0.25.1.zap|Tribblix m25.1" >> /etc/zap/version.list</span></blockquote><p></p><p>and check that's visible with</p><p></p><blockquote><span style="font-family: courier;">zap upgrade list</span></blockquote><p></p><p>and then start the upgrade with</p><p></p><blockquote><span style="font-family: courier;">zap upgrade m25.1</span></blockquote><p></p><p><b>Do not activate or reboot yet!</b></p><p>You <b>MUST</b> do the following:</p><p></p><blockquote><span style="font-family: courier;">beadm mount m25.1 /a<br />zap install -C /a TRIBshell-ksh93<br />pkgadm sync -q -R /a<br />beadm umount m25.1</span></blockquote><p></p><p>and then you should be safe to reboot:</p><p></p><blockquote><span style="font-family: courier;">beadm activate m25.1<br />init 6</span></blockquote><p></p><p>If it doesn't come back, you can boot into the previous release (that you took the name of earlier, remember) from the ok prompt</p><p></p><blockquote><span style="font-family: courier;">boot -Z rpool/ROOT/tribblix</span></blockquote><p></p><p>Once you're up and running m25.1 it's time to clean up.</p><p></p><blockquote><span style="font-family: courier;">zap refresh</span></blockquote><p></p><p>and then remove some of the old opensxce packages</p><p></p><blockquote><span style="font-family: courier;">zap uninstall \<br />SUNWfont-xorg-core \<br />SUNWfont-xorg-iso8859-1 \<br />SUNWttf-dejavu \<br />SUNWxorg-clientlibs \<br />SUNWxorg-xkb \<br />SUNWxvnc \<br />SUNWxwcft \<br />SUNWxwfsw \<br />SUNWxwice \<br />SUNWxwinc \<br />SUNWxwopt \<br />SUNWxwxft \<br />SUNWxwrtl \<br />SUNWxwplr \<br />SUNWxwplt</span></blockquote><p></p><p>and then bring packages up to current</p><p></p><blockquote><span style="font-family: courier;">zap update-overlay -a</span></blockquote><p></p><p>and this should give you a system that's in a workable state, and roughly matching my active SPARC environment.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com2tag:blogger.com,1999:blog-9726833.post-14543546537118939322022-11-14T10:08:00.000+00:002022-11-14T10:08:45.364+00:00Tribblix for SPARC m22 ISO now available<p>I've made available a newer ISO image for Tribblix on SPARC.</p><p>This is an m22 ISO. So it's actually relatively old compared to the mainstream x86 release.</p><p>I actually had a number of random SPARC ISO images, but for a while I've had no way of testing any of them. (And many of the problems with the SPARC ISOs in general is because I had no real way of testing them properly.)</p><p>Arrive a newish T4-1 (thanks Andy!), and I can now trivially create an LDOM, assign it a zvol for a root disk and a ISO image to boot from, and testing is trivial again. And while some of the ISO images I have are clearly so broken as to not be worth considering, the m22 version looks pretty reasonable.</p><p>In terms of available application packages, it exactly matches the old m20 release. I do have newer packages on some of my test systems, but they are built with a newer gcc and so need a proper upgrade path. But that's going to be easier now too.</p><p>There is a minor error on the m22 ISO, in that the xz package shipped appears to be wrong. To fix, simply</p><p><span style="font-family: courier;">zap install TRIBcompress-xz</span></p><p>and to update to the latest available applications (the ISO is early 2020, the repo is middle of 2021)</p><p><span style="font-family: courier;">zap refresh<br />zap update TRIBlib-security-openssl<br />zap update-overlay -a</span><br /></p><p>The reason for updating openssl on its own is that a number of applications are compiled against openssl 1.1.1, so you need to be sure that gets updated first.</p><p>Next step is to push on to something newer.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com1tag:blogger.com,1999:blog-9726833.post-84103405547347185782022-10-11T17:20:00.000+01:002022-10-11T17:20:17.967+01:00DevOps as a HR problem<p>I wrote about one way in which <a href="https://ptribble.blogspot.com/2022/10/on-intersection-between-it-and-hr.html">HR and IT can operate more closely</a>, but there's another interaction between IT and HR that might not be so benign.<br /></p><p>DevOps is ultimately about breaking down silos in IT (indeed, my definition of DevOps is as a cultural structure where teams work together to meet the needs of the business rather than competing against each other to meet the needs of the team).<br /></p><p>However, in a business, individuals and teams are actually playing a game in which the rules and criteria for success are set by HR in the shape of the (often annual) review cycle. And all too often promotions, pay rises, even restructuring, are based around individual and team performance in isolation. And who can blame individuals and teams for optimising their behaviour around the performance targets they've been set?<br /></p><p>It's similar to Conway's Law, in which the outputs of an organization mirror its organisational structure - here, the outputs of an organisation will mirror the performance targets that have been set. If you want to improve collaboration and remove silos, then make sure that HR are on board and get them to explicitly put those into the annual performance targets.<br /><br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-45125297306788401752022-10-04T10:49:00.001+01:002022-10-04T10:49:31.429+01:00On the intersection between IT and HR<p>A while ago I mentioned <a href="https://ptribble.blogspot.com/2021/12/the-three-strands-of-information.html">The three strands of Information Technology</a>, and how this was split into an internal-facing component (IT for the business, IT for the employee) and external-facing (IT for the customer).</p><p>In a pure technology company, there's quite a mismatch, with the customer-facing component being dominant and the internal-facing parts being minimised. In this case, do you actually need an IT department, in the traditional sense?</p><p>You need a (small) team to do the work, of course. But one possibility is to assign them not to a separate IT organization but to the HR department.</p><p>Why would you do this? Well, the primary role of internal IT in a technology company is simply to make sure that new starters get the equipment and capabilities they need on day one, and that they hand stuff back and get their access removed when they leave. And if there's one part of an organisation that knows when staff are arriving and leaving, it's the HR department. Integrating internal IT directly into the rest of the onboarding and offboarding process dramatically simplifies communication.</p><p>It helps security and compliance too. One of the problems you often see with the traditional setup where IT is completely separate from HR is that it can take forever to revoke a staff member's access when they leave; integrating the two functions massively shortens that cycle.<br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-41926909920851846772022-08-19T21:16:00.000+01:002022-08-19T21:16:30.537+01:00Tribblix m28, Digital Ocean, and vioscsi<p>A while ago, I wrote about <a href="https://ptribble.blogspot.com/2021/04/running-tribblix-on-digital-ocean.html">deploying Tribblix on Digital Ocean</a>.</p><p>The good news is that the same process works flawlessly with the recently released m28 Tribblix release.</p><p>If you recall from the original article, adding an additional storage volume didn't work. But, we now have a vioscsi driver, so has the situation improved?</p><p>Yes!</p><p>All I did was select an additional volume when creating the droplet. Then if I run format, I see:</p><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">AVAILABLE DISK SELECTIONS:<br /> 0. c3t0d1 <DO-Volume-2.5+ cyl 13052 alt 2 hd 255 sec 63><br /> /pci@0,0/pci1af4,8@5/iport@iport0/disk@0,1<br /> 1. c4t0d0 <Virtio-Block Device-0000-25.00GB><br /> /pci@0,0/pci1af4,2@6/blkdev@0,0</span></p></blockquote><p>That 25GB Virtio-Block device is the root device used for rpool; the other one is the 100GB additional volume. It's also visible in diskinfo:</p><p></p><p><span style="font-family: courier;"></span></p><blockquote><p><span style="font-family: courier;">TYPE DISK VID PID SIZE RMV SSD<br />SCSI c3t0d1 DO Volume 100.00 GiB no no <br />- c4t0d0 Virtio Block Device 25.00 GiB no no <br />- c5t0d0 Virtio Block Device 0.00 GiB no no <br /></span></p><p></p></blockquote><p>(That empty c5t0d0 is the metadata service, by the way.)</p><p></p><p>Let's create a pool:</p><p><span style="font-family: courier;"></span></p><blockquote>zpool create store c3t0d1</blockquote><p></p><p>It just works. And performance isn't too shabby - I can read and write at 300MB/s.</p><p>There you go. More options for running illumos.</p><p> <br /></p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0tag:blogger.com,1999:blog-9726833.post-72790749764733986912022-07-09T14:54:00.000+01:002022-07-09T14:54:05.946+01:00Tribblix and static networking on AWS<p>I've just made available the m27 <a href="http://www.tribblix.org/aws.html">AMIs for Tribblix</a>. As usual, these are just available in London (eu-west-2).</p><p>One thing I've noticed repeatedly while running illumos on AWS is that network stability isn't great. The instance will occasionally drop off the network and stubbornly refuse to reclaim its IP address even if you reboot it. It's not just Tribblix, I run a whole lot of OmniOS on AWS and that does the same thing.</p><p>The problem appears to be related to DHCP not being able to pick up the address (even though I can see it ending out the correct requests and getting what look like legitimate responses).<br /></p><p>So what I do is convert the running instance from using NWAM and being a DHCP client to having statically configured networking. On first boot it needs to use DHCP, because it cannot know what its IP address and network configuration should be until it's booted up once and used DHCP to get the details. But it's really extremely rare to take an AWS instance and change its networking - you would simply build new instances rather than modifying existing ones - so changing it to static is fine, and eliminates any possibility of DHCP failures messing you up in future.<br /></p><p>In the past I've always done this manually, but now there's a much easier way if you're using m27 or later:</p><p><span style="font-family: courier;">zap staticnet</span><br /></p><p>will show you what the system will do, just as a sanity check, and then</p><p><span style="font-family: courier;">zap staticnet -y</span></p><p>will implement the change.</p>Peter Tribblehttp://www.blogger.com/profile/09363446984245451854noreply@blogger.com0