Sunday, June 18, 2017

Tweaking binaries with elfedit

On Solaris and illumos, you can inspect shared objects (binaries and libraries) with elfdump. In the most common case, you're simply looking for what shared libraries you're linked against, in which case it's elfdump -d (or, for those of us who were doing this years before elfdump came into existence, dump -Lv). For example:

% elfdump -d /bin/true

Dynamic Section:  .dynamic
     index  tag                value
       [0]  NEEDED            0x1d6               libc.so.1
       [1]  INIT              0x8050d20          

and it goes on a bit. But basically you're looking at the NEEDED lines to see which shared libraries you need. (The other field that's generally of interest for a shared library is the SONAME field.)

However, you can go beyond this, and use elfedit to manipulate what's present here. You can essentially replicate the above with:

elfedit -r -e dyn:dump /bin/true

Here the -r flag says read-only (we're just looking), and -e says execute the command that follows, which is dyn:dump - or just show the dynamic section.

If you look around, you'll see that the classic example is to set the runpath (which you might see as RPATH or RUNPATH in the dump output). This was used to fix up binaries that had been built incorrectly, or where you've moved the libraries somewhere other than where the binary normally looks for them. Which might look like:

elfedit -e 'dyn:runpath /my/local/lib' prog

This is the first example in the man page, and the standard example wherever you look. (Note the quotes - that's a single command input to elfedit.)

However, another common case I come across is where libtool has completely mangled the link so the full pathname of the library (at build time, no less) has been embedded in the binary (either in absolute or relative form). In other words, rather than the NEEDED section being

libfoo.so.1

it ends up being

/home/ptribble/build/bar/.libs/libfoo.so.1

With this sort of error, no amount of tinkering with RPATH is going to help the binary find the library. Fortunately, elfedit can help us here too.

First you need to work out which element you want to modify. Back to elfedit again to dump out the structure

% elfedit -r -e dyn:dump /bin/baz
     index  tag                value
       [0]  POSFLAG_1         0x1                 [ LAZY ]
       [1]  NEEDED            0x8e2               /home/.../libfoo.so.1

It might be further down, of course. But the entry we want to edit is index number 1. We can narrow down the output just to this element by using the -dynndx flag to the dyn:dump command, for example

elfedit -r -e 'dyn:dump -dynndx 1' /bin/baz

or, equivalently, using dyn:value

elfedit -r -e 'dyn:value -dynndx 1' /bin/baz

And we can actually set the value as well. This requires the -s flag to set a string, but you end up with:

elfedit -e 'dyn:value -dynndx -s 1 libfoo.so.1' /bin/baz

and then if you use elfdump or elfedit or ldd to look at the binary, it should pick up the library correctly.

This is really very simple (the hardest part is having to work out what the index of the right entry is). I didn't find anything when searching that actually describes how simple it is, so I thought it worth documenting for the next time I need it.


Friday, June 09, 2017

On Tribblix Milestone 20

Having released a new update for Tribblix, I thought I would add a little commentary on the progress that's being made and the direction things are going in.

This goes beyond the rather dry release notes and list of what's changed.

The big structural change is that the ISO has been built as a single root archive, rather than the old way with a split-off /usr that's lofi-mounted from a compressed image.

The original reason for doing this (and I experimented with it a while ago) was to allow installation on systems without drivers for the device that you're booting from. This might be a system with only USB3 ports, or I've had problems with laptops where illumos doesn't recognize the CD drive. The boot loader (and BIOS) load the initial boot archive, so if you don't need to ever talk to the media device again you're in much better shape.

While we now have USB3 support, this simplified boot is a good thing in any case, and it allows some neat tricks like iPXE boot.

Another logical change is in the release mechanism itself. I've discussed the Tribblix package repositories before. The snag with the traditional repository layout was that the packages that defined a release were in the main Tribblix repository. So, every time I make a new release I end up having to create a whole new Tribblix repository. Every time I update the illumos packages, I needed a new Tribblix repository. Creating a new one isn't too bad; ongoing support for multiple repositories is a lot of unnecessary work.

The way to fix this is to split out the packages (there are 3 of them) that define the properties of a release into their own separate repo. This allows at least 2 new possibilities:

  1. I can release updated illumos packages without spinning a whole new Tribblix release. It would still use the same upgrade mechanism, but the main Tribblix repo is shared and it's a much lighter release process.
  2. I could create variants or spins. For example, I could create a variant that has LX (see omnitribblix). This would just have a different set of illumos packages but shares everything else. Or I could build a 32-bit or 64-bit only distro.
I haven't yet done either of those things, but it's going to happen.

Behind the scenes I've been gradually working to get more packages - especially those that deliver libraries - built as both 32-bit and 64-bit.

Tribblix is fairly clear that it will continue to support 32-bit and 64-bit hardware, at least for a while. (Whereas both OmniOS and OpenIndiana have effectively dropped 32-bit compatibility, mostly by neglect rather than design.) Of course, there is a reasonable amount of software now that's only 64-bit (anything built with go, for example, or OpenJDK 8), but there's a reasonable chance the people using 32-bit hardware aren't necessarily going to want the latest and greatest applications. (This isn't 100% true, by the way - sometime you have to interoperate with other facilities in the environment.) But eventually we're going to have to make a full 64-bit transition, and it would be good to be ready.

That gives a rough idea of the work that's currently underway. Looking ahead, there are a whole long list of packages that need adding or updating (such is a maintainer's life). The one significant place I have been falling behind is that I haven't updated gcc, so that needs work. And, of course, I'm trying to get SPARC into some sort of reasonable shape. But, overall, Tribblix is now pretty solid and a bit more polish and attention to detail would benefit it greatly.

Wednesday, June 07, 2017

Installing Tribblix on Vultr using iPXE

One of the new features in Tribblix 0m20 is that booting and installing using iPXE now works.

Here's an example of using this functionality to install a server running Tribblix in the Vultr cloud. A similar mechanism ought to work for any other provider that allows iPXE boot.

I'm assuming you have signed up and logged in, then go to deploy a server.

First choose where you want to deploy the server. I'm in the UK, so London is a good choice.


Then the critical bit, selecting the Server Type. The bit you want here is in a slightly confusing location, under the "Upload ISO" tab. But then select the "iPXE" radio button and put in the value http://pkgs.tribblix.org/m20/ipxe.txt


The other key option is Server Size. As with many providers, there's a simple scale. For testing, an instance with 1G of memory is more than adequate.


The deploy it. After a few seconds of installing you can then click the link to manage the server, and then view the console, which uses VNC.

If you're reasonably quick you get to see the initial iPXE screen, and can see it downloading the images:


What you can see here is that it's downloaded the original ipxe script we specified. This looks like:

#!ipxe
dhcp
kernel /m20/platform/i86pc/kernel/amd64/unix
initrd /m20/platform/i86pc/boot_archive
boot
 
Which just says to set up the network using dhcp (this might have already been done, but if you're booting off an ipxe iso it may not have been, so we do it anyway), then download the kernel and the boot archive, then boot from what you've just downloaded.

The kernel and the boot archive are on the iso, I've just unpacked them on the server (so the URL given above for the ipxe script will be reasonably permanent for anybody to use). The only slight tweak I've had to make is that the original boot archive is actually gzip compressed and iPXE can't handle that, so it's been uncompressed. The boot archive also now contains the /usr file system as well, rather than it being split off as before. While I'm sure you could mangle the system to download it and sort things out, it's so much easier to put it inside the boot archive.

Then you get into the normal installer, so log in as jack, su to root, and see what disk(s) are available using the new diskinfo tool. Then you can install Tribblix to that disk:



Don't bother adding additional overlays at this point. It won't work - and you'll get an error about not being able to install overlays (you'll get the error anyway because the installer always tries to add some packages that aren't needed in the live environment). This will be fixed in a future update, but it's relatively harmless.

The other thing you should do before the installation is to change the passwords for root and jack. If you change them before running the installer than the change will propagate to the installed system (because all it's doing is a copy). You really don't want the system to boot up wide open to the internet with the default (and well known) passwords.

Once the (pretty quick) install finishes, it'll look like this:


That's just like a normal install, other than the missing overlays. Then just reboot and you'll soon see the new loader, followed by the system booting.

Due to the missing overlays, you'll get an error about the intrd service failing. You'll have to log in (ssh will work at this point) and then add at least the base overlay:

zap install-overlay base

Plus whatever other overlays you might want. Then you can clear the intrd service and you're good to go.

Friday, June 02, 2017

Tribblix memory requirements

Compared to the other illumos distributions, Tribblix has lower memory requirements.

I'm not talking about crazy stunts like running in 48M; here I'm talking about running a fully fledged system.

I've been doing a bit of testing of the upcoming release, which includes running the install under a range of configurations. The test here is to boot the ISO image in VirtualBox with a range of memory sizes and then install the kitchen sink.
  • The live image won't boot at all on a 256M system
  • The live image will boot on a 512M system, but installing to zfs will fail
  • However, installing to ufs works on a 512M system
  • With 768M, installation to zfs is rather slow
  • With 1G or more, you're fine
The upcoming release is going to be built slightly differently, in that it's no longer a split-off /usr configuration. (I discussed how that worked and those strange zlib files some time ago.) The latest OmniOS is a single image; SmartOS likewise. It's just so much easier to construct, and far more reliable.

That change explains the 256M failure - the ramdisk is about 300M, so it simply won't fit. It's likely to have an impact on the 512M case too - in the old scenario you only paged in the bits of the /usr filesystem as and if you needed them, now it's locked into memory.

On a limited memory system there's a way to make things a bit easier. Simply install the base (no additional overlays) from the installer, then add the rest of the overlays and packages later. The point here is that running from disk doesn't lock up anywhere near as much memory as the full OS being resident in RAM does. And some of the packages in the kitchen sink are rather large, which causes problems.

Once you've got Tribblix installed, how well does it cope? Surprisingly well, to be honest. The Xfce desktop runs quite well in either 512M or 768M of memory. I can run firefox on the 768M system without too many problems (given the way it consumes memory, probably not for a long intensive browsing session), while firefox on a 512M system does run, but it's clearly starting to grind. Java applications work, some smaller ones at least. You need to be realistic in your expectations, but the point is that smaller systems do work.

The most limited systems would tend to be older, possibly 32-bit hardware. I could build a 32-bit only image which would be quite a bit smaller - maybe only two-thirds the size. (And if you really wanted to you could get it even smaller - but then you're in the realms of building custom images using mvi or the like.)

However, the aim of keeping Tribblix viable on smallish systems isn't just to allow the use of old hardware, beneficial though that is. If you're running a service on a cloud or hosting provider then being able to use a 1G server instead of a 2G server will halve your costs, and that's a very good thing to be able to do.

Monday, May 29, 2017

Tribblix SPARC progress

Tribblix is one of the relatively few illumos distributions that runs on both SPARC and x86 hardware.

There are valid reasons for the lack of SPARC support in other distributions. For those backed by commercial entities, it makes no sense to support SPARC as they don't have paying customers to foot the bill. Which leaves SPARC support firmly in the hobbyist realm.

Even in Tribblix, SPARC support has lagged the x86 version somewhat. Again, for entirely predictable reasons. While I do have SPARC hardware, it's relatively slow, noisy, power hungry, and heat-producing compared to my regular x86 boxes. And my day to day use is my x86 workstation, so that drives a lot of the desktop work.

But SPARC development of Tribblix hasn't stopped. Far from it, it's just naturally slower.

The current download ISO image at this time is still Milestone 16. Just to clarify the versioning here - that means it was built from exactly the same illumos commit as the corresponding x86 release. Because it took a little longer to get ready, the userland packages (such as they were) tended to be a bit newer.

There have been 3 more Tribblix release on x86 since then. Over the winter (when it was cold and the heat output from the T5140 I use as a build server was a good thing) I tried building updated illumos versions. The T5140 I'm using to do the builds is running a cobbled-together frankendistro of bits of Tribblix, bits of OpenSXCE, some random bits from other people working on SPARC, and a whole lot of elbow grease. I managed to build illumos at the m17 and m18 release points, but m19 was a step too far (some of the native stuff assumes that the host OS isn't terribly antiquated). What this means is that I need to replace that by a current system, and get a properly self-hosting illumos build.

That modernizes the underlying illumos components a bit. What about the rest of the system? The primary effort there was to replace the old core components that had been been borrowed from OpenSXCE while bootstrapping the distribution in the first place with native packages (and that are then up to date and match the x86 build). Some of the components here are pretty crucial - zlib and libxml2, for instance. At one point I messed up libxml2 slightly - not enough to kill SMF (which would be a big worry) but enough to stop zones working (which, apart from indicating that I had broken it, also left me without an imortant test mechanism). Rebuild everything enough times and the problem eventually cleared.

I also had a go at getting my SunBlade 1500 workstation working. It's not terribly quick, but it's quiet enough and sufficiently low power that I can have it running without negatively impacting the home office. That was a bit of a struggle, the bge network driver currently in illumos doesn't work - I assume I'm seeing bug 7746 here, but the solution - to use an older version of the driver - works well enough. With that box available I not only have more testing available but also a lightweight machine that I can use to keep the package backlog under control.

Graphics on SPARC is an interesting problem. OK, so I don't expect this to be a priority, but it would be nice to have something that worked. The first problem I found (a while ago) was that some of the binary graphics drivers wouldn't work at all. For example, the m64 driver (which is what might drive the graphics in my SunBlade 2000) uses hat_getkpfnum which was removed from illumos courtesy of bug 536. Graphics drivers that load often simply don't work, and getting an X server to start is a bit of a nightmare. After far too much manual fiddling I did manage to get a twm desktop running on the aforementioned SunBlade 1500, but don't expect native graphics support to improve any time soon.

Applications is another matter, there's no reason you couldn't run at least some applications on a SPARC system and display them back on your desktop machine. After all, X11 is a network display protocol (despite all the effort to eradicate that and turn it into a local-only display protocol). Or run a VNC server and access that remotely. So I've started (but not finished) building up the components for useful applications.

I haven't yet got an ISO image. That's likely to be a while, but if you have an existing SPARC system running Tribblix m16 then the upgrade to m18 ought to work. Although I would recommend a couple of changes to the procedure if you're going to try this:
  • Refresh and update everything: 'zap refresh ; zap update-overlay -a'
  • Download the current upgrade script from github and run that script in place of 'zap upgrade'
  • After booting into the newly updated BE, refresh and update everything again, just to make sure you're up to date

Saturday, April 29, 2017

OmniTribblix

In Tribblix, it's a basic principle that I ship upstream software unmodified. I don't impose my own views on installation layout, nor do I customize it. Generally, I apply patches only to make stuff compile.

This means that what you see in Tribblix is exactly what the upstream author intended, and not some distro-specific bastardization of it.

It also makes my life easier, I don't have to maintain patches, and updating software is much easier if it's unmodified.

In particular, I use an absolutely vanilla illumos-gate. (For a long time it differed only in that I had the fix for 5188 applied, relevant because Tribblix actually uses SVR4 packaging, but now that's integrated I don't even need to do that.)

Again, this makes my life easier. (When you're maintaining a distro on your own in your spare time, making decisions that simplify your job is essential.)

But it also has another benefit: because I have no "special" features that I've added, I'm not tied to one particular version or variant or commit of illumos. Any version of illumos-gate will do just fine. When it comes time to make a release, I just clone the gate, build, and go.

What I could do, then, is build an instance of Tribblix atop some other fork of the gate. For example, illumos-omnios.

I did just that, built the gate (it needed a couple of changes to Makefiles because of the way that perl and snmp are slightly different in OmniOS than it is in Tribblix), created packages, built an ISO, booted and installed it in VirtualBox.

As expected, it just works.

But just demonstrating that it works isn't really the reason I wanted to do this. What I'm really after is the LX brand, which has been integrated into current OmniOS.

Installing an LX zone requires a Linux image. The original (Joyent) work was for their own deployment mechanism, using ZFS images. As soon as it was available in OmniOS the first thing I did was use tarballs, which OmniOS now supports. The easiest way to create a Linux image is to create a Docker container the way you like it, and then export it to a tarball. I did that for Alpine and installed a zone based on that.

Then you can do very simple things like:

# zlogin lx1 /bin/uname -a 
Linux lx1 4.4 BrandZ virtual linux x86_64 Linux

It's an attractive idea to simply use this as the base for the next Tribblix release. However, that requires illumos-omnios to be supported in the long term, which is currently at risk.

Wednesday, April 12, 2017

Noisy Tribblix

I've had a couple of Tribblix users ask me why audio doesn't work.

This was something I had noticed myself, and the reason was not that audio was in some way broken, but that the permissions on the audio devices were wrong - owned and only writeable by root.

Now I only wanted to actually get any audio out on fairly rare occasions, so a quick chown wasn't that much of an imposition. But it obviously needed fixing properly.

My assumption here is that most desktop users will be logging in through the SLiM login manager. So all I need to do is fix the permissions just before it calls setuid() to the logged in user. And then reset them back once the user is done.

Now, I could have made up a bunch of chowns myself, or written a helper. There's actually code in SLiM to call ConsoleKit - but I don't have ConsoleKit, and don't really see the need to maintain a port of it just for this.

But illumos already has the capability to do this, and the normal login mechanisms use it. There's code in libdevinfo that sets the permissions according to the rules laid out in the /etc/logindevperm file. So the code is really just a call to di_devperm_login() and di_devperm_logout(), and all is well.

This also fixed another irritating bug - I can now eject memory sticks as myself, without needing to be root.

The next thing that happens, of course, is that it doesn't take very long to realise that Twitter has a lot of videos that play automatically. So I'm sitting there and I can hear either the internal loudspeaker or my headphones warbling away.

So the next thing I need is a way to shut the thing up. Historically, I used the old CDE sdtaudiocontrol, which was pretty good. (In general, I detested CDE as a desktop, the mailer and calendar were decent enough for their time, and the audio control was the only other thing I used much.) I use Xfce as my desktop, it used to have xfce4-mixer but that's now unmaintained and deprecated (and I removed that as part of the migration from gstreamer-0.10 to gstreamer1). Which pretty much leaves the command line audio utilities in illumos, specifically audioctl. I've added the package so users who update will automatically get that as well.

The command

audioctl set-control volume 0

silences things, while

audioctl set-control volume 75

puts the volume back to normal. I've created aliases mute and unmute for those. A more sophisticated approach would be to save the volume and restore it afterwards, but this is enough for now.