Saturday, August 29, 2015

Tribblix meets MATE

One of the things I've been working on in Tribblix is to ensure that there's a good choice of desktop options. This varies from traditional window managers (all the way back to the old awm), to modern lightweight desktop environments.

The primary desktop environment (because it's the one I use myself most of the time) is Xfce, but I've had Enlightenment available as well. Recently, I've added MATE as an additional option.

OK, here's the obligatory screenshot:


While it's not quite as retro as some of the other desktop options, MATE has a similar philosophy to Tribblix - maintaining a traditional environment in a modern context. As a continuation of GNOME 2, I find it to have a familiar look and feel, but I also find it to be much less demanding both at build and run time. In addition, it's quite happy with older hardware or with VNC.

Building MATE on Tribblix was very simple. The dependencies it has are fairly straightforward - there aren't that many, and most of them you would tend to have anyway as part of a modern system.

To give a few hints, I needed to add dconf, a modern intltool, itstool, iso-codes, libcanberra, zenity, and libxklavier. Having downloaded the source tarballs, I built packages in this order (this isn't necessarily the strict dependency order, it was simply the most convenient):
  • mate-desktop
  • mate-icon-theme
  • eom (the image viewer)
  • caja (the file manager)
  • atril (the document viewer, disable libsecret)
  • engramap (the archive manager)
  • pluma (the text editor)
  • mate-menus
  • mateweather (is pretty massive)
  • mate-panel
  • mate-session
  • marco (the window manager)
  • mate-backgrounds
  • mate-themes (from 1.8)
  • libmatekbd
  • mate-settings-daemon
  • mate-control-center
The code is pretty clean, I needed a couple of fixes but overall very little needed to be changed for illumos.

The other thing I added was the murrine gtk2 theme engine. I had been getting odd warnings from applications for a while mentioning murrine, but MATE was competent enough to give me a meaningful warning.

I've been pretty impressed with MATE, it's a worthy addition to the available desktop environments, with a good philosophy and a clean implementation.

Monday, August 10, 2015

Whither open source?

According to the Free Software Definition, there are 4 essential freedoms:

  • The freedom to run the program as you wish, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Access to the source code and an open-source license are necessary preconditions for software freedom, but not sufficient.

And, unfortunately, we are living in an era where it is becoming ever more difficult to exercise the freedoms listed above.

Consider freedom 0. In the past, essentially all free software ran perfectly well on essentially every hardware platform and operating system. At the present time, much of what claims to be open-source software is horrendously platform-specific - sometimes by ignorance (I don't expect every developer to be able to test on all platforms), but there's a disturbing trend of deliberately excluding non-preferred platforms.

There is increasing use of new languages and runtimes, which are often very restricted in terms of platform support. If you look at some of the languages like Node.JS, Go, and Rust, you'll see that they explicitly target the common hardware architectures (x86 and ARM), deliberately and consciously excluding other platforms. Add to that the trend for self-referential bootstrapping (where you need X to build X) and you can see other platforms frozen out entirely.

So, much of freedom 0 has been emasculated. What of freedom 1?

Yes, I might be able to look at the source code. (Although, in many cases, it is opaque and undocumented.) And I might be able to crack open an editor and type in a modification. But actually being able to use that modification is a whole different ball game.

Actually building software from source often enters you into a world of pain and frustration. Fighting your way through Dependency Hell, struggling with arcane and opaque build systems, becoming frustrated with the vagaries of the autotools (remember how the configure script works - it makes a bunch of random and unsubstantiated guesses about the state of your system, the ignores half the results, and often needs explicitly overriding, making a mockery of the "auto" part), only to discover that "works on my system" is almost a religion.

Current trends like Docker make this problem worse. Rather than having to pay lip-service to portability by having to deal with the vagaries of multiple distributions, authors can now restrict the target environment even more narrowly - "works in my docker image" is the new normal. (I've had some developers come out and say this explicitly.)

The conclusion: open-source software is becoming increasingly narrow and proprietary, denying users the freedoms they deserve.

Sunday, August 02, 2015

Blank Zones

I've been playing around with various zone configurations on Tribblix. This is going beyond the normal sparse-root, whole-root, partial-root, and various other installation types, into thinking about other ways you can actually use zones to run software.

One possibility is what I'm tentatively calling a Blank zone. That is, a zone that has nothing running. Or, more precisely, just has an init process but not the normal array of miscellaneous processes that get started up by SMF in a normal boot.

You might be tempted to use 'zoneadm ready' rather than 'zoneadm boot'. This doesn't work, as you can't get into the zone:

zlogin: login allowed only to running zones (test1 is 'ready').
So you do actually need to boot the zone.

Why not simply disable the SMF services you don't need? This is fine if you still want SMF and most of the services, but SMF itself is quite a beast, and the minimal set of service dependencies is both large and extremely complex. In practice, you end up running most things just to keep the SMF dependencies happy.

Now, SMF is started by init using the following line (I've trimmed the redirections) from /etc/inittab

smf::sysinit:/lib/svc/bin/svc.startd

OK, so all we have to do is delete this entry, and we just get init. Right? Wrong! It's not quite that simple. If you try this then you get a boot failure:

INIT: Absent svc.startd entry or bad contract template.  Not starting svc.startd.
Requesting maintenance mode

In practice, this isn't fatal - the zone is still running, but apart from wondering why it's behaving like this it would be nice to have the zone boot without errors.

Looking at the source for init, it soon becomes clear what's happening. The init process is now intimately aware of SMF, so essentially it knows that its only job is to get startd running, and startd will do all the work. However, it's clear from the code that it's only looking for the smf id in the first field. So my solution here is to replace startd with an infinite sleep.

smf::sysinit:/usr/bin/sleep Inf

(As an aside, this led to illumos bug 6019, as the manpage for sleep(1) isn't correct. Using 'sleep infinite' as the manpage suggests led to other failures.)

Then, the zone boots up, and the process tree looks like this:

# ptree -z test1
10210 zsched
  10338 /sbin/init
    10343 /usr/bin/sleep Inf

To get into the zone, you just need to use zlogin. Without anything running, there aren't the normal daemons (like sshd) available for you to connect to. It's somewhat disconcerting to type 'netstat -a' and get nothing back.

For permanent services, you could run them from inittab (in the traditional way), or have an external system that creates the zones and uses zlogin to start the application. Of course, this means that you're responsible for any required system configuration and for getting any prerequisite services running.

In particular, this sort of trick works better with shared-IP zones, in which the network is configured from the global zone. With an exclusive-IP zone, all the networking would need to be set up inside the zone, and there's nothing running to do that for you.

Another thought I had was to use a replacement init. The downside to this is that the name of the init process is baked into the brand definition, so I would have to create a duplicate of each brand to run it like this. Just tweaking the inittab inside a zone is far more flexible.

It would be nice to have more flexibility. At the present time, I either have just init, or the whole of SMF. There's a whole range of potentially useful configurations between these extremes.

The other thing is to come up with a better name. Blank zone. Null zone. Something else?

Saturday, August 01, 2015

The lunacy of -Werror

First, a little history for those of you young enough not to have lived through perl. In the perl man page, there's a comment in the BUGS section that says:

The -w switch is not mandatory.

(The -w switch enables warnings about grotty code.) Unfortunately, many developers misunderstood this. They wrote their perl script, and then added the -w switch as though it was a magic bullet that fixed all the errors in your code, without even bothering to think about looking at the output it generated or - heaven forbid - actually fixing the problems. The result was that, with a CGI script, your apache error log was full of output that nobody ever read.

The correct approach, of course, is to develop with the -w switch, fix all the warnings it reports as part of development, and then turn it off. (Genuine errors will still be reported anyway, and you won't have to sift through garbage to find them, or worry about your service going down because the disk filled up.)

Move on a decade or two, and I'm starting to see a disturbing number of software packages being shipped that have -Werror in the default compilation flags. This almost always results in the build failing.

If you think about this for a moment, it should be obvious that enabling -Werror by default is a really dumb idea. There are two basic reasons:

  1. Warnings are horribly context sensitive. It's difficult enough to remove all the warnings given a single fully constrained environment. As soon as you start to vary the compiler version, the platform you're building on, or the versions of the (potentially many) prerequisites you're building against, getting accidental warnings is almost inevitable. (And you can't test against all possibilities, because some of those variations might not even exist at the point of software release.)
  2. The warnings are only meaningful to the original developer. The person who has downloaded the code and is trying to build it has no reason to be burdened by all the warnings, let alone be inconvenienced by unnecessary build failures.
To be clear, I'm not saying - at all - that the original developer shouldn't be using -Werror and fixing all the warnings (and you might want to enable it for your CI builds to be sure you catch regressions), but distributing code with it enabled is simply being rude to your users.

(Having a build target that generates a warning report that you can send back to the developer would be useful, though.)

Friday, July 24, 2015

boot2docker on Tribblix

Containers are the new hype, and Docker is the Poster Child. OK, I've been running containerized workloads on Solaris with zones for over a decade, so some of the ideas behind all this are good; I'm not so sure about the implementation.

The fact that there's a lot of buzz is unmistakeable, though. So being familiar with the technology can't be a bad idea.

I'm running Tribblix, so running Docker natively is just a little tricky. (Although if you actually wanted to do that, then Triton from Joyent is how to do it.)

But there's boot2docker, which allows you to run Docker on a machine - by spinning up a copy of VirtualBox for you and getting that to actually do the work. The next thought is obvious - if you can make that work on MacOS X or Windows, why not on any other OS that also supports VirtualBox?

So, off we go. First port of call is to get VirtualBox installed on Tribblix. It's an SVR4 package, so should be easy enough. Ah, but, it has special-case handling for various Solaris releases that cause it to derail quite badly on illumos.

Turns out that Jim Klimov has a patchset to fix this. It doesn't handle Tribblix (yet), but you can take the same idea - and the same instructions - to fix it here. Unpack the SUNWvbox package from datastream to filesystem format, edit the file SUNWvbox/root/opt/VirtualBox/vboxconfig.sh, replacing the lines

             # S11 without 'pkg'?? Something's wrong... bail.
             errorprint "Solaris $HOST_OS_MAJORVERSION detected without executable $BIN_PKG !? I are confused."
             exit 1

with

         # S11 without 'pkg'?? Likely an illumos variant
         HOST_OS_MINORVERSION="152"

and follow Jim's instructions for updating the pkgmap, then just pkgadd from the filesystem image.

Next, the boot2docker cli. I'm assuming you have go installed already - on Tribblix, "zap install go" will do the trick. Then, in a convenient new directory,

env GOPATH=`pwd` go get github.com/boot2docker/boot2docker-cli

That won't quite work as is. There are a couple of patches. The first is to the file src/github.com/boot2docker/boot2docker-cli/virtualbox/hostonlynet.go. Look for the CreateHostonlyNet() function, and replace

    out, err := vbmOut("hostonlyif", "create")
    if err != nil {
        return nil, err
    }


with

    out, err := vbmOut("hostonlyif", "create")
    if err != nil {
               // default to vboxnet0
        return &HostonlyNet{Name: "vboxnet0"}, nil
    }


The point here is that , on a Solaris platform, you always get a hostonly network - that's what vboxnet0 is - so you don't need to create one, and in fact the create option doesn't even exist so it errors out.

The second little patch is that the arguments to SSH don't quite match the SunSSH that comes with illumos, so we need to remove one of the arguments. In the file src/github.com/boot2docker/boot2docker-cli/util.go, look for DefaultSSHArgs and delete the line containing IdentitiesOnly=yes (which is the option SunSSH doesn't recognize).

Then you need to rebuild the project.

env GOPATH=`pwd` go clean github.com/boot2docker/boot2docker-cli
env GOPATH=`pwd` go build github.com/boot2docker/boot2docker-cli

Then you should be able to play around. First, download the base VM image it'll run:

./boot2docker-cli download

Configure VirtualBox

./boot2docker-cli init

Start the VM

./boot2docker-cli up

Log into it

./boot2docker-cli ssh

Once in the VM you can run docker commands (I'm doing it this way at the moment, rather than running a docker client on the host). For example

docker run hello-world

Or,

docker run -d -P --name web nginx
 
Shut the VM down

./boot2docker-cli down

While this is interesting, and reasonably functional, certainly to the level of being useful for testing, a sign of the churn in the current container world is that the boot2docker cli is deprecated in favour of Docker Machine, but building that looks to be rather more involved.

Wednesday, July 15, 2015

How to build a server

So, you have a project and you need a server. What to do?
  1. Submit a ticket requesting the server
  2. Have it bounced back saying your manager needs to fill in a server build request form
  3. Manager submits a server build request form
  4. Server build manager assigns the build request to a subordinate
  5. Server builder creates a server build workflow in the workflow tool
  6. A ticket is raised with the network team to assign an IP address
  7. A ticket is raised with the DNS team to enter the server into DNS
  8. A ticket is raised with the virtual team to assign resources on the VMware infrastructure
  9. Take part in a 1000 message 100 participant email thread of doom arguing whether you really need 16G of memory in your server
  10. A ticket is raised with the storage team to allocate storage resources
  11. Server builder manually configures the Windows DHCP server to hand out the IP address
  12. Virtual Machine is built
  13. You're notified that the server is "ready"
  14. Take part in a 1000 message 100 participant email thread of doom arguing that when you asked for Ubuntu that's what you actually wanted rather then the corporate standard of RHEL5
  15. A ticket is raised with the Database team to install the Oracle client
  16. Database team raise a ticket with the unix team to do the step of the oracle install that requires root privileges
  17. A ticket is raised with the ops team to add the server to monitoring
  18. A ticket is raised with your outsourced backup provider to enable backups on the server
  19. Take part in a 1000 message 100 participant email thread of doom over whether the system has been placed on the correct VLAN
  20. Submit another ticket to get the packages you need installed
  21. Move server to another VLAN, redoing steps 6, 7, and 11
  22. Submit another ticket to the storage team because they set up the NFS exports on their filers for the old IP address
There's actually a few more steps in many cases, but I think you get the idea.

This is why devops is a thing, streamlining (eradicating) processes like the above.

And this is (one reason) why developers spin up machines in the cloud. It's not that the cloud is better or cheaper (because often it isn't), it's simply to avoid dealing with dinosaurs of legacy corporate IT departments which only exist to prevent their users getting work done.

My approach to this was rather different.

User: Can I have a server?

Me: Sure. What do you want to call it?

[User, stunned at not immediately being told to get lost, thinks for a moment.]

Me: That's fine. Here you go. [Types a command to create a Solaris Zone.]

Me: Engages in a few pleasantries, to delay the user for a minute or two so that the new system will be ready and booted when they get back to their desk.

Thursday, June 11, 2015

Badly targetted advertising

The web today is essentially one big advertising stream. Everywhere you go you're bombarded by adverts.

OK, I get that it's necessary. Sites do cost money to run, people who work on them have to get paid. It might be evil, but (in the absence of an alternative funding model) it's a necessary evil.

There's a range of implementations. Some subtle, others less so. Personally, I take note of the unsubtle and brash ones, the sort that actively interfere with what I'm trying to achieve, and mark them as companies I'm less likely to do business with. The more subtle ones I tolerate as the price for using the modern web.

What is abundantly clear, though, is how much tracking of your activities goes on. For example, I needed to do some research on email suppliers yesterday - and am being bombarded with adverts for email services today. If I go away, I get bombarded with adverts for hotels at my destination. Near Christmas I get all sorts of advertising popping up based on the presents I've just purchased.

The thing is, though, that most of these adverts are wrong and pointless. The idea that because I searched for something, or visited a website on a certain subject, might indicate that I would be interested in the same things in future, is simply plain wrong.

Essentially, if I'm doing something on the web, then I have either (a) succeeded in the task at hand (bought an item, booked a hotel), or (b) failed completely. In either case, basing subsequent advertising on past activities is counterproductive.

If I've booked a hotel, then the last thing I'm going to do next is book another hotel for the same dates at the same location. More sensible behaviour for advertisers would be to prime the system to stop advertising hotels, and then advertise activities and events (for which they even know the dates) at my destination. It's likely to be more useful for me, and more likely to get a successful response for the advertiser. Likewise, once I've bought an item, stop advertising that and instead move on to advertising accessories.

And if I've failed in my objectives, ramming more of the same down my throat is going to frustrate me and remind me of the failure.

In fact, I wonder if a better targeting strategy would be to turn things around completely, and advertise random items excluding the currently targeted items. That opens up the possibility of serendipity - triggering a response that I wasn't even aware of, rather than trying to persuade me to do something I already actively wanted to do.