Monday, February 29, 2016

Updating gcc for Tribblix

One of the items on the Tribblix Roadmap was improving the system compiler.

I'm currently shipping a copy of gcc 4.8.3. When Tribblix was originally created, I inherited the old gcc 3.4.3 build from OpenIndiana. It took a modest amount of effort to get a copy of gcc v4 that worked (at one point I bailed out completely and rolled back to gcc3), originally I had 4.7.2 but settled on 4.8.3. And for many purposes that's worked out quite well.

It's starting to show problems, though. In no particular order, some are:

  • It's not fully compatible with modern C++ code. In particular, there are instances of modern code that simply keel over and complain that 'to_string isn't a member of std', and various other functions that ought to be supported.
  • There are places that have the path to the gcc install encoded. Given that each compiler version is installed to a different path, publishing packages that have a hardcoded dependency on the install location of this particular compiler build doesn't work so well. In many cases I can fix up any .la files that have errant paths in them, and do so as part of my build process.
  • For (bad) reasons associated with the way I transitioned from gcc3 to gcc4, the current gcc build has libssp, libgomp, and libquadmath disabled. I've recently come across builds that need the first two at least.
  • I've decided that it was a mistake to put the full version in the install prefix. It would have been better to use 4.8 rather than 4.8.3. (Yes, I know that gcc has an internal directory hierarchy that uses the full version number.)
Now, gcc has moved on in the meantime. I can see some basic options:

  • Update to gcc 4.8.5, fixing the build process to add the missing features and fix up any errant paths
  • Update to gcc 4.9.3, fixing the build process to add the missing features and fix up any errant paths
  • Update to whatever the latest gcc5 is.
  • Wait for gcc6 to come out and switch to that
 I've been doing a little experimentation.

While I would like to keep as current as possible, waiting for gcc doesn't solve the problems I have now, so maybe not.

I've tried building gcc5 and had some components fail on me. With 5.1.0, the fortran build didn't work; with 5.3.0 it was gccgo that failed. Not having fortran is probably a showstopper. Losing gccgo less so, as we have golang proper. On x64, at any rate.

Another issue arguing against gcc5 is the C++ ABI changes. Not that they're a problem of themselves, but that it makes rolling back a real pain. I've already had to do a compiler rollback once, and it was painful. And given that I've seen problems with gcc5 builds (that I haven't been able to satisfactorily resolve) then I'm wary of other problems cropping up.

Building 4.9.3 has been uneventful so far. I've enabled (or, rather, not disabled) libssp, gomp, and quadmath. I added obj-c++ as it didn't seem to hurt. I've disabled libitm. Generally, this brings me much closer to the OpenIndiana and OmniOS build settings.

Testing this compiler, the 'to_string isn't a member of std' family of errors are gone, which opens up more code (LibreOffice 5 in particular). And a handful of test builds seem to work just fine.

I have one issue to resolve before I can get this rolled out. It looks like the way that libgcc is linked has changed (looking at the dumpspecs output I can see it's changed) so that every binary ends up depending on libgcc_s. That's something I really don't want. And manually forcing -static-libgcc on every build seems wrong (and besides, there are builds where it's the wrong thing to do).

Tribblix Roadmap

It seems like only yesterday, but I put together a Tribblix scorecard about a year ago.

Tribblix keeps improving, of course. There have been a number of enhancements to the available software, and this continues with a steady stream of package updates.

Actual releases have been relatively thin on the ground. I managed to get Milestone 15 out in April, and Milestone 16 in September. It's about time for another one, looking at that cadence, and I don't really want to go much slower.

What constitutes a release anyway? For Tribblix, there are 2 things that can only happen at a release.

The first is updating the underlying illumos build - this is slightly artificial, but you can't really do on the fly updates of illumos, whereas most of the applications can be updated just fine. So that leads to a distinction between updating packages and upgrading the release.

The second is any structural change to the distro itself, including the way it's packaged and that packages are managed (so if zap changes that's a new release).

Note that adding packages, or replacing them with new versions, doesn't generally involve a release. Even if it's a fairly major package, it will simply be available when it's built.

Looking forward, then, what do I have in the pipeline in terms of releases and changes aligned with them?

Overall, the general stability of the Tribblix pipeline, and the fact that I've had managed to achieve many of the objectives I set myself initially, means that I'm reasonably close to calling it ready, and putting an actual 1.0 release together. What needs doing to make that a reality?

  • Fully fledged and aligned SPARC support would be good. Not that it's necessary to align the release dates or support every piece of software, but what I need to be sure of is that a SPARC release isn't going to need breaking changes to the rest of Tribblix.
  • Fixing the compiler baseline. I currently have a slightly tweaked gcc 4.8.3. I need to settle on a version (possibly a newer version) and untweak the configuration.
  • Making sure that upgrades are solid. At the present time, they mostly work, but are a little clumsy.
There's a lot of devil in those details, but there's nothing massive there. And the result will be an illumos distro with regular updates and lots of the functionailty I expect.

Beyond that, I'm already making plans for Tribblix2. This is the opportunity for me to play with different ideas for how illumos could be used. And this ability to experiment with new ideas is one of the reasons I built my own distribution in the first place.

Think of Tribblix2 as more of a series of concepts rather than a single release vehicle. I can try things like removing 32-bit (and do it wholesale) to see what the benefits would be (and identify any drawbacks along the way). I can see how much of the legacy we currently ship in /usr can be ditched entirely. I can look further at projects like minimal viable illumos and minimal memory systems. It would be fun to see what happens if you apply some of the ideas behind Docker and Unikernels to illumos. Of course, successful experiments could lead to improvements trickling down to regular Tribblix.

OK, so it's not really a roadmap, but it should give an idea of where I'm heading.

Wednesday, February 17, 2016

Do suppliers want to go out of business?

The onrushing cloud behemoth seems destined to sweep many legacy IT suppliers aside, at least if you believe the pundits.

However, it's not just down to the (often imagined) inherent superiority of cloud computing. In many cases, IT suppliers have only themselves to blame.

Now, I have no business training, but even I can understand that making it easy for customers to buy your stuff is probably a good idea.

It's clear, though, that many companies obviously don't want to sell me stuff.

Starting with a website. That's what you do now. In the old days you might have gone to a trade show or looked in a magazine, and ended up with a brochure. No more - if I want product details, I'll look at your website. If your product details are sketchy, non-existent, out of date, inconsistent, vaporware, and generally devoid of technical content, then I'll look elsewhere. If you want me to register to view your technical documentation - even something as simple as the prerequisite system requirements - then I'll likely look elsewhere. If it's impossible to even guess what ballpark your prices are in, then I'll assume I can't afford it.

Then, please tell me how to buy the stuff. If you have resellers, have a list. Make sure that your resellers have actually heard of you. And keep that list up to date and accurate. If you sell direct, say so.

And last, but certainly not least, if a potential customer emails you - either direct or via that stupid form on your website that has a little postage stamp sized box to put the query in - showing an interest in buying your stuff, actually show some interest in selling your product. Answer the email, at the very least. Do it promptly. Do it accurately.

Looking for products recently, many supplier websites simply fail, completely. In some cases, there's a confusingly jumbled list of products that you have to visit individually to work out if they meet your requirements. Other times, they've decided to randomly segment their offerings into neat customer buckets, so I have to trawl through all their sections to find the product I want. One site had a useful handy product chart - with no hyperlinks, and listing a number of models that didn't exist as far as I can tell. Being forced to use a search engine to navigate a supplier's site is not without its problems - often you land on old pages with no indication whether the product is even current.

I've been trying to get a number of quotes recently. Maybe a third of suppliers and/or manufacturers are pretty good (thank you to all of you, by the way). Another third have such a dismal web presence that they're clearly not fit for the 21st century, so why would I even think of using them? The remaining third have simply ignored all attempts to contact them. Given how difficult it is for IT companies to survive in the current rapidly changing and highly competitive landscape, it staggers me what a poor job so many companies are doing.

Monday, February 15, 2016

On FOSDEM

I recently travelled to FOSDEM. Elsewhere I've talked about getting to Brussels and Back; here are some of my notes on the event itself.

I stayed in a hotel that was out of the centre, about a third of the way to the campus. This meant I could walk there and back, which made up for missing my regular morning swim. Shame it was so damp and drizzly.

On Friday evening, a bunch of illumos aficionados met up and went to Manhattns for dinner, before heading off to the beer event. The Delirium Cafe wasn't quite as packed as I expected, although the queues at the bars were pretty long. There was a qualifying question to gain entry as a FOSDEM attendee - what's your favourite distribution? What, you've never heard of Tribblix?

I went to quite a few talks. One thing that was managed extremely well was that the talks ran to time. There are a lot of rooms and tracks, but they do a very good job of sticking to the timetable. What this means is that if you leg it across the campus for a talk you want to see, you can be pretty confident that it'll be on when it says it will.

I went to Mark Reinhold's talk on The State of OpenJDK. Interesting to see what the current focus is and where they're heading. Of particular interest to me was project Panama, aiming to supplant the user-hostile JNI as a bridge to native code.

Then Dalibor and Rory on Preparing for JDK9. Apart from all the changes coming up, the one thing that I noted was the version string changes. I also learnt about the jdeps tool, which could be very useful

Changing tack webwards, I learnt about telemetry in Firefox, and telemetry.mozilla.org. Following that, more on HTTP/2 - 30% adoption is pretty good after less than a year, but I guess that's a reflection of how web traffic is dominated by a relatively small number of sites. There's a huge long tail of small sites that are going to take much longer to migrate, if ever. And one thing I didn't know is that client certificates aren't yet supported in HTTP/2, which is a bit of a pain.

In between, I spent time going round the various project stands. We had an illumos booth, it would have been nice to spend more time at that.

I spent a lot of Sunday in the main Janson lecture theatre.

First up, Re-thinking Linux Distributions. Or, as I interpreted it, moving on from package management as the defining characteristic of a distribution. This is a subject I'm deeply interested in, as it forms part of my thoughts about severing the link between applications and the OS, and thinking about software stacks as a useful unit.

Then, Reproducible Builds. Although it's not really about reproducible builds as reproducible package archives. The two aren't necessarily the same. For example, IPS doesn't even have package archives, and is only interested in changes to the binary content rather than to irrelevant metadata. And we have tooling like wsdiff to identify changes in illumos builds. Still, knowing that your build is completely reproducible is a goal we ought to wor towards, although we may end up with a slightly different slant on the subject.

Next, Dan talked about illumos at 5, even mentioning yours truly. And it was great to talk to Thomas from opencsw after the talk.

The State of Go was looking forward to the forthcoming release of Go 1.6. I was motivated to test the release candidate on illumos, and was pleased to see that rc2 builds and runs just fine.

One of the most interesting talks was on LibreOffice Online. How it works (tiling like online map viewers) and some of the waste they have managed to eliminate. Something else I picked up on was the potential for LibreOfficeKit to expose a simple API for other tools to talk to.

It was a busy weekend, and quite focussed. There wasn't as much casual chat as I might have liked. To take advantage of FOSDEM, you have to be organised - plan your schedule out in advance. If I go next year I'll try and give a Tribblix lightning talk, just to get a bit of exposure.


Tuesday, February 09, 2016

Building influxdb and grafana on Tribblix

For a while now I've been looking at alternative ways to visualize kstat data. Something beyond JKstat and KAR, at least.

An obvious thought is: there are many time-series databases being used for monitoring now, with a variety of user-configurable dashboards that can be used to query and display the data. Why not use one of those?

First the database. For this test I'm using InfluxDB, which is written, like many applications are these days, in Go. Fortunately, Go works fine on illumos and I package it for Tribblix, so it's fairly easy to follow the instructions, so make a working directory, cd there, and:

export GOPATH=`pwd`
go get github.com/influxdb/influxdb 
cd $GOPATH/src/github.com/influxdb/influxdb
go get -u -f -t ./...
go clean ./...
go install ./...
 
(Note that it's influxdb/influxdb, not influxdata/influxdb. The name was changed, but the source and the build still use the old name.)

That should just work, leaving you with binaries in $GOPATH/bin.

So then you'll want a visualization front end. Now, there is Chronograf. Unfortunately it's closed source (that's fine, companies can make whatever choices they like) which means I can't build it for Tribblix. The other obvious path is Grafana.

Building Grafana requires Go, which we've already got, and Node.js. Again, Tribblix has Node.js, so we're (almost) good to go.

Again, it's mostly a case of following the build instructions. For Grafana, this comes in 2 parts. The back-end is Go, so make a working directory, cd there, and:

export GOPATH=`pwd`
go get github.com/grafana/grafana
cd $GOPATH/src/github.com/grafana/grafana
go run build.go setup
$GOPATH/bin/godep restore
go run build.go build
 
You'll find the Grafana server in $GOPATH/src/github.com/grafana/grafana/bin/grafana-server

The front-end involves a little variation to get it to work properly. The problem here is that a basic 'npm install' will install both production and development dependencies. We don't actually want to do development of Grafana, which ultimately requires webkit and won't work anyway. So we really just want the production pieces, and we don't want to install anything globally. But we still need to run 'npm install' to start with, as otherwise the dependencies get messed up. Just ignore the errors and warnings around PhantomJS.

npm install
npm install --production
npm install grunt-cli
./node_modules/.bin/grunt --force

With that, you can fire up influxd and grafana-server, and get them to talk to each other.

For the general aspects of getting Grafana and Influxdb to talk to each other, here's a tutorial I found useful.

Now, with all this in place, I can go back to playing with kstats.