Thursday, July 28, 2011

A bigger hammer than svcadm

One advantage of the service management facility (SMF) in Solaris is that you can treat services as single units: you can use svcadm to control all the processes associated with a service in one command. This makes dealing with problems a lot easier than trying to grope through ps output trying to kill the right processes.

Sometimes, though, that's not enough. If your system is under real stress (think swapping itself to death with thousands of rogue processes) then you can find that svcadm disable or svcadm restart simply don't take.

So, what to to? It's actually quite easy, once you know what you're looking for.

The first step is to get details of the service using svcs with the -v switch. For example:

# svcs -v server:rogue
STATE NSTATE STIME CTID FMRI
online - 10:06:05 7226124svc:/network/server:rogue
(you'll notice a presentation bug here). The important thing is the number in the CTID column. This is the contract ID. You can then use pkill with the -c switch to send a signal to every process in that process contract, which defines the boundaries of the service. So:

pkill -9 -c 7226124
and they will all go. And then, of course SMF will neatly restart your service automatically for you.

(Why use -9 rather than a friendlier signal? In this case, it was because I just wanted the processes to die. Asking then nicely involves swapping them back in, which would take forever.)

Tuesday, July 26, 2011

The Oracle Hardware Management Pack

I've recently acquired some new servers - some SPARC T3-1s and some x86 based X4170M2s.

One of the interesting things about these is that the internal drives are multipathed by default - so you get device names like c0t6006016021B02C00F22A3EED6CADE011d0s2 rather than the more traditional c0t0d0s2.

This makes building a jumpstart profile a bit more tedious than normal, because you need to have a separate disk configuration section for every box - because the device names are different on each box.

However, there's another minor problem. How do you easily map from the WWN-based device names to physical positions in the chassis? You really need this so you're sure you're swapping the right drive. And while a SPARC system really doesn't mind which disk it's booting from, for an x86 system it helps if you install the OS on the first disk in the BIOS boot order.

The answer is to install the Oracle Hardware Management Pack. (Why this isn't even on the preinstalled image I can't explain.) This seems to work on most current and recent Sun server models.

Now, actually getting the Hardware Management Pack isn't entirely trivial. So prepare to do battle with the monstrosity called My Oracle Support.

So, you're logged in to My Oracle Support. Click the Patches & Updates tab. In the Patch Search area, click the link marked 'Product or Family (Advanced)'. Then scroll down the dropdown list and select the item that says 'Oracle Hardware Management Pack'. Then choose some of the most recent releases (highest version numbers - note that different hardware platforms match different version numbers of the software) and select your desired platform (essentially, SPARC or X86 or both) from the dropdown to the right of where it says 'Platform is'. Then hit the Search button.

Assuming the flash gizmo hasn't crashed out on you (again) you should get a list of patches. No, I have no idea why they're called patches when they're not. You can then click on the one you want and download it.

What you get is a zip file, so you can unzip that, cd into it and then into the SOFTWARE directory inside it, and then run the install.bin file you find there. (You may have to chmod the install.bin file to make it executable.) I just accept all the defaults and let it get on with it.

On a preinstalled system it may claim it's already installed. It probably isn't - just 'pkgrm ipmitool' first. And if you're using your own jumpstart profile, make sure the SUNWCsma cluster is installed. It may be necessary to wait a while and then 'svcadm restart sma' to get things to take the first time.

So, once it's installed, what can you do?

The first thing is that there's a Storage tab in the ILOM web interface. Go there once you've got the hardware management pack installed and you should be able to see the controllers and disks enumerated.

On the system itself, the raidconfig command is very useful. Something like

raidconfig list all
will give you a device summary, and

raidconfig list disk -c c0 -v

will give a verbose listing of the disks on controller c0. (And. just to remind you, the c0 in c0t6006016021B02C00F22A3EED6CADE011d0s2 doesn't refer to physical controller 0.)

The hardware management pack is really useful - if you're running current generation Sun T-series or X-series hardware, you ought to get it and use it.

Friday, July 08, 2011

CPU visualization

Over the years, simple tools like cpustate have allowed you to get a quick visualization of cpu utilization on Solaris. It's sufficiently simple and useful that emulating it was one of the first demos I put together using JKstat.

Its presentation starts to suffer as core and thread counts continue to rise. So recently I added vertical bars as an option. This allows a more compact representation, and also works better given that most screens are wider that they are tall.

Still, even that doesn't work very well on a 128-way machine. And, by treating all cpus equally, you can't see how the load is distributed in terms of the processor topology.

However, as of the new release of SolView there's now a 'vertical' option for the enhanced cpustate demo included there.

So, what's here? This is a 2-chip system (actually an X4170M2), each chip has 6 cores, each of which has 2 threads. The chips are stacked above each other, and within each chip is a smaller display for each core, and within each core are its threads. All grouped together to show how the threads and cores are related, and each core and chip having an aggregate display.

Above is a T5140 - 2 chips, each with 8 cores each with 8 threads.

I find it interesting to watch these for a while, and you can see how work is scheduled. What you normally see is an even spread across the cores. Normally, if there's an idle core you see a process sent there rather than running on a thread on a busy core and competing for shared resources. (Not always: you can see on the first graph that there's one idle core and another core with both threads busy, which is unusual.) The scheduler is clearly aware of the processor topology and generally distributes the work pretty well to make best use of the available cores.