Monday, February 18, 2008

If only!

Tried this on a test server:

# psrinfo -v
Status of virtual processor 0 as of: 02/18/2008 16:37:34
on-line since 01/21/2008 10:00:38.
The sparcv9 processor operates at 2793 MHz,
and has a sparcv9 floating point processor.

We can dream, can't we?

What's actually happening here is that I'm trying out the sparc emulator from Transitive (which now runs on Solaris x86) and it's reporting the speed of the Opteron processors in the box.

Python considered harmful

I used to think that using java was like being in league with memory suppliers.


10946 fred 1 60 0 3692M 270M sleep 25:34 0.00% python
19559 joe 1 59 0 1950M 836M sleep 17:21 0.00% python
20738 bill 1 59 0 1787M 1608M sleep 3:28 0.00% python

That's on a machine with 4G of physical memory. And given that python is being used more widely, and that data volumes are increasing, I need to do a couple of things. First, order more memory; and second, work out how to build a 64-bit copy of python with all the modules we use.

All-in-one servers

One recurring theme as I build servers is that I often want a configuration that's not available.

Sun are the worst culprit as, while they generally have excellent products, the actual range of configuration options is rather limited.

And one of the reasons I was interested in the X4150 in the first place was the ability to have 8 internal disk drives. For many things I would prefer internal storage, if I can get it.

So what is wrong with external storage? Well, it can be very expensive, because you need to get a chassis, maybe raid controllers, and HBAs, not to mention the extra rack space, power cords, cables, and having separate boxes to manage and monitor. And if all you want is a few hundred gig of space, then it's just not worth it.

So consolidate on a SAN, you say. Maybe, but SAN storage itself is rather expensive. For small amounts, the cost of HBAs and the fibre infrastructure can be prohibitive.

I've been attracted to iSCSI, but while it does actually work great, it's limited to low-bandwidth light-use scenarios. (I just have regular gigabit ethernet.)

So a solution where the storage can fit neatly in the server is very attractive. At the moment I have something that takes about 500G, but is likely to grow slightly. So I think I need about a terabyte, and it's got to be reasonably quick, faster than iSCSI anyway.

So looking at the X4150 I would probably do something like use the first 2 drives for the OS and use the upmarket raid card to create a raid-5 array across the other 6 drives. So that's 5 drives worth of data, or about 700G.

Close, but it's not quite close enough. It's just a little bit tight. It would be nice to have larger drives, but as Ben has discovered, larger capacity drives in the small (2.5") form factor just can't be had.

So if we have to go beyond that then we might go up to regular 3.5" drives (which gets you 15K rpm and 300G or 450G capacities), or you need more than 8 internal drives. In either case that implies a larger chassis. (Sun have an X4450 which is the big brother of the X4150, but that's identical as far as supported drives are concerned.)

Looking at what Sun have available, there really isn't anything. And no I don't want a thumper, not for this application anyway. (It's a shame that there aren't more variations on the thumper theme.)

Ho hum. Off to see if I can find something different. A Dell 2900 is a lumpy tower. What about a HP ProLiant DL580 G5? (The DL 320s might work, but the max memory is a little too small at a mere 8G.)

Anyone care to suggest alternative options? (Must run Solaris!)

Sunday, February 17, 2008

X4150 experiences.

It's clear from the comments I've had that I'm not the only one who's had issues with X4150s.

Those of us who have been playing this game for many years manage to surmount these minor obstacles, so we can take advantage of the good things that these servers have to offer. But it does worry me that customers without the experience (or confidence and contacts) to get past the irritating issues lose out on what could be a good solution, and that a supplier loses out on a potential sale.

Saturday, February 09, 2008

X4150: lit up

So I've had the X4150 booted and running for about a day now. And it's managed getting on for 8 cpu-days worth of work so far.

It's looking good. On the workload we've tested so far, it seems to be about as fast as - if not marginally faster than - a comparable opteron, such as the X4200. Overall it can chew twice the workload because it has twice the cores.

It was a bit of a drag getting there, but I think it was worth the effort.

Friday, February 08, 2008

X4150: where do I boot from?

Seems my earlier optimism was a tad misplaced. I come back a short time later and the X4150 is sitting there at the interactive prompt you get when doing a network boot. So the install finished just fine but then it booted off the network again.

(See: I knew it was a good thing not to default to install. It could have installed itself over and over in a loop.)

The cause is reasonably obvious - either the drive isn't listed as bootable or the boot order in the BIOS is wrong.

This is where Ben's tip saved me either a bit of fiddling or a walk down to the machine itself.

I get into the BIOS, and the boot order shows me the DVD, the 4 network ports, and the disk I installed to. In that order. OK, I move the disk up and reboot.

Success! We boot off the hard disk.

(There's still an open question here: I want to have all four disks bootable just in case I lose one, and I'll set them up as two sets of mirrors - one for running, and one for live upgrade. I still need to work out how to add the rest.)

X4150: No disks found

So I've managed to get to the point where I can control my new X4150 using the SP, and can get to the system console.

I have the address set in dhcp and my jumpstart server configured, so let it boot up and see what happens.

(OK. So I goofed when typing the mac address into the dhcp server the first time. So it took me an extra attempt.)

Just as an aside, the X4150 is like the X2200M2 and requires the serial port set to 115200 baud, so I've rebuilt the boot image it gets from the jumpstart server just like it says in the documentation.

Off it goes, boots up, I hit 2 for jumpstart (I don't default this so that I don't accidentally overwrite a system if it boots off the network by mistake or for maintenance), and it's going well. What normally happens is that it complains the disk in my jumpstart profile isn't valid on my system.

Not this time. "No disks found."

That's not good. One nice thing about the X4150 is the 8 disk bays on the front, and I know that 4 of them are occupied. So why can't it see them?

I was shipped a HBA and a cable kit, but hadn't found any documentation on why I would need it or how to install it. So maybe you actually do need an additional HBA to make it work, which makes me wonder why on earth this requirement isn't prominently documented, and why they don't have a functioning on-board disk controller, and why the disk bays have been carefully cabled up to the on-board connectors when that's not going to work?

OK, so I pull out the old cables and put in the replacement ones. Looks like the HBA has to go in the middle slot, otherwise it fouls the memory slots. I hope I have the cables in right and routed the correct way.

Wonder of wonders, I boot up and all the disks are visible. Jumpstart tells me the disk isn't valid on this system, but that's a trivial fix to the jumpstart profile to get the controller numbering correct and off the installation goes.

Pretty quickly too!

Thursday, February 07, 2008

LOM: Consistency would be nice

Most of Sun's x64 servers have some sort of LOM, but while they're generally superficially the same (they call themselves the SP, they have a similar /SP and /SYS layout), all the different models differ in the details.

And we all know that that's where the devil is....

Why on earth are the steps to set the IP address subtly different between systems, for example?

X4150: talk to me!

OK, so the first step is to connect up the cables, power the server on, and configure the SP.

So I do that - system and management networks, connect up the serial port, tip in, and apply power.

Silence. Not a peep. Not a sausage.

Come on, now. Talk to me!

Now I think this is OK, because the same tip session from the same host works flawlessly on every other Sun x64 server I have. But still I go through the motions - different host, different cable, various types of cable.

It's still sulking.

I give Sun Service a call. I don't think I'm doing anything wrong, so we'll see what they have to say - maybe I've got a faulty unit.

Turns out there's a problem with some units shipped with the wrong settings. If this happens to you, connect up a real keyboard and monitor, and power up the box (the real box, not just the standby power to the SP). Hit F2 to go into setup, and look through the settings. The "external serial port" should be set to SP. If it's set to system, hit F9 to restore defaults, and F10 to save (I think it's those function keys). I get back to my desk and there's the regular SP login prompt.

Thanks to James of Sun Support for tracking that one down for me.

Server deja vu

Over a year ago, I had a great deal of fun and games when we got some Sun X2100M2 systems:

On to the X2100 M2
Me versus the M2
The M2 comes alive
Server Wars: The M2 strikes back

Well, it looks like I have another battle on my hands. After a stack of flawless installs of X4200 and X4500 boxes, we decided it was worth getting an X4150 to see what the Xeon offered. Our cpu-hungry applications love the idea of quad-core; our data storage likes the idea of 8 disks in a 1U system.

Installing this thing isn't going well. Watch this space...