Tuesday, December 23, 2014

Setting up Logical Domains, part 1

I've recently been playing with Logical Domains (aka LDOMs, aka Oracle VM Server for SPARC). For those unfamiliar with the technology, it's a virtualization framework built into the hardware of pretty well all current SPARC systems, more akin to VMware than Solaris zones.

For more information, see here, here, or here.

First, why use it? Especially when Solaris has zones. The answer is that it addresses a different set of problems. Individual LDOMs are more independent and much more isolated than zones. You can partition resources more cleanly, and different LDOMs don't have to be at the same patch level (to my mind, it's not that you can have a different level of patches in each LDOM that matters, but that you can do maintenance of each LDOM to different schedules that matters). One key advantage I find is that the virtual switch you set up with LDOMs is much better at dealing with complex network configuration (I have hosts scattered across maybe dozens of VLANs, trying to fake that up on Solaris 10 is a bit of a bind). And some applications don't really get on with zones - I would build new systems around zones, but ill-understood and poorly documented legacy systems might be easier to drop inside an LDOM.

That dealt with, here's how I setup up one of my machines (a T5140, as practice for live deployment on a bunch of replacement T4-1 systems) as an LDOM host. I'll cover setting up the guest side in a second post.

Make sure the firmware is current - these are the minimum revs:

T2 - 7.4.5
T3 - 8.3
T4 - 8.4.2c

Then install the LDOM software.

cd /var/tmp
unzip p17291713_31_SOLARIS64.zip
cd OVM_Server_SPARC-3_1/Install

You'll get asked if you want to launch the configuration assistant after installation. I chose n, and you can run ldmconfig at any later time. (If you want - it's best not to.)

Now need to apply the LDOM patch

svcadm disable -s ldmd
patchadd 150817-02
svcadm enable ldmd

Verify things are working as expected:

# ldm list
primary          active     -n-c--  SP      128   32544M   0.1%  16m

You should see the primary domain.

The next step is to establish default services and configure the control domain.

The necessary services are:

Virtual console
Virtual disk server
Virtual switch service

ldm add-vds primary-vds0 primary
ldm add-vcc port-range=5000-5100 primary-vcc0 primary
ldm add-vsw net-dev=nxge0 primary-vsw0 primary

verify with

ldm list-services primary

And we need to limit the control domain to a limited set of resources. The way I do this (this is just a personal view) is, on a system with N cores, to define N units each with 1 core with 1/N of the total memory. Assign one of those units to the primary domain, and then build the guest domains with 1 or more of those units (sizing as necessary - note that you can resize them on the fly so you don't have to get it perfect first time around). You can get very detailed and start allocating down to individual threads and specific amounts of memory, but it's so much better to just keep it simple.

For a T2/T2+/T3 you need to futz with crypto MAUs. This is unnecessary on later systems.

To show:

ldm list -o crypto primary

To add

ldm set-mau 1 primary

To set the CPU resources of the control domain

ldm set-core 1 primary

(I want to just set 1 core. Allocation best done by cores, not threads.)

Start the reconfig

ldm start-reconf primary

Fix the memory

ldm set-memory 4G primary

Save the config to the SP

ldm add-config initial

verify the config has been saved

ldm list-config

Reboot to activate

shutdown -y -g0 -i6

OK, so that creates a 1-core (8-thread) 4G control domain. And that all seems to work.

Next steps are to configure networking and enable terminal services. From the console (as you're reconfiguring the primary network):

ifconfig vsw0 plumb
ifconfig nxge0 down unplumb
ifconfig nxge0 inet6 down unplumb
ifconfig vsw0 inet netmask broadcast up
ifconfig vsw0 inet6 plumb up
mv /etc/hostname.nxge0 /etc/hostname.vsw0
mv /etc/hostname6.nxge0 /etc/hostname6.vsw0

For a T4, replace nxge with igb.

At this point, you have a machine with minimal resources assigned to the primary domain, which looks for all the world like a regular Solaris box, ready to create guest domains using the remaining resources.

1 comment:

Anonymous said...

Solaris 10..... c'mon!
Go Solaris 11 and then Kernel Zones and/or LDOMS