But if you do want to place some limits on a zone, then the zone configuration offers a couple of options.
First, you can simply allocate some CPUs to the zone:
add dedicated-cpu set ncpus=4 end
Or, you can cap the cpu utilization of the zone:
add capped-cpu set ncpus=4 end
I normally put all the configuration commands for a zone into a file, and use
zonecfg -f
to build the zone; if modifying a zone then I create a fragment like the above and load that the same way.In terms of stopping a zone monopolizing a machine, the two are fairly similar. Depending on the need, I've used both.
When using dedicated-cpu, it's not just a limit but a guarantee. Those cpus aren't available to other zones. Sometimes that's exactly what you want, but it does mean that those cpus will be idle if the zone they're allocated to doesn't use them.
Also, with dedicated-cpu, the zone thinks it's only got the specified number of cpus (just run psrinfo to see). Sometimes this is necessary for licensing, but there was one case where I needed this for something else: consolidating some really old systems running a version of the old Netscape Enterprise Server, and it would crash at startup. I worked out that this was because it collected performance statistics on all the cpus, and someone had decided that hard coding the array size at 100 (or something) would cover all future possibilities. That was, until I ran it one a T5140 with 128 cpus and it segfaulted. Just giving the zone 4 cpus allowed it to run just fine.
I use capped-cpu when I just want to stop a zone wiping out the machine. For example, I have a machine that runs application servers and a data build process. The data build process runs only rarely, but launches many parallel processes. When it had its own hardware that was fine: the machine would have occasional overload spikes but was otherwise OK. When shared with other workloads, we didn't want to change the process, but have the build zone capped at 30 or 40 cpus (on a 64-way system) so there's plety of cpu left over for other workloads.
One advantage of stopping runaways with capped-cpu is that you can limit each zone to, say, 80% of the system, and you can do that for all zones. It looks like you're overcommitting, but that's not really the case - uncapped is the same as a cap of all the cpus, so you're lower than that. This means that any one zone can't take the system out, but each zone still has most of the machine if it needs it (and the system has the available capacity).
The capability to limit memory also exists. I haven't yet had a case where that's been necessary, so have no practical experience to share.
2 comments:
Personally I always tended to use "cpu-shares". Give the global zone (say) 100 shares, and each non-global zone 20, and this gives reasonable assurances that a zone won't take out the system.
This also allows 'burstable' performance, so that if most of the system is idle, but one zone needs more CPU it can get it. But, if other zones come alive and need CPU as well, they all balance out to proportionately shared resources.
Using "capped-cpu" on top of this is also possible of course.
The other resource to always set is "max-lwps". I've had cases where bugs / accidental fork bombs have taken out entire physical machines because one zone gobble up all the processes (even if the CPU/s was mostly idle).
There are some good examples of how to limit cpu usage in a zone in Oracle's white papers.
See 'Resource Partitioning with Pools'
in the white paper “Effective Resource Management Using Oracle Solaris Resource Manager”
This is part 2 of a 4 part series
Part 1: “Introduction to Resource Management in Oracle Solaris and Oracle Database”
http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-054-intro-rm-419298.pdf
Part 2: “Effective Resource Management Using Oracle Solaris Resource Manager”
http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-054-intro-rm-419298.pdf
Part 3: “Effective Resource Management Using Oracle Database Resource Manager”
http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-056-oracledb-rm-419380.pdf
Part 4: “Resource Management Case Study for Mixed Workloads and Server Sharing”
http://www.oracle.com/technetwork/articles/servers-storage-admin/o11-054-intro-rm-419298.pdf
Post a Comment