Wednesday, July 08, 2020

Customizing EC2 instance storage and networking with the AWS CLI

I use AWS to run illumos quite a bit, either with Tribblix or OmniOS.

Creating EC2 instances with the console is fine for one-offs, but gets a bit tedious. So using the AWS CLI offers a better route, with the ec2 run-instances command.

Yes. there are things like templates and terraform and all sorts of other options. For whatever reason, they don't work in all cases.

In particular, the reasons you might want to customize an instance if you're running illumos might be slightly different than a more traditional usage model.

For storage, there are a couple of customizations we might want. The first is that the AMI has a fairly small root disk, which we might want to make larger. We may be adding zones, with their root filesystems installed on the system pool. We may be adding swap (while anonymous reservation means applications like java don't need to write to swap, you still need space backing the swap space to be available). For the second, there's the fact that we might actually want to use EBS to provide local storage (so we can use ZFS, for example, which has data integrity and manageability benefits).

To automate the enlargement of the root pool, I create a mapping file that looks like this:

[
  {
    "DeviceName": "/dev/xvda",
    "Ebs": {
      "VolumeSize": 12,
      "Encrypted": true
    }
  }
]

The size is in Gigabytes. The /dev/xvda is the normal device name (from EC2, clearly in illumos we have a different naming). If that's in a file called storage.json, then the argument to the ec2 run-instances command is:

--block-device-mappings file://storage.json

Once the instance is running, that will normally (on my instances) show up on c2t0d0, and the rpool can be expanded to use all the available space with the following command:

zpool online -e rpool c2t0d0

To add an additional device, to keep application storage separate, in addition to that enlargement, would involve a json file like:

[
  {
    "DeviceName": "/dev/xvda",
    "Ebs": {
      "VolumeSize": 12,
      "Encrypted": true
    }
  },
  {
    "DeviceName": "/dev/sdf",
    "Ebs": {
      "VolumeSize": 256,
      "DeleteOnTermination": false,
      "Encrypted": true
    }
  }
]

On my instances, I always use /dev/sdf, which comes out as c2t5d0.

For networking, I often end up with multiple IP addresses. This is because we have zones - rather than create multiple EC2 instances, it's far more efficient to run applications in zones on a single system, but then you want to assign each zone its own IP address.

You would think - supported by the documentation - that the --secondary-private-ip-addresses flag to ec2 run-instances would do the job. You would be wrong. That flag, actually, is supposed to just be a convenient shortcut for what I'm about to describe, but it doesn't actually work. (And terraform doesn't support this customization either - it can handle additional IP addresses, but not on the same interface as the primary.)

To configure multiple IP addresses we again turn to a json file. This looks like:

[
  {
    "DeviceIndex": 0,
    "DeleteOnTermination": true,
    "SubnetId": "subnet-0abcdef1234567890",
    "Groups": ["sg-01234567890abcdef"],
    "PrivateIpAddresses": [
      {
        "Primary": true,
        "PrivateIpAddress": "10.15.32.12"
      },
      {
        "Primary": false,
        "PrivateIpAddress": "10.15.32.101"
      }
    ]
  }
]

You have to define (SubnetId) the subnet you're going to use, and (Groups) the security group that will be applied - these belong to the network interface, not to the instance (in the trivial case there's no difference). So you don't specify the security group(s) or the subnet as regular arguments. Then I define two IP addresses (you can have as many as you like), one is set as the primary ("Primary": true), all the others will be secondary ("Primary": false). Again, if this is in a file network.json you feed that to the command like

--network-interfaces file://network.json

One other thing I found is that you can add tags to the instance (and to EBS volumes) at creation, saving you the effort of having to go through and tag things later. It's slightly annoying that it doesn't seem to allow you to apply different tags to different volumes, you can just say "apply these tags to the instance" and "apply these tags to the volumes". The trick is that the example in the documentation is wrong (it has single quotes, which you don't need and don't work).

So the tag specification looks like:

--tag-specifications \
ResourceType=instance,Tags=[{Key=Name,Value=aws123a}] \ ResourceType=volume,Tags=[{Key=Name,Value=aws123a}]

In the square brackets, you can have multiple comma-separated key-value pairs. We have tags marking projects and roles so you have a vague idea of what's what.

Putting this all together you end up with a command like:

aws ec2 run-instances \
--region eu-west-2 \
--image-id ami-01a1a1a1a1a1a1a1a \
--instance-type t2.micro \
--key-name peter-key \
--network-interfaces file://network.json \
--count 1 \
--block-device-mappings file://storage.json \
--disable-api-termination \
--tag-specifications \
ResourceType=instance,Tags=[{Key=Name,Value=aws123a}] \
ResourceType=volume,Tags=[{Key=Name,Value=aws123a}]

Of course, I don't write either the json files or the command invocation by hand. I have a script that knows what all my AMIs and availability zones and subnets and security groups are and does the right thing for each instance I want to build.

No comments: