SSM Parameter Store

As part of my disaster recovery process, I make a daily AMI snapshot of the servers and copy it to my disaster recovery target region. As that AMI ID changes each day, I need a way to get the current day’s ID into my CloudFormation template so that AMI can be used when creating a copy of our server in the new region. Short of having to use python and Lambda to discover the AMI ID and recreate the template, there had to be a better way to do it.

Enter Parameter Store.

This is an AWS service that acts like a region bound scratchpad, where you can store data and have it retrieved from a few other services, one of which is CloudFormation.

My first step is to create the AMIs and store their IDs in SSM:

# snippet, local_ami_list is list of local instances and names with date pre-pended.
# this section creates the AMIs, tags them, and adds to a list for copying to DR
for line in local_ami_list:
    image_data_combined_list = line.split(',')
    #pprint(image_data_combined_list)
    local_instance_id = image_data_combined_list[0]
    local_instance_name = current_date_tag + '-' + image_data_combined_list[1]
    image = ec2_local.create_image(InstanceId=local_instance_id, Description=local_instance_name, DryRun=False,
                                    Name = local_instance_name, NoReboot=True)
    tag_image = ec2_local.create_tags(Resources=[image['ImageId']], Tags=[{'Key': 'Name', 'Value': local_instance_name},])
    entry = local_instance_name + ',' + image['ImageId']
    ami_list_to_copy.append(entry)

sleep(90)

# this snippet copies the AMIs to the DR region

for line in ami_list_to_copy:
    ami_list_combined_data = line.split(',')
    local_ami_name = ami_list_combined_data[0]
    local_ami_id = ami_list_combined_data[1]
    try:
        image_copy = ec2_dr.copy_image(Description=local_ami_name, Name=local_ami_name, SourceImageId=local_ami_id,
                                        SourceRegion=local_region, DryRun=False)
        entry = local_ami_name + ',' + image_copy['ImageId']
        dr_ami_list.append(entry)
# this snippet is a bit of kludgy hack, but it gets me in the ballpark. Anonymized for my protection

# 5. lists amis in dr region and writes the current day to SSM parameters for further use in cf-scripts

sv1_ami_parameter = '/org/env/ec2/ServerName1/ami'
sv2_ami_parameter = '/org/env/ec2/ServerName2/ami'
sv3_ami_parameter = '/org/env/ec2/ServerName3/ami'
sv4_ami_parameter = '/org/env/ec2/ServerName4/ami'
sv5_ami_parameter = '/org/env/ec2/ServerName5/ami'
current_ami_list = []

ssm_dr = boto3.client('ssm',region_name=dr_region)
dr_amis = ec2_dr.describe_images(Owners=['self'])
for ami in dr_amis['Images']:
    match = re.search(current_date_tag, str(ami['Name']))
    if match:
        entry = str(ami['Name']) + ',' + str(ami['ImageId'])
        current_ami_list.append(entry)

for ami in current_ami_list:
    line = ami.split(',')

    match1 = re.search('ServerName1', line[0])
    match2 = re.search('ServerName2', line[0])
    match3 = re.search('ServerName3', line[0])
    match4 = re.search('ServerName4', line[0])
    match5 = re.search('ServerName5', line[0])
    if match1:
        set_parameter = ssm_dr.put_parameter(Name=sv1_ami_parameter,
                                            Value=line[1],
                                            Type='String',
                                            Overwrite=True)
    elif match2:
        set_parameter = ssm_dr.put_parameter(Name=sv2_ami_parameter,
                                            Value=line[1],
                                            Type='String',
                                            Overwrite=True)
    elif match3:
        set_parameter = ssm_dr.put_parameter(Name=sv3_ami_parameter,
                                            Value=line[1],
                                            Type='String',
                                            Overwrite=True)

    elif match4:
        set_parameter = ssm_dr.put_parameter(Name=sv4_ami_parameter,
                                            Value=line[1],
                                            Type='String',
                                            Overwrite=True)

    elif match5:
        set_parameter = ssm_dr.put_parameter(Name=sv5_ami_parameter,
                                            Value=line[1],
                                            Type='String',
                                            Overwrite=True)

As you can see, the set_parameter function of the boto3 ssm client module puts the data as a plain text value. To retrieve it in a cloudformation script, you have to reference it:

Parameters:

  sv1:
    Description:  'pre-baked AMI copied from ops region, ID retrieved from SSM Parameter Store'
    Type: 'AWS::SSM::Parameter::Value<String>'
    Default: '/org/env/ec2/ServerName1/ami'

# then reference it in the Resources ec2 instance code block as the ImageId.

No more hard coded values, or a need to dynamically generate the script on a daily basis.

You can encrypt values and store them as SecureStrings. However, to retrieve them you will need an understanding of the version number, and I’ve yet to figure that out. Once I do that, then I can more securely store usernames and passwords and avoid hard coding them. So, very cool indeed! (And now I know the answer to an interview question that I bombed!)

python : amiCreateCopyRotate.py

This script creates  an AMI of each running instance, tags them with a datestamp, copies them to a disaster recovery region, and trims images that are 7 days old. This script needs some refactoring, but works at the moment. I think I would like to decouple it a bit so that the sleep timers can be removed.

#!/usr/bin/python

import boto3
import botocore
from datetime import datetime, timedelta
from time import sleep
import re

# variables

local_region = 'us-west-2'
dr_region = 'us-east-1'

current_date = datetime.now().today()
last_week = current_date - timedelta(days=7)
current_date_tag = str(current_date.strftime("%Y-%m-%d"))
last_week_date_tag = str(last_week.strftime("%Y-%m-%d"))

local_ami_list = []
dr_ami_list = []
ami_list_to_copy = []

ec2_local = boto3.client('ec2', region_name=local_region)
ec2_dr = boto3.client('ec2', region_name=dr_region)


# 1. pull list of running instances in us-west-2

try:
    local_instances = ec2_local.describe_instances()
    for key in local_instances['Reservations']:
        for instance in key['Instances']:
            if instance['State']['Name'] == 'running':
                local_instance_id = instance['InstanceId']
                local_instance_tags= instance['Tags'][0]
                local_instance_name = str(local_instance_tags.get('Value'))
                entry = local_instance_id + ',' + local_instance_name
                local_ami_list.append(entry)
            else:
                pass

except botocore.exceptions.ClientError as error:
    print('error: {0}'.format(error))


# 2. Creates an AMI of each instance and tags it with the current date and name

for line in local_ami_list:
    image_data_combined_list = line.split(',')
    #pprint(image_data_combined_list)
    local_instance_id = image_data_combined_list[0]
    local_instance_name = current_date_tag + '-' + image_data_combined_list[1]
    image = ec2_local.create_image(InstanceId=local_instance_id, Description=local_instance_name, DryRun=False,
                                   Name = local_instance_name, NoReboot=True)

    entry = local_instance_name + ',' + image['ImageId']
    ami_list_to_copy.append(entry)

sleep(90)


# 3. Copies the AMIs to the DR region us-east-1

for line in ami_list_to_copy:
    ami_list_combined_data = line.split(',')
    local_ami_name = ami_list_combined_data[0]
    local_ami_id = ami_list_combined_data[1]
    try:
        image_copy = ec2_dr.copy_image(Description=local_ami_name, Name=local_ami_name, SourceImageId=local_ami_id,
                                       SourceRegion=local_region, DryRun=False)
        entry = local_ami_name + ',' + image_copy['ImageId']
        dr_ami_list.append(entry)

    except botocore.exceptions.ClientError as error:
        print('error: {0}'.format(error))

sleep(90)


# 4. Pulls a list of current private AMIs in us-west-2 and drops AMIs that are tagged 7 days older

local_amis_to_prune = ec2_local.describe_images(Owners=['self'])
local_amis = local_amis_to_prune['Images']
for ami in local_amis:
    entry = str(ami['Name']) + ',' + str(ami['ImageId'])
    match = re.search(last_week_date_tag,entry)
    if match:
        ec2_local.deregister_image(ImageId=ami['ImageId'])
        #print('deleting: ', ami['Name'])
    else:
        pass
        #print('not deleting', ami['Name'])

# 5. same for dr region

remote_amis_to_prune = ec2_dr.describe_images(Owners=['self'])
remote_amis = remote_amis_to_prune['Images']
for ami in remote_amis:
    entry = str(ami['Name']) + ',' + str(ami['ImageId'])
    match = re.search(last_week_date_tag,entry)
    if match:
        ec2_dr.deregister_image(ImageId=ami['ImageId'])
        #print('deleting: ', ami['Name'])
    else:
        pass
        #print('not deleting', ami['Name'])

Learning python with boto3: Client vs Resource

I’ve only recently spent any serious time learning Python, and usually just to solve a specific problem. I am not a programmer by training, just used to hacking out a script, one line at a time. Frequently, I really feel my lack of proper programming training, so I am using the bang-head-on-desk method of programming, figuring it out with the language reference close at hand, and by buying some video tutorials and books to eat up all of my sleep time.

Previously, I built a script that would automate the back of AMIs and copying them to the disaster recovery region, but that was really just substituting python for bash. Now, I have a problem to solve at the new gig that requires I learn Boto3, the AWS sdk for python. 

The issue is that I need to:

  1. interrogate each AWS account by ID,
  2. list the resources created,
  3. write those resources into a DynamoDB table so we can can see what is in each account,
  4. maintain state info when something changes.

I have figured out how to get ec2 instance data the hard way:

#!/usr/bin/python
import boto3
ec2 = boto3.client('ec2')
response = ec2.describe_instances()
print(response)

This gives you a json dump of all of your ec2 instances and all their information. Useful, but more than a little messy. However, this script calls the the default credentials from the ~/.aws/credentials file, and I am working with many profiles. So, I need to do that for each account, which introduced me to the idea of sessions. I need to add in my list of accounts, then go through each one. Creating the session was not difficult:

profiles = ['default','foo1','foo2']
for profile in profiles:
    session = boto3.Session(region_name='us-east-1', profile_name=profile)
# need to set the region_name, or will err

However, if you use the ec2 = boto3.client(‘ec2’) code here, you will not get any change of results. This introduced the idea of Resources and Clients.

I had trouble finding a web page that explained it clearly, and at this point, I’m not all that clear myself, but using the session variable and calling for a resource, I was able to get the info I needed:

ec2 = session.resource('ec2')
for instance in ec2.instances.all():
    print(instance.id)

This gives me specific information, rather than the json object, and it changes with each new session, as expected. 

So, I need to get really familiar with the methods and attributes available to each service, and while it is a really thick read, it is all there in the docs. My next step is to start writing to the database.