How to Identify Orphaned EBS Snapshots to Optimize AWS Costs

So a while back I got an email from our finance team. I was tasked to assist with tagging resources in our AWS infrastructure and investigate which items are contributing to certain costs. I don’t know about other engineers, but these kinds of tasks are on the same realm of fun as … wiping bird poop off your windshield at a gas station. So I did the sanest thing I could think of.

I deleted the email…

However, when LogicMonitor’s AWS Billing Monitoring came out, it was pretty easy to see that items are broken down, all out of the box. I wasn’t jumping head first onto the cost optimization train by any stretch of the imagination. But I was intrigued.

Apparently, tracking down all AWS resources is like moving to a new home. You pack everything, and for the first time in years, deep dive into the crevices of the garage to categorize everything. What is this instance? Are we using it? Is it running an app? Which one? That’s when I stumbled onto our mountain of orphaned EBS snapshots.

Orphaned Snapshots costing you $$$

In order to save data without paying crazy fees for EBS volumes, snapshots can be taken as a backup of the volume. It is much cheaper and you have the option of restoring the volume from a snapshot when needed. However, when an EC2 instance is terminated, even if the EBS volumes attached are deleted along with it, the snapshots leftover will be stored in s3 and you will be charged monthly. These can easily be forgotten and will pile up over time…and that is how you get a few 500GB snapshots from 2011 labeled “TESTING”.

Finding Orphaned Snapshots

There must be a programmatic way to identify orphaned snapshots since Amazon does not allow you to filter those natively. One of my team members is amazing with CLI one-liners. Check it out below:

comm -23 <(aws ec2 describe-snapshots --owner-ids AWS-ACCOUNT-ID --query 'Snapshots[*].SnapshotId' --output text | tr '\t' '\n' | sort) <(aws ec2 describe-volumes --query 'Volumes[*].SnapshotId' --output text | tr '\t' '\n' | sort | uniq) 

However, if you want to extrapolate and manipulate that data (such as tag those snapshots), you will need to add on to that. What if you wanted to tag those snapshots every month? Would you copy and paste commands manually? Of course not, we aren’t animals. You can utilize a lambda function following cloudwatch event rules to do all that for you. Here is a sample script that tags all your orphaned snapshots.

Using LogicMonitor

If you have already tagged your snapshots, you can use LogicMonitor’s Cost By Tag DataSources and have immediate insight into your month to date spending.

I also created a datasource specific for Orphaned Snapshots so that I can measure datapoints like the aggregate size of all snapshots for a specific deleted volume or the oldest date of an orphaned snapshot.

Give it a try. Create a few dashboards, clear out some cruft, save some money and earn yourself a hearty pat on the back and a resounding “Thank You” from your boss. Thank yous are nice…

But so are gift cards.