Save Money with AWS ECR Lifecycle Policies

Erik A. Ekberg
5 min readApr 5, 2023

--

You can save money on your monthly AWS ECR bill by adding a Lifecycle Policy to your repositories.

A policy is made up of rules which tell AWS which images in your repository are safe to delete, referred to as expiring an image in the AWS documentation.

AWS only charges you for images you are currently storing in ECR.

So, letting AWS automatically delete images you do not need saves money on your monthly bill.

At Good Dog we have been using ECR for over five years without any policies which was costing us over $400 a month in storage costs.

After adding a policy to delete non-production images after 30 days we dropped our bill from $400 to around $15 a month.

While the cost savings from Lifecycle Policies are straight forward, unfortunately writing policies is not.

Parts of a Lifecycle Policy

Each ECR can have a single Lifecycle Policy which is a JSON object with a single rules property.

These rules tell AWS when an image is safe to be deleted.

As the name rules suggest, a policy may have many different rules which target different sets of images.

For example, you may choose to delete all images tagged with beta every 30 days whereas you want to keep your three newest prod builds in case you need to rollback.

Watever your case may be, I recommend reading the AWS documentation to learn how to write rules.

But after you are familiar with the syntax of a single rule, there are a few common pitfalls to watch out for when starting to write multi-rule policies.

How are Multi-Rule Lifecycle Policies Applied?

When a Lifecycle Policy rule is applied AWS will either (1) skip the image, (2) marks the image as KEEP, or (3) marks the image as EXPIRED.

When an image is marked as EXPIRED AWS will immediately delete that image.

To know which action to perform, that is to skip, KEEP, or EXPIRE an image, AWS performs three checks per image per rule in the following order:

  1. Has this image already been marked as KEEP or EXPIRED by a higher priority rule? If yes, then skip this image.
  2. Does this rule apply to this image (e.g. tagged with prod, beta, etc.)? If no, then skip this image.
  3. Does this image meet the criteria to be EXPIRED? If yes, mark the image as EXPIRED, otherwise, mark the image as KEEP.

Common Pitfalls of Multi-Rule Lifecycle Policies

If an image is marked as EXPIRED or KEEP by a higher priority rule then all lower priority rules are skipped for that image.

A common pitfall is having a high priority rule that applies to "any" image which skips over every lower priority rule because each image with be marked as EXPIRED or KEEP as a result of the "any" rule.

For this reason you may only ever have one "any" or "untagged" rule in your policy and they must be your lowest priority rules.

Another common pitfall is deleting your production images.

Lifting from the the AWS ECR Lifecycle Policy examples, say you have images

| Label |  Pushed at  | Image Tags |        |
| | | | |
|:-----:|:-----------:|:----------:|:------:|
| C | 8 days ago | beta-3 | |
| B | 9 days ago | beta-2 | prod-2 |
| A | 10 days ago | beta-1 | prod-1 |

where image A is your oldest image and image C is your newest image.

If we wrote a policy where we intended to only keep the newest beta and the newest prod images like

{
"rules": [
{
"rulePriority": 1,
"description": "Only keep the newest beta image",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["beta"],
"countType": "imageCountMoreThan",
"countNumber": 1
},
"action": {
"type": "expire"
}
},
{
"rulePriority": 2,
"description": "Only keep the newest prod image",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["prod"],
"countType": "imageCountMoreThan",
"countNumber": 1
},
"action": {
"type": "expire"
}
}
]
}

we would instead accidentally be deleting all our prod images.

| Label |  Pushed at  | Image Tags |        | Rules     |           | Expired? |
| | | | | Rule 1 | Rule 2 | |
|:-----:|:-----------:|:----------:|:------:|:---------:|:---------:|:--------:|
| C | 8 days ago | beta-3 | | Keep | - | Keep |
| B | 9 days ago | beta-2 | prod-2 | Expire | - | Expire |
| A | 10 days ago | beta-1 | prod-1 | Expire | - | Expire |

This is because the "Only keep the newest beta image" is applied first marks image C as KEEP and both B and A as EXPIRED.

So, when the second rule "Only keep the newest prod image" is applied each image is already marked as KEEP or EXPIRED and so the rule is effectively skipped: deleting all our prod images.

Learning from these types of accidents, a rule of thumb is to write your highest priority rules for your highest priority images and then your lowest priority rules for your lowest priority images.

Using this guideline, rewriting your policy with rules around prod first, we get a policy like

{
"rules": [
{
"rulePriority": 1,
"description": "Only keep the newest prod image",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["prod"],
"countType": "imageCountMoreThan",
"countNumber": 1
},
"action": {
"type": "expire"
}
}
{
"rulePriority": 2,
"description": "Only keep the newest beta image",
"selection": {
"tagStatus": "tagged",
"tagPrefixList": ["beta"],
"countType": "imageCountMoreThan",
"countNumber": 1
},
"action": {
"type": "expire"
}
},
]
}

which works as intended to keep only the newest beta and prod images.

| Label |  Pushed at  | Image Tags |        | Rules     |           | Expired? |
| | | | | Rule 1 | Rule 2 | |
|:-----:|:-----------:|:----------:|:------:|:---------:|:---------:|:--------:|
| C | 8 days ago | beta-3 | | - | Keep | Keep |
| B | 9 days ago | beta-2 | prod-2 | Keep | - | Keep |
| A | 10 days ago | beta-1 | prod-1 | Expire | - | Expire |

In the end, writing ECR Lifecycle Policies can save you money like we did at Good Dog, but be sure to write your policies carefully and verify they are being applied as you expect.

--

--

Erik A. Ekberg

Software engineer with a background in human psychology and data analytics who affords both customer and engineer delight through Agile software architectures.