A Primer on Commitment Discount Management (AWS)


Note: This post will focus on AWS specifically, but the concepts and strategies can be applied to other cloud service providers.

Whether a small startup or large enterprise, you need to be able to manage commitment discounts and scale the management practice successfully. Tools to automate this part of the Rate Optimization capability exist, and many benefit from them, but the tools can be viewed as pricey or potentially untrusted because they can spend a lot of money on your behalf.

There are many strategic decisions that need to be discussed that impact your ability to scale, help with you to act, and be successful.

First things first: laying out a foundation for success to managing Commitment Discounts. Reserved Instances and Compute Savings Plans are types of Commitment Discounts: you are making a commitment (contract) to spend money on specific Services or Resources for a period of time (1 or 3 years). I will use the term Commitment Discount when generically referring to both of these throughout.

Having strong allocation before endeavoring to starting to manage Commitment Discounts will help significantly, especially when figuring out which Engineering teams to engage with.

To assist with the analysis I strongly recommend abiding by the Pareto Principle by following an 80/20 rule: 80% of the costs, and thus Commitment Discounts, are generally covered by 20% of the teams. So, target largest costs/workloads first.

But, before getting to it, you needs to first think about tracking Commitment Discounts through an Inventory followed by several Strategies.

Inventory

Whenever you make a purchase the details of the purchase should be logged in an Inventory. A simple spreadsheet is more than adequate to manage even at scale (with this you could easily create formulas to script to Commitment Discount purchases).

Always keep track of the details of the purchase (your future self will thank you); minimally:

  • purchase date
  • expiration date
  • term (1 or 3 year)
  • purchasing account
  • reservation ARN
  • service
  • region
  • no/partial/all upfront
  • instance type
  • technology type
  • count
  • upfront cost
  • WHO and WHY you bought the Commitment Discount
  • other or specific details

It is crucial you keep track of who and why you made the purchase so that you can explain what this is/was for and easily follow-up in the future during expirations. Also, capture screenshots of coverage rates before and after the purchases to ensure proper use.

All of this data will be very useful when performing the FinOps Operations Review.

Helpful Tools

These tools are within the Billing and Cost Management section on the AWS console.

Cost Explorer: This is the main tool you will use to visualize costs.

The following sections are useful for managing Commitment Discounts within the Billing and Cost Management module.

Savings Plans

  • Inventory: shows inventory of savings plans.
  • Recommendations: provides recommendations for savings plans based on term, payment option, and time period.
  • Purchase Analyzer: detailed report on cost, coverage, and utilization recommendations for Savings Plans.
  • Utilization Report: shows utilization rates (whether they are being used or not) of Savings Plans.
  • Coverage Report: shows coverage rates (how much is being covered by Savings Plans vs. on-demand) of Savings Plans.

Reservations

  • Recommendations: provides recommendations for reserved instances based on term, payment option, and time period.
  • Utilization Report: shows utilization rates (whether they are being used or not) of Reserved Instances.
  • Coverage Report: shows coverage rates (how much is being covered by Reserved Instances vs. on-demand) of Reserved Instances.

Note: To view the Inventory of Reserved Instances you must go to each service in the console and check within each region, or gather the data via an API (EC2, RDS, ElastiCache, OpenSearch)

Strategy: Buying Location

Think about: On which account do you make the purchase? Payer (AWS Organization) or Linked account?

Most recommend the Payer account because a Commitment Discount bought on the Payer account can be applied to any Linked account and makes sourcing all Savings Plans and Commitment discounts easy. In this scenario, if the Commitment Discount goes unused on one Linked account, it will automatically start to be used on another Linked account if the Commitment Discount details (instance type, region, etc.) match, thus enabling flexibility.

If you purchase a Commitment Discount on a Linked account, that Commitment Discount is locked into that Linked account unless sharing is enabled.

Strategy: Cashflow

Think about: How much to buy and when? Buying all of your Commitment Discounts at 1 time can cause a significant impact to cashflow at the time of purchase and when they expire (if you renew).

Questions to consider:

  • Does your company have significant coffers of cash and/or should you consider spreading out the purchases over a time period (across multiple months or quarters)?
  • Do you have the budget to make the purchase? Allocating budget each month or quarter to purchase Commitment Discounts is recommended. Be proactive.
  • Do you have approval to make the purchase? Whether you have the budget or not, are you authorized to make the purchase by Finance, CFO, your leadership, etc.? Often you will need to confirm with Finance due to cashflow implications before purchasing.

Strategy: Target Coverage %

Think about: What percentage of workloads do you want to target to cover with Commitment Discounts?

coverage

100% coverage can lead to being over committed which results in wasting money. This happens if services are optimized, reduced, or shut off.

90% coverage is fairly aggressive and it works for mature organizations and leaves room for optimizations.

However, consider aiming for 60-70% coverage to start.

BUT all of this depends as well: if you are spending $50/month on a Service, does it make sense to spend your time and Engineering’s time to review, discuss, and purchase a Commitment Discount? Spending dollars chasing cents is not recommended.

Strategy: Reserved Instances and/or Compute Savings Plans

Educate yourself on the differences between Reserved Instances and Compute Savings Plans and know that Reserved Instances are expected to go away in favor of Compute Savings Plans.

Compute Savings Plans offer flexibility and can be significantly better for Engineering even though Reserved Instances have better discounts.

Which is more important: Flexibility or saving more money?

Strategy: Instance Class

Educate yourself on Standard and Convertible RI’s.

Most companies leverage Standard but there are use cases for Convertible (seeding a small number of very cheap Reserved Instances throughout the year and buying when it makes sense to cover temporary workloads) but this will add complexity to managing Commitment Discounts in your Inventory.

Does the flexibility of Convertibles outweigh the higher discount of Standard?

Strategy: Instance Family

A strategy of recommending a specific instance family can be promising because within EC2, RDS (see Size-flexible reserved DB instances), and ElastiCache (NOT Opensearch) an RI can be automatically transferrable to different sizes within the same instance family (r7g.large <> r7g.medium).

Recommending instance families like M8g or R7g provide additional benefits in that these are graviton based instances which are cheaper and more performant, but require running your code via ARM vs. x86 (Intel/AMD). On services like RDS and ElastiCache where you are not running compiled code but instead leveraging a platform service, it is strongly recommended to discuss (with Engineering) setting an Instance Family strategy.

Ultimately: Does it make sense to restrict Engineering to specific instance families? In many circumstances it does.

Strategy: Buy Based on History & Forecast Usage

Do you have hundreds of Engineering teams and/or 10’s of 1000’s of resources? Working through all of these and confirming how long each will exist can be challenging and time consuming.

You have an option to make a trade-off between positively confirming workload requirements with Engineering OR purchasing Commitment Discounts based on historical usage and forecasts (with little or no involvement from Engineering).

Note: With this approach you risk buying Commitment Discounts for workloads you have not confirmed with Engineering.

Note: This strategy has less risk when purchasing Compute Savings Plans due to their flexibility BUT you should have decent high-level alignment with Engineering leadership, architects, or teams that workloads will remain stable.

  • Review workload cost and usage history and gain additional details on workloads by reaching out to the Engineering teams running the largest workloads and seek to understand their historical and future usage plans.
  • Discuss options with the Engineering teams that would align well with their forecasts.
  • Consider buying Commitment Discounts to cover some percentage of the workloads: buy more if you have a more accurate forecast from Engineering that workloads will remain the same, or buy less if a less accurate forecast OR Engineering expects to optimize and reduce usage.

If workloads are unpredictable or you are lacking sufficient Unit Economic details then do not consider buying large numbers of Commitment Discounts, instead consider buying a small number of Commitment Discounts such that they are being utilized 100%.

Return on Investment (ROI)

With each Commitment Discount there is a point in time where the cost of the Commitment Discount overcomes the same cost of the workload as if it were on-demand: this is called the break even point (approximately month 7 for 1 year RI’s and month 13 for 3 year RI’s but this changes depending on the Service, cost, and other details of the potential RI).

The below graph showcases a simple example of the break even point concept using a $50/month service. The break even point is where the red (1 YR RI) or orange (3 YR RI) cross $0.

roi

The remaining cost you would have paid between the break even point and the expiration of the Commitment Discount is effectively free, so running workloads on a Commitment Discount to the expiration date will maximize financial discounts.

The break even point is also the point where you have hit your ROI and you could make a change to a new Commitment Discount without negative financial implications, except if your forecast includes the free period after the break even point.

Expectations

When you purchase a commitment discount you are entering into a 1 year or 3 year contract. This plus the other restrictions (account, instance type, region, number of instances, etc.) are key bits of information to bring to Engineering to discuss and align.

If you make a mistake and buy a commitment discount, you can submit a billing case and request it to be canceled.

Gotchas

DynamoDB has some gotchas: Reservations requires using Provisioned mode and do not apply to global tables.

Unit Economics

Do you have data on unit economics to support workload history and forecasts? e.g. the number of units that causes the workloads to scale up and down. These units can help make with buying based on historical detail.

Are these units trending up or down? Are they seasonal? Do they actually impact workload costs?

Data Analysis

Compare costs that are on-demand vs. reserved (covered by Commitment Discounts). Do this for each Service you are looking to buy Commitment Discounts for.

Visualize this within Cost Explorer using the Purchase Option dimension.

You should routinely review on-demand rates as part of your FinOps Operational Reviews to determine if on-demand is growing (new workloads may lead you to needing more commitment discounts or a commitment discount could have expired).

Alignment with Engineering

This will help drive alignment and ownership across Engineering!

Before engaging with Engineering, verify you have firm or potential budget and approval to purchase the Commitment Discount recommendations from Finance/budget owner.

Prepare the data and details for the meeting:

  • Expectations and limitations on Reserved Instances and Compute Savings Plans: term, instance family/flexibility, region, costs, flexibility, ROI, etc.
  • Ensure you are prepared to explain that these are commitments (contracts) and if infrastructure changes are made prior to the break even point in the ROI calculation then it will potentially waste the company’s money
  • Depending on your company’s culture, you may have to present a business case to the Finance/budget owner and it may make sense to have these folks at the same meeting with Engineering to drive alignment.

alignmentengineering

Engage in an open conversation with Engineering and discuss plans for the workload:

  • Review the cost and usage data together
  • Ask questions to gain an understanding of the workload’s purpose. Stress that you are here to learn and gain alignment with them, before diving deeper:
    • Is the workload static or stable? Why or why not?
    • For how long will they keep this workload like this?
    • Do any external factors raise or lower costs, like number of customers?
    • What other helpful or interesting details can be provided?
    • Ask this again: “Realistically, how long will this workload be configured like this?” If the time period is greater than the break even point then it makes sense to buy the Commitment Discount.
  • Gain alignment on purchasing the Commitment Discount(s)
    • Be clear that if they are planning to change workloads before the break even point then they need to engage with you to discuss options (will this be used by another linked account in the organization, will they need more because they are scaling up, etc.)
  • Always leave this meeting with a reminder that they should proactively engage with you to purchase Commitment Discounts to cover long term workloads so that you both can be good stewards of the company’s money.
  • Always log the details in the Inventory and confirm budget and approval before making the purchase
  • Communicate when the purchase has been made and show them the cost impact after you have the updated data

FinOps Operational Reviews

Define a routine and use a consistent process for all Commitment Discounts:

  • How often are you performing a FinOps Operational Review? Quarterly? Monthly? Biweekly? This will depend on how much workloads change. Monthly is a good place to start.
  • Review the Inventory for expiring Commitment Discounts
    • If your Engineering teams are well organized, you could be proactive and add JIRA tickets to their future roadmap, 30 or 60 days prior to expiration, such that it would trigger a conversation with you.
    • Configure emailed alerts within AWS to notify you of upcoming expiring Reserved Instances and Compute Savings Plans.
  • Schedule weekly reports to catch anomalies and changes in coverage
    • Identify new on-demand workloads in this reporting
    • Engage with Engineering on new workloads
    • Purchase as needed
  • Review commitment discount usage to ensure they are being used 100% and reach out to the team(s) from the original purchase if they are not being used. View unused commitments via Billing and Cost Management module or CUDOS (especially if you have multiple AWS organizations). CUDOS (Quicksight) can be configured to email you a report.
  • Consider building a KPI like Effective Savings Rate to gauge how well you are doing with managing Commitment Discounts

Also consider other operational reviews: Are there Engineering Operations Reviews, Architecture Review Boards, Change Management Boards, or other calls you can be a fly on the wall to understand upcoming workloads or changes in workloads?

Gauging Success

While coverage rate is a helpful indicator of success, a better key performance indicator (KPI) is Effective Savings Rate (ESR).

Effective Savings Rate is the Return on Investment (ROI) for cloud discount instruments and the one output metric you can measure true savings performance.

Calculating ESR, understanding it, and reviewing it at least monthly is recommended.


Other Resources


Discuss this further on LinkedIn