Amazon Web Services (AWS) recently introduced significant changes on how to request and operate Amazon EC2 spot instances, which can provide considerable cost savings. Users can now request spot instances without specifying a bidding price, spot prices are adjusted more gradually, and spot instances can also be stopped or hibernated and later resumed to further optimize interruptible workloads.
Amazon EC2 Spot Instances are a spare compute capacity that AWS makes available at discounts of up to 90% compared to On-Demand prices, but which might be reclaimed when AWS needs the capacity back (forcing the EC2 spot instance to terminate, or optionally stop/hibernate). They are well suited for workloads that are fault-tolerant or can be interrupted, such as test & development environments, CI/CD pipelines, stateless web services, batch processing, analytics, and machine learning.
While AWS has evolved the spot instance concepts over time, resilient and efficient use of this low-cost capacity pool still required sufficient engineering maturity to deal with sudden instance terminations, as well as an understanding of bidding strategies for the spot market. Depending on various factors such as instance type, time of day and region, prices could fluctuate widely (even up to ten times the On-Demand price), causing frequent interruptions and potentially complex capacity and cost calculations.
AWS has now moved to a pricing model "where prices adjust more gradually, based on longer-term trends in supply and demand". Notably, the new spot price limit is the On-Demand price, which implies that many users will not need to specify a bidding price at all, if they do not want to further limit their instance budget. These benefits also apply when using a Spot Fleet to automate the management of potentially diversified instance type pools, for example via AWS services that can use spot instances to provide compute capacity, such as AWS Batch, Amazon ECS, and Amazon EMR.
While the impact on the average cost savings potential varies between instance types and regions, the new spot pricing model yields considerably reduced fluctuations and smoother pricing trends. Accordingly, the spot instance advisor meanwhile attests a 'low' frequency of interruptions for most instance types, and the spot pricing history clearly visualizes these changes as well:
Image: Amazon EC2 spot pricing model change for instance type m4.2xlarge in eu-west-1
AWS has also introduced a simplified spot instance request model via the run-instances API, which returns an instance ID immediately when capacity is available, removing the need to poll for the state of an asynchronous spot request via the old request-spot-instances API. Requesting a spot instance only requires a single additional parameter, which can simply be added to existing scripts and services to gain the implied cost saving benefits:
$ aws ec2 run-instances --instance-market-options '{"MarketType":"Spot"}' \
--image-id ami-1a2b3c4d --count 1 --instance-type c3.large
Furthermore, instead of being terminated when interrupted by EC2, spot instances can now optionally be stopped or hibernated and then resumed when capacity for the same instance type is available again. It is worth noting that this is a service level optimization and not available to manually stop and start spot instances. Jeff Barr (Chief Evangelist AWS) summarizes the motivating use cases as follows:
When capacity becomes available, the instances are started and can keep on going without having to spend time provisioning applications, setting up EBS volumes, downloading data, joining network domains, and so forth.
The main difference between stopping and hibernating an instance is that the latter also persists data from RAM to the EBS root volume, thereby providing additional benefits for workloads that "keep a lot of state in memory". Stopping a spot instance has very few requirements besides a root EBS volume. Hibernation also requires an agent on a supported operating system and is as of recently available for the most commonly used EC2 instance types only. In addition, AWS strongly recommends to "use an encrypted EBS volume as the root volume" for hibernation so that "contents of memory (RAM) are encrypted when the data is at rest on the volume".
In related news, AWS has recently addressed a long-standing feature request by making its two-minute warning for spot instance termination available via Amazon CloudWatch Events (previous coverage). Based on the new spot instance interruption notices, users can now trigger push notifications and automate dependent activities via the same event bus and action targets already facilitated for most other AWS resource changes.
Google Cloud Platform (GCP) and (as of recently) Microsoft Azure are also offering lower-priced virtual machines for applicable workloads, but the pricing model and operational constraints of GCP's preemptible VM instances and Azure's low-priority VMs differ considerably.
The Amazon EC2 documentation features a user guide for Linux and Windows instances, including sections on getting started with EC2 spot instances and spot instance interruptions, the AWS CLI reference, and the API reference. The Amazon EC2 Spot instance pricing lists the current compute time charges, which do not incur for interrupted instances – their preserved volumes are subject to regular usage based EBS pricing. Support is provided via the Amazon EC2 forum.