To guarantee higher availability and better performances, S3 has for years relied on an eventual consistency model. During the first week of re:invent, AWS announced that S3 now supports strong read-after-write consistency.
One of the key technology aspects of S3 and other large-scale distributed systems has been the eventual consistency model: after a call to an S3 API that stores or modifies data, there has been a small time window where the data has been durably stored, but not yet visible to all GET requests. Since the launch fourteen years ago, eventual consistency has been seen as a trade-off on S3 required by a distributed development.
Source: https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency
Jeff Barr, chief evangelist at AWS, describes on the AWS blog the change:
Effective immediately, all S3 GET, PUT, and LIST operations, as well as operations that change object tags, ACLs, or metadata, are now strongly consistent. What you write is what you will read, and the results of a LIST will be an accurate reflection of what's in the bucket. This applies to all existing and new S3 objects, works in all regions, and is available to you at no extra charge! There’s no impact on performance, you can update an object hundreds of times per second if you'd like, and there are no global dependencies.
In other services, like DynamoDB, AWS already offered strong read consistency as a not default option and with a price premium. But the strong consistency change for S3 has surprised many developers and triggered interesting debates. Forrest Brazeal, senior manager at A Cloud Guru and AWS Serverless Hero, tweets:
S3 is now strongly consistent. No config changes, no caveats, it just is. This is just an unreal flex by the Greatest Cloud Service of All Time.
He wrote a detailed article to explain why he believes that "S3 is an engineering marvel". Some users have instead questions on the S3 achievement. Colin Percival, computer scientist and FreeBSD security officer emeritus, asks:
Having strong consistency from S3 is awesome, but I don't understand the claim of "no impact to availability". If S3 gets partitioned, it can't keep this improved consistency guarantee without sacrificing availability...
Michael Shapiro agrees:
I'm also confused as to how they claim there's no performance impact. How is there no tradeoff between consistency and latency?
Alex Chan, software developer at the Wellcome Trust, questions the benefits:
A lot of discussion of S3’s strong consistency has marvelled at the technical skill and smarts required to pull this off. Is that why we’re all so impressed (and it’s certainly impressive), or does it also explain some new use cases that I’m missing?
Luc van Donkersgoed, head of AWS technology at Sentia Group, wrote an article to explain how it helps working with data lakes:
In many use cases eventual consistency is fine. Let’s say you store profile pictures for users on social media. When someone updates their image and somebody else views their profile in the same second, it doesn’t matter if the old image is still there. It will be updated the next time they visit. However, S3 is used for data lakes more and more. In these use cases, your S3 buckets might contain reports, analytics data, clickstreams, and many other types of time-sensitive data. With the release of strong read-after-write consistency, these data processing applications are now guaranteed to have the latest data available.
Dropbox, a long term AWS customer, shows in a video how strong consistency on S3 simplified their 34 PB data lake.
At re:invent, AWS announced other improvements and new features for S3 including replication with multiple destination buckets, two-way replication across regions and bucket keys.