Amazon recently announced the general availability of Redshift Serverless, an elastic option to scale data warehouse capacity. The new service allows data analysts, developers and data scientists to run and scale analytics without provisioning and managing data warehouse clusters.
Announced in preview at the latest re:Invent, Redshift Serverless is designed for variable workloads, unpredictable spikes and development environments. With the serverless option, the compute capacity scales vertically and automatically based on the workload and shuts down during periods of inactivity. Danilo Poccia, chief evangelist EMEA at AWS, explains the main benefits:
This allows more companies to build a modern data strategy, especially for use cases where analytics workloads are not running 24-7 and the data warehouse is not active all the time. It is also applicable to companies where the use of data expands within the organization and users in new departments want to run analytics without having to take ownership of data warehouse infrastructure.
Developers can connect to a Redshift endpoint using a client tool via JDBC/ODBC or with the Redshift Query Editor v2, a web-based SQL application available on the AWS console. It is also possible to access the database and perform queries using the Redshift Data API, integrating with Lambda functions and SageMaker notebooks.
Source: https://aws.amazon.com/blogs/aws/amazon-redshift-serverless-now-generally-available-with-new-capabilities/
Before the availability of Redshift Serverless, developers built open source tools to automatically pause and resume Redshift clusters using AWS Lambda, CloudWatch and Step Functions. Differently from the preview, it is now possible to create multiple serverless endpoints per region using namespaces and workgroups and query monitoring rules to help manage costs.
Jeevan Dongre, CEO and co-founder of Antstack, comments on the trend towards serverless analytics:
Another milestone in the serverless journey! AWS Redshift Serverless goes live. Serverless plays a major role in the coming days, especially in the data play.
The serverless option is not the only improvement announced for Redshift. The cloud provider introduced Row-Level Security, the ability to restrict access to a subset of rows within a table based on the job role or permissions and level of data sensitivity. Offering the same performance as user-created materialized views, the new Automated Materialized Views are now generally available and help lower query latency for repeatable workloads. Finally, Redshift recently improved cluster resize performance and the flexibility of cluster restore.
Addressing a tweet about the pricing model of the serverless option, Poccia writes:
Redshift measures data warehouse capacity in Redshift Processing Units (RPUs). You pay for the workloads you run in RPU-hours on a per-second basis with a 60-second minimum charge. You can specify the base capacity in RPUs between 32 and 512 RPUs. You can also configure daily, weekly, or monthly usage limits in RPU-hours to control your costs.
Redshift Serverless is currently available in a subset of AWS regions, including Northern Virginia, Frankfurt and Ireland. AWS lowered the RPU price by 25% compared to the preview period, starting at 0.375 USD per RPU-hour in the Northern Virginia region.