BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Amazon Introduces AWS Batch Preview

Amazon Introduces AWS Batch Preview

This item in japanese

At the recent AWS Re:Invent conference, Amazon announced a new preview service, called AWS Batch. AWS Batch allows organizations to optimize their scheduling and workload execution across a cloud-based landscape. Amazon has built this service in response to many AWS customers building their own batch platforms using EC2 instances, containers and CloudWatch.

Running Batch schedulers and jobs is not a new paradigm. It has traditionally been managed on-premise using fixed infrastructure in the form of clusters. Inevitably, customers either over build and have underutilized infrastructure or under build and miss opportunities. Jeff Barr, chief evangelist at AWS, sees an opportunity to modernize traditional approaches by using the cloud:

We believe that cloud computing has the potential to change the batch computing model for the better, with fast access to many different types of EC2 instances, the ability to scale up and down in response to changing needs, and a pricing model that allows you to bid for capacity and to obtain it as economically as possible.

Image Source: (screenshot) https://www.youtube.com/watch?v=ZDScBNahsL4

AWS Batch does not require any installation on servers and can dynamically provision compute resources, including the ability to participate in Amazon Spot instances, which allow customers to bid on spare Amazon EC2 computing capacity. Job Priority and dependency management are also features built into the service. Amazon wants customers to focus on identifying and providing business requirements to the service, and AWS Batch will take care of the rest.

Image Source: (screenshot) https://www.youtube.com/watch?v=ZDScBNahsL4

In the Introducing AWS Batch: Easy and Efficient Batch Computing on AWS session, at AWS re:Invent, Jamie Kinney, principal product manager at AWS, introduced the following AWS Batch Concepts:

  • Jobs represent a unit of work, which are submitted to Job Queues, where they reside, and are prioritized, until they are able to be attached to a compute resource.
  • Job Definitions specify how Jobs are to be run. While each job must reference a Job Definition, many parameters can be overridden, including vCPU, memory, Mount points and container properties.
  • Job Queues store Jobs until they are ready to run. Jobs may wait in the queue while dependent Jobs are being executed or waiting for system resources to be provisioned.
  • Compute Environments include both Managed and Unmanaged environments. Managed compute environments enable you to describe your business requirements (instance types, min/max/desired vCPUs etc.) and AWS will launch and scale resources on your behalf. Unmanaged environments allow you to launch and manage your own resources such as containers.
  • Scheduler evaluates when, where and how to run Jobs that have been submitted to a Job Queue. Jobs run in approximately the order in which they are submitted as long as all dependencies on other jobs have been met.

Amazon has provided some guidance on when AWS Batch should and should not be used. In use cases where there is a lot of data being passed around, such as ETL or Big Data processing, Amazon is encouraging customers to consider EMR, Data Pipeline, Redshift or other related data processing tools. In scenarios where customers may have many small Cron jobs, AWS could be used to execute these jobs, but Kinney suggests that customers “will likely need a workflow or job-scheduling system to manage schedules and orchestrate job submissions.”

However, Kinney feels that AWS Batch is an ideal tool

for customers who run a lot of big and small compute jobs on heterogeneous compute resources.

AWS Batch is currently in preview in the US East (Northern Virginia) Region. Once the service reaches General Availability, the functionality will be made available in other regions. Support for Array jobs and jobs being executed as AWS Lambda functions are on the near-term AWS Batch roadmap.

 

Rate this Article

Adoption
Style

BT