Recently AWS announced a new capability of Amazon SageMaker called Amazon SageMaker Feature Store, a fully-managed, purpose-built repository. This new Amazon SageMaker capability allows customers to create repositories that make it easier to store, update, retrieve, and share machine learning (ML) features for training and inference.
Amazon feature store is one of the nine significant updates to the cloud-based machine learning platform, Amazon SageMaker announced during the annual re:Invent. With the feature store, the company aims to overcome the problem of storing features mapped to multiple models. The new capability of Amazon SageMaker will help customers access and share features that make it much easier to name, organize, find, and share sets of features among teams of developers and data scientists. Furthermore, the Feature Store resides in SageMaker Studio, close to where ML models are run - thus, according to an AWS Press release on the new Amazom SageMaker capabilities, it provides single-digit millisecond latency for inference.
Julien Simon, an artificial intelligence & machine learning evangelist at Amazon, wrote in a blog post on Amazon SageMaker Feature Store:
Amazon SageMaker Feature Store is designed for fast and efficient access for real time inference, with a P95 latency lower than 10ms for a 15-kilobyte payload. This makes it possible to query for engineered features at prediction time, and to replace raw features sent by the upstream application with the exact same features used to train the model.
Source: https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store.html
Users can organize and store their engineered features in Amazon SageMaker Feature Store by grouping them in feature groups - a collection of records, similar to rows in a table. Each record in the feature group has a unique identifier and holds the engineered feature values for one of the data instances from the original data source. Furthermore, users can choose to encrypt the data at rest using their own AWS Key Management Service (KMS) key - unique for each feature group.
Mammad Zadeh, VP of engineering, Data Platform at Intuit, said in the AWS Press release on the new Amazon SageMaker capabilities:
We have worked closely with AWS in the lead up to the release of Amazon SageMaker Feature Store, and we are excited by the prospect of a fully-managed feature store so that we no longer have to maintain multiple feature repositories across our organization. Our data scientists will be able to use existing features from a central store and drive both standardisation and reuse of features across teams and models.
Also, Holger Mueller, principal analyst and vice president at Constellation Research Inc., told InfoQ:
Building AI models is a challenging process. One of the many challenges is that specific re-usable algorithms may not be identical during numerous data runs and over time. Amazon addresses this with SageMaker Feature Store - creating a repository of re-usable algorithms (the "features") across different AI model runs in the AI model life cycle.
Currently, Amazon SageMaker Feature Store is generally available in all AWS regions in the Americas and Europe, and some regions in the Asia Pacific, with additional regions coming soon. Furthermore, details of the new capability and sample notebooks are available on the documentation pages. Pricing of the feature is based on feature reads and writes and on the total amount of data stored.