Recently Google announced a partnership with Databricks to bring their fully-managed Apache Spark offering and data lake capabilities to Google Cloud. The offering will become available as Databricks on Google Cloud.
Databricks on Google Cloud will be tightly integrated with Google Cloud’s infrastructure and analytics capabilities. By integrating Databricks with Google Kubernetes Engine (GKE), customers will get a fully container-based Databricks runtime in the cloud. Furthermore, Databricks has an optimized connector with Google BigQuery that allows easy access to data in BigQuery directly via its Storage API for high-performance queries.
The integration of Databricks with Looker and support for SQL Analytics, along with an open API environment on Google Cloud, will give customers the ability to directly query the data lake, providing an entirely new visualization experience. And lastly, customers can deploy Databricks through the Google Marketplace with unified billing and one-click setup inside the Google Cloud console as soon as it becomes available.
According to the Google blog post by Kevin Ichhpurani, corporate vice president, Global Ecosystem at Google Cloud, customers can benefit from less overhead using the managed services in Google Cloud AI Platform when deploying models built on Databricks. Furthermore, they can pair Databricks with Google Cloud services to make the most of their data management and analytics investments. And finally, they save on infrastructure by having Databricks, Google Cloud, and additional analytics applications work side by side on one shared infrastructure.
Google follows Microsoft and AWS in offering Databricks on their Cloud platform. Microsoft released Azure Databricks back in 2018 which is available in 30 regions, including the recent addition of Azure China. Furthermore, Azure Databricks is a first-party Microsoft Azure service that is sold and supported directly by Microsoft. Simultaneously, AWS also offers Databricks with Databricks on AWS available in various regions and has another managed Spark offering called EMR. By partnering with Google, Databricks is now the only unified data platform available on the three significant cloud providers.
In a Google press release Thomas Kurian, CEO at Google Cloud, said:
We’re delighted to deliver Databricks’ lakehouse for AI and ML-driven analytics on Google Cloud. By combining Databricks’ capabilities in data engineering and analytics with Google Cloud’s global, secure network—and our expertise in analytics and delivering containerized applications—we can help companies transform their businesses through the power of data.
In addition, Ali Ghodsi, CEO and co-founder of Databricks, said in the same press release:
This is a pivotal milestone that underscores our commitment to enable customer flexibility and choice with a seamless experience across cloud platforms. We are thrilled to partner with Google Cloud and deliver on our shared vision of a simplified, open, and unified data platform that supports all analytics and AI use-cases that will empower our customers to innovate even faster.
Currently, customers can sign up for the public preview of Databricks on Google in March.