In recent news from the Azure team, Azure Synapse Link introduces a set of new capabilities for Azure Cosmos DB, including its compatibility with existing MongoDB collections, integration with continuous backup, and custom partitioning, empowering users to leverage analytics workloads with ease and efficiency on top of Azure Cosmos DB data.
Among these capabilities, the most prominent is the General Availability of Synapse Link for existing MongoDB collections. This new capability offers users the flexibility to apply Synapse Link to collections created without initially having this feature.
Users can now employ the command-line interface (CLI) or PowerShell to enable Synapse Link on their pre-existing MongoDB collections. This prevents the need for time-consuming data export processes, enabling a more smooth workflow for users.
To demonstrate how to activate Synapse Link for an existing MongoDB collection, an official announcement blog post provided a CLI example:
az cosmosdb mongodb collection update -g <your_resource_group> -a <your_database_account> -d <your_database> -n <your-collection> –analytical-storage-ttl -1
It's important to note that the time required for the initial synchronization of a collection with the analytical store can vary significantly depending on data volume and document complexity. This process may range from a matter of seconds to multiple days.
As stated, Azure Synapse Link is available for Azure Cosmos DB SQL API or for Azure Cosmos DB API for Mongo DB accounts. And it is in preview for Gremlin API, with activation via CLI commands.
The next significant news is that Synapse Link now offers integration with continuous backup, bringing this feature to the general user base. Users can activate Synapse Link on Azure Cosmos DB database accounts with continuous backup enabled, extending the analytical capabilities of their databases. As reported, the activation process is highly flexible, allowing users to utilise the portal, CLI, PowerShell, and Cosmos DB SDKs.
Moreover, Synapse Link can be utilized in database accounts that already have point-in-time restore (PITR) enabled, ensuring data integrity and accessibility for a wide range of scenarios. Notably, enabling continuous backup on Synapse Link-enabled accounts is currently in private preview, with the opportunity for users to register and participate in this feature.
Furthermore, the next significant enhancement is the general availability of custom partitioning for the analytical store using keys, or fields in documents. As mentioned in the documentation, the custom partitioning enables you to partition analytical store data, on fields that are commonly used as filters in analytical queries, resulting in improved query performance.
Users can create separate partitioned stores for different groups of keys, each with its schedule and filters. These partitioned stores are securely stored in the Azure Data Lake Store, ensuring seamless data visualization and access management within the Synapse Workspace.
(Source: Microsoft DevBlogs, Synapse Link custom partitioning architecture, article: New features for Azure Synapse Link for Azure Cosmos DB)
Lastly, more details on Azure Synapse Link are available on the documentation page and about the Azure Cosmos DB GitHub repo, and more details about Synapse Link pricing can be found on the Azure pricing page.