InfoQ Homepage Database Content on InfoQ
-
Building End-to-End Field Level Lineage for Modern Data Systems
In this article, the authors discuss the data lineage as a critical component of data pipeline root cause and impact analysis workflow, and how automating lineage creation and abstracting metadata to field-level helps with the root cause analysis efforts.
-
An Introduction and Tutorial for Azure Cosmos DB
Azure Cosmos DB is a globally distributed, JSON-based database delivered as a ‘Platform as a Service’ (PaaS) in Microsoft Azure. Learn about the benefits and disadvantages of Azure Cosmos DB. Find out more about this database and discover how to interact with it using tools, SDKs, and APIs.
-
The Next Evolution of the Database Sharding Architecture
In this article, author Juan Pan discusses the data sharding architecture patterns in a distributed database system. She explains how Apache ShardingSphere project solves the data sharding challenges. Also discussed are two practical examples of how to create a distributed database and an encrypted table with DistSQL.
-
Developing Deep Learning Systems Using Institutional Incremental Learning
Institutional incremental learning promises to achieve collaborative learning. This form of learning can address data sharing and security issues, without bringing in the complexities of federated learning. This article talks about practical approaches which help in building an object detection system.
-
You’re Doing it Wrong: it’s Not about Data and Applications – It’s about Processes
Classic developer thinking tends to approach application design from a data-centric point of view. When the domain is process management, that often leads to excess complexity and work; it also (wrongly) over-reduces proactive processes to quick bursts of automation triggered by data changes. There’s a better way to do this: start with the process.
-
Turning Microservices Inside-Out
Turning microservices inside-out means moving past a single, request/response API to designing microservices with an inbound API for queries and commands, an outbound APIs to emit events, and a meta API to describe them both. A database can be supplemented with Apache Kafka via a connecting tissue such as Debezium.
-
Building Latency Sensitive User Facing Analytics via Apache Pinot
At QCon, a virtual conference for senior software engineers and architects covering the trends, Chinmay Soman talked about how you can use Apache Pinot as part of your data pipelines for building rich, external, or site-facing analytics.
-
Accelerating Deep Learning on the JVM with Apache Spark and NVIDIA GPUs
In this article, authors discuss how to use the combination of Deep Java Learning (DJL), Apache Spark v3, and NVIDIA GPU computing to simplify deep learning pipelines while improving performance and reducing costs. They also show the performance comparison of this solution with GPU vs CPU hardware, using Amazon EMR and NVIDIA RAPIDS Accelerator.
-
Evolution of Azure Synapse: Apache Spark 3.0, GPU Acceleration, Delta Lake, Dataverse Support
At Microsoft Build 2021, Azure Synapse has announced significant improvements for its Apache Spark pool, its performance, and data querying and integration capabilities. This article outlines the improvements and provides the context.
-
Case Study: a Decade of Microservices at a Financial Firm
Microservices are the hot new architectural pattern, but the problem with “hot” and “new” is that it can take years for the real costs of an architectural pattern to be revealed. Fortunately, the pattern isn’t new, just the name is. So, we can learn from companies that have been doing this for a decade or more.
-
Deep Diving into EF Core: Q&A with Jeremy Likness
Entity Framework (EF) Core is a cross-platform, extensible, open-source object-database mapper for .NET. Since its first release in 2016, EF Core evolved until reaching its current form: a powerful and lightweight .NET ORM. InfoQ interviewed Jeremy Likness, program manager for .NET Data at Microsoft, to understand more about EF Core and what we should expect for its next release later this year.
-
Why a Serverless Data API Might Be Your Next Database
In this article, author Pieter Humphrey discussed database as a service (DBaaS) and serverless data API for cloud based data management.