InfoQ Homepage DevOps Content on InfoQ
-
How CNAME Ordering in RFC Specs Caused Cloudflare 1.1.1.1 Outage
In a recent article titled "What came first- the CNAME or the A record?" Cloudflare explains how an unclear RFC specification caused the popular Cloudflare’s 1.1.1.1 service to break. After identifying the breakage and the ambiguity in older DNS standards regarding record order, Cloudflare proposes a clarified specification.
-
Datadog Integrates Google Agent Development Kit into LLM Observability Tools
Datadog recently announced that its LLM Observability platform now provides automatic instrumentation for applications built with Google's Agent Development Kit (ADK), offering deeper visibility into the behavior, performance, cost, and safety of AI-driven agentic systems.
-
GitHub Reworks Layered Defenses After Legacy Protections Block Legitimate Traffic
GitHub engineers recently traced user reports of unexpected “Too Many Requests” errors to abuse-mitigation rules that had accidentally remained active long after the incidents that prompted them.
-
Etleap Launches Iceberg Pipeline Platform to Simplify Enterprise Adoption of Apache Iceberg
Etleap has recently launched the Iceberg pipeline platform, a new managed data pipeline layer designed to let enterprises adopt Apache Iceberg without building or maintaining a complex custom stack.
-
Cloudflare's Matrix Homeserver Demo Sparks Debate over AI-Generated Code Claims
A Cloudflare blog post claiming a "production-grade" Matrix homeserver on Workers didn't survive community scrutiny. Missing federation, incomplete encryption, and TODO comments in authentication logic pointed to unreviewed AI output. Matrix's Matthew Hodgson welcomed the effort but noted the implementation "doesn't yet constitute a functional Matrix server."
-
Chainguard Finds 98% of Container CVEs Lurking outside the Top 20 Images
The latest State of Trusted Open Source report from Chainguard gives details on current industry thinking about vulnerabilities in container images and the long tail of open-source dependencies. The report offers a data-driven view of production environments based on more than 1,800 container image projects and 10,100 vulnerability instances observed between September and November 2025.
-
OpenEverest: Open Source Platform for Database Automation
Percona recently announced OpenEverest, an open-source platform for automated database provisioning and management that supports multiple database technologies. Launched initially as Percona Everest, OpenEverest can be hosted on any Kubernetes infrastructure, in the cloud, or on-premises.
-
Google Introduces Managed Connection Pooling for AlloyDB
Google Cloud has launched managed connection pooling for AlloyDB for PostgreSQL, boosting client connections by 3x and transactional throughput by up to 5x. This feature simplifies database management by automating connection management and reducing latency.
-
NVIDIA Dynamo Planner Brings SLO-Driven Automation to Multi-Node LLM Inference
Microsoft and NVIDIA have released Part 2 of their collaboration on running NVIDIA Dynamo for large language model inference on Azure Kubernetes Service (AKS). The first announcement aimed for a raw throughput of 1.2 million tokens per second on distributed GPU systems.
-
Uber Gets Ready for AI in Network Observability with Cloud Native Overhaul
Transportation company Uber has publishing a detailed account of its new observability platform on its blog, highlighting that for them, network visibility is now a strategic capability rather than a set of discrete monitoring tools.
-
Railway Highlights the Importance of Logs, Metrics, Traces, and Alerts for Diagnosing System Failure
Railway’s engineering team published a comprehensive guide to observability, explaining how developers and SRE teams can use logs, metrics, traces, and alerts together to understand and diagnose production system failures.
-
Google BigQuery Adds SQL-Native Managed Inference for Hugging Face Models
Google has launched SQL-native managed inference for 180,000+ Hugging Face models in BigQuery. The preview release collapses the ML lifecycle into a unified SQL interface, eliminating the need for separate Kubernetes or Vertex AI management. Key features include automated resource governance via endpoint_idle_ttl and secure identity-based execution using existing data warehouse permissions.
-
Cedar Joins CNCF as a Sandbox Project
Cedar, an open-source policy language architected by AWS, has joined the CNCF as a Sandbox project. Designed for fine-grained application permissions, it decouples access control from code using a verifiable, high-performance policy engine. Cedar supports RBAC, ABAC, and ReBAC, offering a secure, analyzable alternative to general-purpose tools like OPA.
-
Two Missing Characters: How a Regex Flaw Exposed AWS GitHub Repos to Supply-Chain Risk
AWS recently published a security bulletin acknowledging a configuration issue affecting some popular AWS-managed open-source GitHub repositories. Dubbed CodeBreach, the critical vulnerability could have resulted in the introduction of malicious code and hijacking of the repositories leveraging AWS CodeBuild.
-
OpenCost Looks Back on 2025 Milestones and Charts a Roadmap for 2026
The OpenCost project, an open-source cost and resource management tool hosted by the Cloud Native Computing Foundation (CNCF), has published a year-in-review reflecting on its progress in 2025 and outlining priorities for 2026.