InfoQ Homepage AIOps Content on InfoQ
-
CNCF Launches Certified Kubernetes AI Conformance Program to Standardise Workloads
The CNCF has launched the Certified Kubernetes AI Conformance program to standardise artificial intelligence workloads. By establishing a technical baseline for GPU management, networking, and gang scheduling, the initiative ensures portability across cloud providers. It aims to reduce technical debt and prevent vendor lock-in as enterprises move generative AI models into production.
-
SIMA 2 Uses Gemini and Self-Improvement to Generalize across Unseen 3D and Photorealistic Worlds
Google DeepMind researchers introduced SIMA 2 (Scalable Instructable Multiworld Agent), a generalist agent built on the Gemini foundation model that can understand and act across multiple 3D virtual game environments. The SIMA 2 architecture uses a Gemini Flash-Lite model trained on a mixture of gameplay and Gemini pretraining data.
-
Michelin Drives Pragmatic Path to AIOps without a Grand Vision
Michelin's China operations group have written about how they implemented an AIOps platform. It details the missteps and organisational resistance that were overcome on the way to eventual alignment with their global IT governance, and explains how enterprises can move past vendor pitches to get to a practical deployment.
-
Meta Details GEM Ads Model Using LLM-Scale Training, Hybrid Parallelism, and Knowledge Transfer
Meta released details about its Generative Ads Model (GEM), a foundation model designed to improve ads recommendation across its platforms. The model addresses core challenges in recommendation systems (RecSys) by processing billions of daily user-ad interactions where meaningful signals such as clicks and conversions are very sparse.
-
Private AI Compute Enables Google Inference with Hardware Isolation and Ephemeral Data Design
Google announced Private AI Compute, a system designed to process AI requests using Gemini cloud models while aiming to keep user data private. The announcement positions Private AI Compute as Google's approach to addressing privacy concerns while providing cloud-based AI capabilities, building on what the company calls privacy-enhancing technologies it has developed for AI use cases.
-
Introducing Evalite: the TypeScript Testing Tool for AI Powered Apps
Evalite is a TypeScript-native eval runner designed for AI applications, enabling developers to create reproducible evals with rich outputs. Featuring first-class trace capture, scoring, and a user-friendly web UI, Evalite enhances testing ergonomics and iteration speed. Open-source under MIT, it seamlessly integrates with any LLM, ensuring complete data control and fostering rapid development.
-
Amazon Adds A2A Protocol to Bedrock AgentCore for Interoperable Multi-Agent Workflows
Amazon announced support for the Agent-to-Agent (A2A) protocol in Amazon Bedrock AgentCore Runtime, enabling communication between agents built on different frameworks. The protocol allows agents developed with Strands Agents, OpenAI Agents SDK, LangGraph, Google ADK, or Claude Agents SDK to "share context, capabilities, and reasoning in a common, verifiable format."
-
Nexla Launches Express: a Conversational Platform for AI Data Engineering
Nexla recently introduced Express, a conversational data engineering platform designed to dramatically lower the barrier for building data pipelines for AI applications.
-
Kimi's K2 Opensource Language Model Supports Dynamic Resource Availability and New Optimizer
Kimi released K2, a Mixture-of-Experts large language model with 32 billion activated parameters and 1.04 trillion total parameters, trained on 15.5 trillion tokens. The release introduces MuonClip, a new optimizer that builds on the Muon optimizer by adding a QK-clip technique designed to address training instability, which the team reports resulted in "zero loss spike" during pre-training.
-
Anthropic Adds Sandboxing and Web Access to Claude Code for Safer AI-Powered Coding
Anthropic released sandboxing capabilities for Claude Code and launched a web-based version of the tool that runs in isolated cloud environments. The company introduced these features to address security risks that arise when Claude Code writes, tests, and debugs code with broad access to developer codebases and files.
-
KubeCon NA 2025 - Salesforce’s Approach to Self-Healing Using AIOps and Agentic AI
AIOps and Agentic AI technologies can help in developing solutions to intelligently analyze Kubernetes cluster health, automatically diagnose problems, and orchestrate issue resolutions with minimal human intervention. Vikram Venkataraman and Srikanth Rajan spoke at KubeCon + CloudNativeCon NA 2025 Conference about Salesforce’s approach to self-healing systems using AIOps and AI Agents.
-
New Claude Haiku 4.5 Model Promises Faster Performance at One-Third the Cost
Anthropic released Claude Haiku 4.5, making the model available to all users as its latest entry in the small, fast model category. The company positions the new model as delivering performance levels comparable to Claude Sonnet 4, which launched five months ago as a state-of-the-art model, but at "one-third the cost and more than twice the speed."
-
Claude Sonnet 4.5 Ranked Safest LLM from Open-Source Audit Tool Petri
Claude Sonnet 4.5 has emerged as the best-performing model in ‘risky tasks’, narrowly edging out GPT-5 in early evaluations by Petri --- Anthropic’s new open-source AI auditing tool.
-
DORA Report Finds AI Is an Amplifier in Software Development, But Trust Remains Low
Nearly 90% of technology professionals now use artificial intelligence in their work. But according to the 2025 DORA State of AI-assisted Software Development report, there's still a significant gap in trust between developers and the tools they increasingly rely upon. The report findings found that while AI adoption has become "nearly universal," there are still some organisational challenges.
-
System Initiative Launches “AI Native” Platform to Simplify Infrastructure Automation
System Initiative recently released its AI Native Infrastructure Automation platform, aiming to offer DevOps teams a new way to manage infrastructure through natural language.