InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage Large language models Content on InfoQ

News

RSS Feed

Newer Older

Architecture & Design

AMD’s Gaia Framework Brings Local LLM Inference to Consumer Hardware

AMD has released Gaia, an open-source project allowing developers to run large language models (LLMs) locally on Windows machines with AMD hardware acceleration. The framework supports retrieval-augmented generation (RAG) and includes tools for indexing local data sources. Gaia is designed to offer an alternative to LLMs hosted on a cloud service provider (CSP).

Matt Foster
on Apr 08, 2025
AI, ML & Data Engineering

Meta AI Releases Llama 4: Early Impressions and Community Feedback

Meta has officially released the first models in its new Llama 4 family—Scout and Maverick—marking a step forward in its open-weight large language model ecosystem. Designed with a native multimodal architecture and a mixture-of-experts (MoE) framework, these models aim to support a broader range of applications, from image understanding to long-context reasoning.

Robert Krzaczyński
on Apr 07, 2025
AI, ML & Data Engineering

Announcing QCon AI: Focusing on Practical, Scalable AI Implementation for Engineering Teams

QCon AI focuses on practical, real-world AI for senior developers, architects, and engineering leaders. Join us Dec 16-17, 2025, in NYC to learn how teams are building and scaling AI in production—covering MLOps, system reliability, cost optimization, and more. No hype, just actionable insights from those doing the work.

Artenisa Chatziou
on Apr 07, 2025
DevOps

How SREs and GenAI Work Together to Decrease eBay's Downtime: an Architect's Insights at KubeCon EU

During his KubeCon EU keynote, Vijay Samuel, Principal MTS Architect at eBay, shared his team’s experience of enhancing incident response capabilities by incorporating ML and LLM building blocks. They realised that GenAIs are not a silver bullet but can help engineers through complex incident investigations through logs, traces, and dashboard explanations.

Olimpiu Pop
on Apr 05, 2025
DevOps

How Observability Can Improve the UX of LLM Based Systems: Insights of Honeycomb's CEO at KubeCon EU

During her KubeCon Europe keynote, Christine Yen, CEO and co-founder of Honeycomb, provided insights on how observability can help cope with the rapid shifts introduced by the integration of LLMs in software systems, which transformed not only the way we develop software but also the release methodology. She explained how to adapt your development feedback loop based on production observations.

Olimpiu Pop
on Apr 03, 2025
AI, ML & Data Engineering

OpenAI Introduces New Speech Models for Transcription and Voice Generation

OpenAI has introduced new speech-to-text and text-to-speech models in its API, focusing on improving transcription accuracy and offering more control over AI-generated voices. These updates aim to enhance automated speech applications, making them more adaptable to different environments and use cases.

Robert Krzaczyński
on Mar 31, 2025
AI, ML & Data Engineering

Google DeepMind Launches TxGemma: Advancing AI-Driven Drug Discovery and Development

Designed to enhance the efficiency of drug discovery and clinical trial predictions. Built on the Gemma model family, TxGemma aims to streamline the drug development process and accelerate the discovery of new treatments.

Robert Krzaczyński
on Mar 30, 2025
Web Development

How Airbnb Used LLMs to Accelerate Test Migration

Thanks to the right mix of workflow automation and large language models, Airbnb significantly accelerated the process of updating their codebase to adopt React Testing Library (RTL) and converted nearly 3.5K React test files originally using Enzyme.

Sergio De Simone
on Mar 28, 2025
AI, ML & Data Engineering

Nvidia Unveils AI, GPU, and Quantum Computing Innovations at GTC 2025

Nvidia presented several new technologies at its GTC 2025 event, covering advancements in GPUs, AI, robotics, and quantum computing.

Daniel Dominguez
on Mar 27, 2025
AI, ML & Data Engineering

Roblox Releases Cube 3D, an AI Open-Source Model for 3D Model Generation

Roblox has introduced Cube 3D, a generative AI system designed for creating 3D and 4D objects and environments.

Daniel Dominguez
on Mar 22, 2025
Architecture & Design

Dapr Agents: Scalable AI Workflows with LLMs, Kubernetes & Multi-Agent Coordination

Introducing Dapr Agents—a groundbreaking framework for creating scalable AI agents using Large Language Models (LLMs). With robust workflows, multi-agent coordination, and cloud-neutral architecture, it enables enterprises to deploy thousands of resilient agents. Built on Dapr’s proven infrastructure, Dapr Agents ensures reliability and observability in AI-driven applications.

Eran Stiller
on Mar 20, 2025
Mobile

Google Launches Gemma 3 1B for Mobile and Web Apps

Requiring a "mere" 529MB, Gemma 3 1B is a small language model (SLM) specifically meant for distribution across mobile and Web apps, where models must download quickly and be responsive to keep user engagement high.

Sergio De Simone
on Mar 17, 2025
AI, ML & Data Engineering

Google Report Reveals How Threat Actors Are Currently Using Generative AI

Google's Threat Intelligence Group (GTIG) recently released a report on the adversarial misuse of generative AI. The team investigated prompts used by advanced persistent threat (APT) and coordinated information operations (IO) actors, finding that they have so far achieved productivity gains but have not yet developed novel capabilities.

Renato Losio
on Mar 15, 2025
AI, ML & Data Engineering

Google Introduces AI Co-Scientist System to Aid Scientific Research

Google has announced the development of an AI co-scientist system designed to assist scientists in generating hypotheses and research proposals. Built using Gemini 2.0, the system aims to accelerate scientific and biomedical discoveries by emulating the scientific method and fostering collaboration between humans and AI.

Daniel Dominguez
on Mar 12, 2025
AI, ML & Data Engineering

OpenAI Introduces Software Engineering Benchmark

OpenAI has introduced the SWE-Lancer benchmark, to evaluate the capabilities of advanced AI language models in real-world freelance software engineering tasks.

Daniel Dominguez
on Mar 08, 2025

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News