InfoQ Homepage Machine Learning Content on InfoQ
-
Unleashing Llama's Potential: CPU-Based Fine-Tuning
Anil Rajput and Rema Hariharan detail CPU-based LLM (Llama) optimization strategies for performance and TCO reduction.
-
Navigating LLM Deployment: Tips, Tricks, and Techniques
Meryem Arik shares best practices for self-hosting LLMs in corporate environments, highlighting the importance of cost efficiency and performance optimization.
-
The Harsh Reality of Building a Real-Time ML Feature Platform
Ivan Burmistrov shares how ShareChat built their own Real-Time Feature Platform serving more than 1 billion features per second, and how they managed to make it cost efficient.
-
Recommender and Search Ranking Systems in Large Scale Real World Applications
Moumita Bhattacharya overviews the industry search and recommendations systems, goes into modeling choices, data requirements and infrastructural requirements, while highlighting challenges.
-
Flawed ML Security: Mitigating Security Vulnerabilities in Data & Machine Learning Infrastructure with MLSecOps
Adrian Gonzalez-Martin introduces the motivations and the importance of security in data & ML infrastructure through a set of practical examples showcasing "Flawed Machine Learning Security".
-
Leveraging Open-source LLMs for Production
Andrey Cheptsov discusses the practical use of open-source LLMs for real-world applications, weighing their pros and cons, highlighting advantages like privacy and cost-efficiency.
-
Scale out Batch Inference with Ray
Cody Yu discusses how to build a scalable and efficient batch inference stack using Ray.
-
Why Most Machine Learning Projects Fail to Reach Production and How to Beat the Odds
Wenjie Zi discusses common pitfalls that cause these failures, such as the inherent uncertainty of machine learning, misaligned optimization objectives, and skill gaps among practitioners.
-
Navigating LLM Deployment: Tips, Tricks, and Techniques
Meryem Arik discusses some of the best practices in model optimization, serving and monitoring - with practical tips and real case-studies.
-
Rethinking Connectivity at the Edge: Scaling Fleets of Low-Powered Devices Using NATS.io
Jeremy Saenz discusses NATS, an open-source project for services communication, and how to leverage NATS to streamline communication and fleet management for devices at the edge.
-
Generative Search: Practical Advice for Retrieval Augmented Generation (RAG)
Sam Partee discusses Vector embeddings in LLMs, a tool capable of capturing the essence of unstructured data used by LLMs to gain access to a wealth of contextually relevant knowledge.
-
Being a Responsible Developer in the Age of AI Hype
Justin Sheehy discusses the dramatic developments in some areas of artificial intelligence and the need for the responsible use of AI systems.