InfoQ Homepage Operations management Content on InfoQ
-
When AIOps Meets MLOps: What it Takes to Deploy ML Models at Scale
Ghida Ibrahim introduces the concept of AIOps referring to using AI and data-driven tooling to provision, manage and scale distributed IT infra.
-
Strategy & Principles to Scale and Evolve MLOps @DoorDash
Hien Luu shares their approach to MLOps, and the strategy and principles that have helped them to scale and evolve their platform to support hundreds of models and billions of predictions per day.
-
MLOps: the Most Important Piece in the Enterprise AI Puzzle
Francesca Lazzeri overviews the latest MLOps technologies and principles that data scientists and ML engineers can apply to their machine learning processes.
-
Developing and Deploying ML across Teams with MLOps Automation Tool
Fabio Grätz and Thomas Wollmann discuss the MLOps Automation tool, and how it can be used to perform DevOps tasks on ML across teams.
-
Iterating on Models on Operating ML
Monte Zweben and Roland Meertens discuss the challenges in building, maintaining, and operating machine learning models.
-
Production & Debugging in a Serverless World
Tal Weiss covers some of the main things to watch out for and the advanced techniques we can put in place to make sure that we'll be prepared to debug even the nastiest Serverless production issues.
-
Top Five Things You Can Do to Reduce Operational Load
Rachel Obstler discusses the things one can do to make a big difference in reducing operational work from incidents, reducing duplicate efforts, surfacing issues, and improving response times.
-
Managing Systems in an Age of Dynamic Complexity
Laura Nolan looks at the common architectural shapes of dynamic control planes, and some examples of how they fail. Why are dynamic control planes so hard to run, and what can be done about it?
-
Evolution of Edge @Netflix
Vasily Vlasov reviews Netflix’s edge gateway ecosystem - multiple traffic gateways performing different functions deployed around the world.
-
Observability to Better Serverless Apps
Erica Windisch dives into how serverless development with observability tooling can help bridge the gap between operations and business intelligence to learn better and iterate faster.
-
Operational Considerations for Containers
Chris Swan discusses how to deal with container operational considerations regarding image management, security, audit, logging, orchestration, and how that relates back to developer experience.
-
Incident Management at the Edge
Lisa Phillips discusses the typical struggles a company runs into when building around-the-clock incident operations and the things Fastly has put in place to make dealing with incidents easier.