InfoQ Homepage AI, ML & Data Engineering Content on InfoQ
-
Massive Scale Anomaly Detection Framework
Guy Gerson introduces an anomaly detection framework PayPal uses, focusing on flexibility to support different types of statistical and ML models, and inspired by scikit-learn and Spark MLlib.
-
Modern NLP for Pre-Modern Practitioners
Joel Grus discusses the latest in NLP research breakthrough, and how to incorporate NLP concepts and models into a project.
-
Docker Data Science Pipeline
Lennard Cornelis explains why they chose OpenShift and Docker to connect to the Hadoop environment, also how to set up a Docker container running a data science model using Hive, Python, and Spark.
-
H2O's Driverless AI: An AI That Creates AI
Marios Michailidis shares their approach on automating machine learning using H2O’s Driverless AI.
-
Understanding Deep Learning
Jessica Yung talks about the foundational concepts about neural networks and highlights key things to pay attention to: learning rates, how to initialize a network, and more.
-
Intuition & Use-Cases of Embeddings in NLP & beyond
Jay Alammar talks about the concept of word embeddings, how they're created, and looks at examples of how these concepts can be carried over to solve problems.
-
How to Prevent Catastrophic Failure in Production ML Systems
Martin Goodson describes the unpredictable nature of artificial intelligence systems and how mastering a handful of engineering principles can mitigate the risk of failure.
-
Productionizing H2O Models with Apache Spark
Jakub Hava demonstrates the creation of pipelines integrating H2O machine learning models and their deployments using Scala or Python.
-
YugaByte DB - A Planet-Scale Database for Low Latency Transactional Apps
Amey Banarse and Karthik Ranganathan introduce and demo YugaByte DB, a large scale DB, highlighting distributed transactions with global consistency.
-
Test-Driven Machine Learning
Detlef Nauck explains why the testing of data is essential, as it not only drives the machine learning phase itself, but it is paramount for producing reliable predictions after deployment.
-
Data Science for Lazy People, Automated Machine Learning
Diego Hueltes discusses using Automated Machine Learning as a personal assistant in Data Science.
-
Winning Ways for Your Visualization Plays
Mark Grundland explores practical techniques for information visualization design to take better account of the fundamental limitations of visual perception.