InfoQ Homepage Data Analysis Content on InfoQ
-
Forecasting Using Data
Troy Magennis discusses the top three reasons forecasts fail to match reality, and challenges the assumption that work complexity and effort correlates with delivery time.
-
A Cloud-centric Ecosystem Approach to Ease IoT Development
Yujing Wu discusses two use cases of a cloud-based IoT ecosystem that enables IoT device communication across silos and interoperability across different vendors.
-
Enabling High Performance Real-time Analytics for IoT Environments
Mahish Singh discusses how to use methodologies during design, development, deployment and operation for delivery of analytics platforms which offer real-time SLAs.
-
Scaling up Near Real-Time Analytics @Uber &LinkedIn
Chinmay Soman and Yi Pan discuss how Uber and LinkedIn use Apache Samza, Calcite and Pinot along with the analytics platform AthenaX to transform data to make it available for querying in minutes.
-
Stream Processing & Analytics with Flink @Uber
Danny Yuan discusses how Uber builds its next generation of stream processing system to support real-time analytics as well as complex event processing.
-
Data Cleansing and Understanding Best Practices
Casey Stella talks about discovering missing values, values with skewed distributions and likely errors within data, as well as a novel approach to finding data interconnectedness.
-
Elastic Data Analytics Platform @Datadog
Doug Daniels discusses the cloud-based platform they have built at DataDog and how it differs from a traditional datacenter-based analytics stack, pros and cons and the tooling built.
-
Streaming Live Data and the Hadoop Ecosystem
Oleg Zhurakousky discusses the Hadoop ecosystem – Hadoop, HDFS, Yarn-, and how projects such as Hive, Atlas, NiFi interact and integrate to support the variety of data used for analytics.
-
Scaling Counting Infrastructure @Quora
Chun-Ho Hung and Nikhil Garg discuss Quanta, Quora's counting system powering their high-volume near-real-time analytics, describing the architecture, design goals, constraints, and choices made.
-
Forecasting Using Data - Quickly Answering How Big, How Long and How Likely
Troy Magennis explains in this workshop how to capture data and use it for reliable project forecasting using a practical and simple approach to forecasting without item effort estimation.
-
Validation Methodology of Large Unstructured Unsupervised Learning Systems
Lawrence Chernin describes best practices and validation methods used to deal with large unstructured data, including a suite of unit tests covering the implementations of algorithmic equations.
-
Developing a Machine Learning Based Predictive Analytics Engine for Big Data Analytics
Ali Jalali presents how to develop a machine learning predictive analytics engine for big data analytics.