InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

Enter your e-mail address

Select your country

We protect your privacy.

InfoQ Homepage Big Data Content on InfoQ

News

RSS Feed

Newer Older

AI, ML & Data Engineering

Netflix Demonstrates Big Data Analytics Infrastructure

At QCon San Francisco, engineers at Netflix discussed their big data strategy and analytics infrastructure. This included a summary of the scale of their data, their S3 data warehouse, and Genie, their big data federated orchestration system.

Andrew Morgan
on Mar 21, 2017
AI, ML & Data Engineering

Apache Ranger Graduates to Top-Level Project

Apache Ranger, a security management framework for Apache Hadoop ecosystem, graduated to top level. Ranger is used as a centralized component to define and administer security policies that are enforced across supported Hadoop components such as Apache HBase, Hadoop (HDFS and YARN), Apache Hive, Apache Kafka, Apache Solr, among others.

Alexandre Rodrigues
on Mar 14, 2017
Java

Lightbend Speaks to InfoQ on Their Acquisition of OpsClarity

Nine months after acquiring BoldRadius, Lightbend announced their acquisition of OpsClarity, a company specializing in monitoring reactive applications. InfoQ interviewed Mark Brewer, president and CEO at Lightbend and Alan Ngai, co-founder of OpsClarity and now VP of cloud services at Lightbend to learn more about this new partnership.

Michael Redlich
on Feb 24, 2017
AI, ML & Data Engineering

Beam Graduates to Top-Level Apache Project

Beam exits incubation period and graduates to top-level Apache project, Google support and contribution to open source integration for various data processing backends and more.

Dylan Raithel
on Feb 21, 2017
AI, ML & Data Engineering

Deep Learning at Gilt

Deep Learning is a rapidly evolving subfield of Machine Learning originating from Neural Networks. Recent algorithmic advances and utilization of GPU parallelization have resulted in Deep Learning based algorithms mastering the game of Go as well as several practical applications. The fashion industry is one of the target sectors for Deep Learning. Gilt is using Deep Learning for real world apps

Alex Giamas
on Feb 19, 2017
Development

Microsoft AirSim, a Simulator for Drones and Robots

Microsoft has developed and open sourced AirSim, a tool that can be used to simulate the flight of drones around the world. The simulator is built on the Unreal Engine and Microsoft will soon add support for robots and other types of vehicles.

Abel Avram
on Feb 16, 2017
AI, ML & Data Engineering

Apache Flink 1.2 Released with Dynamic Rescaling, Security and Queryable State

Apache Flink 1.2 was announced and features dynamic rescaling, security, queryable state, and more. The release resolved 650 issues, maintains compatibility with all public APIs and ships with Apache Kafka 0.10 and Apache Mesos support. Flink’s dynamic rescaling allows one to change the parallelism of a streaming job or of an operator within the job.

Alexandre Rodrigues
on Feb 15, 2017
AI, ML & Data Engineering

MindMeld’s Guide to Building Conversational Apps

MindMeld, a conversational AI company, has published The Conversational AI Playbook, a guide outlining the challenges and the steps to be made to create conversational applications.

Abel Avram
on Feb 03, 2017
AI, ML & Data Engineering

Apache HBase 1.3 Ships with Multiple Performance Improvements

Apache HBase 1.3.0 was released mid-January 2017 and ships with support for date-based tiered compaction and improvements in multiple areas, like write-ahead log (WAL), and a new RPC scheduler, among others. The release includes almost 1,700 resolved issues in total.

Alexandre Rodrigues
on Jan 30, 2017
AI, ML & Data Engineering

Apache Eagle, Originally from eBay, Graduates to top-level project

Apache Eagle, an open-source solution for identifying security and performance issues on big data platforms, graduates to Apache top level project on January 10, 2017. Firstly open-sourced by eBay on October 2015, Eagle was created to instantly detect access to sensitive data or malicious activities and, to take actions in a timely fashion.

Alexandre Rodrigues
on Jan 24, 2017
Cloud

Improving Azure SQL Database Performance Using In-Memory Technologies

In late 2016, Microsoft announced the general availability of Azure SQL Database In-Memory technologies. In-Memory processing is only available in Azure Premium database tiers and provides performance improvements for On-line Analytical Processing (OLTP), Clustered Columnstore Indexes and Non-clustered Columnstore Indexes for Hybrid Transactional and Analytical Processing (HTAP) scenarios.

Kent Weare
on Jan 21, 2017
AI, ML & Data Engineering

Mathieu Ripert on Instacart's Machine Learning Optimizations

Instacart is an online delivery service for groceries under one hour. Customers order the items on the website or using the mobile app, and a group of Instacart’s shoppers go to local stores, purchase the items and deliver them to the customer. InfoQ interviewed Mathieu Ripert, data scientist at Instacart, to find out how machine learning is leveraged to guarantee a better customer experience.

Alexandre Rodrigues
on Jan 05, 2017
AI, ML & Data Engineering

Google BigQuery Adds New Public Datasets

Stack Overflow recently announced making its dataset available through Google’s BigQuery. Using regular SQL statements, developers can query the full set of Stack Overflow data including posts, votes, tags, and badges. In this article we explore datasets that are available through Google's BigQuery platform.

Alex Giamas
on Jan 05, 2017
AI, ML & Data Engineering

Julien Nioche on StormCrawler, Open-Source Crawler Pipelines Backed by Apache Storm

Julien Nioche, director of DigitalPebble, PMC member and committer of the Apache Nutch web crawler project, talks about StormCrawler, a collection of reusable components to build distributed web crawlers based on the streaming framework Apache Storm. InfoQ interviewed Nioche, main contributor of the project, to find out more about StormCrawler and how it compares to other similar technologies.

Alexandre Rodrigues
on Dec 15, 2016
AI, ML & Data Engineering

Facebook's Comparison of Apache Giraph and Spark GraphX for Graph Data Processing

A Facebook team has recently published a comparison of the performance of their existing Giraph-based graph processing system with the newer GraphX which is part of the popular Spark framework. Their conclusion is that GraphX is neither sufficiently scalable or performant to support their graph processing workloads.

Srini Penchikala
on Dec 09, 2016

Newer News

Older News

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

News