InfoQ Homepage Big Data Content on InfoQ

Articles

RSS Feed

Newer Older

Architecture & Design

FPGAs Supercharge Computational Performance

Originally used in the development of new hardware, new, cloud-based FPGAs are making the technology more accessible. The dramatic improvements in speed and lower costs over traditional CPUs means more companies can start benefiting from the technology. FPGAs are fundamentally concurrent, which makes them an ideal tool for data-intensive, parallel processing problems.

Rob Taylor
on Nov 17, 2017
AI, ML & Data Engineering

Big Data and Big Money: The Role of Data in the Financial Sector

When we consider the 3Vs of big data— volume, velocity, and variety—it is hard to think of many sectors whose requirements fit so nicely into the guidelines at finance.

Jennifer Q. Trelewicz
on Oct 17, 2017
AI, ML & Data Engineering

Video Stream Analytics Using OpenCV, Kafka and Spark Technologies

What is the role of video streaming data analytics in data science space. Learn how to implement a motion detection use case using a sample application based on OpenCV, Kafka and Spark Technologies.

Amit Baghel
on Sep 02, 2017
AI, ML & Data Engineering

Apache Beam Interview with Frances Perry

InfoQ Interviews Apache Beam's Frances Perry about the impetus for using Beam and the future of the top-level open source project and covers the thoughts behind the programming model as well as some of the touch-points in integration with other data engineering tools like Apache Spark and Flink.

Dylan Raithel
on Jun 20, 2017
Java

Introducing Reladomo - Enterprise Open Source Java ORM, Batteries Included! (Part 2)

Goldman Sachs is widely known as a leader in investment banking, but they are very much a leading technology firm as well. Continuing our exploration of Reladomo, the primary Java ORM used at GS and now open source, GS Technology Fellow, Mohammad Rezaei looks at advanced features, such as sharding, caching, bitemporal access, performance, and testing.

Mohammad Rezaei
on Jun 13, 2017
AI, ML & Data Engineering

Machine Learning Techniques for Predictive Maintenance

In this article, the authors explore how we can build a machine learning model to do predictive maintenance of systems. They discuss a sample application using NASA engine failure dataset to predict the Remaining Useful Time (RUL) with regression models.

Roshan Alwis Srinath Perera
on May 21, 2017
AI, ML & Data Engineering

Predicting Movie Ratings: NLP Tools is What Film Studios Need

In this article, the author discusses how to use Natural Language Processing (NLP) techniques to predict the movie ratings using the data shared on social media platforms. Sentiment analysis of movie reviews can also be used to classify movies into different genres and to improve the movie recommendation systems.

Tatsiana Levdikova
on May 13, 2017
AI, ML & Data Engineering

From Alibaba to Apache: RocketMQ’s Past, Present, and Future

Feng Jia and Wang Xiaorui share the core distributed systems principals behind RocketMQ, Alibaba's distributed messaging and data streaming platform now open sourced through the Apache Foundation.

Wang Xiaorui Feng Jia
on Apr 21, 2017
AI, ML & Data Engineering

Building Pipelines for Heterogeneous Execution Environments for Big Data Processing

The Pipeline61 framework supports the building of data pipelines involving heterogeneous execution environments. It reuses the existing code of the deployed jobs in different environments and provides version control and dependency management that deals with typical software engineering issues. A real-world case study shows its effectiveness.

Liming Zhu Qinghua Lu Sherif Sakr Xiwei Xu Daniel Sun Dongyao Wu
on Mar 31, 2017
Java

Introducing Reladomo - Enterprise Open Source Java ORM, Batteries Included!

Goldman Sachs is widely known as a leader in investment banking, but they are very much a leading technology firm as well. Reladomo is the primary Java ORM used at GS, and it is now open source. In this article GS Technology Fellow, Mohammad Rezaei, takes us on a deep dive into Reladomo.

Mohammad Rezaei
on Mar 28, 2017
AI, ML & Data Engineering

Big Data Processing Using Apache Spark - Part 6: Graph Data Analytics with Spark GraphX

In this article, author Srini Penchikala discusses Apache Spark GraphX library used for graph data processing and analytics. The article includes sample code for graph algorithms like PageRank, Connected Components and Triangle Counting.

Srini Penchikala
on Mar 14, 2017
AI, ML & Data Engineering

Three Experts on Big Data Engineering

Clemens Szyperski (Microsoft), Martin Petitclerc (IBM), and Roger Barga (Amazon Web Services) answer three questions: What major challenges do you face when building scalable, big data systems? How do you address these challenges? Where should the research community focus its efforts to create tools and approaches for building highly reliable, scalable, big data systems?

Clemens Szyperski Roger Barga Martin Petitclerc
on Mar 12, 2017

Newer Articles

Older Articles

InfoQ Software Architects' Newsletter

Articles