InfoQ Homepage Big Data Content on InfoQ
-
Getting Ready for IoT’s Big Data Challenges with Couchbase Mobile
Our physical world is about to become digitally enabled and according to various predictions for example by Gartner or Cisco, there will be many billions of IoT devices going online and constantly gathering data in the coming years. We got in touch with Wayne Carter and Ali LeClerc of Couchbase to discuss how Couchbase Mobile is also ready for the upcoming era of Internet of Things.
-
Big Data Processing with Apache Spark - Part 3: Spark Streaming
In this article, third installment of Apache Spark series, author Srini Penchikala discusses Apache Spark Streaming framework for processing real-time streaming data using a log analytics sample application.
-
Health Informatics and Survival Prediction of Cancer with Apache Spark Machine Learning Library
In this article, author discusses the survival prediction of colorectal cancer as a multi-class classification problem and how to solve that problem using the Apache Spark's MLlib Java API.
-
Data Lake-as-a-Service: Big Data Processing and Analytics in the Cloud
Data Lake-as-a-Service solutions provide big data processing in the cloud for faster business outcomes in a very cost effective way. InfoQ spoke with Lovan Chetty and Hannah Smalltree from Cazena team about how Data Lake as a Service works.
-
Real-time Data Processing in AWS Cloud
In this article, author Oleksii Tymchenko discusses a bio-informatic software as a service (SaaS) product called Chorus, which was built as a public data warehousing and analytical platform for mass spectrometry data. Other features of the product include real-time visualization of raw mass-spec data.
-
Oozie Plugin for Eclipse
Oozie Eclipse plugin is a new tool for editing Apache Oozie workflows graphically inside Eclipse. Usage of this plugin allows to skip hard to develop and maintain process definition in HPDL. Instead a process graph is defined graphically by placing process actions on pallet and connecting them. An article introduces Eclipse Oozie plugin and provides an example of its usage.
-
Big Data Solutions with MS SQL ColumnStore Index
Columnar data storage can offer significant performance improvements over the way database tables are traditionally stored, but they aren’t always faster. Aleksandr Shavlyuga explores the power, and limitations of SQL Server’s ColumnStore Indexes.
-
The Estimation Game - Techniques for Informed Guessing
In this article, author Carlos Bueno discusses the strategies for estimating the server capacity for big data projects and initiatives, with the help of two case studies.
-
Machine Learning and Cognitive Computing
Based on a webinar on analytics, this article covers the topics of machine learning and cognitive computing, and how these fields are related to artificial intelligence (AI). Panelists discuss how this technology is being applied in digital marketing space and what concerns organizations have in providing machine learning services.
-
Garage Door Openers: An Internet of Things Case Study
In this article, author discusses how to design an Internet-connected garage door opener ("IoT opener") to be secure. He talks about cloud service authentication and security improvements offered by networked openers, like two-factor authentication (2FA). He also discusses security infrastructure for IoT devices, which includes user authentication, access policy creation & enforcement.
-
Big Data as a Service, an Interview with Google's William Vambenepe
Many of the Big Data technologies in common use originated from Google and have become popular open source platforms, but now Google is bringing an increasing range of big data services to market as part of its Google Cloud Platform. InfoQ caught up with Google's William Vambenepe, who's lead product manager for Big Data services to ask him about the shift towards service based consumption.
-
The Promise of Healthcare Analytics
Data analytics play a central role in the healthcare system by improving outcomes and quality of life while helping to control costs. In this article, author describes the role analytics can play with the emerging wearable technologies with biophysical interfaces, physiological sensors, and embedded diagnostic tools.