InfoQ Homepage Database Content on InfoQ
-
COVID-19 and Mining Social Media - Enabling Machine Learning Workloads with Big Data
In this article, author Adi Pollock discusses how to enable machine learning workloads with big data to query and analyze COVID-19 tweets to understand social sentiment towards COVID-19.
-
From Cloud to Cloudlets: a New Approach to Data Processing?
The growing popularity of small, distributed clouds, or “cloudlets” is an implicit recognition of the limitations of the “traditional” cloud model, and could signal a major shift in the way that data is collected, stored, and processed.
-
Scalable Cloud Environment for Distributed Data Pipelines with Apache Airflow
In this article, author Lena Hall discusses how to use Apache Airflow to define and execute distributed data pipelines with an example of the workflow framework running on Kubernetes on Azure cloud platform.
-
Interview with RavenDB Founder Oren Eini
RavenDB is a NoSQL document database with multi-document ACID transactions and smart document compression. To learn more about the recent RavenDB 5 release and RavenDB in general, we’ve invited Oren Eini, creator of RavenDB and CEO of Hibernating Rhinos, to join us.
-
Easy Interpretation of a Logistic Regression Model with Delta-p Statistics
Delta-p statistics is an easier means of communicating results to a non-technical audience than the plain coefficients of a logistic regression model. In this article, authors Maarit Widmann and Alfredo Roccato discuss how to predict credit eligibility using the Delta-p statistics based solution.
-
Combining DataOps and DevOps: Scale at Speed
DataOps is an extension of DevOps standards and processes into the data analytics world. It's about streamlining the processes involved in processing, analyzing and deriving value from big data.
-
Data Leadership Book Review and Interview
Data Leadership book, authored by Anthony Algmin, covers the data leadership topic and how data leaders should manage and govern the data management programs in their organizations. Data Leadership is how organizations choose to apply their energy and resources toward creating data capabilities to influence their business.
-
Innovation Startups Modeling Agile Culture
Innovation is not only about the most advanced technology; management and processes are the new era of startups' innovation. To mix the power of the data and the importance of people to offer business intelligence is a key point nowadays. The result is not only the most important thing; the way you do it is more important. To be agile is to adapt to today's market.
-
Applied Probability - Counting Large Set of Unstructured Events with Theta Sketches
In this article, author Ronen Cohen discusses the solution to processing the event data using Theta Sketches and technologies like HBase and Kafka.
-
State at the Edge: an Interview with Peter Bourgon
Building upon topics in his talk at QCon London, Peter Bourgon answers questions about edge computing, distributed data, and the complexity of synchronization.
-
Apache Arrow and Java: Lightning Speed Big Data Transfer
Apache Arrow puts forward a cross-language, cross-platform, columnar in-memory data format for data. It is designed to eliminate the need for data serialization and reduce the overhead of copying.
-
2020 State of Testing Report
The 2020 State of Testing report provides insights into the adoption of test techniques, practices, and test automation, and the challenges that testers are facing. It shares results from the 2020 testing survey organized by Joel Montvelisky from PractiTest, and Lalit Bhamare from Tea-Time with Testers.