InfoQ Homepage Data Analysis Content on InfoQ
-
AWS Announces Clean Rooms for Secure Collaboration with Analytics Data
During the recent re:Invent conference, AWS announced the preview of Clean Rooms for analytics data. The new service provides safe environments where multiple customers can securely share and analyze data with control of how the data is used, reducing the risk of sharing personal data.
-
Using Data to Predict Future Usage and Increase User Insights
By identifying usage trends, you can proactively adjust load, scaling, and routing to better handle the load on particular parts of the globe when you know it will peak there. Data about how users interact with your application can be used to design future features that better mimic these patterns and ensure that new features have a better chance of solving real user problems and getting adopted.
-
AWS Glue Now Supports Crawler History
AWS recently launched support for histories of AWS Glue Crawlers, which allows the interrogation of Crawler executions and associated schema changes for the last 12 months.
-
A New Microsoft Platform in Town: the Microsoft Intelligent Data Platform
Recently Microsoft introduced a new platform called the Microsoft Intelligent Data Platform that fully integrates their database, analytics, and governance offerings. The new platform encompasses everything already available in the Azure Data space (Azure Data Factory, Azure Data Explorer, etc.) to the Synapse Analytics products, Power BI, and the newly rebranded Purview data governance service.
-
Austrian DPA Ruling against Google Analytics Paves the Way to EU-based Cloud Services
In a recent ruling, the Austrian data regulator declared the use of Google Analytics unlawful based on EU GDPR regulation. While the ruling is very specifically argued and worded, its implications go well beyond this particular case.
-
Data Collection, Standardization and Usage at Scale in the Uber Rider App
Uber Engineering recently published how it collects, standardises and uses data from the Uber Rider app. Rider data comprises all the rider's interactions with the Uber app. This data accounts for billions of events from Uber's online systems every day. Uber uses this data to deal with top problem areas such as increasing funnel conversion, user engagement, etc.
-
Microsoft Renames Its Azure for FHIR API to Azure Healthcare APIs
Recently Microsoft announced the renaming of its Cloud for Healthcare's Azure API for Fast Healthcare Interoperability Resource (FHIR) to "Azure Healthcare APIs." In addition to the renaming of the APIs, the company also expands support for healthcare data to include patient health data via FHIR, medical imaging data via DICOM - and medical device data via the Azure IoT Connector for FHIR .
-
Amazon SNS Gains Message Archiving and Analytics via Amazon Kinesis Data Firehose
Amazon Web Services (AWS) recently announced that Amazon SNS supports Amazon Kinesis Data Firehose subscriptions to send messages to "data lakes, data stores, and analytics services [...] without writing custom code". The new event destination also simplifies the integration of third-party service providers.
-
AWS Announces a Data Management and Analytics Solution Called Amazon FinSpace
Recently, AWS announced a data management and analytics solution purpose-built for the Financial Services Industry (FSI) called Amazon FinSpace. The service aims to reduce the time it takes for financial analysts to find and access all types of financial data for analysis.
-
Using Machine Learning in Testing and Maintenance
With machine learning, we can reduce maintenance efforts and improve the quality of products. It can be used in various stages of the software testing life-cycle, including bug management, which is an important part of the chain. We can analyze large amounts of data for classifying, triaging, and prioritizing bugs in a more efficient way by means of machine learning algorithms.
-
Designing for Failure in the BBC's Analytics Platform
Last week at InfoQ Live, Blanca Garcia-Gil, principal systems engineer at BBC, gave a session on Evolving Analytics in the Data Platform. During this session, Garcia-Gil focused on how her team prepared and designed for two types of failure - "known unknowns" and "unknown unknowns."
-
Google Brings Databricks to Its Cloud Platform
Recently Google announced a partnership with Databricks to bring their fully-managed Apache Spark offering and data lake capabilities to Google Cloud. The offering will become available as Databricks on Google Cloud.
-
Amazon Announces the General Availability of AWS Glue 2.0
AWS Glue is a fully-managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. With AWS Glue, customers don’t have to provision or manage any resources, and only pay for resources when the service is running.
-
Accelerating Machine Learning Lifecycle with a Feature Store
Feature Store is a core part of next generation ML platforms that empowers data scientists to accelerate the delivery of ML applications. Mike Del Balso and Geoff Sims recently spoke at Spark AI Summit 2020 Conference about the feature store driven ML development.
-
Amazon Introduces the New Streaming ETL Feature on AWS Glue
Recently, Amazon announced AWS Glue now supports streaming ETL. With this new feature, customers can easily set up continuous ingestion pipelines that prepare streaming data on the fly and make it available for analysis in seconds.