InfoQ Homepage Data Content on InfoQ
-
Setting up a Data Mesh Organization
A data mesh organization: producers, consumers, and the platform. According to Matthias Patzak, the mission of the platform team is to make the lives of the producer and consumers simple, efficient and stress free. Data must be discoverable and understandable, trustworthy, and shared securely and easily across the organization.
-
Data Teams Survey: Lag in DataOps and Value Delivered
We report on Jesse Anderson's 2024 Data Teams Survey which showed a lag in DataOps capabilities, slow LLM adoption, and a concerning decline in perceived value creation by data teams. It called out the importance of teams spread with data science, engineering, and operations capabilities. We also cover Petr Janda's recent podcast on the need for more engineering rigour for parity with other teams.
-
Anthropic Unveils Contextual Retrieval for Enhanced AI Data Handling
Anthropic has announced Contextual Retrieval, a significant advancement in AI systems' interaction with extensive knowledge bases. This technique addresses the challenge of context loss in Retrieval-Augmented Generation (RAG) systems by enriching text chunks with contextual information before embedding or indexing.
-
How to Develop a Culture of Quality in Software Organizations
According to Erika Chestnut, software organizations can develop a culture of quality with a clear commitment from leadership, not only to endorse quality efforts in software teams, but also to actively champion them. This commitment and advocacy should manifest in data-driven decision-making that strikes a balance between innovation and quality, ensuring that we maintain the highest quality.
-
Cloudflare One Data Protection Suite for Data Security across Web, Private, and SaaS Applications
Cloudflare recently announced its One Data Protection Suite, a unified set of advanced security solutions designed to protect data across every environment – web, private, and SaaS applications. The company states the suite is powered by Cloudflare’s Security Service Edge (SSE), allowing customers to streamline compliance in the cloud, mitigate data exposure and loss of source code.
-
6 Tracks Not to Miss at QCon San Francisco, October 2-6, 2023: ML, Architecture, Resilience & More!
At InfoQ’s international software development conference, QCon San Francisco (October 2-6) 2023, senior software practitioners driving innovation and change in software development will explore real-world architectures, technology, and techniques to help you solve such challenges.
-
QCon New York: Five Tracks to Level-up on the Latest Software Development Practices
The 2023 edition of the QCon New York (June 13-15) software development conference, hosted by InfoQ, is set to bring together over 800 senior software developers. The three-day conference will feature over 80 innovative senior software practitioners from early adopter companies sharing how they are solving current challenges, providing new ideas and perspectives across multiple domains.
-
Zero-Copy In-Memory Sharing of Large Distributed Data: V6d
Zero-copy and in-memory data manager Vineyard (v6d) is maintained as a CNCF sandbox project and provides distributed operators that can be utilized to share immutable data within or across cluster nodes. V6d is of interest particularly for deep network training on big (sharded) datasets such as large language and graph models.
-
Unsupervised Object Detection and Semantic Segmentation Using Deep Learning
Meta AI released CutLER, a state-of-the-art zero-shot unsupervised object detector which improves detection performance by over 2.7 times on 11 benchmark datasets for different domains like video frames, painting, sketches, etc. This model’s simplicity allows compatibility with different object-detection architectures across different domains.
-
AWS Announces DataZone, a New Data Management Service to Govern Data
At AWS re:Invent, Amazon Web Services announced Amazon DataZone, a new data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources.
-
Amazon SageMaker Clarify Now Supports Online Explainability for ML Predictions
Amazon is announcing that Amazon SageMaker Clarify now supports online explainability by providing explanations for machine learning model’s individual predictions in near real-time on live endpoints.
-
AWS DataSync Discovery Preview Edition Supports Automated Data Collection and Storage Recommendation
Amazon is announcing the public preview of AWS DataSync Discovery. This new feature of AWS DataSync enables users to better understand on-premises storage usage through automated data collection and analysis, quickly identify data to migrate, and evaluate recommended AWS Storage services for data.
-
Amazon Announces the Improvement of ML Models to Better Identify Sensitive Data on Amazon Macie
Amazon is announcing a new capability to create allow lists in Amazon Macie. Now text or text patterns not desire for Macie to report as sensitive data can be specified in allow lists. Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect sensitive data in AWS.
-
Amazon Launches What-If Analyses for Machine Learning Forecasting Service Amazon Forecast
Amazon is announcing that now its time-series machine learning based forecasting service Amazon Forecast can run what-if assessments to determine how different business scenarios can affect demand estimates. What-if analysis is an effective business technique for simulating hypothetical scenarios and stress testing on planning assumptions by recording potential outcomes.
-
Amazon Comprehend Announces the Reduction of the Minimum Requirements for Entity Recognition
Amazon is announcing that they lowered the minimal requirements for training a recognizer with plain text CSV annotation files as a result of recent advances in the models powering Amazon Comprehend. Now, you just need three documents and 25 annotations for each entity type to create a unique entity recognition model.