BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Google Enhances Data Privacy with Confidential Federated Analytics

Google Enhances Data Privacy with Confidential Federated Analytics

This item in japanese

Log in to listen to this article

Google has announced Confidential Federated Analytics (CFA), a technique designed to increase transparency in data processing while maintaining privacy. Building on federated analytics, CFA leverages confidential computing to ensure that only predefined and inspectable computations are performed on user data without exposing raw data to servers or engineers.

Federated analytics allows for distributed data analysis while keeping raw data on user devices. Traditionally, devices respond to queries by sending aggregated statistics rather than individual data points. However, users have had no way to verify how their data is being processed, which presents trust and security challenges.

CFA addresses this limitation by using Trusted Execution Environments (TEEs). These restrict computations to predefined analyses and prevent unauthorized access to raw data. CFA also makes all privacy-relevant server-side software publicly inspectable, allowing external verification of the data-handling process.

scheme

Source: Google Blog

Richard Seroter, a director of developer relations at Google Cloud, noted the importance of this advancement, stating:

This feels like a real step forward. Federated learning and computation using lots of real devices is very cool but can make privacy-oriented folks nervous.

Google has deployed CFA in Gboard, its Android keyboard, to improve new word detection across over 900 languages. Language models require updates to recognize emerging words while filtering out rare, private, or non-standard entries.

Previously, Google used LDP-TrieHH, a local differential privacy-based approach. However, this method had limited scalability and required weeks to process updates, particularly for languages with lower user volume.

With CFA, the system processed 3,600 missing Indonesian words in two days, reaching more devices and languages while maintaining stronger differential privacy guarantees.

CFA operates through a structured, multi-step process that ensures data remains private while enabling meaningful analysis. The workflow consists of the following key stages:

  • Data Collection and Encryption: Devices store relevant data locally and encrypt it before upload.
  • Access Policy Enforcement: Data can only be decrypted for pre-approved computations, defined by structured policies.
  • TEE Execution: Data processing occurs within a TEE, ensuring confidentiality and preventing unauthorized modifications.
  • Differential Privacy Algorithm: The system applies a stability-based histogram approach, adding noise before identifying frequently typed words.
  • External Verifiability: The processing pipeline, software, and cryptographic proofs are logged in a public transparency ledger for external auditing.

Google plans to apply Confidential Federated Computations to broader federated learning tasks, enabling AI model training with strict privacy guarantees. The technique is expected to be integrated into Android Private Compute Core and other privacy-focused systems.

About the Author

BT