Automatic generation of aggregated analytics for de-anonymization

Bautista Marelli

Abstract

In this research we propose a systematic methodology for producing vulnerable aggregated analytics—summary statistics which, once released, enable partial reconstruction of the underlying dataset even when it has been k‑anonymized.

Our goal is to raise awareness of the latent disclosure risks accompanying the publication of seemingly innocuous aggregates. We formalize the conditions under which an attacker can combine the released analytics with limited auxiliary information to infer an individual‑level sensitive numerical attribute (e.g. salary). We derive tight bounds that relate the dimensionality of the table, the choice of k, and the class of aggregates to the amount of recoverable data.

The presentation will detail the mathematical foundations of the reconstruction attack, and illustrate them with a running example. We conclude by outlining future research directions.

Date
Jun 17, 2025 10:30 AM — 11:30 AM

Bautista is a Computer Science Master’s student at the National University of Rosario, Argentina. This research is part of his Master’s thesis.

Bautista Marelli’s webpage