Filed under: Papers
[author]Ninghui Li, Tiancheng Li, Suresh Venkatasubramanian.[/author]
2007 IEEE 23rd International Conference on Data Engineering
The $k$-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least $k$ records. Recently, several authors have recognized that $k$-anonymity cannot prevent attribute disclosure. The notion of $\ell$-diversity has been proposed to address this; $\ell$-diversity requires that each equivalence class has at least $\ell$ well-represented values for each sensitive attribute.
In this paper we show that $\ell$-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. We propose a novel privacy notion called $t$-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table (i.e., the distance between the two distributions should be no more than a threshold $t$). We choose to use the Earth Mover Distance measure for our $t$-closeness requirement. We discuss the rationale for $t$-closeness and illustrate its advantages through examples and experiments.
Notes:
An extended version of this paper has been submitted to a journal. Email me if you’d like a copy.
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>