Basser Seminar Series

Data Transformation for Privacy-Preserving Data Mining

Osmar R. Zaiane
University of Alberta, Canada

Wednesday 23 November 2005, 4-5pm

Basser Conference Room (Madsen Building, Room G92)

Abstract

The sharing of data is often beneficial in data mining applications. It has been proven useful to support both decision-making processes and to promote social goals. However, the sharing of data has also raised a number of ethical issues. Some such issues include those of privacy, data security, and intellectual property rights.

In this talk, we focus primarily on privacy issues in data mining, notably when data are shared before mining. Specifically, we consider some scenarios in which applications of association rule mining and data clustering require privacy safeguards. Addressing privacy preservation in such scenarios is complex. One must not only meet privacy requirements but also guarantee valid data mining results. This status indicates the pressing need for rethinking mechanisms to enforce privacy safeguards without losing the benefit of mining. These mechanisms can lead to new privacy control methods to convert a database into a new one in such a way as to preserve the main features of the original database for mining.

In particular, we address the problem of transforming a database to be shared into a new one that conceals private information while preserving the general patterns and trends from the original database. To address this challenging problem, we propose a unified framework for privacy preserving data mining that ensures that the mining process will not violate privacy up to a certain degree of security. The framework encompasses a family of privacy-preserving data transformation methods, a library of algorithms, retrieval facilities to speed up the transformation process, and a set of metrics to evaluate the effectiveness of the proposed algorithms, in terms of information loss, and to quantify how much private information has been disclosed.

Our investigation concludes that privacy-preserving data mining is to some extent possible. Our experiments demonstrate that our framework is effective, it meets privacy requirements, and it guarantees valid results for the disclosure of the discovered patterns.

Speaker's biography

Osmar R. Zaïane is an Associate Professor in Computing Science at the University of Alberta, Canada. Dr. Zaiane joined the University of Alberta in July of 1999. He obtained a Master's degree in Electronics at the University of Paris, France, in 1989 and a Master's degree in Computer Science at Laval University, Canada, in 1992. He obtained his Ph.D. from Simon Fraser University, Canada, in 1999 under the supervision of Dr. Jiawei Han. His Ph.D. thesis work focused on web mining and multimedia data mining. He has research interests in novel data mining algorithms, web mining, text mining, image mining, and information retrieval. The main projects he is leading are DIVE-ON, an immersed virtual environment for data warehouse and data mining results visualization, MetaWeb, a web-content mining and web- warehousing system, ExaQuest, a data mining toolbox, and a web-usage mining project on web mining for web activity evaluation and intelligent restructuring of web-based learning environments. More datils can be found here: http://www.cs.ualberta.ca/~zaiane/ htmldocs/bio.html