Basser Seminar Series

Feature Selection and Caching: Extensions of the Relevant-Set Correlation Model

Speaker: Professor Michael E. Houle
Visiting Professor, National Institute of Informatics, Tokyo, Japan

Time: Friday 14 May 2010, 4:00-5:00pm
Refreshments will be available from 3:30pm

Location: The University of Sydney, School of IT Building, Lecture Theatre (Room 123), Level 1

Add seminar to my diary


In recent years, a number of methods have been proposed for data clustering that make use of so-called "shared-neighbor" information. The rationale behind all such approaches is that dense, interrelated data clusters can be revealed by the degree to which the neighborhoods of their members overlap. In this talk, we look at two extensions of the Relevant-Set Correlation (RSC) model for data clustering. The first extends the model to the case of multimodal information, in which objects is associated with several ranked relevant sets (neighborhoods), each associated with its own collection of data features and similarity measures. The second extension applies the modeling methodology of RSC to the problem of active caching of query-by-example ranked result lists. Here, the goal is to avoid disk access latency by estimating a query result from cached information whenever the desired result is missing from the cache. The extension to caching is joint work with Vincent Oria and Umar Qasim of NJIT.

Clustering, feature selection, active caching, correlation.

Speaker's biography

Michael Houle obtained his PhD degree from McGill University in 1989, in the area of computational geometry. Since then, he developed research interests in algorithmics, data structures, and relational visualization, first as a research associate at Kyushu University and the University of Tokyo in Japan, and from 1992 at the University of Newcastle and the University of Sydney in Australia. From 2001 to 2004, he was a Visiting Scientist at IBM Japan's Tokyo Research Laboratory, where he first began working on approximate similarity search and shared-neighbor clustering methods for data mining applications. Currently, he is a Visiting Professor at the National Institute of Informatics (NII), Japan.