Home

Teaching

Research

Publications

Service

RESEARCH INTERESTS

My research interests are in the area of Machine Learning, Neural Networks and Data Mining, and their applications for Pattern Recognition, Text, Image and Video. Processing. This area is concerened with developing algorithms that learn and improve from previous experience or find hidden and interesting patterns in data. In particular, I am interested in algorithms for classification (supervised learning) and clustering (unsupervised learning) and their applications. In a classification task, given a set of examples with their correct class (label), the goal is to induce a classifier that can be used to predict the class of new, unseen, examples. Some of the classification tasks I have worked on are learning to automatically classify sleep stages in babies, decode motor commands from brain activity, recommend movies, recognise fingerprints, filter spam from non-spam e-mail. In clustering, given a set of unlabelled examples, the goal is to group data into several clusters according to their similarity. I have worked on clustering for keyframe extraction in video, video summarization and fast, non-linear access to the relevant material; finding similar genes in microarray data and recommending movies.

Current focus
  • Learning from labelled and unlabelled examples - Supervised machine learning algorithms learn from examples that are labelled with the correct category. To build an accurate classifier, a large number of labelled examples are needed. Obtaining labelled examples requires human effort and is a time consuming and tedious process. Semi-supervised learning tries to overcome this problem by learning from a small set of labelled and a large set of unlabelled examples. For example, we can build a weak classifier using the small set of labelled examples, predict the class of all unlabelled examples, select the most confidently labelled, add them to the training set and re-learn.  Learning from both labelled and unlabelled examples can be also beneficial for unsupervised algorithms. For example, we can use a the set of labelled examples to find the initial seeds for a k-means clustering algorithm.
  • Machine Learning in image and video retrieval - Large video collections already exist (e.g. archives of digital TV broadcast, film production art, geographic and surveillance information and archiving systems) and the growth of content is tremendous. This has created a great potential opportunity for users to search and explore them. However, the technology for efficient search and browsing is still very limited and is essentially text-based. Many applications do not come with associated text, and even if they come, the object of interest may not be mentioned in the text (manually created, closed captions or text from audio stream). There is a need to develop approaches for  image and video retrieval for collections without associated text, with limited associated text, or requiring visual queries to support the text-based retrieval. There are many learning tasks in this area - e.g. mapping of low level features (color, tuxeture) to semantic concepts (girl playing on the beach) or organising the search results of a web-based image search engine using clustering.
  • Algorithm selection - There are many classification algorithms, which is the best one for a given problem? Given a set of algorithms and their performance on previous tasks, we can use metalearning to select an appropriate algorithm for the new problem. I am particularly interested in algorithm selection via landmarking. Daren Ler has developed a regression-based landmarking approach that allows to evaluate the performance of a sub-set of algorithms and estimate the performance of the remaining sub-set, and choose an appropriate algorithm.      
Grants
  • Smart Services CRC grant: Personalisation, J. Kay, J. Curran, I. Koprinska, K. Yacef. $253K, 2008-2009.
  • ARC Discovery near miss: Sequential patttern analysis in learning traces, K. Yacef, J. Kay, I. Koprinska, $15K, 2007.
  • Bridging grant from the University of Sydney (ARC Discovery near miss): Data Mining of Learner Models, K. Yacef, J. Kay, I. Koprinska, $36K, 2006.
  • Smart Internet Technology CRC grant: Bridging the Gap: Smart Support for the Intergenerational Distributed Family, B. Kummerfeld, J. Kay, K. Yacef, I. Koprinska, J. Poon, $270K, 2005-2007.
  • Smart Internet Technology CRC grant: Machine Learning for the Smart Personal Assistant, I. Koprinska and J. Poon, $19K, 2003-2004;  Collaboration with ANU and Griffith University.
  • SESQUI Early Career grant: Video Segmentation and Summarization Using Neural Networks, I. Koprinska, $16K, 2002.
  • Smart Internet Technology CRC: Knowledge Aquisition and Machine Learning, J Davis, J. Kay, I. Koprinska, J. Poon, M. Takatsuka and K. Yacef, $46K, 2002.

Research students

current

past
    PhD

    Honours/engineering final year/MIT research projects/internships

Third year software development projects

Projects offered
Honours projects offered in 2008
Honours projects offered in 2006

Links

Machine Learning, Data Mining, Neural Networks, Pattern Recognition
MLnet Online Information Service
ML Database Repository at UC Irvine
WEKA - ML Algorithms for Data Mining at Waikato Univ.
UTCS Machine Learning Research Group
Microsoft Belief Networks Tools
RuleQuest
MLC++, A Machine Learning Library in C++
KD Nuggets Directory
The Data Mine
WinMine Toolkit Home Page
Neural Networks at PNNL
Pattern Recognition page at Delft Univ. of  Technology
 
Artificial Intelligence

CMU Artificial Intelligence Repository
Russel and Norvig, Artificial Intelligence: A Modern Approach  

Image and Video Database Systems, Digital Libraries
Informedia Digital Library at CMU
QBIC at IBM
PhotoBook at MIT Media Lab
VisualSEEk at Columbia University
ViBE at Purdue University
Digital Library at the UC Berceley
NZ Digital Library at the Univ. of Waikato

MPEG

MPEG Pointers and Resources
MPEG-2 Video Codec

Other
Research office