RESEARCH INTERESTS
My research
interests are in the area of Machine
Learning, Neural Networks and Data Mining, and their applications for
Pattern Recognition, Text, Image and Video. Processing. This area is
concerened with
developing
algorithms that learn and improve from
previous experience or find hidden and interesting patterns in
data. In particular, I am interested in algorithms for classification
(supervised learning) and clustering (unsupervised learning) and their
applications. In a classification task, given a set of examples with
their correct class (label), the goal is to induce a classifier that
can be used to predict the class of new, unseen, examples. Some of the
classification tasks I have worked on are learning to
automatically classify sleep stages in babies, decode motor commands
from
brain activity, recommend movies, recognise fingerprints, filter spam
from non-spam
e-mail.
In
clustering, given a set of unlabelled examples, the goal is to group
data into several clusters according to their similarity. I have worked
on clustering for keyframe extraction in video, video
summarization and fast, non-linear access to the relevant material;
finding similar genes in microarray data and recommending movies.
Current focus
- Learning from labelled and unlabelled
examples - Supervised machine
learning algorithms
learn from examples that are labelled with the correct category. To
build an
accurate classifier, a large number of labelled examples are needed.
Obtaining
labelled examples requires human effort and is a time consuming and
tedious
process. Semi-supervised learning tries to overcome this problem by
learning
from a small set of labelled and a large set of unlabelled examples.
For
example, we can build a weak classifier using the small set of labelled
examples, predict the class of all unlabelled examples, select the most
confidently labelled, add them to the training set and re-learn. Learning
from both labelled and unlabelled
examples can be also beneficial for unsupervised algorithms. For
example, we
can use a the set of labelled examples to find the initial seeds for a
k-means
clustering algorithm.
- Machine Learning in image and video
retrieval - Large
video
collections already exist (e.g.
archives of digital TV broadcast, film production art, geographic and
surveillance information and archiving systems) and
the growth of content is tremendous. This has
created a great potential opportunity for users to search and explore
them.
However, the technology for efficient search and browsing is still very
limited
and is essentially text-based. Many applications do not come with
associated
text, and even if they come, the object of interest may not be
mentioned in the
text (manually created, closed captions or text from audio stream). There
is a need to develop approaches for image and video retrieval for
collections without associated text, with
limited associated text, or requiring visual queries to support the
text-based
retrieval. There are many learning tasks in this area - e.g. mapping of
low level features (color, tuxeture) to semantic concepts (girl playing
on the beach) or organising the search results of a web-based image
search engine using clustering.
- Algorithm selection - There
are many classification algorithms, which is the best one for a given
problem? Given a set of algorithms and their performance on previous
tasks, we can use metalearning to select an appropriate algorithm for
the new problem. I am particularly interested in algorithm selection
via landmarking. Daren Ler has developed a regression-based landmarking
approach that allows to evaluate the performance of a sub-set of
algorithms and estimate the performance of the remaining sub-set, and
choose an appropriate algorithm.
Grants
- Smart Services CRC grant: Personalisation, J. Kay, J. Curran, I.
Koprinska, K. Yacef. $253K, 2008-2009.
- ARC Discovery near miss: Sequential patttern analysis in learning
traces, K. Yacef, J. Kay, I. Koprinska, $15K, 2007.
- Bridging grant from the University of Sydney
(ARC Discovery near miss): Data
Mining of Learner Models, K. Yacef, J. Kay, I. Koprinska, $36K, 2006.
- Smart Internet Technology CRC grant: Bridging
the Gap: Smart Support for the Intergenerational Distributed Family, B.
Kummerfeld, J. Kay, K. Yacef, I. Koprinska, J. Poon, $270K, 2005-2007.
- Smart Internet Technology CRC grant: Machine
Learning for the Smart Personal Assistant, I. Koprinska and J. Poon, $19K, 2003-2004; Collaboration
with ANU
and Griffith
University.
- SESQUI Early Career grant: Video
Segmentation and Summarization Using Neural Networks, I. Koprinska,
$16K, 2002.
- Smart Internet Technology CRC: Knowledge Aquisition and Machine
Learning, J Davis, J. Kay, I. Koprinska, J. Poon, M. Takatsuka
and K. Yacef, $46K, 2002.
Research
students
current
- Tim O'Keefe, Honours student (engineering thesis), Sentiment
analysis [ trac
]
- Rohen Sood, Honours student (engineering thesis), Electricity
load forecasting
(with Prof.
Vassilios Agelidis)
- Kate Graves, internship with Qantas, Fuel prediction, associate
supervisor
past
PhD
- Daren Ler, PhD
student,
topic: Algorithm selection via landmarking [ icml05 ] [ flairs05 ] [
ajcai04_extendedTR ] [ icml04 ] [ his04_extendedTR ] [ TR ]
- Jason Chan,
PhD
student (with Dr
Josiah Poon), topic: Background knowledge in semi-supervised learning [
ijait08 ] [ is07 ] [ flairs07 ] [ adcs04 ] [ wi04 ]
Honours/engineering
final year/MIT research projects/internships
- 2008, Anthony Setiawan, engineering final year, Electricity load
prediction (with Prof.
Vassilios Agelidis); Research Conversazione 2008 award [ ijcnn09 ]
- 2008, Daniel Howell, Painting
with the Wii; Research Conversazione 2008 award
- 2008, Sergey Mainich, Story
segmentation (with Daniel Lloyd Jones from Visionbytes)
- 2008, Omar Al Zoubi, MIT, Classification
of Brain-Computer Interface Data [ ausDM'08 ]
- 2007, Martin Schwank, MIT, Using Wii for painting
- 2006, Andrew Hutchings, Honours, Content-based image retrieval
using Google image search
- 2006, Dean Cummins, Honours, Personalised
learning using resources
from the open web (with Dr Kalina Yacef) [ajiips06 ]
- 2006, Adam Hotz, engineering final year, E-mail worms detection using machine
learning
- 2005, Felix Feger, MIT; research thesis mark: 89 [ eusipco06 ]
[ijcnn06 ]
- Jason Chan, Honours, Co-training
with a
random split of a single natural featue set (with Dr Josiah
Poon)
- 2003, Olivier Fayau from Ecole
Superieure d’Ingenierie Leonard De Vinci, Paris; Combining neural networks and
genetic algorithms for e-mail
classification (with Dr J. Poon),
- 2003, Félix Trieu from Ecole
Superieure d’Ingenierie Leonard De Vinci, Paris; Decision forest (with J. Poon)
[adcs03]
- 2002, Elisabeth Crawford,
currently a PhD student at CMU, Honours mark:
95.5, Automated
Text Categorization with Applications to E-mail Management (with
Jon
Patrick) [ adcs02 ] [adcs04 ]
- Christine Remediakis, Text
Mining in DNA Microarrays ((with Dr Rohan Williams from Centenary
Institute and Dr Josiah Poon))
- 2002, James Clark, Honours, currently RA at the University of
Sydney, E-mail
Classification: A Hybrid Approach Combining Genetic Algorithms with
Neural
Networks (with Josiah
Poon in sem.2) [is07] [ kes05] [ ijcnnt04 ] [ cost04 ] [ icann03
] [ wi03 ] [adcs03 ]
- 2002, Harry Mak, Honours, Content-Based
Movie Recommendation Using Learning for Text Categorization
(with Josiah
Poon in sem.2) [ wi03 ]
- 2002, Ehremin Avila, Honours, Mining
of Microarray Gene Expression Data
- 2001, Kim Jackson, Honours, Clustering
of Gene Expression Data (collaboration with Dr Janette Burgess,
Dept. of Pharmacology) [iconip02]
- 2001, Anna Ceguerra, currently at IBM; Honours mark: 95; Fingerprint
Verification, collaboration with Andrea Aizza and Piero Calucci
from Tender
S.p.A., and Fabio Vitali and Sergio Carrato from the Image
Processing Laboratory, University of Trieste, Italy [ icann02 ] [
icpr02 ]
Third
year software development projects
- Anindha Party, High Speed Dynamic Content Filtering (with Teewoon
Tan, Darren Williams and Matt Barrie from Sensory Networks, Software
Development Project Advanced (won Departamental prize for best project-3d place)
- Alistair Reid, Processing of MPEG-2 video, Talented Student
Program project
- Christine Remediakis - Text
Mining in DNA Microarrays (with Rohan Williams from Centenary
Institute
and Josiah Poon)
- Michael Dam, Kire Tosevski, Kathlene Belista, and Mohammad
El-Ali, Classification
of Brain-Computer Interface Data, AI project, entered the international BCI competition and
obtained 4th
place
- Clinton Freeman, Michael Crawford, David Storey and. Haresh
Wadiwel, Classification
of Brain Computer Interface Data, AI project
- EliteAI: Damien McMonigal, James Clark, Feifei Huang, Helen Lee,
Kin Nguyen, Intelligent Movie Recommender System, best project award - 1st place, EliteAI
group's system, Product development project
- Jyot Boparai, Ehremin Avila, Harry Mak, Kien Luu, Intelligent
Movie Recommender System, Product Development Project
- Keith Player, Clustering of
gene expression data , Software Development Project (Advanced)
- Damien McMonigal, James Clark, Lawrence Suen, Tommy Suen, Movie
Recommender, AI project
- Jyot Boparai, Ehremin Avila, Sai Thurein, Harry Mak, Movie
Recommender, AI project
- Seung Yang, Roman Belousov, Yoonhee Choi, Jason Kim, Movie
Recommender, AI project
- Travis Wells, Natasa Jetvic, Interactive
Web-Based Tutor on Artificial Neural Networks, AI project
Projects offered
Honours
projects offered in 2008
Honours
projects offered in 2006
Links