Project offerings for Semester 2 2015

Click on the supervisor's name for a description of the projects they are offering.

Projects will be added over the coming weeks.


Supervisor Project Credit points
Weidong (Tom) Cai Intelligent 3D Single Neuron Reconstruction 18
Neuroimaging Computing for Early Detection of Dementia 18
Context Modeling for Medical Image Retrieval 18
Structural Feature Representation for Image Pattern Classification 18
Software Development for Analysis of Spatial Contexts in Lung Images 12
Vera Chung A new feature selection technique for Multimedia Information Retrieval 18
Context-based Video Retrieval System for the Life-Log Applications 18
A new data mining algorithm for Network Intrusion Detection System 18
Video Object tracking for video surveillance 18
Vincent Gramoli   USyd Malware Traffic Analysis 18
Benchmarking Concurrent Data Structures on Multi-core Machines 18
Evaluating Consensus Protocols in Distributed Systems 18
Joachim Gudmundsson  Role-assignment of football players 12
Dominant regions of football players 12
Seokhee Hong Scalable Visual Analytics of Big Data 12 or 18
Visualization and Analysis of Large and Complex Biological Networks and Social Networks 12 or 18
2.5D Graph Navigation and Interaction Techniques 12 or 18

Drawing Algorithms for Almost Planar Graphs

12 or 18

MultiPlane Graph Embedding (2.5D Graph Embeddability)

12 or 18
Bryn Jeffries         Deep Learning for Classifying Sleep Stages in EEG 18
Advanced Sandboxing 12
Front-end enhancements 12
Advanced Reporting 12
Gamification with Active Databases 12
Jinman Kim           Medical Image Visualisation 12 or 18
Ray Profile Analysis for Volume Rendering Visualisation 12 or 18
MDT Visualisation 12 or 18
Exploratory Visualisation with Virtual and Augmented Reality Devices 12 or 18
Medical Avatar 12 or 18
Medical Image Visual Analytics 12 or 18
Kevin Kuan Emotional Contagion in Online Social Networks 18
The Role of Emotion in Mobile Personalisation 18
The Role of Emotion in Group-Buying 18
Online Social Influence in Group-Buying 18
Temporal Features and Consumer Evaluations of Group-Buying 18
Wisdom of the Crowd Effect in Electronic Commerce 18
Online Consumer Reviews in Electronic Commerce 18
Information Overload and Online Decision Making in Electronic Commerce 18
Josiah Poon TOCA: Toolbox for Complexity Analysis 12
Simon Poon          Analytical models for Causal Complexities 12 or 18
Mobile Guardian Angel 12 or 18
Data visualisation: examining the healthiness of the food supply and the global burden of nutrition-related disease 12 or 18
A framework to model organizational Complexities in IT business value study 12 or 18
Uwe Roehm       Data Privacy Analysis of Health Tracking Services 12 or 18
A Touch Interface for SQL Databases 18
Database Cluster Management Tool 12
PowerDB: Freshness-aware Replication in a Database Cluster 12 or 18 MIT
Bio-Data Processing using Map/Reduce 12 or 18 MIT
Zhiyong Wang           Mobile and Internet Multimedia Computing 12 or 18
Remote-sensing Image Analysis 18
Human Motion Analysis (with multiple sub-tasks) 12 or 18
Medication Monitoring App (in collaboration with Jesse Jansen from the School of Public Health) 12

Project supervised by Weidong (Tom) Cai

Intelligent 3D Single Neuron Reconstruction (18cp)
The single neuron reconstruction is one of the major domains in computational neuroscience, a frontier research area intersected with signal processing, computer vision, artificial intelligence and learning theory, applied mathematics, fundamental neuroscience and quantum physics. The 3D morphology of a neuron determines its connectivity, integration of synaptic inputs and cellular firing properties, and also changes dynamically with its activity and the state of the organism. Analyzing the three-dimensional shape of neurons in an unbiased way is critical to understanding how neurons function and developing applications to model neural circuitry. Such analysis can be enabled by reconstructing tree models from microscopic image stacks by manual tracing. However such manual process is tedious and hard to scale. This project aims to develop novel computational approaches for automatic 3D reconstruction of neuron models from noisy microscopic image stacks. Such methods would enable faster and more accurate neuron models to further accumulate the knowledge of single neuron functionality and neural network connectome.

Neuroimaging Computing for Early Detection of Dementia (18cp)
Dementia is one of the leading causes of disability in Australia, and the socioeconomic burden of dementia will be aggravated over the forthcoming decades as people live longer. So far, there is no cure for dementia, and current medical interventions may only halt or slow down the progression of the disease. Therefore, early detection of the dementia symptoms is the most important step in the management of the disease. Multi-modal neuroimaging has been increasingly used in the evaluation of patents with early dementia in the research setting, and shows great potential in mental health and clinical applications. The objective of this project is to design and develop novel neuroimaging computing models and methods to investigate pattern of dementia pathology with a focus on early detection of the disease.

Context Modeling for Medical Image Retrieval (18cp)
Content-based medical image retrieval is a valuable mechanism to assist patient diagnosis. Different from text-based search engines, similarity of images is evaluated based on comparison between visual features. Consequently, how to best encode the complex visual features in a comparable mathematic form is crucial. Different from the image retrieval techniques proposed for general imaging, in the medical domain, disease-specific contexts need to be modeled as the retrieval target. This project aims to study the various techniques of visual feature extraction and context modeling in medical imaging, and to develop new methodologies for content-based image retrieval of various medical applications.

Structural Feature Representation for Image Pattern Classification (18cp)
Image pattern classification has a wide variety of applications, such as differentiation of disease patterns and detection of interest objects. The classification performance is largely dependent on the descriptiveness and discriminativeness of feature representation. Consequently, how to best model the complex visual features especially the complex structural interactions is crucial. Currently many different ways of image feature extraction have been proposed in the literature, yet their performance is still unsatisfactory and feature extraction remains a hot topic in computer vision. This project aims to study the various techniques of structural feature representation, and to develop new methodologies for various applications in the medical imaging domain.

Software Development for Analysis of Spatial Contexts in Lung Images (12cp)
Spatial contexts are essential for diagnosing and treating many types of lung diseases. The aim of this project is to develop image analysis software to effectively extract and describe the spatial contexts in the lung with user interaction. The software will be useful to assist the physicians in identifying proper treatment plan. The source images are stored in DICOM format, which is the industrial standard in diagnostic medical imaging. The spatial contexts will be analysed by adaptively dividing the lung into multiple regions and computing a set of features for each region. A reference design is available to guide the development.

Requirement: Objective-C

Project supervised by Vera Chung

A new feature selection technique for Multimedia Information Retrieval (18cp)
With the rapid increasing use of multimedia data such as audio, image and video, there is a strong demand for efficient techniques for their storage, browsing, indexing, and retrieval to exploit the full benefit of the explosive growth and application of multimedia data. This project will study and design some new feature selection techniques for the application of digital museum, entertainment, internet shopping, and medical image retrieval system.

Requirements: Good programming skills required in Java

Context-based Video Retrieval System for the Life-Log Applications (18cp)
The custom of writing a diary is common all over the world. This fact shows that many people like to log their everyday lives. However, to write a complete diary, a person must recollect and note what was experienced without missing anything. For an ordinary person, this is impossible. It would be nice to have a secretary who observed your everyday life and wrote your diary for your. In the future, a wearable computer may become such a secretary-agent. This project will study and develop a “life-log agent logs our everyday life on storage devices instead of paper, using multimedia such as small camera instead of a pencil.

Requirement: Good programming skills required in Java

A new data mining algorithm for Network Intrusion Detection System (18cp)
Network Intrusion Detection Systems (NIDS) are increasingly in demand today as the widespread of networked machines and Internet technologies emerge rapidly. As a result, many unauthorized activities by external and internal attackers within organization need to be detected in recent years. Thus, it is crucial that organizations should have the capability to detect these unlawful activities so that the integrity of organizational information can be protected. This project will develop a new data mining technique for NIDS.
Requirement: Good programming skills required in Java

Video Object tracking for video surveillance (18cp)
This project is to study object tracking algorithms, including Particle Swarm Optimization (PSO), and apply the algorithms into video tracking in visiting applications. We have real-world animal video data captured from a zoo. You will implement the object tracking algorithms for these videos for animal tracking, give performance comparisons, and customized or improve the tracking results for this kind of applications. It is also possible to improve the tracking results by combining the video data and the simultaneous RFID data available (offline).

Requirements: Good programming skills required in Java

Project supervised by Vincent Gramoli

USyd Malware Traffic Analysis (18cp)
In 2014, McAfee estimated the annual cost to the global economy from cybercrime as more than $400 billion. This includes the cost related to the stolen personal information of hundreds of millions of people.
In the 2012 report commissioned by the Australia’s National CERT about cybercrime and security survey, 20% of the surveyed companies identified cybersecurity incidents in the last year, with 21% of these incidents of type trojan/rootkit malware.

The goal of this project is to analyze the traffic at the University of Sydney using powerful multi/many-core servers connected through a 10Gbps network running Intrusion Detection and Prevention Systems to learn about the usage of malware by the machines accessing the Internet from the University campus.

The first phase of the project consists of deploying software components at the servers located between the university network and the Internet to help identifying malware threats. The second phase of the project consists of gathering accesses to infected websites. The third phase of the project consists of quantifying the threats by analyzing the collected data and drawing conclusions.

The project requires knowledge of some network technologies like tcpdump, wireshark, pcap files or suricata.

Benchmarking Concurrent Data Structures on Multi-core Machines (18cp)
For the last decade, manufacturers have increased the number of processors, or cores, in most computational devices rather than their frequency. This trend led to the advent of chip multiprocessors that offer nowadays between tens to a thousand cores on the same chip. Concurrent programming, which is the art of dividing a program into subroutines that cores execute simultaneously, is the only way for developers to increase the performance of their software.

These multicore machines adopt a concurrent execution model where, typically, multiple threads synchronize with each other to exploit cores while accessing in-memory shared data. To continue the pace of increasing software efficiency, performance has to scale with the amount of concurrent threads accessing shared data structures. The key is for new synchronization paradigms to not only leverage concurrent resources to achieve scalable performance but also to simplify concurrent programming so that most programmers can develop efficient software.

We have developed Synchrobench in C/C++ and Java [1], the most comprehensive set of synchronization tools and concurrent data structure algorithms. The implementations of concurrent data structures and synchronization techniques keep multiplying. The goal of this project is to extend the Synchrobench benchmark suite with new concurrent data structures and to compare their performance against the 30+ existing algorithms on our multi-core machines to conclude on the algorithm design choices to adopt to maximize performance in upcoming concurrent data structures.

[1] Synchrobench
[2] Vincent Gramoli. More Than You Ever Wanted to Know about Synchronization: Synchrobench, Measuring the Impact of the Synchronization on Concurrent Algorithms. PPoPP 2015.

Evaluating Consensus Protocols in Distributed Systems (18cp)
Distributed system solutions, like CoreOS used by Facebook, Google and Twitter, exploit a key-value store abstraction to replicate the state and a consensus protocol to totally order the state machine configurations. Unfortunately, there is no way to reconfigure this key-value store service, to include new servers or exclude failed ones, without disruption.

The Paxos consensus algorithm that allows candidate leaders to exchange with majorities could be used to reconfigure a key-value store as well [4]. To circumvent the impossibility of implementing consensus with asynchronous communications, Paxos guarantees termination under partial synchrony while always guaranteeing validity and agreement, despite having competing candidate leaders proposing configurations.

Due to the intricateness of the protocol [1] the tendency had been to switch to an alternative algorithm where requests are centralized at a primary. Zab, a primary-based atomic broadcast protocol, was used in Zookeeper [2], a distributed coordination service. Raft [1] reused the centralization concept of Zookeeper to solve consensus. The resulting simplification led to the development of various implementations of Raft in many programming languages.
The goal of this project is to compare a Raft-based implementation to Paxos-based implementations [3] to confirm that Paxos can be better suited than Raft in case of leader failures and explore cases where Raft could be preferable.

[1] Diego Ongaro and John Ousterhout. In search of an understandable consensus algorithm. In ATC, pages 305–319, Philadelphia, PA, 2014. USENIX.
[2] Flavio Junqueira and Benjamin Reed. ZooKeeper: Distributed Process Coordination. O’Reilly Media, Nov. 2013.
[3] Vincent Gramoli, Len Bass, Alan Fekete, Daniel Sun. Rollup: Non-Disruptive Rolling Upgrade. USyd Technical Report 699.

Project supervised by Joachim Gudmundsson

Role-assignment of football players (12cp)
Supervisors: Joachim Gudmundsson and Michael Horton
There is currently considerable interest in research for developing objective methods of analysing sports, including football (soccer). Football analysis has practical applications in player evaluation for coaching and scouting; development of game and competition strategies; and also to enhance the viewing experience of televised matches. Until recently, the analysis of football matches was typically done manually or by using simple frequency analysis. However, such analysis was primarily concerned with what happened, and did not consider where, by whom or why. Recent innovations in image processing and object detection allow accurate spatio-temporal data on the players and ball to be captured.

In this project the aim is to implement a recent method to assign roles to players during the games. That is, as players constantly change roles during a match, we want to employ a “role-based” representation instead of one based on player “identity”. This, facilitates the possibility for a deeper analysis of the process of playing football matches.

Good programming skills and good algorithmic background are required.

Knowledge of basic aspects of machine learning and linear algebra would be very useful but not essential.

This is a 12cp project but can be converted to 18cp only for students that can demonstrate a good background in algorithms and machine learning.

Dominant regions of football players (12 cp)
Supervisors: Joachim Gudmundsson and Michael Horton
There is currently considerable interest in research for developing objective methods of analysing sports, including football (soccer). Football analysis has practical applications in player evaluation for coaching and scouting; development of game and competition strategies; and also to enhance the viewing experience of televised matches. The aim of this project is to develop a fast algorithm to approximate the “dominant regions” of football players. A dominant region of a player is the region the player can reach before any other player. The general approach is to use a sampling method and a realistic model of a player’s movement.

Good programming skills and good algorithmic background are required.

This is a 12cp project but can be converted to 18cp only for students that can demonstrate a good background in algorithms.

Project supervised by Seokhee Hong

Scalable Visual Analytics

Visualization and Analysis of Large and Complex Biological Networks and Social Networks

2.5D Graph Navigation and Interaction Techniques

Drawing Algorithms for Almost Planar Graphs


MultiPlane Graph Embedding (2.5D Graph Embeddability)

Project supervised by Bryn Jeffries

The CRC for Alertness, Safety and Productivity is a collaboration between industrial and university partners to develop products that improve alertness in the workplace and during commutes. The Alertness Database project, led by Dr Bryn Jeffries in the School of Information Technologies, is developing a cloud-based database suitable for collecting and sharing the research data collected by clinical research partners, and will be the foundation for data analysis and data mining activities.

Deep Learning for Classifying Sleep Stages in EEG (18 cp)
Sleep researchers often monitor the sleep of a patient with EEG (electrical activity of the brain), and manually classify the type of sleep based upon commonly characteristics. This process is called Sleep Staging. Various automated forms of Sleep Staging have been implemented to varying degrees of success. In this project a student would develop an unsupervised sleep staging tool by applying Deep Learning (neural networks), and compare its results to those of trained experts.

PASTA is an open-source system for students to submit attempts at programming exercises, providing automated feedback based upon unit tests. It is used in many units at the School of Information Technologies and elsewhere in the University of Sydney. A number of improvements have been added with the aid of an Large Education Infrastructure grant in 2015. Several further extensions are now available as student projects. The following are listed as software design projects of 12 CP, but could be extended to research projects of 18 CP after discussion on a suitable research component.

Advanced Sandboxing (12cp)
Large assignments often involve significant bodies of code that are not readily assessed with a-priori unit tests. To support such situations, submissions need to be compiled and run in a sand-boxed environment such as small virtual machines. In this project a well-motivated student would need to survey the available technologies, choose a suitable route to implementation, and demonstrate a working extension.

Front-end enhancements (12cp)
PASTA is one amongst many similar systems that are functionally almost identical, but are often differentiated by their front-end interfaces. In this project a student would engage with users of PASTA (students and educators) to understand their needs of the graphical interface. A consistent and comprehensive new interface would then be designed and implemented.

Advanced Reporting (12cp)
Automated grading could enhance the educational experience within classes by providing sufficient information to tutors and lecturers to inform them of where people need most support. In this project a student would familiarise themselves with the database back-end of PASTA to device a set of report queries, along with front-end extensions to visualise this information. Query costs and scalability would be assessed, with appropriate optimizations added to the database.

Gamification with Active Databases (12cp)
Student engagement may be improved by incorporating “gamification” elements. In this project, as student would devise a set of badges that could be awarded to participants for various events. These would be implemented at the database level through the use of triggers and stored procedures, as a demonstration of an active database. Additional front-end elements to support the badge system would also be implemented.

Projects supervised by Jinman Kim

Medical Image Visualisation (12 or 18cp)
With the next-generation of medical imaging scanners, new diagnostic capabilities are resulting in improved patient care. These medical images are multi-dimensional (3D), multi-modality (fusion of e.g., PET and MRI) and also time-varying (e.g., 3D volumes taken over multiple time points and functional MRI). Rapid advances in GPU hardware coupled with smart image processing / rendering algorithms are allowing image visualisation techniques that can render realistic and detailed 3D volumes of the human body. Despite these developments, visualisation still rely on operator interactions to manipulate complex parameters for optimal rendering.

We have several ongoing visualisation projects at the biomedical and multimedia information technology (BMIT) research group. It involves new innovative visualisation algorithms with the use of emerging hardware devices (e.g., Oculus Rift and HoloLens, couples with Kinect/Leap motion). We seek students who can bring their own experiences and interests in visualisation to take on new projects listed below. Students will join a team of researchers and will have the opportunity to work at the clinical environment with clinical staffs / students; as well as to join the institute of Biomedical engineering and technology (BMET) (Level 5 West, School of IT Building).

Ray Profile Analysis for Volume Rendering Visualisation (12 or 18cp)
A key requirement for medical image volume rendering visualisation is to identify regions of interests (ROI), such as a tumour, within a volume such that these ROIs can be prioritised in the 3D rendering. State-of-the-art approaches rely on manual or semi-automated ‘image segmentation’ algorithms to identify the ROIs. However, such approaches are time consuming and difficult to use and thus limiting its application in the clinical setting. In this project, we will introduce a new approach to automatically identifying ROIs by using the information within the ‘ray’ in volume rendering. We propose a ‘ray profile classifier’ such that large image collection database can be used to identify patterns in the ray profile; these ROIs can then be used for improved visualisation.

MDT Visualisation (12 or 18cp)
Multidisciplinary team meetings (MDTs) are the standard of care in modern clinical practice. MDTs typically comprise members from a variety of clinical disciplines involved in a patient’s care. In MDT, imaging is critical to decision-making and therefore it is important to be able to communicate the image data to other members. However, the concept of changing the image visualisations for different members, to aid in interpretation, is currently not available. In this project, we will design and develop new MDT visualisations, where we propose the use of a novel ‘optimal view selection’ algorithm to transform the image visualisation to suit the needs of the individual team members. In this approach, a set of visual rules (via qualitative and quantitative modelling) will be defined that ensures the selection of the view that best suits the needs of the different users. Our new MDT visualisation will facilitate better communication between all the clinicians involved in a patient’s care and potentially improve patient outcomes.

Exploratory Visualisation with Virtual and Augmented Reality Devices (12 or 18cp)
Statistical analysis of medical images are providing new scientific and clinical insights into the data with capabilities of e.g., characterising traits of schizophrenia with functional MRI. Although these data, which includes images and statistics, are multi-dimensional and complex, it currently relies on traditional 2D ‘flat’ displays with mouse-and-keyboard input. Due to the constrained screen space and an implied concept of depth, they are limited in presenting a meaningful, uncluttered view of the data without compromising on preserving semantic (human anatomy) context. In this project, we will explore new emerging visualisation hardware for medical image visualisation, including virtual reality (VR) and augmented reality (AR), coupled with gesture-based inputs to create an immersive environment for visualising these data. We suggest that VR/AR can reduce visual clutter and allow users to navigate the data in a ‘natural’ way that lets them keep their focus on the exploratory visualisations.

Medical Avatar (12 or 18cp)
With the continuing digital revolution in the healthcare industry, patients are being presented with more health data than ever before, which now includes wellness and fitness data from various sensors. Current personal health record (PHR) systems do a good job of storing and consolidating this data, but are limited in facilitating patient understanding through data visualisation. One reason for this stems from the lack of semantic context (human anatomy) that can be used to present the spatial and temporal data in the PHR. Further, a lack of meaningful visualisation techniques exists in the PHR interfaces. In this project, we will design and develop a data processing and consolidation framework to visualise wide range of health data in a visual format. This will rely on the use of patient-specific anatomical atlas of the human body, which we refer to as the ‘avatar’ to be constructed from patient’s health data. The framework will also include a navigable timeline of health events. This project will build up on our existing research on web-based PHR visualisation system.

Medical Image Visual Analytics (12 or 18cp)
Biomedical imaging has revolutionised modern healthcare. There now exists a wealth of valuable knowledge that are derived from the accumulation of massive and heterogeneous image databases of patient population (knowledgebase), such as image atlases, disease models and classifications, statistical shape models of e.g., major organs, and related / similar images (inter- and intra- patient). Unfortunately, the potential within these knowledgebase to support image analysis is currently very limited due to difficulties in interacting and using the information. In this project, we propose ‘medical visual image analytic’ – to enable intuitive and efficient access to knowledgebase to support image analysis in an interactive human-computer loop. We will develop an interactive query formulation algorithm such that it can automatically formulate multiple queries to access all the various types of databases and to visualise them concurrently; current approach are to searching the databases require complex interactions that are manually-driven and time-consuming.

Projects supervised by Kevin Kuan

Emotional Contagion in Online Social Networks (18cp)
Description: The project is motivated by the recent controversial Facebook experiment on emotional contagion, in which Facebook manipulated the news feeds of nearly 700,000 users to examine if the emotion they expressed through messages on their news feeds influenced the emotion of other users as expressed in their subsequent posts (Kramer et al. 2014). To probe further into this Facebook experiment, the current project will provide an explanatory mechanism to predict and explain how emotional contagion influences users’ behavior in online social networks. The project will investigate other message variables and how they affect user behavior in online social networks using controlled experiment. As an option, student may choose to experiment with electroencephalography (EEG) technology in the project.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology; experience with Matlab for data analysis.


The Role of Emotion in Mobile Personalisation (18cp)
Description: Unlike web personalisation, mobile personalisation is highly time and location sensitive. It also involves more complicated considerations among different imperfect recommendations. This project examines different mobile personalisation factors that affect the preferences toward different imperfect recommendations. In particular, the role of emotion in these factors will be investigated using controlled experiment. As an option, student may choose to experiment with electroencephalography (EEG) technology in the project.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology; experience with Matlab for data analysis.


The Role of Emotion in Group-Buying (18cp)
Description: Group-buying sites, such as Groupon, Yahoo! Deals and LivingSocial, have emerged as popular platforms in social commerce that has received tremendous interest from both researchers and practitioners. Among the attractive features of group-buying is the deep discount offered on the deals, typically 50% or more. Another feature is that the deeply discounted deals are only available for limited times, ranging from days to weeks, with definite end times. The project investigates the emotional effects triggered by these features on purchase decisions using controlled experiment. As an option, student may choose to experiment with electroencephalography (EEG) technology in the project.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in experimenting with electroencephalography (EEG) technology; experience with Matlab for data analysis.

Online Social Influence in Group-Buying (18cp)
Description: Group-buying sites, such as Groupon, Yahoo! Deals and LivingSocial, have emerged as popular platforms in social commerce that has received tremendous interest from both researchers and practitioners. Besides the offering of products with deep discounts for a limited time, a prominent characteristic of group-buying is the heavy reliance on social influence to promote deals. This project aims to study the effects of social influence on consumer purchase decision in group-buying. Data will be collected from external sources (e.g. Groupon, Spreets, Facebook, etc.) for testing the hypotheses. The findings will provide insights on the types of information provided by group-buying sites and how they affect the purchase decision.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Interest in text mining.

Temporal Features and Consumer Evaluations of Group-Buying (18cp)
Group-buying sites such as Groupon and LivingSocial offer daily deals at large discounts of typically 50% or more. Such deals are also characterized by various temporal features. For example, these deals involve lead time before they are available for redemption. Also, such deals are available for a limited time only. This study examines how the two temporal features (lead time and deal time) of group-buying deals affect consumer evaluations.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).


Wisdom of the Crowd Effect in Electronic Commerce (18cp)
Description: The wisdom of the crowd (WoC) effect refers to the phenomenon that the aggregate of many people’s opinions tends to be more accurate than the separate individual or even expert opinions. The effect has been demonstrated by examples from stock markets, political elections, quiz shows, etc. (Surowiecki, 2004). However, there is also evidence that the WoC effect can be severely undermined depending on factors such as task characteristics, social situations, etc. This project aims to study the WoC effect in e-commerce related context. Data will be collected from Twitter and external sources for testing the hypotheses. The findings will provide insights on when and how to take advantage of the WoC effect to facilitate online decision making.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Experience with API tools (e.g. Twitter Streaming REST APIs); basic understanding in text mining.


Online Consumer Reviews in Electronic Commerce (18cp)
Description: Consumers increasingly rely on online product reviews in guiding purchases. Given the large number of reviews available on a particular product, people are unable and unlikely to go through every reviews manually, and the ability of online review sites (e.g. Yelp.com) to help users identify useful reviews becomes important. This project aims to study the review characteristics (length, tone, style, reviewer credibility, etc.) that make a review helpful. Review data will be collected from external sources (e.g. Yelp.com) and analyzed on their characteristics. The findings will provide insights to online review sites on how to automate the process of identifying useful reviews for the users.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Optional requirements: Basic understanding in text mining.


Information Overload and Online Decision Making in Electronic Commerce (18cp)
(Co-supervised with Prof. Joseph Davis)
Description: The negative consequences of information overload have been studied in a range of disciplines. Davis and Ganeshan (2009) showed that humans acquire and process relative more information under the threat of information unavailability. However, they are less satisfied with their decisions than those who acquire and process less information under no threat of information unavailability. This project extends the work on Davis and Ganeshan (2009) and investigates the impacts of information overload in the context of online decision making. Controlled experiment will be conducted to test the hypotheses. The findings will provide insights on the types of decision aids that can be provided by online sellers to their customers.

Minimum requirements: Basic web programming skills; basic understanding in statistical analysis (e.g. ANOVA, regression, etc.).

Projects supervised by Josiah Poon

TOCA: Toolbox for Complexity Analysis (12cp)
Our research group has over the years successfully developed several techniques to perform causal analysis on multi-dimensional dataset. The algorithms were not only effective, but also scalable and deterministic. It has been decided that these algorithms should be bundled together and made available online for the benefits of other researchers. The aim of this project is to develop a web-based system so that users can upload and/or store their data and to be analysed by the appropriate algorithm with certain setting. Online visualisation of the result in network graphs is desirable.

Relevant skills: Web design skills (good at both frontend and backend programming), visualization knowledge

Projects supervised by Simon Poon

Analytical models for Causal Complexities (12 or 18cp)
Understanding complex interactions amongst multiple study factors is central for a range of academic disciplines. Changing one factor may have little effect on study outcome if other factors remain unchanged. Confounding may arise through interactions of study factors. Such interactions may extend to different configurations with complex interactions amongst many seemingly unrelated factors. For example in gene expression analysis, the number of possible factors is relatively large and the number of cases is relatively small, i.e. cases to high dimensionality. Data mining techniques are often employed to extract pattern from clinical herbal prescription records, but this approach has its unique challenges in handling the issues of confounding. Conventional dimension reduction techniques used in data mining may not be adequate when the relationships of data are highly interactive. Removing important herbs may lead to severe confounding in the analysis.

The aim of this project is to develop an appropriate heuristics to derive suitable causal models to assess of the strength of interactions on study outcome. The application can be used to determine the level of effect is changed under the certain conditions (or contingent) of other studies factors, i.e. to facilitate good understanding of the diverse interactions and interrelatedness among the study factors and how they impact on certain diseases. The project will include the development and implementation of a data mining algorithm as well as to discover interaction patterns from given high dimensional datasets.

Relevant skills: Statistics/machine learning/data mining, Programming skills, R & Matlab.

Mobile Guardian Angel (12 or 18cp)
The Guardian Angel concept proposes the idea that a large, decentralised social support network can more effectively motivate individual in a connected community to collectively improve a population’s health and lifestyle habits than a traditional centralised system of a few localized hubs, e.g., health care professionals (e.g. clinicians), monitoring a large number of spokes (e.g. patients). The Guardian Angel idea is to organise patients, so that each individual has a guardian angel (or more than 1 guardians) from among the other patients (defined as children) and is assigned to be a guardian angel for someone else. The angel has the ability to engage with the other person's activities and then encourage and motivate the person to continue on a positive trajectory (to accomplish health goal). Every person tries to motivate someone and is motivated by someone else so that the health and lifestyle of the whole community improves. The question is how the dynamics of the system are dependent on features such as the size of the cohort, structure of the social network, randomness of events, and variation between individuals.

This project consists of four sub-projects:

  • Simulation Modelling: This part of the project involving using simulation methods to model the various aspects dynamics of the social systems, in particular the structure of the social networks;
  • Mobile application development: Thise part of the project would involve developing a mobile application prototype to enable the Guardian Angel Network. One aspect is to identify functionalities to encourage and motivate, and the other aspect is about user interface design
  • Mathematical Modelling: This part of the project would involve simulating possible Guardian Angel systems using differential equations or agent-based models (in collaboration with Dr. Peter Kim in the School of Mathematics and Statistics)
  • Evaluation: This part of the project would involve designing and implementing a suitable clinical trail to test and evaluation the impacts of the mobile Guardian Angel system (in collaboration with Associate Professor Clement Loy in the School of Public Health)

Relevant skills: Subject to the nature of the sub-projects

Data visualisation: examining the healthiness of the food supply and the global burden of nutrition-related disease (12 or 18cp)
The George Institute’s Food Policy Division works in Australia and internationally to reduce rates of death and disease caused by diets high in salt, saturated fat and sugar or excess energy, by undertaking research and advocating for a healthier food environment. The Group’s main focuses are food reformulation, monitoring changes in the food supply, and developing and testing innovative approaches to encourage consumers towards better food choices. A key example of innovation and collaboration is FoodSwitch, an award winning smartphone app that helps consumers to make healthier food choices.

The FoodSwitch database contains around 80,000 food and beverage products assigned to more than 600 food categories. The database contains a large amount of information about each food and beverage item, with up to 100 data points per individual product. Currently the data have a front-end “CMS” (Content Management System) where the values can be viewed by the user. Information for each product such as nutrient information, brand name, manufacturer name, ingredient lists, allergen information, serving sizes and many other pieces of data will be utilised in this project to better visualise what our food supply looks like and where policymakers should target improvements.

This project has been established to explore innovative approaches to viewing information about the healthiness of the food supply that can then be used by consumers, researchers and policy makers alike.

Relevant skills: Statistics/machine learning/data mining, business intelligence and Dashboard Development.

A framework to model organizational Complexities in IT business value study (12 or 18cp)
In order to make effective use of IT, managers require a body of knowledge and a set of evidence-based practices that can enable them to make the correct IT resource allocation decisions relating to the appropriate mix of IT inputs, their interactions, and the complementary investments needed in organisational processes. The significance of the research aims to provide such finely-grained and focussed guidelines for deciding on candidate technologies and for developing effective organisational strategies. The analytical approach developed in this project can be used to provide deeper insights into the interrelatedness of multiple factors beyond the extant IT business value research and also potentially to a wide range of disciplines in explaining complex interactions among multiple organisational factors. In this project, we will use various computational approaches to study complex organisational configurations in relation to business value impacts of IT

Skills Requirements: Good knowledge in Information Systems Concepts, Organizational Theories, Statistics and Econometrics, some knowledge in Probabilities and Set Theory

Projects supervised by Uwe Roehm

Data Privacy Analysis of Health Tracking Services (12 or 18cp)
In recent years, personal health tracking services have become very popular. Those systems collect data from personal health sensors, such as health bands, step counters or smart watches, and provide health data analysis via graphical user interfaces. Some services additionally integrate some social networking functionality, for example to share experiences or to provide additional motivation by comparing own health habits with peers. The underlying data processing infrastructure is typically cloud-based: Data is collected locally and then send to a central services that is hosted on some cloud data centers, where the processing, sharing and visualisation is done.
The goal of this project is to compare popular health tracking services with regard to their processing infrastructure from the point of view of data privacy: How is the data collected, where is it processed, and is any data disclosed to other people or even organisations?
For students interested in taking on this project as a research project, this task can be extended to include the design of a distributed health tracking service with guaranteed data privacy and anonymization functionalities.

A Touch Interface for SQL Databases (18cp)
More and more computing systems are produced with touch interfaces, from smartphones via tablets to the latest versions of desktop operating systems (Windows 8 and Max OS X). At the same time, the basic interface to database systems is still SQL, which is a text-based query language that requires keyboard input and that is hard to learn for novice users.
In our TouchQL project, we aim to develop a query 'language' that is purely based on a graphical schema representation and input gestures and that allows to query a relational database using a tablet computer.
There exists already an initial prototype of TouchQL for Android devices that supports basic selections, projections and natural joins over local databases.
The goal of this Honours project is to extend this system with a mechanism for grouping and aggregation, and also to support querying remote databases. The challenge in the later part is to provide timely feedback to the user for the intended operations as in TouchQL, there is no separation between query formulation and query execution - users shall get immediate feedback on their intended actions on the actual data set. It would be additionally beneficial if the student would be able to port TouchQL from the Java-based Android to the Objective-C based iOS.

Database Cluster Management Tool (12cp)
We have a small database cluster of 8 nodes which we use for several research projects. It is a multi-boot cluster (Linux and Windows 2003 Server) that can run different database engines, both commercial and open source, such as Oracle, Microsoft SQL Server, and PostgreSQL. We need a platform independent monitoring tool with a GUI that helps us (a) keeping track of the current cluster state and (b) reboot cluster nodes into different configurations. Ideally, it would also include a cluster allocation component to manage our research projects and allow us to use subsets of the cluster concurrently in different projects. This project shall conduct a study and review of corresponding database cluster management tools and set-up a suitable solution, eventually enriched with self-developed software components.

PowerDB: Freshness-aware Replication in a Database Cluster (12 or 18cp MIT)
This project aims to set-up a freshness-aware replication engine for a cluster of databases. It will be based on an existing cluster coordinator called PowerDB that is written in C++ and optimised for SQL Server. The student shall install, configure and optimise this version on our new database cluster running PostgreSQL. The 18cp version of the project will then in addition run some performance and scalability tests on the new system.

Bio-Data Processing using Map/Reduce (12 or 18cp MIT)
This project will investigate the suitability of a map/reduce framework for the parallel processing of DNA fragment data (so called 'short reads'). The student shall implement a short-read comparison algorithm on the database research cluster of the DBRG using the open source Hadoop system.

Projects supervised by Zhiyong Wang

Mobile and Internet Multimedia Computing (12 or 18cp) (with multiple sub-tasks)
Internet has been a platform for delivering a wide range of information. Particularly, multimedia information has dominated network traffic and it is anticipated that mobile Internet will be the next emerging focus since more and more portable devices get connected to Internet. Such wealthy information has enabled many research tasks to be conducted at an ever larger scale and in more convenient way, such as multimedia information retrieval, object recognition, security, multimedia forensics, and knowledge discovery. For example, a large number of diverse training samples can be easily fetched for more robust object recognition, compared with previous research on a small dataset. This project is to harvest the vast multimedia resources in Internet for multimedia information retrieval, computer vision, and multimedia data mining at Internet scale, to investigate secure multimedia information delivery in Internet, and to innovate more multimedia applications for Internet users. Students will enrich their knowledge on Internet, information retrieval, multimedia data processing, machine learning techniques, and data mining techniques in this project.

Remote-sensing Image Analysis (18cp)
Remote sensing images have played a key role in many fields such as monitoring and protecting our natural environment, improving agriculture, and assessing water quality. Due to the limitation of the current imaging technology, advanced image analysis techniques such as unmixing and classification are needed to better utilize remote sensing images. Meanwhile, the increasing number of massive remote sensing images demands efficient algorithms in order to support timely decision-making. This project is to investigate efficient and effective approaches to address the emerging issues in remote sensing image analysis. Students will develop strong skills in image analysis, machine learning, and data mining.

Human Motion Analysis (12 or 18cp) (with multiple sub-tasks)
People are the focus in most activities and investigating human motion such as body movements and facial movements has been driven by a wide range of applications such as visual surveillance, 3D animation, affective computing, advanced Human Computer Interaction, sports, and medical diagnosis and treatment (e.g. Autism). Under the umbrella of human-centered multimedia computing, this project is to address a number of challenge issues of this area in realistic scenarios, including human tracking, motion and facial expression detection, recognition, modeling, animation, and synthesis. Students will gain comprehensive knowledge in computer vision (e.g. object segmentation and tracking, and action/event detection and recognition), 3D modeling, computer graphics, and machine learning.

Medication Monitoring App (12cp)
In collaboration with Jesse Jansen (School of Public Health)
Monitoring medication of aging people is very important for their health. With the advancement mobile technologies, this project is to develop an iPad app to facilitate the monitoring process and the communication between patients and GPs. This project requires experiences on iOS app development and knowledge on database.