Parallel
& Distributed Computing 1. Load Balancing and Task Scheduling in
Parallel, Distributed, and Cluster Computing Environments (in collaboration with Dr. Javid Taheri,
Sydney University) Scheduling
and load balancing are two important problems in the area of parallel
computing. Efficient solutions to these problems will have profound
theoretical and practical implications that will affect other parallel
computing problems of similar nature. Little research attempted a generalized
approach to the above problems. The major problems encountered are due to the
interprocessor communication and delay because of
interdependency between the different subtasks of a given applications. The
mapping problem arises when the dependency structure of a parallel algorithm
differs from the processor interconnection of the parallel computer, or when
the number of processes generated by the algorithm exceeds the number of
processors available. This problem can be further complicated when the parallel
computer system contains heterogeneous components (e.g. different processors
and link speeds, such as in Cluster and Grid Architectures). This project
intends to investigate the development of new classes of algorithms for
solving a variety of scheduling and loadbalancing problems for static and
dynamic scenarios. 2.
Scheduling Communications in Cluster Computing Systems (in collaboration with Dr. Javid Taheri,
Sydney University) Clusters
of commodity computer systems have become the fastest growing choice for
building costeffective highperformance parallel computing platforms. The
rapid advancement of computer architectures and highspeed interconnects have
facilitated many successful deployments of this type of clusters. Researchers
in previous studies have reported that, the cluster interconnect
significantly impacts the performance of parallel applications. Highspeed
interconnects not only unveil the potential performance of the cluster, but
also allow clusters to achieve better performance/cost ratio than clusters
with traditional local area networks. Towards this end, this project aims to
study the how computations and communications influence the performance of
such systems. Applications tend to range from the computeintensive to the
communicationintensive and an understanding of such applications and how
they map efficiently onto clusters is important. 3.
Parallel Machine Learning and Stochastic Optimization Algorithms (in collaboration with Dr. Javid Taheri, Sydney University) Optimization
algorithms can be used to solve a wide range of problems that arise in the
design and operation of parallel computing environments (e.g., datamining, scheduling, routing). However, the many
classical optimization techniques (e.g., linear programming) are not suited
for solving parallel processing problems due to their restricted nature. This
project is investigating the application of some new and unorthodox
optimization techniques such fuzzy logic, genetic algorithms, neural
networks, simulated annealing, ant colonies, Tabu
search, and others. However, these techniques are computationally intensive
and require enormous computing time. Parallel processing has the potential of
reducing the computational load and enabling the efficient use of these
techniques to solve a wide variety of problems. 4. Autonomic Communications in Parallel and Distributed
Computing Systems (in collaboration
with Dr. Javid Taheri, Sydney University) The rapid advancement of computer
architectures and highspeed interconnects have facilitated many successful
deployments of many types of parallel and distributed systems. Researchers in
previous studies have reported that, the design of interconnects
significantly impacts the performance of parallel applications. Highspeed
interconnects not only unveil the potential performance of the computing
system, but also allow such systems to achieve better performance/cost ratio.
Towards this end, this project aims to study the how computations and
communications influence the performance of such parallel and distributed
computing systems. 5. Quality of Service in
Distributed Computing Systems There
is a need to develop a comprehensive framework to determine what QoS means in the context of the distributed systems and
the services that will be provided through such infrastructure. What
complicates the scenario is that the fact the distributed systems will
provide a whole range of services and not only high performance computing.
There is a great need for the development of different QoS
metrics for distributed systems that could capture all the complexity and
provide meaningful measures for a wide range of applications. This will
possibly mean that new classes of algorithms and simulation models need to be
developed. These should be able to characterize the variety of workloads and
applications that can be used to better understand the behaviour of
distributed computing systems under different operating conditions. 6.
Healing and SelfRepair in Large Scale Distributed Computing Systems As the complexity of distributed systems increases
time
there will be a need to endow such systems with capabilities that make them
capable of operating in disaster scenarios. What makes this problem very
complex is the heterogeneous nature of today’s distributed computing
environments that could be made up of hundreds or thousands of components
(computers, databases, etc). In addition, a user in one location might not be
able to have control over other parts of the system. So it is rather logical
that there is a need for “smart” algorithms (protocols) that can achieve such
an acceptable level of faulttolerance and account for a variety of disaster
recovery scenarios.
