ACCEPTED PAPERS:
(the paper ID will be required in the registration process)
HeteroPar1: Brett A. Becker and Alexey Lastovetsky,
Matrix Multiplication on Two Interconnected Processors
Abstract:
This paper presents a new partitioning algorithm to perform matrix multiplication on two
interconnected heterogeneous processors. Data is partitioned in a way which minimizes
the total volume of communication between the processors compared to more general
partitionings, resulting in a lower total execution time whenever the power ratio between
the processors is greater than 3:1. The algorithm has interesting and important
applicability, particularly as the top-level partitioning in a hierarchal algorithm that is to
perform matrix multiplication on two interconnected clusters of computers.
HeteroPar2: Bahman Javadi, Jemal H. Abawajy, Saeid Nahavandi and Mohammad K. Akbari,
Analytical Network Modeling of Heterogeneous Large-Scale Cluster Systems
Abstract:
The study of the communication networks for distributed systems is very important,
since the overall performance of these systems is often depends on the effectiveness
of its communication network. In this paper, we address the problem of networks
modeling for heterogeneous large-scale cluster systems. We consider the large-scale
cluster systems as a typical cluster of clusters system. Since the heterogeneity is
becoming common in such systems, we take into account network as well as cluster
size heterogeneity to propose the model. To this end, we present an analytical network
model and validate the model through comprehensive simulation. The results of the
simulation demonstrated that the proposed model exhibits a good degree of accuracy
for various system organizations and under different working conditions.
HeteroPar3: Antonio J. Plaza,
Heterogeneous Computing in Remote Sensing Applications: Current Trends and Future Perspectives
Abstract
Heterogeneous networks of computers (HNOCs) have rapidly become a very
promising commodity computing solution, expected to play a major role in the design of high
performance computing systems for many on-going and planned remote sensing missions.
Currently, only a few parallel processing strategies for remotely sensed image analysis are
available in the open literature, and most of them assume homogeneity in the underlying
computing platform. This paper develops several highly innovative heterogeneous parallel
algorithms for information extraction from high-dimensional remotely sensed images, with
particular emphasis on target detection and land-cover mapping applications. Analytical and
experimental results are presented in the context of a realistic application, using real data
collected by NASA’s Jet Propulsion Laboratory over the World Trade Center area in New York
after September 11th, 2001. Parallel performance of the proposed heterogeneous algorithms is
discussed using several (fully and partially) heterogeneous networks at University of Maryland,
and a massively parallel Beowulf cluster at NASA’s Goddard Space Flight Center. Combined,
these parts offer a thoughtful perspective on the potential and emerging challenges of applying
heterogeneous computing practices to remote sensing problems.
HeteroPar4: Richard L. Graham, Galen M. Shipman, Brian W. Barrett, Ralph H. Castain and George Bosilca,
Open MPI: A High-Performance, Heterogeneous MPI
Abstract
The growth in the number of generally available, distributed, heterogeneous com-
puting systems places increasing importance on the development of user-friendly tools
that enable application developers to efficiently use these. Open MPI provides sup-
port for several aspects of heterogeneity within a single, open-source MPI imple-
mentation. Through careful abstractions, heterogeneous support maintains efficient
use of uniform computational platforms. We describe Open MPI’s architecture for
heterogeneous network and processor support. A key design features of this imple-
mentation is the transparency to the application developer while maintaining very
high levels of performance. This is demonstrated with the results of several numerical
experiments.
HeteroPar5: Noriyuki Fujimoto and Kenichi Hagihara,
A 2-Approximation Algorithm for Scheduling Independent Tasks onto a Uniform Parallel Machine and its Extension to a Computational Grid
Abstract
First, this paper gives a very simple 2-approximation algorithm for scheduling n
independent tasks onto a uniform parallel machine with m processors. Best known results so far
are (1+)-approximation algorithm (0<<=1) in exponential in (1/) time and 2-approximation
algorithm based on LP-rounding technique which runs in O((m+n)3.5) time.
In contrast, the proposed algorithm runs in O(n log n+mn) time. Next, this paper
proves that, if a criterion of a schedule is total computational power consumed by the schedule,
the proposed algorithm is also a 2-approximation algorithm for a uniform parallel machine
such that processor speed varies over time. Such a parallel machine corresponds to a
so-called desktop grid.
HeteroPar6: Jacques M. Bahi, Raphaël Couturier and Philippe Vuillemin,
JaceP2P: an Environment for Asynchronous Computations on Peer-to-Peer Networks
Abstract
Using Peer-to-Peer (P2P) networks is a way to federate a large amount of processors in order to
solve large scale scientific problems. Those networks are decentralized, highly dynamic and composed
of heterogeneous machines. The goals of our work is to compute large scale scientific iterative appli-
cations on P2P networks. We propose JaceP2P, a multi-threaded Java based library designed to build
asynchronous parallel iterative applications. Using this library, it is possible to run such applications on
a set of dynamic and heterogeneous machines organized in a decentralized and P2P fashion.
HeteroPar7: Euloge Edi, Tahar Kechadi and Ronan McNulty,
Virtual Structured P2P Network Topology for Distributed Computing
Abstract
P2P and Grid computing are two paradigms more and more used in today computing
environments; their potential to provide better quality of service to users is very promising compared
to the cost it involves. This paper presents a hierarchical virtual network topology, built on top
of the real existing one and which is used to manage distributed resources in Grid environment.
The distributed resources are found on the Web following P2P techniques; and so they are very
volatile. The virtual topology constructs an efficient and robust virtual machine which will serve as
a distributed computing platform. This topology is called TreeP and it is exploited in DGET [7];
a data-grid middleware environment. Here, we study this virtual topology both theoretically and
experimentally. We show that this topology is very scalable, robust, load-balanced, and easy to
construct and maintain.
HeteroPar8: Pedro Alonso, Alexey Lastovetsky and Antonio M. Vidal,
A Parallel Algorithm for Solution of the Deconvolution Problem on Heterogeneous Networks
Abstract
In this work we present a parallel algorithm for solution of a given least squares problem with structured
matrices. This problem arises in many applications mainly related to digital signal processing. The parallel algo-
rithm is designed to speed up the sequential one on heterogeneous networks of computers. The parallel algorithm
follows the HeHo strategy and is implemented with the recently developed HeteroMPI programming environment.
The results obtained validate HeteroMPI as a very useful tool for programming heterogeneous parallel algorithms.
HeteroPar9: Clovis Dongmo Jiogo, Pierre Kuonen and Pierre Manneback,
Well balanced sparse matrix-vector multiplication on a parallel heterogeneous system
Abstract
This paper discusses well balanced implementations of sparse
matrix-vector multiplication on heterogeneous environments. A new heuris-
tic is proposed for balancing the computing load over the processors
proportionally to their power. This is done by defining a redistribution
model which splits the sparse matrix in k-way partitions, in order to mi-
nimize the total execution time. A implementation of the sparse matrix
vector multiplication in heterogeneous environment using parallel ob ject-
oriented programming model POP-C++, shows that this 1D-partitioning
heuristic improve greatly the performances of the product, in comparison
with block row decomposition.
HeteroPar10: Sascha Hunold, Thomas Rauber and Gudula Rünger,
TGrid – Grid runtime support for hierarchically structured task-parallel programs
Abstract
In this article we introduce a grid runtime
system called TGrid which is designed to run hierar-
chically structured task-parallel programs on heteroge-
nous environments and can also be used for common
component-based grid programming. TGrid is a location-
aware runtime system which means the system keeps track
of the placement of tasks on the grid. This enables better
scheduling strategies and heuristics since the system is able
to determine each processor’s position in the grid and so,
the spatial locality leads to better performance due to less
network overhead.
HeteroPar11: J. Díaz, S. Reyes, A. Niño and C. Muñoz-Caro,
A Quadratic Self-Scheduling Algorithm for Heterogeneous Distributed Computing Systems
Abstract
Scheduling algorithms play an important role in heterogeneous computing systems. Development of
new scheduling strategies is an active research field. In this context, we present a general formulation
of the self-scheduling problem, deriving a new, quadratic, self-scheduling algorithm. Initial tests
comparing the performance of the new algorithm against well-established ones are carried out. Thus,
working at the application level, we allocate sets of several thousand tasks in an Internet-based Grid
of computers that involves a transatlantic connection. In all the tests, the new algorithm performs
better than the previous ones.
HeteroPar12: I. Riakotakis, F. M. Ciorba, T. Andronikos and G. Papakonstantinou,
Self-Adapting Scheduling for Tasks with Dependencies in Stochastic Environments
Abstract
This paper refers to dynamic load balancing algorithms for non-dedicated heteroge-
neous clusters of workstations. We propose an algorithm called Self-Adapting Scheduling
(SAS), targeted at nested loops with dependencies in a stochastic environment. This means
that the load entering the system, not belonging to the parallel application under execution,
follows an unpredictable pattern which can be modeled by a stochastic process. SAS takes
into account the history of previous timing results and the load patterns in order to make
accurate load balancing decisions. We study the performance of SAS in comparison with
DTSS. We established in previous work that DTSS is the most efficient self-scheduling
algorithm for loops with dependencies on heterogeneous clusters. We test our algorithm
under the assumption that the interarrival times and lifetimes of incoming jobs are expo-
nentially distributed. The experimental results show that SAS significantly outperforms
DTSS especially with rapidly varying loads.
HeteroPar13: Wahid Nasri, Hajer Hamad and Hadhemi Fejjari,
A Framework for Adaptive Communication Modeling on Heterogeneous Hierarchical Clusters
Abstract
Today, due to the wide variety of existing parallel systems consisting on collections of
heterogeneous machines, it is very difficult for a user to solve a target problem by using a
single algorithm or to write portable programs that perform well on multiple
computational supports. The inherent heterogeneity and the diversity of networks of such
environments represent a great challenge to model the communications for high
performance computing applications. Our objective within this work is to propose a
generic framework based on communication models and adaptive techniques for dealing
with prediction of communication performances on based-clusters hierarchical
platforms. Toward this goal, we introduce the concept of poly-model of communications
that corresponds to techniques to better model the communications in terms of the
characteristics of the hardware resources of the target parallel system. We apply this
methodology on collective communication operations and show that the framework
provides significant performances while determining the best combination model-
algorithm depending on the problem and architecture parameters.