Workshop on Parallel Functional Programming

University College London, September 8, 1998

Abstracts

Looking for Eden in the land of parallel-functional languages

Yolanda Ortega and Ricardo Peña
Universidad Complutense de Madrid

1 A classification of parallel-functional languages

The exploitation of parallelism is a long pursued --and not yet convincingly met-- goal in programming. It appears to be a contradiction between the efficient exploitation of parallelism and the simplicity of the corresponding programs: the more control the language has on process management, communication and synchronization aspects, the more complex and longer, and the less amenable for reasoning, are the resulting programs. Imperative parallel programming is a good example of this assertion.

It is well known that functional languages offer, in principle, good opportunities for parallelism exploitation due to the freedom they present in the evaluation order of their subexpressions. In some sense, the implicit parallelism is too much. If we try to exploit all of it, then we get a big number of very low granularity parallel activities, in such a way that the benefits of parallelism are lost in creating and communicating processes. For this reason, most of the approaches rely on the programmer to decide which expressions deserve the effort of creating a parallel process for their evaluation. The differences between these approaches fall mainly in the degree of explicitness they consider to be the appropriate one to deliver this information. From less to more explicitness, we classify the proposals into the two following groups:

Transformational languages

In a parallel transformational system some inputs are transformed into some outputs functionally depending on them. The whole purpose of parallelism is to speed up the computation. The programmer supplements a purely functional program with special expressions, either written as annotations interspersed in the text or provided as specialized wiring functions, that directs the compiler about where and when processes should be created. The semantics of the program with these specialized expressions is (almost) the same as the semantics of the program without them. The only difference is the order of evaluation of the subexpressions.

We can further refine this group into two subgroups: annotated languages --we classify here languages such as Concurrent Clean [NSEP91] and Glasgow parallel Has kell [THJP96]--, and skeleton languages --here we include most of the work on skeletons [Bra93, DFH⁺93] and the language Caliban [Kel89].

Reactive languages

Reactive systems are the opposite to transformational ones: usually for them there are no clear notions of inputs and outputs or even of termination and the whole purpose of parallelism is to maintain a set of separate tasks interacting with an external environment. Of course, reactive constructs can also be used for the programming of parallel transformational systems but the set of possible systems is wider than in the previous group. Non determinism unavoidably appears in these systems and the referential transparency of functional languages is lost.

Typically, languages in this group offer constructs not only for the creation of processes but also for communicating and synchronizing them. In some sense, the resulting languages appear to be a more or less successful combination of two languages: a functional one and a coordination one. We consider to belong to this group languages such as FACILE [GMP90],Concurrent ML [Rep91], Erlang [Wik94], and Concurrent Haskell [PJGF96].

2 Eden, a compromise

Eden extends the functional language Haskell with constructs for explicit process management. Detailed information about the language and its implementation can be found in the presentation of this proceedings.

The first construct offered by Eden is the type Process a b of process abstractions mapping values of type a into values of type b. Process abstractions are semantically equivalent to lambda abstractions the only difference being that, when instantiated with some inputs, a process abstraction gives rise to a parallel activity or actual process. Process abstractions are first class values and, as such, they can be communicated in messages, used as arguments or results, and stored in data structures. This facilitates the programming of generic process topologies or skeletons. By only having process abstractions, Eden could be classified in the group of transformational languages: interpreting the type Process a b as synonymous of type a -> b, we get a semantically equivalent sequential program written in pure Haskell. But it does not clearly fit in any of the two subgroups of this group. Perhaps a third subgroup ``languages with processes as first class values'' should be created for it.

The next construct is the predefined process abstraction merge:: Process [[a]] [a] whose instances merge, in a nondeterministic and fair way, a list of lists into a single list. This is the only source of nondeterminism in the language but it gives enough expressive power to program reactive systems. With merge, Eden could be better classified into the reactive languages group. Note however that a lot of features of coordination languages such as forking processes, sending and receiving messages or establishing channels between processes are missing in Eden. Process creation is explicit but communication is implicit. It suffices to apply a process abstraction to the output of an instantiated process to get both processes communicating.

The evaluation model of Eden programs is a compromise between lazy and eager strategies. Most of the language is lazy. Eager evaluation is used in the main expressions of processes --i.e. processes produce their outputs without needing a previous demand for them--, and in process instantiations --i.e. processes are instantiated as soon as possible without needing demand for their outputs. List communication is stream based: every element of the list is eagerly evaluated to normal form but the whole list is transmitted element by element. These approaches speed up the distribution of the computation and allow to better exploit parallelism.

Eden design decisions are aimed to achieve a reasonable degree of control by the programmer to efficiently exploit parallelism, but avoiding at the same time to go into very low level coordination constructs which will make programs longer and more difficult to understand. We can say that Eden is a compromise between several extreme alternatives --implicitness/explicitness, lazy/eager strategies, determinism/nondeterminism, etc.-- existing in the design space.

References

Bra93: T. A. Bratvold. A Skeleton-Based Parallelising Compiler for ML. In Proceedings of the fifth International Workshop on the Implementation of Functional Languages, pages 23-33, September 1993.
DFH⁺93: J. Darlington, A. J. Field, P. J. Harrison, P. H. J. Kelly, D. W. N. Sharp, and Q. Wu. Parallel Programming using Skeleton Functions. In PARLE'93: Parallel Architectures and Languages Europe. Springer-Verlag, Jun. 1993.
GMP90: A. Giacalone, P. Mishra, and S. Prasad. Operational and algebraic semantics for Facile: A symmetric Integration of concurrent and functional programming. In ICALP 90, LNCS 443. Springer, 1990.
Kel89: P. Kelly. Functional Programming for Loosely-Coupled Multiprocessors. Pitman, 1989.
NSEP91: E. Nöcker, J. Smetsers, M. Van Eekelen, and M. Plasmeijer. Concurrent Clean. In LNCS 494, pages 202-219. Springer-Verlag, 1991. TAPSOFT'91.
PJGF96: S. Peyton-Jones, A. Gordon, and S. Finne. Concurrent Haskell. In ACM Symp. on Principles of Prog. Lang. POPL'96. ACM Press, 1996.
Rep91: J. H. Reppy. CML: A Higher-order Concurrent Language. In ACM SIGPLAN Conference on Prog. Lang. Design and Implementation, pages 293-305. ACM Press, 1991.
THJP96: P. W. Trinder, K. Hammond, J. S. Mattson Jr., and A. S. Partridge. GUM: a portable parallel implementation of Haskell. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Philadelphia, USA. ACM Press, May 1996.
Wik94: C. Wikström. Distributed programming in Erlang. In Hoon Hong, editor, PASCO'94: First International Symposium on Parallel Symbolic Computation, pages 412-421. World Scientific Publishing Company, Sept 1994.

Abstract, declarative control over partitioning in parallel functional programs: experiences with Caliban

Paul Kelly
Imperial College, London

For the general case, where good automatic algorithms don't apply, partitioning a parallel computation is itself a programming problem. To be portable and reusable, the parallel program itself must incorporate partitioning rules, which may take into account the size and performance characteristics of the machine and the structure of the problem.

Explicit mechanisms (spark, par, future, ParAlfl) basically partition the program as a side-effect of its execution. How can we partition declaratively? How can we express how the executing program will be partitioned separately from the program's execution?

This was the motivation behind the Caliban language design.

In Caliban, programs are annotated with an assertion in the form of a graph, indicating which expressions are executed together on separate processors, and where they communicate.

The interesting thing about Caliban is that these assertions can be represented in the program as expressions, using functions and types used in the application. We can use application-specific higher-order operators to build both the computation and the process network.

The Caliban compiler, reported in Frank Taylor's PhD thesis, symbolically evaluates the assertions at compile time. The process network cannot, therefore, change during execution but it can be derived automatically for different machines and problem instances by recompilation.

Caliban has proven a difficult language to implement (it took 10 years!). The experiment has led to some interesting conclusions, not all positive. In particular:

Lazy evaluation can easily lead to very poor performance for very counter-intuitive reasons
Compilation time depends both on symbolic evaluation time and, worse, on the overall size of the annotation. This is at least the size of the process network, but it is much larger if many small tasks are bundled onto each processor. Some care is needed to keep this under control.

Caliban has an intriguing relationship with more recent approaches, in particular HPF (it generalises data partitioning, so it can be user-defined on user-defined data types), P3L (the allocation of resources to skeletons can be programmed in the annotation to ensure, for example, that pipelines are balanced), and GUM (partial evaluation to uncover evaluation strategies, and assign the resulting tasks to processors).

More or Less Explicit Parallelism in Concurrent Clean

Pascal Serrarens
University of Nijmegen

Recently, a distinction has been made between parallel programs and concurrent programs in Concurrent Clean. Parallel programs have the same semantics as their sequential counterparts, so there is no explicit message passing and non-determinism. Concurrent programs have a different semantics, because explicit message passing and non-determinism are used. Both programming models are now provided in Concurrent Clean.

Process placement, which is an issue in both models, is becoming more explicit. Instead of single processors, we will work with processor structures, which are dynamic and thus may change during the lifetime of a program. They give a better control over the processors chosen to solve a certain problem, while it is still possible to write programs independent of the machine configuration used to execute them.

A Parallel Functional Modular Language

Cyrus F. Nourani

Techniques and a language for modular software design are presented applying software agents. The conceptual designs are domain independent and make use of specific domain aspects applying Multiagent AI. The stages of conceptualization, design and implementation are defined by new techniques coordinated by objects. Software systems are designed by knowledge acquisition, specification, and multiagent implementations. Multiagent implementations are defined for the modular designs, applying our recent projects which have lead to fault tolerant AI systems. A new high level concurrent syntax language is applied to the designs. A novel multi-kernel design technique is presented. Communicating pairs of kernels, each defining a part of the system, are specified by object-coobject super-modules. Models and a language constructs are defined for coordination and implemented by software agents. Interaction among problem solving paradigms are via objects, multiagent AI, and design language coordination.

Keywords: Programming Design with Software Agents, Multiagent AI Techniques, Multi Kernel Designs, Object Level Design, Abstract Multiagent Implementations, Coordination Languages, Modular Programming, Stage-Composable Modules, Explicit/Implicit Parallelism, Nondeterministic Agent Computing, Abstract Communication and Synchronization Paradigms

Parallel and Distributed Computing with Haskell

Manuel M. T. Chakravarty
Institute of Information Sciences and Electronics
University of Tsukuba

High-performance SMP servers, clustered workstations, and modern supercomputers increasingly share the same architectures; at the same time, we see that distributed computing and parallel computing are converging. It is clear that this development has a strong impact on programming models and language design for networked and high-performance machines. This talk shows that the language Goffin, an extension of Haskell for parallel programming, can be further extended such that many parallel and distributed applications can be implemented in the same framework. Furthermore, we compare this approach to more conservative approaches to parallel and distributed programming in Haskell that are based on exploiting implicit parallelism and monads, respectively.

Strategic SPMD

Kevin Hammond
University of St. Andrews

While SPMD parallelism is an important emerging paradigm, it has hitherto been unclear how this can be incorporated in the evaluation strategy model, a new model of parallelism which provides a clean separation of behaviour and control for a wide variety of parallel paradigms. In this talk I will discuss how SPMD can be integrated with evaluation strategies to provide support for this paradigm in a purely functional setting without introducing new control constructs in the functional language.

Parallel Programming with Data Distribution Algebras

Thomas Nitsche and Mario Südholt
Technische Universität Berlin

Full Paper in ps format

This article presents a method that facilitates parallel programming by enabling abstract descriptions of the distribution of data structures by means of overlapping covers that form data distribution algebras. Algorithms are formulated and derived by transformation in a functional environment using skeletons, i.e. higher-order functions with specific parallel implementations. Communication is specified implicitly through the access to overlapping parts of covers.

DIGRESS - an object-oriented design for "plug-and-play" distributed-memory graph reduction

Chris Clack
University College London

DIGRESS (Distributed Graph Reduction Experiment Support System) is an experimental PFP system designed to support research into DMGR implementations. An object-oriented class hierarchy has been designed which can be used to produce a number of different DMGR implementations using different runtime-system components. For example, a new task or graph representation can be installed whilst still utilising the rest of the runtime-system (the communications infrastructure, the garbage collector, the task distribution system, etc.). The development of the class hierarchy has forced attention to be focussed on the degree of cohesion required between different components of the runtime-system. Obviously, performance is important but it is not the primary design goal for DIGRESS - rather, DIGRESS is a tool for exploring the runtime-system design space.

Architecture-independence of Glasgow Parallel Haskell (GpH)

Phil Trinder
Heriott Watt University, Edinburgh

In principle, pure functional languages promise straightforward architecture-independent parallelism. This talk reports on an investigation into the validity of this claim in the context of our highly-portable implementation of an implicitly-parallel functional language: the GUM implementation of Glasgow Parallel Haskell (GPH). We discuss architecture independence at two levels: low-level (i.e. the implementation) and high-level (i.e. the programmer).

Low-level architecture independence is achieved by chosing a message-passing model for GUM, and implementing it using portable C and a widely-supported message-passing library like PVM. In fact GUM is largely independent of the message-passing library, and has been adapted to use MPI and the CM-5 CMMD libraries as well as PVM. As a result, GUM is easily ported, and is currently available on seven platforms including shared-memory machines, distributed-memory machines, and networks of workstations. We provide indicative measurements of how efficient and effective our architecture-independent runtime system is across a range of architectures.

The GPH programming model provides higher-level architecture independence. The parallelism in GPH is mainly implicit, and hence relatively small parts of the program need to be changed for a new architecture. The coordination that is required is expressed with a new high-level construct, evaluation strategies. Evaluation strategies provide a clean separation between algorithm and coordination, easing the task of changing either facet to suit new parallel environments. Moreover, GPH programs can systematically be developed for multiple target architectures, using a suite of simulation, profiling and visualisation tools. Much of the development is architecture-independent but, once a particular target architecture has been selected, the tools are parameterised to support tuning for that architecture. We demonstrate the systematic development of two real programs to the point of achieving good speedups on four architectures.

From (sequential) Haskell to (parallel) Eden:
An Implementation Point of View

Silvia Breitinger, Rita Loogen, and Ulrike Klusik
Philipps-Universität Marburg

The explicitly parallel programming language Eden adds a coordination level to the lazy functional language Haskell. In this talk we show how to extend the Glasgow Haskell compiler (GHC), a publicly available fast implementation of Haskell, for Eden's parallel implementation. We describe the compilation of Eden and present a lean parallel runtime system which implements DREAM, the DistRibuted Eden Abstract Machine. In the Eden parallel runtime system the process structures underlying parallel algorithms can be modelled exactly, i.e. processes communicate directly using the communication channels defined in the Eden program and not using a (virtual) shared memory or global address space. This simplifies the whole memory management including garbage collection. Several parts of the GUM (Graph reduction for a Unified Machine model) runtime system for Glasgow parallel Haskell programs have been reused for our parallel system.