Parallel Functional Programming

This page describes each lecture and contains links to related materials.

Course Introduction

This is the lecture that sets the scene for the course. It explains why functional languages are well suited for parallel programming, and this despite the fact that they really failed to deliver on the promise of solving parallel programming in the past. So why are they interesting now?

Slides:

The slides

Reading:

Note that we distinguish between parallel and concurrent programming. Our undergraduate curriculum at Chalmers has arguably overemphasized the latter (locks, semaphores, synchronisation mechanisms and the like) and underemphasized the business of making programs run faster by using many cores (see this discussion by Simon Marlow (previously Microsoft Research, now Facebook)), who makes distinctions that exactly mirror our view). Similar opinions were later expressed by Bob Harper, who has radically reshaped the introductory computing curriculum at CMU to place parallelism at the centre — a major source of inspiration for this course.

Simon Marlow's book on Parallel and Concurrent Programming in Haskell gives a good explanation of why the topics of this course are interesting. It also makes the same distinction between concurrency and parallelism as that made in this course. We consider only Part I on parallelism in the Haskell part of the course. We will simply call the book PCPH.

Be inspired by this video of Simon Peyton Jones lecturing on parallel programming in Haskell
The three papers listed on the second last slide of the first lecture are
- Haskell on a Shared-Memory Multiprocessr, Harris, Marlow and Peyton Jones, Haskell'05
- Feedback Directed Implicit Parallelism, Harris and Singh, ICFP'07
- Runtime Support for Multicore Haskell, Marlow, Peyton Jones and Singh, ICFP'09
Make sure to read the last of these.

from par and pseq to Strategies

This lecture considers par and pseq more critically, and concludes that it might be a good idea to separate the control of behaviour relating to parallelisation from the description of the algorithm itself. The idea of Strategies is described in a well-known paper called Algorithms + Strategies = Parallelism by Trinder, Hammond, Loidl and Peyton Jones. More recently, Marlow and some of the original authors have updated the idea, in Seq no more: Better Strategies for Parallel Haskell. We expect you to read both of these papers. The lecture is based on the newer paper. See also PCPH chapters 2 and 3.

Slides: * The slides

Other Material:

The documentation of the Strategies Library is very helpful.

exercise session on parallelising Haskell

Code:

Haskell file

The Par Monad

This lecture is about a programming model for deterministic parallelism, introduced by Simon Marlow and colleagues. It introduces the Par Monad, a monad for deterministic parallelism, and shows how I-structures are used to exchange information between parallel tasks (or "blobs"), see Marlow's Haskell'11 paper with Ryan Newton and Simon PJ. You should read this paper.

Take a look at the I-Structures paper referred to in the lecture (not obligatory but interesting). See PCPH chapter 4.

Also, Phil Wadler's "Essence of Functional Programming" is a very interesting read, and it covers monads and continuation passing style.

The lecture starts with a presentation by Koen Claessen on his Poor Man's Concurrency Monad (see his JFP Pearl).

Slides:

The version of the Par monad that Max has made that allows you to draw pictures of your programs is at

VisPar

Data Parallel Programming I

This lecture is all about Guy Blelloch's seminal work on the NESL programming language, and on parallel functional algorithms and associated cost models.

Material:

The best introduction is to watch the video of his marvellous invited talk at ICFP 2010
A page about NESL (including interactive tutorial, papers and information about applications)

Reading:

To read about Work and Depth, start with this page. Note that there will be an exam question about calculating work and depth of a NESL program. So study examples!
We advise reading the whole of Blelloch's Programming Parallel Algorithms, CACM 39(3).

Slides:

The slides

Andrzej Filinksi's NESL interpreter, which calculates work and span (or depth) for you:

Many thanks to Andrzej for letting us use his tool.

Data Parallel Programming II

We continue to discuss work and span as a means to analyse parallel algorithms (as used in NESL).

We briefly present some details of (and some non-idiomatic programming in) Repa (a library for data parallel programming in Haskell).

Finally, we consider some open research questions (in Mary's biased opinion).

Material:

Chapter 5 of Marlow's book is about Repa
The Repa home page
The third Repa paper: Guiding Parallel Array Fusion with Indexed Types

Slides:

The slides

Parallel Functional Programming in Java (Peter Sestoft)

It has long been assumed in academic circles that functional programming, and declarative processing of streams of immutable data, are convenient and effective tools for parallel programming. Evidence for this is now provided, paradoxically, by the object-imperative Java language, whose version 8 (from 2014) supports functional programming, parallelizable stream processing, and parallel array prefix operations. We illustrate some of these features and use them to solve computational problems that are usually handled by (hard to parallelize) for-loops, and also combinatorial problems such as the n-queens problem, using only streams, higher-order functions and recursion. We show that this declarative approach leads to very good performance on shared-memory multicore machines with a near-trivial parallelization effort on this widely used programming platform. We also highlight a few of the warts caused by the embedding in Java. Some of the examples presented are from Sestoft: Java Precisely, 3rd edition, MIT Press 2016.

Slides:

The slides

Data Parallel Programming in Futhark (Troels Henriksen, DIKU, Copehnhagen University)

Functional programming is intuitively a great fit for massively parallel programming, and Blelloch's seminal work on NESL appeared to show exactly how to transform nested data parallelism into efficient flat parallelism. So how come we are not all using functional languages to program our GPUs? In this lecture we will discuss the gap between simply expressing a lot of parallelism, and actually obtaining good performance on real hardware. I will present the design and implementation of the Futhark programming language, which has been carefully restricted (compared to NESL) to permit the construction of an aggressively optimising compiler. We will look at how close Futhark gets to the dream of automatically transforming high-level hardware-agnostic functional parallel code into efficient low-level code, and how to tailor one's parallel programming style to the restrictions imposed by Futhark.

Recommended reading (in decreasing order of relevance):

Parallel Programming in Erlang

This lecture introduces Erlang for Haskell programmers), taking parallelising quicksort as an example, both within one Erlang VM and distributed across a network. The latest version of the Erlang system can be downloaded from here. There is a Windows installer. Many linux versions have an Erlang packagage available, but not necessarily a package suitable for development of Erlang code, and not necessarily the latest version. On Ubuntu, try

sudo apt-get install erlang-dev

If that doesn't work or you can't find an appropriate package, build the VM from source.

Slides:

The slides

Robust Erlang

This lecture focusses on the fault tolerance constructs in Erlang--links and system processes--and the motivation for the "Let It Crash" philosophy. It introduces supervision trees and the Open Telecoms Platform, and develops a simple generic server.

Slides :

The slides

Exercise session on parallel programming in Erlang

An introduction to programming in Erlang.

Single Assignment C — Functional Programming for HP^3 (Sven-Bodo Scholz)

SaC is designed to combine High-Productivity with High-Performance and High-Portability. The key to achieving this goal is a purely functional core of the language combined with several advanced compilation and runtime techniques. This lecture gives an overview of the key design choices that SaC is based upon and it sketches how these can be leveraged to producing codes for various heterogeneous many-core systems that often outperform hand-written low-level counterparts.

Slides:

The slides

Erlang Parallel Search

QuickCheck finds faults in software by testing properties in a large number of random test cases. Since test cases are independent of each other, there is scope to speed up fault-finding dramatically using parallelism. But realising those speed-ups in practice requires a careful choice of architecture and some surprising trade-offs. In this lecture I'll tell this story.

Slides:

The slides

Map Reduce

Google's Map-Reduce framework has become a popular approach for processing very large datasets in distributed clusters. Although originally implemented in C++, it's connections with functional programming are close: the original inspiration came from the map and reduce functions in LISP; MapReduce is a higher-order function for distributed computing; purely functional behaviour of mappers and reducers is exploited for fault tolerance; it is ideal for implementation in Erlang. This lecture explains what Map-Reduce is, shows a sequential and a simple parallel implementation in Erlang, and discusses the refinements needed to make it work in reality.

Reading:

One of
MapReduce: Simplified Data Processing on Large Clusters, the original paper from 2004.
MapReduce: Simplified Data Processing on Large Clusters, a retrospective published in CACM in 2008.

Yes, both papers have the same title (and the same authors). What can you do?

Slides:

The slides

Databases in the New World

No-SQL databases have become very popular for the kind of scalable applications that Erlang is used for. In this lecture, we introduce the mother of them all, Amazon's Dynamo, and one of its descendants -- Riak, implemented in Erlang by Basho Technologies. We discuss scalability, the CAP theorem, eventual consistency, consistent hashing and the ring, and the mechanisms used to detect, tolerate, and repair inconsistency.

Reading:

The key reference is the Dynamo paper.
For amusing and informative background reading, check out The Network is Reliable (yeah right).

Parallel Functional Programming – Lecture content	DAT280 / DIT261, LP4 2018
Home \| Schedule \| Labs \| Lectures \| Exam \| About	Fire \| Slack \| TimeEdit \| Links

Parallel Functional Programming – Lecture content	DAT280 / DIT261, LP4 2018
Home \| Schedule \| Labs \| Lectures \| Exam \| About	Fire \| Slack \| TimeEdit \| Links

Course Introduction

from par and pseq to Strategies

exercise session on parallelising Haskell

The Par Monad

Data Parallel Programming I

Data Parallel Programming II

Parallel Functional Programming in Java (Peter Sestoft)

Data Parallel Programming in Futhark (Troels Henriksen, DIKU, Copehnhagen University)

Parallel Programming in Erlang

Robust Erlang

Exercise session on parallel programming in Erlang

Single Assignment C — Functional Programming for HP^3 (Sven-Bodo Scholz)

Erlang Parallel Search

Map Reduce

Databases in the New World

Parallel Functional Programming in Erlang at Klarna (Richard Carlsson)

The Erlang Virtual Machine (Erik Stenman)