This is the lecture that sets the scene for the course. It introduces what we mean by parallelism, and distinguishes that from concurrency. It explains why functional languages are well suited for parallel programming, and this despite the fact that they really failed to deliver on the promise of solving parallel programming in the past. So why are they interesting now?
randomInts = take 200000 (randoms (mkStdGen 211570155)) :: [Integer](It is hidden by a callout in the pdf of the relevant slide.)
You should submit your code using the Fire system. This is for fun, so you may form any group you like. We suggest that a single person submits the code (with comments about who has worked on it).
The submitted entries and John's slides giving the results are available in this zipped file.
On Simon Marlow's home page, you can find notes and slides from his CEFP course on Parallel and Concurrent Programming in Haskell. The notes give a good explanation of why the topics of this course are interesting. They also make the same distinction between concurrency and parallelism as that made in this course. Later in the course, Simon himself will come and tell us about his worhttp://community.haskell.org/~simonmar/par-tutorial.pdf"> on the Par Monad.
This lecture considers par and pseq more critically, and concludes that it might be a good idea to separate the control of behaviour relating to parallelisation from the description of the algorithm itself. The idea of Strategies is described in a well-known paper called Algorithms + Strategies = Parallelism by Trinder, Hammond, Loidl and Peyton Jones. More recently, Marlow and some of the original authors have updated the idea, in Seq no more: Better Strategies for Parallel Haskell. We expect you to read both of these papers. The lecture is based on the newer paper.
See above for papers
The documentation of the Strategies Library is very helpful.
Andres expanded on some of the topics from the first two lectures, emphasising the need to think about lazy evaluation (and degrees of evaluation) when parallelising programs. He showed some pitfalls of parallel programming through small examples, but based on his experience of parallelising larger programs, gained through his work on the Parallel GHC project. He went more deeply into how to use Threadscope to debug parallel Haskell programs, and into the associated GHC event system. (Well-Typed have made significant improvements to Threadscope and the event system recently.) He ran his examples on an 8 core machine in Leipzig and we looking on enviously.
One thing we hope to do at the end of the course is to compile a list of suggestions for useful improvements to Threadscope. So keep a note of your ideas along these lines.
You would be well advised to study the code and to try some of the exercises.
The lecture was an example of what we call "hearing it from the horse's mouth". It was about a new programming model for deterministic parallelism. Simon introduced the Par Monad, a monad for deterministic parallelism, and showed how I-structures are used to exchange information between parallel tasks (or "blobs"), see his Haskell'11 paper with Ryan Newton and Simon PJ. Simon showed some nice examples of how to program in this style. He showed an example of pipeline parallelism (using streams) where the dynamic behaviour is simply not possble to express using Strategies. (This example is very new and is not in the lecture notes mentioned below.) At the end of the lecture, he briefly showed the kmeans problem that is described in his lecture notes. Students began work on parallelising it, but someone else had booked the room for 17.00, so we all had to leave. Students on the course are advised to complete the parallelisation of the kmeans example (and Simon has provided a solution, which we may well return to in the next lecture).
The stream example is in code/euler35
Relating par, strategies and the Par Monad. Shows the k-means example in detail. Also proposes some possible topics for student presentations. For reading and code, see the previous lecture. The lecture also contains another programming challenge.
Jost discussed skeletons as a means to structure parallel computations -- viewing skeletons as higher order functions. He distinguished three types of skeletons: small scale skeletons (like parMap), process communication topology skeletons, and proper algorithmic skeletons (like divide and conquer). He briefly discussed the Eden language. The standard way of using Eden requires a modified GHC runtime. Jost promises that it will soon be possible to download the necessary Eden modules and skeleton library from the Eden home page. It is also possible to use the Eden skeletons on top of a simulation of the modified runtime, built on Concurrent Haskell -- thus using Eden as a library with an unmodified GHC runtime. This also provides decent speed-ups. More details of this shortly.
This lecture is all about Guy Blelloch's seminal work on the NESL programming language and on parallel functional algorithms and associated cost models. The best introduction is to watch the video of his marvellous invited talk at ICFP 2010, which John and Mary had the pleasure to attend. There are pages about NESL and about his publications in general. For the notions of work and depth, see this part of the 1996 CACM paper, and also this page, which considers work and depth for three algorithms.
Data parallel programming using the Repa library, which gives flat data parallelism. The main source is the Repa paper from ICFP 2010. Thanks to Ben Lippmeier for letting me borrow some slides.
This lecture introduced Erlang for Haskell programmers, taking parallelising quicksort as an example, both within one Erlang VM and distributed across a network. The latest version of the Erlang system can be downloaded from here. There is a Windows installer. Many linux versions have an Erlang packagage available, but not necessarily a package suitable for development of Erlang code, and not necessarily the latest version. On Ubuntu, try
sudo apt-get install erlang-dev
If that doesn't work or you can't find an appropriate package, build the VM from source.
This lecture focussed on the fault tolerance constructs in Erlang--links and system processes--and the motivation for the "Let It Crash" philosophy. It introduced supervision trees and the Open Telecoms Platform, and developed a simple generic server.
Scalable parallel programming in Erlang demands dividing tasks into sufficiently many processes, which can run in parallel, and which avoid heavy sequential parts, such as the last step of divide-and-conquer algorithms which is often the most expensive, and runs in parallel with nothing. But even then, congestion for shared resources can spoil performance. The lecture discussed ways of reducing congestion, both at the Erlang source level and in the virtual machine, for example by replacing one resource shared by n processes with n^2 resources shared by just two. "Invisible" shared resources, such as the scheduler queue(s) and the L3 cache can hit performance badly, so even Erlang programmers do need to be aware of architectural limitations such as cache sizes.
Google's Map-Reduce framework has become a popular approach for processing very large datasets in distributed clusters. Although originally implemented in C++, it's connections with functional programming are close: the original inspiration came from the map and reduce functions in LISP; MapReduce is a higher-order function for distributed computing; purely functional behaviour of mappers and reducers is exploited for fault tolerance; it is ideal for implementation in Erlang. This lecture explains what Map-Reduce is, shows a sequential and a simple parallel implementation in Erlang, and discusses the refinements needed to make it work in reality.
Lennart told us about how functional programming is used in the investment banking part of Standard Chartered. He explained how many of the pricing and risk analysis problems that demand heavy computation are embarassingly parallel, so that a form of pmap is just about the only way that is used to express parallelism. A strategy parameter determines whether the resulting computation is run on multiple threads or processes on a local machine, or is sent off to a grid. The grid computations must be pure and Lennart stressed the usefulness of the type system of either Mu (a strict version of Haskell) or Haskell in enforcing this. He emphasised that putting the Quant library, Context (based on the financial contracts paper by Peyton Jones, Eber and Seward), into practical use at many sites around the world involved a lot of hard engineering work related both to how to serialise both data and functions and to having to cope with the fact that different sites may be running different versions of the library, and on different architectures. Along the way, Lennart mentioned that it is well known that some programmers are ten times more productive than others, and pointed out that such programmers can, in fact, get paid ten times as much if they choose the right employer :)
Richard told us why Erlang is a good fit for Klarna, emphasizing that though Erlang's performance can, of course, be beaten, it lets you get close enough to the best possible performance in a very short time. He talked about designing for parallelism, for example splitting shared resources to reduce contention. Databases in parallel distributed systems bring consistency problems, and Richard explained the famous CAP-theorem. Finally he mentioned that Klarna are always hiring!
The lecture presented some of our thoughts on the course so far, what we learned, our intentions with the exam, and pointers to ways to keeping up with developments. We stressed that we can help students who want to consider a masters thesis project related to this course, either at a company or in the Functional Programming group. (We have just appointed four assistant professors in the group, so we have plenty of supervision capacity!) We had a long and productive discussion about the course. Many thanks to all students who contributed. Note also that you can still give suggestions for improvement, or any ideas you come upon, to your course reps, Emil Falk, Dan Rosen and Michal Palka. (We have a meeting planned for May 25.)
The winners of the prize for best Repa tutorial were Marcus Lönnberg and Karl Schmidt. Congratulations to them! Here is their tutorial. Nikita picked out nine candidates for the prize, and Mary was struck by the creativity of our students! (Next year, we will have more tutorial topics to choose from, and probably larger groups to work on them.)
Cloud Haskell is a framework for writing distributed programs in Haskell. It's divided into two layers:
Process layer - This layer, as we will see, has very much in common with Erlang. Here, we can spawn processes, send & receive messages etc.
The task layer - A more abstract and higher-level interface where we need not bother about the details. The framework will take care of failures & communication for us.
We will start by looking at the process layer whilst we compare the constructs to their Erlang counterparts. I will show some demos along the way (hopefully I can do this in a distributed environment). I will then move on to show the interface to the Task layer and a small demo on that.
Parsing of the input data is a component of most computer programs. If one wishes to obtain sublinear parallel performance, the parsing phase must also be parallelised. We will construct a parallel parsing algorithm, and we will see that sublinear performance is achievable only if the input satisfies certain conditions, which we will formulate. These conditions will help us understand fundamental limits of parallel computation.
- Beginner: CYK Algorithm
- Intermediate: To CNF or not to CNF? An Efficient Yet Presentable Version of the CYK Algorithm
- Advanced: General context-free recognition in less than cubic time