Aspects
Applications to run faster if we have multicore CPUs
Is the code going to run faster just by having more cores?
No!
It might if it has lots of non-interferent processes and no sequential bottlenecks
Erlang uses all the cores on the SMP machine by default
Symmetric MultiProcessing
erl -smp +S N
The number N
indicates how many schedulers (govern
Erlang virtual machines)
Useful to test how the program works on different amount of cores
General guidelines
Use lots of processes
Avoid side effects
Avoid sequential bottlenecks
Write “small messages, big computations” code.
map
map(_, []) -> []; map(F, [H|T]) -> [F(H)|map(F, T)].
pmap(F, Xs) -> S = self(), Ref = make_ref(), Pids = map(fun(X) -> spawn( fun() -> S ! {self(), Ref, F(X)} end ) end, Xs), gather(Pids, Ref). gather([Pid|T], Ref) -> receive {Pid, Ref, Ret} -> [Ret|gather(T, Ref)] end ; gather([], _) -> [].
map(F,L) =:= pmap(F,L)
Order of elements?
Side-effects?
What if the list is very big and the computation F
very small?
What would it happen if you use pmap with a list of 1.000.000.000 elements?
Sometimes, it is necessary to impose some resource limits
Parallel processes, open files, etc.
Gives stability to the system
The workers model is designed for that goal
We have tasks (computations) divided among a number of workers
Workers can be active or passive.
There is a server, called a pool, that keeps track of the tasks to be performed and has a fix number of workers willing to take those tasks.
Server behavior
The initial state of the server is a queue of tasks and a list of passive workers
A worker can take more than a single task
An active worker becomes passive after finishing with the assigned task(s)
A passive worker becomes active when being assigned a task(s).
The server finishes execution when the queue of tasks is empty and there are no active workers
The server waits for a worker to return a result when the queue is empty or there are no more passive workers
When the task queue is not empty, the server gets a passive worker and assign some chunk of tasks to perform, i.e., the worker is now active
worker(Compute) -> spawn (fun () -> worker_body(Compute) end ). worker_body(Compute) -> receive {Pid, Tasks} -> Result = Compute(Tasks), Pid ! {self(), Result}, worker_body(Compute) end.
Implementation of the server
-record(st, { tasks, aworkers, pworkers, get, combine } ). %% Done work_load(#st{tasks = [], aworkers = []}, Results) -> Results; %% There are tasks to give to a passive worker work_load(St = #st{tasks = [Task | Tasks], pworkers = [PWorker | PWorkers], aworkers = AWorkers, get = Get} , Results) -> {Chunk, TTasks} = Get([Task | Tasks]), PWorker ! {self(), Chunk}, work_load(St#st{tasks = TTasks, pworkers = PWorkers, aworkers = [PWorker | AWorkers] } , Results) ; %% No more passive workers or empty tasks, then %% wait for results work_load(St = #st{pworkers = PWorkers, aworkers = AWorkers, combine = Combine } , Results) -> receive {Worker, Result} -> work_load (St#st{ pworkers = [Worker | PWorkers], aworkers = lists:delete(Worker, AWorkers) } , Combine(Result, Results)) end.
pmap
using workersLimit the resources to two workers only
Tasks are the elements in the list, i.e., [X1,X2,X3]
has three tasks.
The computation of the worker is just to apply F
to its tasks, e.g.,
F(X)
.
Getting a task is simply taking the first task
get_pmap([X | Xs]) -> {X, Xs}.
Combine a new result is simply to add it to the lists of computed results
combine_pmap(R, Rs) -> [R | Rs].
The initial result is the empty list
pmap
with two workers
pmap(F, Xs) -> W1 = worker(F), W2 = worker(F), start(Xs, [W1, W2], fun get_pmap/1, fun combine_pmap/2, []).
Lock-based programming is difficult
There are many potential problems:
Deadlock
Starvation
Non-compositionality
Is there some way to eliminate at least some of these problems?
Lock-based programming does not compose
Suppose you have two thread safe buffers and you want to atomically take an element from one of them and put it in the other
class Buffer<Elem> { Elem get() {} ; void put(Elem) {} ; }
A not so nice solution
Expose the the locks of the buffers
class Buffer<Elem> { void aquireLock(); void releaseLock(); Elem get(); void put(Elem); }
Lock both buffers before moving the element
class TwoBuffer<Elem> { private Buffer<Elem> b1 ; private Buffer<Elem> b2 ; void copy_elem() { b1.aquireLock() ; b2.aquireLock() ; b2.put(b1.get()) ; b2.releaseLock() ; b1.releaseLock() ; } }
It reduces opportunities for concurrency
It breaks abstraction!
What is you need to involved 3 buffers?
The number of locks grows as we compose algorithms
Increases the risk of programming errors
Lock-based synchronization can be seen as pessimistic concurrency: ”We always assume that we need mutual exclusion”
Another option would be optimistic concurrency
A concept to allow easy lock-free programming and optimistic concurrency
Although the programming model is lock-free implementations uses locks
Transactions: standard database concept
A group of operations should execute atomically,
Or not at all
One possible implementation of transactions
When writing to variables, do not actually modify them, instead the system keeps a log over all the reads and writes that are made
When the transaction is done the system checks that the read variables still have the same value as in the beginning of the transaction
If that is the case, make the changes permanent (known as commit)
Otherwise, rerun the transaction (known as rollback or retry)
To detect if a variable has changed, we assume a version number for each variable in the transaction
Example
We have two processes with two different transactions.
Now, both transactions read their corresponding variables. Each transaction recalls the version number of the read variables.
The transaction on the left firstly writes into variable x
, and the
transaction on the right follows but it fails (Why?) and retry.
The transaction on the right retries.
At the time of writing, it succeeds (Why?).
Benefits of transactions:
Many processes can be in the critical section at the same time
More parallelism
They only need to rerun if there is an actual runtime conflict
Deadlocks cannot occur
Easy to compose
Drawbacks of transactions:
Cannot guarantee fairness
All the book keeping can be expensive
STM are still a subject of reseach!