TDA383/DIT390

Critical section

Summary: Last time
- Introduction to concurrency
- Threads in Java
- The shared update problem
Today
- Introduction to semaphores
- Peterson's algorithm to solve the shared update problem
- Java locks
- Lab 1

Questions/comments

What is 'atomic' and the opposite of 'atomic'?
How could the example give less than 100k as result?
Should 'concurrent programming' be called 'samtida programmering'?
A bit slow, but clear and understandable.

The shared update problem

Review the counter example
Models of concurrency

Semaphore

A semaphore has two operations: wait() (also: P) and signal() (also: V)
A semaphore has a number of permits (binary semaphore: one permit)
The wait() operation removes a permit from the semaphore. If the semaphore has 0 permits, the operation waits until another thread returns a permit. Operation names in standard Java semaphores are different.
The signal() operation puts back a permit.
Invented in the mid 60’s
- Edsger Wybe Dijkstra [1930–2002]

Using a semaphore

  Semaphore s;

  public void run () {
      for(int i = 0; i < rounds; i++) {
        s.wait();
        int tmp = counter;
        counter = tmp + 1;
        s.signal();
      }
  }

thread1	thread2

counter = 0

    5. s.wait()
    6. 0 = counter
    7. counter = 1
    8. s.signal()

counter = 1

    5. s.wait()
    6. 1 = counter
    7. counter = 2
    8. s.signal()

counter = 2

    5. s.wait()
    6. 2 = counter
    7. counter = 3
    8. s.signal()

counter = 3

    5. s.wait()
    6. 3 = counter
    7. counter = 4
    8. s.signal()

counter = 4

How to convince yourself that the counter is now correct?

Things to come

We are going to show how to implement a semaphore using shared variables (reads and writes)
It is an academic exercise, and a classic result
It demonstrates a few important things
1. What important properties we need to care about
2. How to show these properties
3. Showing how to make one thing using another is a common technique in Computer Science of gaining understanding
4. A lot of practical results are shown in similar ways, so they will look familiar once you know this

Preliminaries

This part follows chapter 3 of the textbook
Instead of using the pseudocode, we will use Java, but as pseudocode–the code will have the same meaning as in the textbook, only a Java syntax

Limited Critical Reference

A statement has a Critical Reference if it (a) writes to a shared variable that may be read by another thread, or (b) it reads from a shared variable that may be written to by another thread.
Limited Critical Reference is satisfied if every statement has at most one Critical Reference (see Def. 2.7 in textbook).
Examples:
- c = c + 1 has two critical references if c can be read and written to by other threads
- a = a + b has 3 critical references if a and b can be read and written to by other threads
- c = tmp + 1 has 1 critical referenc if c can be read by other threads, while tmp is local
LCF is a convenient trick that lets us use larger statements instead of listing individual instructions
It also lets us abstract from the exact code the compiler will generate for a given statement

`await` statement

await(E) is a statement that waits until expression E becomes true
await(E) can be implemented using busy waiting: while(!E);

Implementing semaphores using `await`

We will show how to solve a model problem for semaphores: the mutual exclusion problem (also known as the critical section problem)
It requires implementing a simplified variant of a semaphore and allows for easier analysis

The problem: there are N threads (N=2 for us), and each one is executing an infinite loop of the following form:

// thread code
   while (true) {
      Non-critical section
      Entry protocol
      Critical section
      Exit protocol
      Non-critical section
   }

Non-critical section is code that can be executed independently by the threads
Critical section is code that should be executed only by one thread at a time
Critical section never hangs (always terminates)
Protocols use a separate set of variables than the rest of the code
Informally: the entry and exit protocols ensure that the critical section is executed by at most one thread at a time, that there are no deadlocks and that no thread is starved (the last point requires additional assumption)

Formal requirements

Properties required from the implementation
1. Mutual exclusion (Mutex)
  - At most one thread at a time is in its critical section (Critical section)
2. No deadlock/livelock
  - If both processes attempt to enter their critical section, one will succeed (Entry protocol)
3. No starvation
  - A process attempting to enter its critical section will eventually succeed (Entry protocol)
Additional assumption: scheduler is weakly fair
- A process waiting to execute await(B) where B is continually (repeatedly) true, will eventually get the processor

Safety/liveness properties

Safety: program is always is a good state (mutual exclusion)
Liveness: starting at any time program eventually progresses (no deadlock/livelock, no starvation)

Attempt 1

Use a variable turn to indicate who may enter next

  int turn = 0; // Start with turn = 0

  // thread 1
     while (true) {
        // Non-critical section
        await (turn == 0) ;
        // Critical Section
        turn = 1;
     }

  // thread 2
     while (true) {
        // Non-critical section
        await (turn == 1) ;
        // Critical Section
        turn = 0;
     }

Mutex
- Ok
No deadlock
- Ok
No starvation
- What if non-critical section does not terminate?

Showing the mutual exclusion property

Create the execution–state graph (enumerate all states)
Too big? Try to limit the states
Too big? Try using an invariant

Invariants

Invariant: a condition that holds for all reachable states
Can be shown to hold for the next state if it holds for the current state
Gives us information about the reachable states
Mathematical proof
- See course Software engineering using Formal methods

Attempt 2

Use a flag to indicate who has entered

boolean flag[] = {false, false};

// thread 1
   while (true) {
      // Non-critical section 
      await (!flag[1]) ;
      flag[0] = true;
      // Critical Section
      flag[0] = false;
   }

// thread 2
   while (true) {
      // Non-critical section
      await (!flag[0]) ;
      flag[1] = true;
      // Critical Section
      flag[1] = false;
   }

Mutex
- No! Initially, both threads can simultaneously execute await (!flag[x]) and since flag[x] is false, both threads execute flag[i] = true;.
No deadlock
- Ok
No starvation
- Ok

Attempt 3

Use a flag to indicate who wants to enter

boolean flag[] = {false, false};

// thread 1
   while (true) { 
 // Non-critical section
 flag[0] = true;
 await (!flag[1]) ;
 // Critical section
 flag[0] = false;
   }

// thread 2
   while (true) {
 // Non-critical section
 flag[1] = true;
 await (!flag[0]) ;
 // Critical section
 flag[1] = false;
   }

Mutex
- Ok
No deadlock
- It can happen! At the beginning, both threads can execute flag[i] = true followed by await (!flag[x]), which make them stuck in the loop.
No starvation
- Ok

Peterson's algorithm

flag+turn: I want to enter, after you

int turn = 0;
boolean flag[] = {false, false};

// thread 1
   while (true) {
 flag[0] = true;
 turn = 1
 await (!flag[1]
        || turn==0) ;
 // Critical section
 flag[0] = false;
   }

// thread 2
   while (true) {
 flag[1] = true;
 turn = 0
 await (!flag[0]
        || turn==1) ;
 // Critical section
 flag[1] = false;
   }

Mutex
- Ok
No deadlock
- Ok
No starvation
- Ok

Showing properties

Create the execution–state graph (enumerate all states)
Safety properties: invariants
Liveness properties: more complex

Remarks

Version for more threads is more complicated
These solutions are already interesting
Testing is difficult

Other low-level primitives

test-and-set (note: int ref denotes a reference to int)

static void test-and-set(int ref common, int local) {
 <local = common;
  common = 1;>
}

compare-and-swap (CAS): used on x86

static int compare-and-swap(int ref common, int old, int new) {
 <int temp = common;
  if (common == old) common = new;
  retutn temp;>
}

Load-link/store-conditional (LL/SC): used on ARM, MIPS, PowerPC

Right for the job?

As a pure software solution to the problem
- These algorithms are not practical
- They all contain a busy-wait loop
```
 while (B) ; 
```
- Consumes a great deal of processor resources and is very inefficient
- As it is, it does not work in Java! (see more of this later)
But often useful in low-level programming
- OS
- Embedded devices

Beyond busy waiting

A more suitable solution would be as follows:
- Entry protocol: if Critical section is busy then sleep, otherwise enter
- Exit protocol: if there are sleeping processes, wake one, otherwise mark the critical section as not busy
Semaphores typically support this solution
- and more

Peterson's algorithm in Java

The specification of the JVM does not guarantee that Peterson's algorithm will work (Check this post explaining why)
Java memory model
- Happens-before relationship
```
int x, y, z, w;
x = 0;
y = 5;

z = x;
w = y;
```
- There are no guarantees regarding updates on shared-variables
- Local variables fulfill the happens-before relationship
- An stronger relationship is needed with concurrency: synchronizes-with
  - Requires using volatile keyword
- An array volatile means that the updates to the references of the elements is visible to other threads, not the update of the elements itself!
- In Java, you cannot make arrays with volatile elements (Check this post)

Semaphores in Java

Specification

class Semaphore {

   private int value;

   Semaphore(int init) {< value = init; >}
   acquire() {< await (value>0) value = value – 1; >}
   release() {< value = value + 1; >}
}

Java has a library support
- java.util.concurrent

Semaphore mutex = new Semaphore(1);

public void run() {
   try {
      while (true) {
         //Non-critical section
         mutex.acquire();
         //Critical Section
         mutex.release();
         //Non-critical section 
   } catch(InterruptedException e) {}
}

Java built-in locks

A lock (binary semaphore) is created for every object in Java

To use this lock we employ the keyword synchronized

class MutexCounter {
   private int counter = 0;

   public synchronized void increment() {
           counter++;
   }
}

Alternative to a synchronized method is a synchronized block

Allows for more fine-grained locking

class MutexCounter  {
 private int counter = 0;

 public void increment() {
    // lock this object
    synchronized (this) {
       counter++; 
    }
}

Locks
- Each object has a lock
- Each lock has a queue of waiting threads
- The order of the queue is not specified (FIFO, LIFO, etc.)

Question

If the Java compiler can rearrange the order of statements, can two calls to acquire() be swapped?

Semaphore s1 = new Semaphore(1);
Semaphore s2 = new Semaphore(1);

...
  s1.acquire();
  s2.acquire();
}

Answer: No

From Chapter 17 of the Java Language Specification (17.4.3 and 17.4.4):

Among all the inter-thread actions performed by each thread t, the program order of t is a total order that reflects the order in which these actions would be performed according to the intra-thread semantics of t.

...

Every execution has a synchronization order. A synchronization order is a total order over all of the synchronization actions of an execution. For each thread t, the synchronization order of the synchronization actions (§17.4.2) in t is consistent with the program order (§17.4.3) of t.

Synchronization actions are volatile reads and writes, and locking and unlocking of monitors associated with objects (synchronized). The documentation for Semaphore does not state explicitly that acquire() and release() are synchronization actions. However, the intention clearly is that they are synchronization actions.

Documentation of the Lock interface gives more explicit assurances:

All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, as described in section 17.4 of The Java™ Language Specification:

A successful lock operation has the same memory synchronization effects as a successful Lock action.

A successful unlock operation has the same memory synchronization effects as a successful Unlock action. Unsuccessful locking and unlocking operations, and reentrant locking/unlocking operations, do not require any memory synchronization effects.

However, Semaphore does not implement the Lock interface.

What prevents the compiler from reordering the acquire() and release() calls? Let's take a look at the code. The operation that is used under the hood is compareAndSetState (link), which itself executes compareAndSwapInt. The comment for compareAndSetState states 'This operation has memory semantics of a volatile read and write.', which makes it a synchronization action that cannot be reordered.

Counter using Java semaphores

import java.util.concurrent.Semaphore;

class CounterS implements Runnable {

  private Semaphore s = new Semaphore (1);

  private int counter = 0;
  private final int rounds = 100000;

  public void run () {
    try {
      for(int i = 0; i < rounds; i++) {
        try {
          s.acquire();
          counter++;
        } finally {
        s.release();
        }
      }
    } catch (InterruptedException e) {
      System.err.println("Thread interrupted");
      System.exit(-1);
    }
  }

  public static void main (String[] args) {
    try {
      CounterS c = new CounterS ();

      // Create two threads that run our run () method.
      Thread t1 = new Thread (c, "thread1");
      Thread t2 = new Thread (c, "thread2");

      t1.start (); t2.start ();

      // Wait for the threads to finish.
      t1.join (); t2.join ();

      // Print the counter
      System.out.println(c.counter);
    } catch (InterruptedException e) {
      System.out.println ("Main thread interrupted!");
      System.exit(-1);
    }
  }
}

Summary

Introduction to semaphores
Peterson's algorithm to solve the shared update problem
Locks