Formal Methods for Software Development TDA294/DIT271, LP1, HT2017

Lab 1 -- Verifying properties with Spin

The first lab consists of two parts.

Part 1 -- The Dining Philosophers

This assignment is about modeling the dining philosophers problem in Promela. A number of philosophers have gathered around a table to eat spaghetti. There are N philosophers around the table and N forks on the table. Philosophers are a very special breed and they need two forks to be able to eat. This means that they need to share forks with each other. When philosophers are not eating they likes to busy themselves with some thinking. The picture to the right illustrates five philosophers sitting around a table with five forks.

Our model will contain philosophers modeled by Promela procedures:

proctype phil(int id) {
  do
  :: /* ... */
  od
}

Notice how the process contains a do loop and thus runs indefinitely. Philosophers really do nothing but eating and thinking. As a start, eating and thinking can be modeled by simply printing something like this:

proctype phil(int id) {
  do
    :: printf("thinking\n");
       /* ... */
       printf("eating\n");
       /* ... */
  od
}

There are still a few /* ... */ to fill in with useful code. But at this stage it is possible to instantiate some of the processes and let them run producing some output.

#define NUM_PHIL 4

proctype phil(int id) {
  do
    :: printf("thinking\n");
       /* ... */
       printf("eating\n");
       /* ... */
  od
}

init  {
  int i = 0;

  do
  :: i >= NUM_PHIL -> break
  :: else -> run phil(i);
	     i++
  od
}

On the online web-interface you find this Promela file under the course material link. Simulate it using Spin by clicking the Run button (specifying a maximum number of steps is recommended). If you are using the command line, type spin Philosophers.pml. The output results in something similar to (note that a different indentation signifies a different process):

thinking
     eating
     thinking
	 thinking
	 eating
	     thinking
     eating
	     eating
	 thinking
	 eating
	     thinking
	     eating
     thinking
	 thinking
     eating
	 eating
		 thinking
     thinking
		 eating
	     thinking
	 thinking
	     eating
     eating
	 eating
     thinking
	     thinking
		 thinking
     eating
	 thinking
		     thinking

While we can see philosophers being active, it is yet hard to see if they actually use any utensils.

The first part of this lab is to implement the philosopher process. This process needs to acquire two forks one by one -- that means that the process cannot pick up both forks at the same time, for example in the same atomic block. When both forks are acquired the philosopher eats and then puts the forks down again. The forks here are a shared resource and should be modeled as such. The sample output after this step should look as follows:

		Philosopher 3 is thinking
    Philosopher 0 is thinking
	Philosopher 1 is thinking
	    Philosopher 2 is thinking
	Philosopher 1 is eating with forks 1 and 0
	    Philosopher 2 is eating with forks 2 and 1
	Philosopher 1 is thinking
	    Philosopher 2 is thinking
		Philosopher 3 is eating with forks 3 and 2
	Philosopher 1 is eating with forks 1 and 0
		Philosopher 3 is thinking
	Philosopher 1 is thinking
	    Philosopher 2 is eating with forks 2 and 1
		Philosopher 3 is eating with forks 3 and 2

Examining such an output by itself does not provide sufficient guarantees that the philosophers do not grab the same fork at the same time. The next step of the assignment is to rewrite your program in such a way that a single fork can not be grabbed at the same time by more than one philosopher.

Consider the following problems:

  1. Use assertions in your code to verify this property about forks.
  2. Verify the absence of deadlocks in your model using Spin.
  3. How strong fairness assumptions do you need for starvation freedom to hold? I.e., can you do without any additional assumptions, or do you need weak or strong fairness? An informal, but still rigorous, argument is sufficient here.
  4. Verify the property from (1) with an LTL formula instead of assertions. Can your LTL-based solution catch bugs not covered by your assertion-based solution? Or are they equivalent in this respect?

It is important to remember that Promela code is as much code as "usual" code written in mainstream languages such as Haskell or Java. Therefore it is as important as always to follow sound software development principles when developing and structuring Promela code. Specifically, both variable names and proper indentation are important.

When building mathematical models of systems it is important that the end result is easy to follow and understand, because the model is a critical part of a program's correctness argument (which must be coherent). The same holds for properties formalized as assertions and LTL formulas. Unnecessarily complex models and formalized properties often hide bugs and other unexpected behaviors. Therefore a clean and simple system model and succinctly formalized properties are required to pass the lab.

In particular, you are not allowed to work around problems induced by overly complex models by giving Spin "help" flags influencing the compilation of the generated search program or its search procedure. Instead, your model should be simple enough that Spin can handle it without additional flags not discussed in the course, or other "hacks". (Such fine-tuning might be needed, and even essential, in other, more complex, verification projects -- but for simple problems we should have simple models! In fact, avoiding accidental complexity is equally, if not more, important when approaching such verification initiatives; where the state space might become intractably large without succinctly expressed models. So this simple-models-requirement is not an arbitrary constraint imposed for course-administrative reasons; rather, it is meant to reflect necessities and expectations one faces in non-lab real-world verification projects.)

To further clarity, the requirements for formalizing properties are similar to the ones met in the model construction process: The correspondence between the informal property formalized and the property stated in its formalized form should be as direct and natural as possible. As with models, this is to make sure that we understand what we are verifying.

Reporting

For Part 1 of the assignment we expect:

  1. A file Philosophers.pml which is your Promela model for the Dining Philosophers problem.
  2. A text report named Report_Philosophers.txt describing your solution to (1)-(4) including all relevant information on how to use your model.

Part 2 -- Needham-Schroeder protocol verification

Background -- security protocols

Most communication on the Internet and other networks is, as you may know, publically visible to anyone listening in. In some situations it is preferable to have a secret communication channel, for example when visiting the website of your bank or when sending your password to login to Facebook.

Public Key Cryptography -- One way to do this is by using public key cryptography. Here each principal X involved in the network has published a public key pX, while keeping the corresponding secret key sX hidden from everybody else. Communication to X can now be encrypted under this public key pX and can only be decrypted using sX, i.e. the key only in possession by X. You could compare it to the principal X giving everyone a lock (pX) which all open with the same key (sX), which X keeps to him/herself. A message can now be placed in a box and locked (encryption) which only X can open (decryption).

Symmetric Key Encryption -- This is not an entire solution to the problem though. Performing encryption and decryption under public key cryptography is computationally rather expensive. To establish a faster communication channel one can use symmetric key cryptography. Here a pair of principals shares the same key, which gives faster encryption and decryption (somehow; for details you have to take a course in cryptography). Symmetric cryptography however knows the problem that in a network with n principals, there must be n(n-1) unique shared keys -- which can mean a rather large number -- and is therefore also not perfect.

One goal of security protocols is to get the best of both worlds. Using public key cryptography, two principals communicate two random numbers. If all steps in the protocol are executed correctly, it should be the case that

  1. Each principal has proven to be the person they claims to be (authenticity)
  2. The involved principals, and only they, know the communicated random numbers (confidentiality)

The communicated random numbers can now somehow be used to construct a symmetric (shared) key and, since no one else knows these numbers, further communication can be encrypted under this shared key.

The Needham-Schroeder protocol

In this assignment we have a look at a famous security protocol, the Needham-Schroeder protocol. It is well-known for being vulnerable to an attack on both confidentiality and authenticity, and in this assignment we aim to find this attack using model checking. The protocol is as follows:

1) X → Y: { X, NX }pY
2) Y → X: { NX, NY }pX
3) X → Y: { NY }pY

where

X → Y: M - Principal X sends message M to principal Y
{ M }pX - The encryption of message M under the public key of principal X
NP - Random number (nonce -- number used once) generated by principal P

That is, X sends to Y a message consisting of X's identity and a nonce encrypted under Y's public key. Y's reply consists of X's nonce, proving that Y is the owner of the secret key, and Y's nonce, encrypted under X's public key. The nonces should now only be known to both principals, and can be used to construct a shared symmetric key (confidentiality). Finally, X sends back Y's nonce, to prove that they has the right secret key (authenticity).

Modelling the network

Agents. We start with two agents: Alice, and Bob and consider a run of this protocol between them in isolation (with no other principals). We assume the existence of public keys kA for Alice, and kB for Bob. The code below is a Promela model for Alice. It also contains supporting declarations for modelling encryption and the network.

Nonces. In a real system the nonces used in the actual protocol would be random numbers generated by their respective agent and used only once for the purpose to establish a common secret between two agents. Here, to keep the model simple and the state space small, nonces are modelled as precomputed names nonceA and nonceB. But initially agent A knows still only nonceA and agent B knows only nonceB. Any other agent, in particular the intruder, does not know any nonces a priori. Nonces may only be learned by receiving/intercepting messages containing the nonces and being able to decrypt these messages. Keep this in mind when modeling the intruder.

Messages. To represent encrypted messages we use a record Crypt that contains three entries: a key and two data entries (dubbed content1 and content2). With such representation encryption is modelled as creating a variable of type Crypt, where the key is the public key of the receiver. Therefore, decryption can only be done by matching the key entry with the key in possession of the agent.

Network. The network is modelled as a single message channel shared by all agents on the network. For simplicity, we use synchronous communication on the network, which is indicated by a buffer length of 0. While this does not affect the possible communication patterns, it helps to reduce the size of the model. A message on the network is a triple consisting of an identification tag (the message number), the intended receiver, and an "encrypted" message body.

Verification aids. For verification purposes we also introduce global variables partnerA, partnerB, statusA, statusB (ghost variables). The variable partnerA specifies who is Alice's intended partner. Similarly, partnerB indicates who is Bob's partner. In the beginning we explicitly set partnerA = agentB.

mtype = {ok, err, msg1, msg2, msg3, keyA, keyB, agentA, agentB,
	 nonceA, nonceB };

typedef Crypt { mtype key, content1, content2 };

chan network = [0] of {mtype, /* msg# */
		       mtype, /* receiver */
		       Crypt
};

/* global variables for verification*/
mtype partnerA, partnerB;
mtype statusA = err;
mtype statusB = err;

/* Agent (A)lice */
active proctype Alice() {
  /* local variables */

  mtype pkey;      /* the other agent's public key                 */
  mtype pnonce;    /* nonce that we receive from the other agent   */
  Crypt messageAB; /* our encrypted message to the other party     */
  Crypt data;      /* received encrypted message                   */


  /* Initialization  */

  partnerA = agentB;
  pkey     = keyB;

  /* Prepare the first message */

  messageAB.key = pkey;
  messageAB.content1 = agentA;
  messageAB.content2 = nonceA;

  /* Send the first message to the other party */

  network ! msg1 (partnerA, messageAB);

  /* Wait for an answer. Observe that we are pattern-matching on the
     messages that start with msg2 and agentA, that is, we block until 
     we see a message with values msg2 and agentA as the first and second  
     components. The third component is copied to the variable data. */

  network ? (msg2, agentA, data);

  /* We proceed only if the key field of the data matches keyA and the
     received nonce is the one that we have sent earlier; block
     otherwise.  */

  (data.key == keyA) && (data.content1 == nonceA);

  /* Obtain Bob's nonce */

  pnonce = data.content2;

  /* Prepare the last message */
  messageAB.key = pkey;
  messageAB.content1 = pnonce;
  messageAB.content2 = 0;  /* content2 is not used in the last message,
                              just set it to 0 */


  /* Send the prepared messaage */
  network ! msg3 (partnerA, messageAB);


  /* and last - update the auxilary status variable */
  statusA = ok;
}

active proctype Bob() {
   printf("placeholder for Bob\n")
}

Task 1: Write a model for Bob

Write a model for Bob that communicates with Alice's. In the official protocol, the principal identifier in the first message (X) is used by Bob to determine with who he is communicating, but we simplify this to Bob always assuming it is Alice who initalizes communication. Hence we always have:

partnerB = agentA;
pkey     = keyA;

Run the simulation with your implementation of Bob. As a last command in your model for Bob add a statement

statusB = ok;

Task 2: Spin

In this simple two-agent world everything should run smoothly and both Alice and Bob should always be able to reach the last statements in their corresponding models. In fact, this can be verified this with a temporal logic formula saying that eventually we reach a state where both statusA and statusB are ok.

Verify this property with a temporal logic formula in Spin.

ltl part2 { ... }

Report:

  • A file named NS2.pml containing the implementation for Bob and an inlined LTL formula named task2.
  • A file named Report_NS.txt explaining which mode you used in Spin to verify the formula (safety or acceptance) and why. How does weak fairness affect your answer?

Task 3: Introducing the attacker

To study security guarantees of our protocol we investigate how communication between Alice and Bob is affected in the presence of a third agent: Intruder. The listing below presents an intruder (also referred as attacker) process.

First, we extend declarations of possible agents, keys, nonces and agents to include the one for the attacker.

mtype = {... , agentI, keyI, nonceI } 

The code for the intruder then looks as follows:

active proctype Intruder() {
  mtype msg, recpt;
  Crypt data, intercepted;
  do
    :: network ? (msg, _, data) ->
       if /* perhaps store the message */
	 :: intercepted.key   = data.key;
	    intercepted.content1 = data.content1;
	    intercepted.content2 = data.content2;
	 :: skip;
       fi ;

    :: /* Replay or send a message */
       if /* choose message type */
	 :: msg = msg1;
	 :: msg = msg2;
	 :: msg = msg3;
       fi ;
       if /* choose a recepient */
	 :: recpt = agentA;
	 :: recpt = agentB;
       fi ;
       if /* replay intercepted message or assemble it */
	 :: data.key    = intercepted.key;
	    data.content1  = intercepted.content1;
	    data.content2  = intercepted.content2;
	 :: if /* assemble content1 */
	      :: data.content1 = agentA;
	      :: data.content1 = agentB;
	      :: data.content1 = agentI;
	      :: data.content1 = nonceI;
	    fi ;
	    if /* assemble key */
	      :: data.key = keyA;
	      :: data.key = keyB;
	      :: data.key = keyI;
	    fi ;
	    data.content2 = nonceI;
       fi ;
      network ! msg (recpt, data);
  od
}

This process is very nondeterministic in the choice of its actions and their sequence. Still, the current attacker is very limited: it picks up messages from the network, replays some messages, or initiates a new run with one of the protocol.

Disclaimer: This is approach might not, in general, be the best way to perform verification of security protocols. Modelling the attacker as a process allows for mistakes in the intruder's capabilities, making the attacker either too powerful or too weak. In this assignment we focus on using Spin and modelling properties with Promela and LTL, but if you are interested in verifying protocols in particular you should look into e.g. this paper.

Add the above code to your model and simulate it. You may notice that with the introduction of the attacker principal the property from Task 2 becomes too strong and is no longer true. Find a counterexample with Spin and explain what exactly causes it to fail.

Report:

  • A file named NS3.pml of the new model.
  • Extend Report_NS.txt and explain which mode you used in Spin to verify the formula (safety or acceptance) and why. How does weak fairness affect your answer? In addition, include the (summarized) trail found by Spin and explain its cause.

Task 4: New assumptions on the attacker

In the above model the intruder is a complete outsider. In this task we strengthen assumptions about the intruder. Not only can our attacker intercept and replay messages, but now they can also be a valid recipient of messages from Alice, that is, Alice may decide to start a communication with the intruder. Effectively, this corresponds to the intruder being a legitimate user of a system -- an insider with malicious intentions.

Start with modifying the code for Alice, so that she nondeterministically chooses either Bob or Intruder as her communication partner for her run of the protocol. The result of this choice should be reflected in the variable partnerA. Once this choice is made the subsequent protocol messages should be encrypted with the key of the corresponding principal, that is, with keyB when partnerA is agentB, and with keyI when partnerA is agentI -- Bob still assumes he is always contacted by Alice.

Task 5: Extending the Intruder

Now that the intruder may receive messages encrypted with their key, they should be able to decrypt and access the content of those (and only those) messages. Extend the intruder process to reflect this. Add two more ghost global variables that represent the attacker's knowledge of the nonces:

/* global variables for verification */
...
bool knows_nonceA, knows_nonceB;

Initialize these variables to false. In your modified code for the intruder, you will have to set these variables to true only when successful decryption of an intercepted message yields nonceA or nonceB. To add the possibility of decryption extend the intruder with another if block which checks the key of the intercepted data. If the key matches with the one of the intruder, proceed to check the content; skip otherwise.

When assembling content of the message that the intruder sends, use the variables knows_nonceA and knows_nonceB so that if these variables are true then the attacker may use the corresponding nonce. Moreover, to sligtly reduce nondeterminism in your model you may introduce "typing" for the intruder's messages. That is, if intruder is composing msg3 then content2 (which is not used for messages of type msg3) of the assembled message could be just set to 0. If you do this, consider also formally verifying that your modification actually decreases the amount of nondeterminism in your program (as a sanity-check).

Task 6: Attack

At this step your implementation should be ready to analyze security guarantees of our protocol. The following three properties specify the properties of the protocol:

  1. If both Alice and Bob reach the end of their runs (i.e. both statusA and statusB are ok) then Alice's communication partner is Bob, and Bob's communication partner is Alice.
  2. If agent A reaches the end of its run (statusA is ok) and A believes it is talking to B (partnerA is agentB) then the intruder does not know A's nonce (!knows_nonceA).
  3. Similarly for B's nonce. That is: if agent B reaches the end of its run (statusB is ok) and it believes it is talking to A (partnerB is agentA) then the intruder does not know B's nonce (!knows_nonceB).

Encode the above properties as temporal logic formulas and use Spin to verify them. Since the protocol that we have presented is buggy, you should be able to find a counterexample to these properties.

Which of the properties (1)-(3) is violated? Can you explain in your words what happens in your counterexample?

Report:

  • A file named NS6.pml of the new model, containing three inlined LTL formulas named propAB, propA and propB for the three requested properties respectively.
  • Extend Report_NS.txt and explain which mode you used in Spin to verify the formulas (safety or acceptance) and why. How does weak fairness affect your answers? In addition, include the (summarized) trails found by Spin and explain their cause for the properties that are violated.

Task 7: Patching the protocol

There is a well-known patch to the protocol: in the patched version the second message of the protocol

2) Y → X: { NX, NY }pX

is replaced with a message that contains the identity of principal Y:

2) Y → X: { Y, NX, NY }pX

Modify your model to implement this patch. Re-verify that your new model satisfies properties (1)-(3) from Task 6.

Remember that model simplicity is important, as discussed in part 1 of the lab. Particularly, for passing the lab, it is important that your model is clear enough that it is immediately obvious that all intended attacks are modeled.

Report:

  • A file named NS7.pml of the new model, containing three inlined LTL formulas named propAB, propA and propB for the three requested properties respectively.
  • Extend Report_NS.txt and explain what the effect of the patch was.

Reporting part 2

For Part 2 of the assignment we expect the collection of files from each task:

  1. A series of files NS2.pml, NS3.pml, NS6.pml and NS7.pml with inlined LTL formulas.
  2. A text report named Report_NS.txt describing your solution to Tasks 1-7 and information on how you did the verification.
  3. Other supporting files, such as Spin-generated (.trail) for the counterexamples that you discuss in your report.

Reporting

  • See respective sections of Part 1 and Part 2.
  • Please submit the report in plain text format (.txt). Avoid other formats as this may cause delays in grading your submission.
  • Submit via Fire

References

  • Spin reference card (PDF)

Attributions

The dining philosophers image is created by Benjamin D. Esham and is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.




Home | Course | Schedule | Exam | Exercises | Labs | Eval | Tools W. Ahrendt, Sep 29, 2017