Formal Methods for Software Development TDA294/DIT271, LP1, HT2017

Exercise session 4

Model Checking with Spin

Proxy servers

A proxy server is a server that forwards requests from clients to other servers. In the setting that we study in this exercise all computers in the local network have to connect to a shared proxy in order to request data on the global network. There are several reasons why a group of computers might be organised this way.

  • The local network is an office network and the office manager would like to monitor (log) what websites are visited, for example to check that employees don't spend too much time on social networking sites.
  • Many machines in the local network often request the same data from the global network, and the proxy can be used to cache this data to speed up access to these.
  • The servers on the global network do not know which of the clients in the local network made the request, thus the proxy provides some anonymity.

If one proxy server makes all the requests to the global network coming from the local network, there is a risk that the proxy gets overloaded. One way to prevent this is to limit the maximum number of requests the proxy sends out to the global network at the same time. Suppose that the proxy has N sockets on which it receives responses from the global network. If all sockets are currently waiting for a response, any new request from the local network has to wait until one of the sockets becomes available again.

Modelling a proxy

We now present an abstract model of a proxy server. We first define some global information, such as the numbers of clients in the local network, servers in the global network, the channels used for communication etc:

/***
 * GLOBAL SETUP
 ***/

#define M 3 /* number of clients */
#define S 4 /* number of servers */
#define N 2 /* number of proxy sockets */

mtype {data0, data1, data2 , data3}; /* Possible server payloads */
mtype payload [4];                   /* Which server hosts which payload */

chan local_network = [0] of {   /* Client-Proxy communication */
  byte, /* client_id  */
  byte, /* server_id  */
  chan  /* reply chan */
};

chan global_network = [1] of {  /* Proxy-Server communication */
  byte, /* server_id  */
  chan  /* reply chan */
};

We already see quite some abstractions here. To keep the search space small we have a local network of only 3 client, and a global network of only 4 servers. The data on the different servers is simply represented by mtypes; payload[i] represents the data stored on server i. When making a request the client will give the proxy a reply channel on which the proxy should send the responde of the server.

We continue with the modelling of one client in the local network. The client makes only a single request.

/***
 * CLIENT
 ***/
proctype client(byte id) {
  mtype data_received;              /* data received */
  chan callback = [0] of { mtype }; /* local callback channel */

  /* Randomly select server */
  byte server_asked;                /* ID of the server to be asked */
  ... TODO ...

  printf("Client%d requesting Server%d\n", id, server_asked);
  local_network ! id , server_asked, callback ;
  callback ? data_received;

  printf("Client%d got the answer %e\n", id, data_received);
}

Q: There is a gap in this code: the client randomly chooses a server to request its data from. Implement this.

Answer:

  if
  :: server_asked = 0;
  :: server_asked = 1;
  :: server_asked = 2;
  :: server_asked = 3;
  fi;

We then continue with a server in the global network. The server listens to the global network to see if there is request made to it, by checking if the message on the channel asks for this server's ID. If that is the case, it sends its payload on the provided reply channel.

/***
 * SERVER
 ***/
proctype server(byte  id) {
  chan reply;
end:
  do
  :: global_network ?? eval(id), reply;
       printf ("Server%d working\n", id );
       reply ! (payload[id]);
  od;
}

Q: What would be the consequence of replacing eval(id) with just id?

Answer: In that case instead of pattern matching on the value of id it would replace the value stored in variable id with the value received on the channel.

Q: Explain why we use ?? to read from the global network, and not ?.

Answer: With ? we only match on the first message and block otherwise; with ?? we walk through the whole buffer of the channel until we find a message that matches (or reach the end of the buffer).

Finally we come to the proxy itself. Its main loop does one of the following. It can receive a request from a client in the local network, or it can receive a response on one of its two sockets from the global network.

(Here we 'abuse' the knowledge that the number of sockets (N) is 2; but it is less obvious how to write a proxy with this behaviour that is completely parametric in N, without starting numerous sub-processes.)

If the proxy gets a request from a client, it loops over the sockets to see if there is one available (i.e. not already waiting for a response from the global network). If this is the case, that socket gets marked as unavailable and the request is forwarded, providing the selected socket as the reply channel for the requested server. The reply channel to the client in the local network is stored in the replies array.

If there is no socket available we simply wait until one of them gets a response and becomes available; after which we restart the search (now knowing that there has to be an available socket).

If a response arrives on a socket this responses is forwarded to the client channel that we previously associated with this socket.

(We have some unfortunate duplication of modelling code here, but again this is not easily avoided without introducing more sub processes.)

Q: Why do we prefer the code duplication?

Answer: You best see this if you try to come up with a different solution: you could start N subprocesses to read from the sockets, but this would result in a much larger state space for the model checker to explore. That is, all possible interleavings of these subprocesses and the proxy itself would need to be considered. In addition, if done improperly you might even end up with wrong interleavings, such as one process labelling a socket as available while the other just send out a request on it.

/***
 * PROXY
 ***/
active proctype proxy() {

  mtype client_id; /* ID of the client making request */
  mtype server_id; /* ID of server requested by client */
  chan  client_ch; /* Callback channel for client */
  mtype data;      /* Payload received from server */

  chan socket   [N] = [0] of {mtype}; /* Socket receiving server response */
  bool available[N] = true;           /* Not waiting for server response? */
  chan replies  [N];                  /* Relating sockets to client callbacks */

  byte i; /* For looping over sockets */

end:
  do

  /* Receive a request from a client */
  :: local_network ? (client_id, server_id, client_ch) ->

     printf("Proxy a request from Client%d to Server%d\n", client_id, server_id);

     i = 0; /* Loop to find available socket */
     do
     :: i < N ->
          if
          :: available[i] ->
               available[i] = false;
               /* make a request */
               replies[i] = client_ch;
               printf ("Proxy asking Server%d \n", server_id);
               global_network ! server_id, socket[i];
               break;
          :: else ->
               i++;
          fi
     :: else -> /* Wait for a socket to become available */
          if    /* .. 'abusing' that we know number of sockets */
          :: socket[0] ? data ->
               replies  [0] ! data;
               available[0] = true;
          :: socket[1] ? data ->
               replies  [1] ! data;
               available[1] = true;
          fi;
          i = 0; /* Restart search */
     od;

  /* Receive a response from a server */
  :: socket[0] ? (data) ->
       replies  [0] ! data;
       available[0] = true;
  :: socket[1] ? (data) ->
       replies  [1] ! data;
       available[1] = true;
  od;

}

We use the init method to ensure that the payloads are correctly assigned before the servers are run. Since we made some assumptions on the number of servers (depending on how you answered the first question) and on the number of sockets, we also insert checks that our parameters match this.

/***
 * INIT
 ***/
init {
  printf ("init\n");
  /* check params */
  S >= 4;
  N >= 2;

  /* Assign payloads */
  payload[0] = data0;
  payload[1] = data1;
  payload[2] = data2;
  payload[3] = data3;

  byte i = 0; /* initialize the servers */
  do
  :: i >= S -> break;
  :: else   -> run server(i); i++;
  od;

  i = 0;      /* initialize the clients */
  do
  :: i >= M -> break;
  :: else   -> run client (i); i++;
  od;
}

You can launch the program in the Spin web interface or download Proxy.pml to work on locally.

Model Checking the proxy

In this exercise we want to verify that the implementation of proxy is indeed correct. Your task is to come up with correctness properties for the proxy server, write them in linear temporal logic and verify them using Spin. If needed you may change the model, for example to introduce ghost variables/arrays.

Here are some example properties, you may extend it with many others:

P: Each client eventually gets an answer (i.e. reaches the end of its code).

Answer: One option is to introduce a ghost variable int count_clients end of its code. You can then check the property \(\Diamond (\mathit{count\_clients} == M)\). which is initialized to 0. Each client increases this counter when it reaches the

Another option is to create an array of booleans bool client_done[M] = false; and each clients sets it's (client_done[id]) boolean to true when it reaches the end of its code. You can then check the property \(\Diamond \mathit{client\_done}[0] \land \Diamond \mathit{client\_done}[1] \land \Diamond \mathit{client\_done}[2]\).

This is a liveness property: something good will eventually happen. (this is also called acceptance because Spin will try to look for an acceptance cycle: a cycle where in all states the negation of this property holds. I.e. it tries to find a cycle where at least one of the client's does not eventually reach the end of its code)

P: The data that a client receives via the proxy matches what should be served by the requested server.

This is a safety property: it is always the case that when the client receives data it matches what the client requested. The model checker has to find a (not necessarily cyclic) path where this property does not hold in some state.

Because it is a safety property you should be able to express it with assertions. In this case we can add the following line to the end of the client's code: assert(data_received == payload[server_asked])

It is also possible to use an LTL formula instead. In that case we need to use the client_done array from the previous property to check that we are at the end of the code. We also need an array data_received[M] and an array server_asked[M] and replace data_received in the client code with data_received[id], similar for server_asked. We can then specify the safety property as

$$ \Box((\mathit{client\_done}[0] \rightarrow \mathit{data\_received}[0] == \mathit{payload}[\mathit{server\_asked}[0]]) \land $$ $$ \;\;\;\;(\mathit{client\_done}[1] \rightarrow \mathit{data\_received}[1] == \mathit{payload}[\mathit{server\_asked}[1]]) \land $$ $$ \;\;\;(\mathit{client\_done}[2] \rightarrow \mathit{data\_received}[2] == \mathit{payload}[\mathit{server\_asked}[2]]) ) $$

Do these properties still hold if we make the following modifications / extensions to the model:

E: Add a nondeterministic guard in the server process that blocks the process indefinitely if that guard is chosen. This models the failure of a server. Modify the correctness properties above to take server failure into account.

We can model failure by adding the possibility to come to a statement that does not allow the server to make progress:

  do
  :: global_network ?? eval(id), reply;
       printf ("Server%d working\n", id );
       reply ! (payload[id]);
  :: true ->
     count_failed++;
     failed[id] = true;
     false;
  od;

We use the global array bool failed[S] = false; to be able to refer to the fact that a server has failed or not. The only property that is changed is the first one, which now becomes either

$$ \Box ( \Box \mathit{count\_failed} == 0 \rightarrow \Diamond \mathit{count\_clients} == M)$$ or: $$ \Box ( ( \Box \neg \mathit{failed}[{\mathit{server\_asked}}[0]] \rightarrow \Diamond \mathit{client\_done}[0] ) \land $$ $$ \quad( \Box \neg \mathit{failed}[{\mathit{server\_asked}}[1]] \rightarrow \Diamond \mathit{client\_done}[1] ) \land $$ $$ \quad( \Box \neg \mathit{failed}[{\mathit{server\_asked}}[2]] \rightarrow \Diamond \mathit{client\_done}[2] ) ) $$ Notice the inner \(\Box\) operator: without them we would still allow servers to fail in a later state. When running this in Spin, also note how much the running time is influenced by the number of clients.

E: How do the above properties change if clients make their requests continuously in a loop instead of just once?

If we place the client's requests in an infinite loop, the counter-approach for the first property does no longer work. Also, the first property needs to be lifted to something like "A client will get infinitely many responses", which is the same as the second way to encode the property (using \(\mathit{client\_done}[\mathit{id}]\)) with the \(\Box\) in front.

Notice how the running time / capability of Spin changes if you place the random selection of server_asked inside or outside the loop. Also think about the role of fairness here -- in fact this property becomes rather difficult to prove because of it.

You can add your own extensions, for example to model the different usages of proxies mentioned in the introduction.

References

  • Spin reference card (PDF)



Home | Course | Schedule | Exam | Exercises | Labs | Eval | Tools W. Ahrendt, Oct 10, 2017