banner CSE

Seminars 2004

 
PhD seminars 2004


December 20, 2004, 1:15 pm

Public defence of PhD thesis by Magnus Ekman, Computer Engineering, Chalmers University of Technology:

Strategies to Reduce Energy and Resources in Chip Multiprocessor Systems

Room: HC2, Hörsalsvägen 14, Chalmers tekniska högskola, Göteborg

Faculty opponent: Professor Margaret Martonosi, Department of Electrical Engineering, Princeton University, USA


ABSTRACT:

A new architectural style known as chip multiprocessor (CMP) has recently emerged, where two or more processor cores are manufactured on the same die. This architectural style comes with many promises such as high performance for applications with much thread-level parallelism (TLP) and shorter design times due to its modularized design. Nevertheless, a new architectural paradigm also introduces new design challenges.

This thesis addresses the technical problem of how to design more efficient CMP systems in terms of energy and memory utilization. It contributes with design strategies that fall into three categories, consisting of design principles to reduce energy dissipation and main memory resources without reducing performance, design recommendations to balance the exploited instruction level parallelism (ILP) and TLP in a CMP, and a methodology to reduce simulation time when evaluating future designs.

Two of the proposed design principles can together reduce the energy dissipation in the L1-caches and translation-lookaside buffers by 30%. Secondly, it is shown that it is possible to tolerate ten times longer access latency to 70% of the main memory, which can be exploited by compression techniques to reduce the amount of memory by 30%. A novel compression scheme applied to the entire main memory is proposed and evaluated and is shown to reduce the needed memory resources by 30%. These reductions do not have any significant negative effect on performance.

Further, the trade-off between TLP and ILP is studied for applications with an abundance of TLP under a fixed area constraint. Four different design points ranging from 16 single-issue cores to two eight-issue cores are evaluated. By choosing the design point with four cores with an issue width of four it is possible to achieve close to optimal performance while still enabling single-threaded applications to run well.

Finally, a sampling technique for single-processor simulation is applied to multiprocessors. For one class of applications the technique is more efficient for multiprocessors by reducing the number of simulation points linearly with the number of processors. Another statistical technique is then proposed both for single and multiprocessor systems and reduces the required number of simulation points by one order of magnitude.

Keywords: chip multiprocessor, energy efficiency, memory compression, statistical simulation techniques


December 16, 2004, 1:15 pm

Public defence of PhD thesis by Joakim Aidemark, Computer Engineering, Chalmers University of Technology:

Node-Level Fault Tolerance for Embedded Real-Time Systems

Room: HC2, Hörsalsvägen 14, Chalmers tekniska högskola, Göteborg.

Faculty opponent: Professor Henrique Madeira, Department of Informatics Engineering, University of Coimbra, Portugal


ABSTRACT:

This thesis deals with cost-effective design and validation of fault tolerant distributed realtime systems. Such systems play an increasingly important role in embedded applications such as automotive and aerospace systems. The cost of fault-tolerance is of primary concern in these systems, particularly for emerging applications like micro-satellites, unmanned air vehicles and active safety systems for road vehicles.

We address cost issues of fault tolerance from both a design and a validation perspective. From a design perspective, we investigate cost-effective techniques that can make systems more resilient to transient hardware faults. We propose a two-level approach to achieve faulttolerance that combines system-level and node-level fault tolerance. Our approach relies on nodes that mask the effects of most transient faults and exhibit omission or fail-silent failures for permanent faults and transient faults that cannot be masked by the node itself. As only a subset of the faults is tolerated at the node level, we call this approach light-weight node-level fault tolerance, or light-weight NLFT. Tolerating transient faults at the node level is important in systems that rely on duplicated nodes for fault tolerance, as it allows the system to survive transient faults also when one of the nodes have failed permanently. It also improves the robustness of the system when both nodes are affected by correlated or near coincident transient faults. We have implemented a real-time kernel that supports light-weight NLFT through time redundant execution of tasks and the use of software implemented error detection. The effectiveness of light-weight NLFT is evaluated both analytically and by extensive fault injection experiments.

The thesis also deals with the cost of fault tolerance from a validation perspective. Fault injection based validation of error handling mechanisms is a time consuming and costly activity. We present a fault injection tool that can be easily extended and adapted to different target systems making fault injection less time-consuming. We also propose an analytical technique for investigating how error coverage varies for different input sequences to a system. The analysis helps us identify interesting activation patterns, e.g., those that give extremely low, or high, error coverage.

Keywords: Distributed real-time systems, error detection, error recovery, time redundancy, real-time kernels, fault injection, reliability calculations, analytical coverage estimation.

 



December 7, 2004, 10:15 am

Public defence of PhD thesis by Peter Ljunglöf, Computing Science, Göteborg University:

Expressivity and Complexity of the Grammatical Framework

Room: HA2, Hörsalsvägen 4, Chalmers tekniska högskola, Göteborg

Faculty opponent: Bernard Lang, INRIA Rocquencourt, France


ABSTRACT:

This thesis investigates the expressive power and parsing complexity of the Grammatical Framework (GF), a formalism originally designed for displaying formal propositions and proofs in natural language. This is done by relating gf with two more well-known grammar formalisms; generalized context-free grammar (GCFG), best seen as a framework for describing various grammar formalisms; and parallel multiple context-free grammar (PMCFG), an instance of GCFG.

Since GF is a fairly new theory, some questions about expressivity and parsing complexity have until now not been answered; and these questions are the main focus of this thesis. The main result is that the important subclass context-free GF is equivalent to PMCFG, which has polynomial parsing complexity, and whose expressive power is fairly well known.

Furthermore, we give a number of tabular parsing algorithms for PMCFG with polynomial complexity, by extending existing algorithms for context-free grammars. We suggest three possible extensions of GF/PMCFG, and discuss how the expressive power and parsing complexity are influenced. Finally, we discuss the parsing problem for unrestricted GF grammars, which is undecidable in general. We nevertheless describe a procedure for parsing grammars containing higher-order functions and dependent types.

Keywords: Grammatical Framework, generalized context-free grammar, multiple context-free grammar, context-free rewriting systems, type theory, expressive power, abstract syntax, linearization, parsing



November 25, 2004, 10:00 am

Public defence of PhD thesis by Per Johannessen, Computer Engineering, Chalmers University of Technology:

On the Design of Electrical Architectures for Safety Critical Automotive Systems

Room: EF, EDIT-building, Hörsalsvägen 11, Chalmers tekniska högskola, Göteborg

Faculty opponent: Professor John A. McDermid, Department of Computer Science, University of York


ABSTRACT:

Increasing demands in the automotive industry in areas such as safety, driving pleasure and environmental care require new complex functionality to be implemented in cars. This evolution will to a large extent depend on the introduction of drive-by-wire systems in cars. A drive-by-wire system can replace traditional mechanical linkages, such as steering rods, with sensors, actuators, electronics, and software. Furthermore, many of these functions will be realized in distributed real-time systems with demands for high dependability and low cost. To design systems with such demands, there is a requirement for a structured design process with system focus. Important factors in such a design process are hazard analysis, design methods, and design principles. This thesis presents several approaches in the area suitable for the design of distributed drive-by-wire systems.

The main contribution of this thesis is a system perspective on the design of safety critical drive-by-wire systems. Related research tends to focus on the design of components in a system and not the whole system. However, safety is a system property and it has to be considered at the system level. Therefore, safety can not be guaranteed at component level. Even if a complete design process is not presented here, several parts of a design process for these systems are included.

The contribution of this work can be divided into several areas. Firstly, the development of two novel approaches for hazard analysis, one based on functionality described with a user focus and one based on actuators. These two approaches complement each other. Secondly, three different electrical system architectures are presented. The architectures represent different concepts for safety-critical drive-by-wire systems. Based on these electrical architectures, one design method for safety-critical control-by-wire architectures was developed. The method gives one electrical architecture with minimal communication and hardware given certain design requirements. As an integrated part of the design method, two design principles are presented. One principle is related to implementation and allocation of redundancy to meet dependability requirements. The second principle is related to timetriggered communication and arbitrary fault coverage. Further, the design methods and design principles have been successfully validated in three case studies.

Keywords: dependable systems, distributed systems, drive-by-wire, electrical architecture, hazard analysis, model-based development, system-safety, timetriggered communication

 



November 19, 2004, 10:15 am


Public defence of PhD thesis by Nicholas Wickström, Computer Engineering, Chalmers University of Technology:

Virtual Sensing of Combustion Quality in SI Engines using the Ion Current

Room: Wigforssalen, Visionen, Högskolan i Halmstad

Faculty opponent: Dr. Gerard Malaczynski, Delphi Corporation


ABSTRACT:

Several virtual sensors for combustion quality in spark ignited internal combustion engines are proposed. The virtual sensors estimate combustion quality based on the ion current signal measured in the center of the combustion chamber. Important properties of these sensors are their cost effective implementation and real-time operation.

A combustion variability virtual sensor for exhaust gas recycling is proposed. It estimates the coefficient of variation of indicated mean effective pressure by using the ion current integral while the combustion process is diluted with excess air or exhaust gases. With this it is possible to control, in a feedback loop, the amount of residual gases without experiencing driveability problems, producing large amounts of hydro carbons or stalling the engine.

Further, a virtual sensor is proposed for estimating the location of peak pressure to a precision of 2 degrees even under disturbances of additives or high air humidity. The estimates, made on-line while driving an experimental vehicle on the highway, were used for feedback control of the spark advance.

Finally, an air-fuel ratio virtual sensor is proposed that estimates the air-fuel ratio within 1.2% from the ion current. A second methodology is also proposed that keeps the air-fuel ratio in a multi-cylinder engine in balance, such that a typical imbalance of around 5-7% between the cylinders' air-fuel ratio is reduced.

Keywords: Internal Combustion Engines, Spark Ignition, Estimation, Control, Neural Networks, Ion Current

 


November 5, 2004, 10:00 am

Public defence of PhD thesis by Håkan Sundell, Computing Science, Chalmers University of Technology:

Efficient and Practical Non-Blocking Data Structures

Room: Gustaf Dalén-salen, Origovägen 1, Chalmers tekniska högskola, Göteborg

Faculty opponent: Professor Friedhelm Meyer auf der Heide, Heinz Nixdorf Institute and Computer Science Department, University of Paderborn, Germany


ABSTRACT:

This thesis deals with how to design and implement efficient, practical and reliable concurrent data structures. The design method using mutual exclusion incurs serious drawbacks, whereas the alternative non-blocking techniques avoid those problems and also admit improved parallelism. However, designing non-blocking algorithms is a very complex task, and a majority of the algorithms in the literature are either inefficient, impractical or both.

We have studied how information available in real-time systems can improve and simplify non-blocking algorithms. We have designed new methods for recycling of buffers as well as time-stamps, and have applied them on known non-blocking algorithms for registers, snapshots and priority queues.

We have designed, to the best of our knowledge, the first practical lock-free algorithm of a skip list data structure. Using our skip list construction we have designed a lock-free algorithm of the priority queue abstract data type, as well as a lock-free algorithm of the dictionary abstract data type.

We have designed, to the best of our knowledge, the first practical lock-free algorithm of a doubly linked list data structure. The algorithm supports well-defined traversals in both directions including deleted nodes. Using our doubly linked list construction we have designed a lock-free algorithm of the deque abstract data type. For the lock-free algorithms presented in this thesis, we have given correctness proofs of the strong consistency property called linearizability and the non-blocking properties.

We have made implementations for actual systems of the algorithms presented in this thesis and a representative set of related non-blocking as well as lock based algorithms in the literature. We have built a framework that combines the implementations in the form of a software library that offers a unified and efficient interface in combination with a portable design.

We have performed empirical performance studies of the data structures presented in this thesis in comparison with related alternative solutions. The experiments performed on an extensive set of multi-processor systems show significant improvements for non-blocking alternatives in general, and for the implementations presented in this thesis in particular.

Keywords: synchronization, non-blocking, shared data structure, skip list, doubly linked list, priority queue, dictionary, deque, snapshot, shared register, real-time, shared memory, lock-free, wait-free, abstract data type.



October 22, 2004, 1:15 pm

Public defence of PhD thesis by Elisabeth Uhlemann, Computer Engineering, Chalmers University of Technology:

Adaptive Concatenated Coding for Wireless Real-Time Communications

Room: HC2, Hörsalsvägen 14, Chalmers tekniska högskola, Göteborg

Faculty opponent: Professor Stephen B. Wicker, Cornell University, New York, USA


ABSTRACT:

The objective of this thesis is to improve the performance of real-time communication over a wireless channel, by means of specifically tailored channel coding. The deadline dependent coding (DDC) communication protocol presented here lets the timeliness and the reliability of the delivered information constitute quality of service (QoS) parameters requested by the application. The values of these QoS parameters are transformed into actions taken by the link layer protocol in terms of adaptive coding strategies.

Incremental redundancy hybrid automatic repeat request (IR-HARQ) schemes using rate compatible punctured codes are appealing since no repetition of previously transmitted bits is made. Typically, IR-HARQ schemes treat the packet lengths as fixed and maximize the throughput by optimizing the puncturing pattern, i.e. the order in which the coded bits are transmitted. In contrast, we define an IR strategy as the maximum number of allowed transmissions and the number of code bits to include in each transmission. An approach is then suggested to find the optimal IR strategy that maximizes the average code rate, i.e., the optimal partitioning of n-k parity bits over at most M transmissions, assuming a given puncturing pattern. Concatenated coding used in IR-HARQ schemes provides a new array of possibilities for adaptability in terms of decoding complexity and communication time versus reliability. Hence, critical reliability and timing constraints can be readily evaluated as a function of available system resources. This in turn enables quantifiable QoS and thus negotiable QoS. Multiple concatenated single parity check codes are chosen as example codes due to their very low decoding complexity. Specific puncturing patterns for these component codes are obtained using union bounds based on uniform interleavers. The puncturing pattern that has the best performance in terms of frame error rate (FER) at a low signal-to-noise ratio (SNR) is chosen. Further, using extrinsic information transfer (EXIT) analysis, rate compatible puncturing ratios for the constituent component code are found. The puncturing ratios are chosen to minimize the SNR required for convergence.

The applications targeted in this thesis are not necessarily replacement of cables in existing wired systems. Instead the motivation lies in the new services that wireless real-time communication enables. Hence, communication within and between cooperating embedded systems is typically the focus. The resulting IR-HARQ-DDC protocol presented here is an efficient and fault tolerant link layer protocol foundation using adaptive concatenated coding intended specifically for wireless real-time communications.

Keywords: Incremental redundancy hybrid ARQ, multiple concatenated codes, iterative decoding, rate compatible punctured codes, union bounds, EXIT charts, convergence analysis, wireless real-time communication, quality of service.

 


September 22, 2004, 13:15 am

Public defence of PhD thesis by Emilie Lundin-Barse, Computer Engineering, Chalmers University of Technology:

Logging for intrusion and fraud detection

Room: HA2, Hörsalsvägen 4, Chalmers University of Technology, Göteborg

Faculty Opponent: Professor Roy Maxion, Carnegie-Mellon University, Pittsburg, USA


ABSTRACT:

Computer security is an area of ever increasing importance. Our society relies on computerised services, which gives many reasons for computer criminals, attackers, terrorists, hackers, crackers, fraudsters, or whatever name is appropriate, to break these systems. To deal with security problems, many types of mechanisms have been developed.

One mechanism is the intrusion detection system (IDS), designed to detect ongoing attacks, detect attacks after the fact or even detect preparations for an attack. The IDS is complementary to preventive security mechanisms, such as firewalls and authentication systems, which can never be made 100% secure. A similar type of system is the fraud detection system (FDS), specialised to detect frauds (or "attacks") in commercial services in different business areas, such as telecom, insurance and banking. Fraud detection can be considered a special case of intrusion detection.

A crucial part of intrusion or fraud detection is to have good quality input data for the analysis, as well as for training and testing the systems. However, it is difficult to acquire any training and test data and it is not known what kind of log data are most suitable to use for detection.

The contribution of this thesis is to offer guidance in matters of acquiring more suitable log data for intrusion and fraud detection. The first part is general and gives a survey of research done in intrusion detection and shows that intrusion and fraud detection reflect different aspects of the same problem. The second part is devoted to improving the availability and quality of log data used in intrusion and fraud detection. The availability of log data for training and testing detection systems can be improved by solving the privacy issues that prevent computer system owners from releasing their log data. Therefore, a method is suggested for anonymising the log data in a way that does not significantly affect their usefulness for detection. Though authentic data are convenient to use for training and testing they do not always have the desirable properties, which include flexibility and control of content. Another contribution to improve the availability and also the quality of log data is thus a method for creating synthetic training and test data with suitable properties. This part also includes a methodology for determining exactly which log data can be used for detecting specific attacks. In the ideal situation, we only collect exactly the data needed for detection, and this methodology can help us develop more efficient and adapted log sources. These new log sources will improve the quality of log data used for intrusion and fraud detection.




June 4th 2004, 1:15 am

Public defence of PhD thesis by Håkan Kvarnström, Computer Engineering, Chalmers University of Technology:

On the Implementation and Protection of Fraud Detection Systems

Room: HC1, Hörsalsvägen 14, Chalmers tekniska högskola, Göteborg

Faculty Opponent: Professor Richard Kemmerer, Computer Science, University of California, USA

ABSTRACT:

Society of today is becoming increasingly dependent on the availability and correctness of IT-systems. There are a growing number of industries that base their business almost exclusively on creating growth and economic value using IT-systems. Banks, credit card providers, online gambling brokers and providers of telecommunication services are a few examples where secure and reliable communication and computing environments are of the outmost importance for survival. These new services can create increased customer value and provide new business opportunities. However, history shows that fraudsters constantly seek economic gain by exploiting weaknesses in technical systems or business models of service offerings. Thus, minimizing economic loss due to fraud is an important goal in many enterprises. An important tool in that process is an efficient fraud detection system (FDS) that can give early indications of misuse of products and services offered to customers. This thesis addresses the subject of implementing and protecting an FDS. Specifically, we give arguments for the importance of thorough testing, training and sufficient protection when deploying a successful fraud detection system. We show how emerging computing environments put new demands on the security mechanisms and security extensions that protect the information and resources of their target environment. Fraud detection systems are no exception and we argue that detection mechanisms will play an important role in such environments. We propose and implement a methodology for generating synthetic data sets that can be used for testing and training fraud detection systems prior to real-life deployment. A comparison of fraud and intrusion detection system is made, where we highlight differences and similarities. Our study shows that the similarities clearly outweigh the differences, and research in one area is most often applicable for both types of systems. Finally, we discuss the topic of protecting fraud detection systems from malicious environments. Specifically, we study the importance of such systems from reverse engineering and propose mechanisms for achieving secrecy of the inherent information, even when deployed in hostile environments.

Keywords: computer security, fraud detection, intrusion detection, fraud, security architecture, computer misuse


June 2nd 2004, 10:15 am

Public defence of PhD thesis by Sven Eklund, Computer Engineering, Chalmers University of Technology:

Exploring Massively Parallel Models and Architectures for Efficient Computation of Evolutionary Algorithms

Room: Wigforssalen, Högskolan i Halmstad

Faculty Opponent: Professor Wolfgang Banzhaf, Department of Computer Science, Memorial University of Newfoundland, Canada


ABSTRACT:

Evolutionary algorithms is the collective name for a group of relatively new stochastic search algorithms that have several unique and interesting properties. However, as with all search algorithms, evolutionary algorithms also have disadvantages. One of the most troublesome is that they require very time-consuming computations in order to reach a solution. Even though the calculations of the algorithms are simple, they have to be repeated so many times that the algorithms often require several days or even months of calculations to reach a solution when run in software on a modern computer. In this thesis we present a way to speed up these calculations by combining different improvements. First, we choose a distributed computation model, called the Diffusion Model, that makes it possible to utilize massively parallel computation resources and thereby drastically reduce the computation time. Second, we choose linear machine code as a representation for the evolutionary algorithm. These segments of code are executed in parallel by specially designed processing elements, embedded as independent, parallel resources in the system. Third, on the basis of extensive simulations using a custom made simulator that allowed us to investigate the design space of the Diffusion Model in great detail, we determine a recommended set-up for the distributed Diffusion Model. This set-up includes the topology, the size and shape of the neighbourhoods and the local selection algorithm, as well as the parameters for the distributed evolutionary algorithm. As a final factor for speeding up the calculations, we demonstrate how a system based on the massively parallel Diffusion Model can be implemented in hardware using VLSI technology. Besides being efficient, the approach is also flexible enough to be used in different application areas and scalable enough to be able to address larger problems by using more computational resources. By implementing the system using VLSI technology, it becomes both compact and power efficient, making it possible to integrate evolutionary algorithms in many applications.

Keywords: Evolutionary algorithms, genetic programming, linear machine code, distributed population models, Diffusion Model, symbolic function regression, time series forecasting, VHDL, FPGA.

 


May 28th, 2004, 1:15 am

Public defence of PhD thesis by Cecilia Ekelin, Computer Engineering, Chalmers University of Technology:

An Optimization Framework for Scheduling of Embedded Real-Time Systems

Room: HC1, Hörsalsvägen 14, Chalmers, Göteborg

Faculty Opponent: Professor Gerhard Fohler, Institutionen för Datavetenskap, Mälardalens Högskola, Västerås

ABSTRACT:

Embedded real-time systems -- appearing in products such as cars and mobile phones -- are nowadays common in our everyday lives. Despite this fact, the design process of such systems is still cumbersome due to the large variety of design constraints that must be considered to ensure a safe operation of the system. In particular, present scheduling techniques -- that analyze the timing behavior of the system -- typically assume a too limited model to truly represent the system. In addition, to make the system cost-effective its design should be optimized regarding performance measures such as resource utilization, energy consumption and robustness. Unfortunately, optimization in general is very time-consuming process, often without guarantee that the best solution will be found.

This thesis addresses these problems by proposing a scheduling framework that not only enables arbitrary design constraints to be modelled but also allows for design optimization. The framework is based on constraint programming, and this thesis presents how the problem of scheduling embedded real-time systems can be modeled and solved using this technique. In addition, a number of novel techniques for reducing the runtime of the optimization algorithm are presented. This includes the identification and exclusion of symmetries in the solution space as well as fast and tight estimates of how good a solution may get. Finally, this thesis contains a performance comparison between the proposed framework and other state-of-the-art scheduling algorithms. The evaluation shows that both the quality of the solutions and the optimization time is improved over previous approaches --- in many cases the order of the solution time is reduced from minutes to seconds.

Keywords:
real-time systems, constraint programming, combinatorial optimization, scheduling, symmetry exclusion, lower-bound estimates, embedded system, design, energy awareness

 



April 16th, 2004, 1:15 am

Public defence of PhD thesis by Fredrik Brännström, Computer Engineering, Chalmers University of Technology:

Convergence Analysis and Design of Multiple Concatenated Codes

Room: HC1, Hörsalsvägen 14, Chalmers, Göteborg

Faculty Opponent: Dr. Gerhard Kramer, Mathematics of Communications Research Department, Bell Laboratories, Lucent Technologies, Murray Hill, New Jersey, USA.

ABSTRACT:

The objective of this thesis is to develop a methodology for designing efficient multiple concatenated coding schemes, consisting of two or more constituent codes, concatenated in serial or in parallel. Extrinsic information transfer (EXIT) analysis is found to be a suitable and versatile analysis tool for such codes and thus, the design methodology is based almost entirely on this technique. The performance of multiple concatenated codes (MCCs) in terms of convergence thresholds, decoding complexity required for convergence, and resulting bit-error rate (BER) can be determined from their multi-dimensional EXIT charts. However, multi-dimensional EXIT charts are difficult to visualize, and hence an equivalent two dimensional EXIT chart projection is introduced, yielding a powerful tool for the convergence analysis of MCCs. An exhaustive search for rate-one convolutional codes with memory four or less is performed, resulting in 98 classes of codes with unique EXIT functions. In addition, a search for 8PSK mappings with different performance in terms of BER and mutual information for the individual bits is conducted. Further, optimal puncturing ratios for MCCs in terms of minimizing the convergence threshold are found utilizing the EXIT functions of the constituent codes. Following this approach, additional degrees of freedom for constructing MCCs with low convergence thresholds over a range of code rates are obtained. For an MCC with two constituent codes, decoding is performed by alternately activating the two decoders. With more than two components, the order of activation is no longer obvious. An algorithm that finds the optimal decode schedule, in terms of minimizing the total decoding complexity required to reach convergence, is proposed. Finally, a method for obtaining unbiased estimates of the symbol energy and the noise variance without knowledge of the transmitted bits is presented.



February 25th, 2004, 13:15

Public defence of PhD thesis by Håkan Sivencrona, Computer Engineering, Chalmers University of Technology:

On the Design and Validation of Fault Containment Regions in Distributed Communication Systems

Room: HA2, Hörsalsvägen 4, Chalmers, Göteborg

Faculty Opponent: Professor Geert Deconinck, Katholieke Universitet, Leuven, Belgium


Abstract:

This thesis has a two fold focus where the first is an evaluation of a time-triggered communication protocol implementation, TTP-C1, which was stressed by use of heavy-ion fault injection. The gathered result showed a novel type of failures, earlier known only from theory, so-called slightly-out-of-specification faults, that manifested as Byzantine failures with resulting system inconsistence. The second focus lies on the design and simulation of algorithms for RedCAN switches to dynamically reconfigure a CAN bus. The thesis is divided into three parts which deals with the design, validation and analysis of dependable communication with respect to the above mentioned focus.

Part I deals with the design of mechanisms for fault containment. We propose one algorithm to handle slightly-out-of-specification faults in time domain as they manifested during heavy-ion fault injections in TTP-C1 implementation. We present a novel simulation tool to test and execute the scenarios that lead to serious communication degradation using two different algorithms. A second algorithm that we propose handles a distributed recovery approach after permanent bus and node failures in a CAN communication system using RedCAN switches.

Part II presents results from heavy-ion fault injection experiments in a TTP-C1 cluster consisting of four to nine synchronized nodes. This was done to be able to evaluate the performance of the time-triggered protocol and assess the efficiency of the implemented dependability increasing mechanism in presence of faults. Part II furthermore presents performance results of different RedCAN recovery algorithms collected through a novel simulation tool, RedCAN simulation manager that we have designed. Part II additionally includes validation result of a proposed solution for isolating asymmetric faults in a time-triggered system through an active star coupler.

Part III contains analyses of the results given from presented validation results and real world experiences, especially Byzantine faults in a distributed communication system are discussed.

Keywords: Time-triggered protocol, fault containment regions, fault injection, validation, slightly-out-of-specification faults, Byzantine faults, asymmetric faults, fault handling and membership agreement.

 



January 27th, 2004, 13:15

Public defence of PhD thesis by Jim Nilsson, Computer Engineering, Chalmers University of Technology:

Towards Accurate and Resource-Efficient Coherence Prediction

Room: HC1, Hörsalsvägen 14, Chalmers, Göteborg

Faculty Opponent: Professor Stefanos Kaxiras, University of Patras, Greece


ABSTRACT:

The increasing speed gap between processor microarchitectures and memory technologies can potentially slow down the historical performance growth of computer systems. Parallel applications on shared memory multiprocessors that experience cache misses due to communication are extra susceptible to this speed difference.

Prediction has been proposed as a general technique in computer architecture to anticipate and speculate about events in the processor and in the memory system. Specific prediction techniques have been successfully used in different parts of the processor, e.g. branch outcome prediction, branch target prediction, and prefetching.

Coherence message prediction has recently been proposed as a means to reduce the impact of long-latency memory operations in multiprocessors. However, this prediction can be very resource consuming and ineffective.

This thesis addresses the resource inefficiency as well as the inaccuracy of proposed coherence predictors. It proposes novel prediction mechanisms with improved accuracy and/or with lower resource requirements.

One important finding of the thesis is that specialized coherence prediction techniques to improve write-related overhead are only moderately effective for a transaction processing workload. I show that improvements are possible through the notion of load-store sequences -- a new data sharing model I contribute with in the thesis.

In the area of generalized coherence message prediction I contribute with a novel caching scheme for history tables that cuts down the static overhead by an order of magnitude. In-depth analyses of the dynamic memory requirements of generalized coherence predictors, as well as their learning time effects are also presented, with suggestions on how to reduce the dynamic overhead while increasing the accuracy of coherence predictors.


Keywords: Computer architecture, shared memory multiprocessors, cache
coherence protocols, prediction, memory systems, performance evaluation.


Licentiate seminars 2004


December 20, 2004, 10:15 am

Licentiate seminar by Ming Xiao, Computer Engineering, Chalmers University of Technology:

Some Concatenated and Iterative Decoding Approaches for Continuous Phase Modulation

Room: EL41, Linsen, Rännvägen 6, Chalmers tekniska högskola, Göteborg

Discussion leader: Ingmar Land, Department of Communication Technology, Aalborg University.


ABSTRACT:

Several concatenated and iterative decoding approaches are used to improve the power efficiency of continuous phase modulation (CPM). First a symbol interleaver is used for serially concatenated CPM (SCCPM). Convergence of iterative decoding for this scheme can be earlier (lower SNR: Signal-to-Noise-Ratio) compared to bit interleaved SCCPM. Then a ring convolutional code (CC) is used as the outer code in this symbol interleaved SCCPM. By setting the to the alphabet size of the system (CPM, interleaver and ring CC), a natural combination of them is achieved. Further improvements on earlier convergence or error floors are achieved with this scheme. Finally, irregular repeat accumulate (IRA) codes with CPM (IRCPM) are considered. CPM now takes the role of the accumulator. This scheme shows even earlier convergence than SCCPM.

To analyze the error floor performance, we derive the union bound for symbol interleaved SCCPM. In this bound analysis, we have to consider the order of nonzero permuted difference symbols that are not identical. We also derive the union bound for IRCPM from its equivalent structure. This structure introduces a virtual interleaver before CPM. Thus, the union bound analysis for SCCPM is now applicable.


December 3, 2004, 10:15 am

Licentiate seminar by Markus Forsberg, Computing Science, Chalmers University of Technology:

Applications of Functional Programming in Formal and Natural Languages

Room: ES52, Rännvägen 6B, Chalmers tekniska högskola, Göteborg

Discussion leader: Professor Viggo Kann, Institutionen för numerisk analys och datalogi, KTH, Stockholm


ABSTRACT:

This thesis describes two applications of functional programming to process formal and natural languages. The techniques described in this thesis are closely connected to compiler construction, which is obvious in the work on BNF Converter.

The first part of the thesis describes the BNFC (the BNF Converter) application, a multi-lingual compiler tool. BNFC takes as its input a grammar written in Labelled BNF (LBNF) notation, and generates a compiler front-end (an abstract syntax, a lexer, and a parser). Furthermore, it generates a case skeleton usable as the starting point of back-end construction, a pretty printer, a test bench, and a Latex document usable as a language specification. The program components can be generated in Haskell, Java, C and C++, and their standard parser and lexer tools. BNFC itself was written in Haskell.

The methodology used for the generated front-end is based on Appel's books on compiler construction. BNFC has been used as a teaching tool in compiler construction courses at Chalmers. It has also beenapplied to research-related programming language development, and in an industrial application producing a compiler for a telecommunications protocol description language.

The second part of the thesis describes Functional Morphology, a toolkit for implementing natural language morphology in the functional language Haskell. The main idea behind is simple: instead of working with untyped regular expressions, which is the state of the art of morphology in computational linguistics, we use finite functions over hereditarily finite algebraic data types. The definitions of these data types and functions are the language-dependent part of the morphology. The language-independent part consists of an untyped dictionary format which is used for translation to other morphology formats and synthesis of word forms, and to generate a decorated trie, which is used for analysis.

Functional Morphology builds on ideas introduced by Huet in his computational linguistics toolkit Zen, which he has used to implement the morphology of Sanskrit. The goal has been to make it easy for linguists who are not trained as functional programmers, to apply the ideas to new languages. As a proof of the productivity of the method, morphologies for Swedish, Italian, Russian, Spanish, and Latin have already been implemented.


November 19, 2004, 10:00 am

Licentiate seminar by Niklas Elmqvist, Computing Science, Chalmers University of Technology:

Visualization of Causal Relations

Room: ED, Hörsalsvägen 11, Chalmers tekniska högskola, Göteborg

Discussion leader: Professor Stephan Diehl, Katholische Universität Eichstätt-Inglostadt, Tyskland

ABSTRACT:

The notion of cause and effect is pervasive in human thinking and plays a significant role in our perception of time. Software systems, in particular parallel and distributed ones, are permeated by this causality, and the human mind is especially well-suited to detect instances of this concept. Unfortunately, real-world systems of causally related events are often too large and complex to be comprehended unaided. In this thesis, we explore ways of using information visualization to help humans perceive these complex systems of causal relations, not only for software systems, but also for more general application areas.

The Growing Squares visualization technique uses a combination of color, texture, and animation to present a sequence of related events in a distributed system. User studies show that this technique is significantly more effective for solving problems related to causality in distributed systems than traditional Hasse diagrams for small data sets, and more effective (though not significantly so) for large data sets.

The Growing Polygons visualization technique was designed to address some of the weaknesses of the Growing Squares technique, and presents the interacting processes in a system as color-coded polygons with sectors indicating the influences and information propagation in the system. User studies show that this technique is significantly more effective than Hasse diagrams for all data sets, regardless of size.

Finally, we have conducted a case study of causality visualization in the context of scientific citation networks, creating a bibliographic visualization tool called CiteWiz. The tool contains a modified Growing Polygons visualization, suitably adapted to citation networks with linear time windows and process hierarchies, as well as a new static timeline visualization that maps the citation count and publication date of an article or author to its size and position on the timeline, respectively.

Keywords: causality visualization, causal relations, citation network
visualization, information visualization.

 



November 17, 2004, 2:15 pm

Licentiate seminar by Xing Fan, Computer Engineering, Chalmers University of Technology:

Real-time communication services for distributed computing over switched Ethernet

Room: Wigforssalen, Visionen, Högskolan i Halmstad

Discussion leader: Professor Anna Brunström, Institutionen för informationsteknologi, Karlstads universitet.

ABSTRACT:

In modern and future parallel and distributed processing, a large part of computation overhead comes from communication. This can be minimized if the network protocol offers the user services that are aimed at specific types of communication used in these applications. Other important properties of distributed processing applications are time-deterministic latency and guarantees to meet deadlines. Moreover, an important trend is to implement distributed real-time applications on top of standard Ethernet based networks. Therefore, in this thesis, we focus on developing and analyzing how to efficiently support real-time communication services for distributed computing applications over switched Ethernet. The network architecture currently assumed is a switched Ethernet network with only one switch.

The work has resulted in proposed Switched Ethernet networks that offer additional features for parallel and distributed real-time processing. An active Ethernet switch concept is proposed to provide efficient support for different user services, including many-to-many communication and other group communication services with high traffic volumes of short messages. Meanwhile, the real-time support for these special communication patterns is addressed by incorporating deadline-based scheduling in the switch and the end nodes.

Moreover, this thesis addresses real-time services by proposing an alternative solution. In this proposal, the Earliest Deadline First (EDF) algorithm is only used in the source nodes to support real-time traffic with a guaranteed bit rate and end-to-end worst-case delay bound. The thesis also reports a feasibility analysis for hard real-time traffic, which also produces figures on the minimum buffer sizes in the switch to be able to guarantee real-time demands. Meanwhile, differentiation of heterogeneous traffic is considered in the proposed system by placing traffic into several priority classes with distinctly different QoS levels.

The performance of the proposed methods is evaluated in simulations and calculations. It is shown that the different Ethernet extensions, in many cases, are efficient choices for distributed computing systems.

Keywords: Switched Ethernet, Real-time, Parallel and distributed processing, Scheduling.

 


November 9, 2004, 10:15 am

Licentiate seminar by Ulf Norell, Computing Science, Chalmers University of Technology:

Implementing Functional Generic Programming

Room: VD, EDIT-building, Hörsalsvägen 11, Chalmers tekniska högskola, Göteborg

Discussion leader: Jeremy Gibbons, Oxford University Computing Laboratory, England


ABSTRACT:

Functional generic programming extends functional programming with the ability to parameterize functions on the structure of a datatype. This allows a programmer to implement certain algorithms once and for all, instead of re-implementing them for each datatype they apply to. Examples of such algorithms include simple traversals and pretty printing as well as more complex XML processing tools.

The topic of this dissertation is the implementation of functional generic programming. More precisely we address two particular questions: how can we reduce the amount of work required to implement generic programming languages, and how can we embed generic programming in an existing functional language.

To answer the first question we show how meta-programming can be used to quickly prototype generic programming languages. In particular we describe prototype implementations of two generic programming languages: PolyP and Generic Haskell. The prototypes are extremely light-weight while still retaining most of the functionality of the original languages. One thing that is missing, though, is a way of adding type systems to the prototypes.

In answer to the second question we show how generic programming can be embedded in Haskell by exploiting the class system. We describe a new version of PolyP (version 2) together with a compiler that compiles PolyP 2 code into Haskell. By compiling the polytypic library, PolyLib, with our compiler we get a library of generic functions for Haskell.

 


October 29 , 2004, 10:15 am

Licentiate seminar by David Wahlstedt, Computing Science, Chalmers University of Technology:

Type Theory with First-Order Data Types and Size-Change Termination

Room: ES52, Rännvägen 6B, Chalmers tekniska högskola, Göteborg

Discussion leader: Erik Palmgren, Matematiska institutionen, Uppsala universitet.

ABSTRACT:

We prove normalization for a dependently typed lambda-calculus extended with first-order data types and computation schemata for first-order size-change terminating recursive functions. Size-change termination, introduced by C.S. Lee, N.D. Jones and A.M. Ben-Amram, can be seen as a generalized form of structural induction, which allows inductive computations and proofs to be defined in a straight-forward manner. The language can be used as a proof system---an extension of Martin-Löf's Logical Framework.

Keywords: Type Theory, Dependent types, Lambda-calculus, Normalization, Size-Change Termination, Type system, Logical Framework, Reducibility, Pattern-matching, Term rewriting.

More information

 



October 18, 2004, 10:00 am

Licentiate seminar by Ha Hoai Phuong, Computing Science, Chalmers University of Technology:

Reactive Shared Objects for Interprocess Synchronization

Room: ES51, Rännvägen 6B, Chalmers tekniska högskola, Göteborg

Discussion leader: Professor Roger Wattenhofer, Department of Computer Science, ETH, Zürich.


ABSTRACT:

In parallel processing environments such as multiprocessor systems, processes are synchronized using concurrent objects, which allow many concurrent processes to access them at the same time. The performance of these concurrent objects heavily relies on the load conditions of the surrounding environment (e.g. OS, other applications, interconnection network, etc.), which are variant and unpredictable due to indeterministic interferences among processes.

Therefore, in order to minimize synchronization cost, a major overhead in parallel applications, we need to construct efficient concurrent objects that can react to the environment variations in order to achieve good performance under all possible system conditions. This thesis presents two new reactive shared objects: the "reactive multi-word compare-and-swap object" and the "self-tuning reactive tree".

The "multi-word" compare-and-swap objects are powerful constructs, which make the design of concurrent data objects much more effective and easier. The "reactive" multi-word compare-and-swap objects are advanced objects that dynamically measure the level of contention as well as the memory conflicts of the multi-word compare-and-swap operations, and in response, they react accordingly in order to guarantee good performance in a wide range of system conditions. We present two algorithms that are non-blocking (lock-free), allowing in this way a fast and dynamical behavior.

The self-tuning reactive trees distribute a set of processes to smaller groups accessing different parts of the memory in a coordinated manner, and moreover reactively adjust their size in order to attain efficient performance across different levels of contention. We present a data structure that in an on-line manner balances the trade-off between the tree traversal latency and the latency due to contention at the tree leaves.

The experiments on the SGI Origin2000, a well-known commercial ccNUMA multiprocessor, showed that the new reactive objects achieve significant improvements compared to previous results.

 


May 10th, 2004, 1:15 am

Licentiate seminar by Karin Hardell, Computing Science, Chalmers University of Technology:

Sequence Based Structural and Functional Modelling of the Halobacterial Transducer of Rhodopsin II from Natronbacterium Pharaonis

Room: Sal ES53, Hörsalsvägen 11, Chalmers, Göteborg

Discussion leader: Lennart Sjölin, institutionen för Oorganisk kemi, Göteborg University

ABSTRACT:

The aim of this work has been to construct a functional and structural model of one of the proteins involved in archaebacterial swimming behavior, or taxis, namely the halobacterial transducer of sensory rhodopsin II from Natronobacterium pharaonis. The recently published crystal structure of this transducer, complexed with its receptor sensory rhodopsin, lacks information on functionally and structurally important parts of the transducer. Therefore, we aimed at constructing a model that could compensate for the lack of structure data and thus give us further clues about the function and structure of the transducer as well as other proteins involved in taxis.

A multiple sequence alignment of related proteins involved in bacterial and archaeal taxis enabled us to relate available structural data to conserved features among the sequences. In particular, we emphasize the role and structure of a common sequence motif involved in signaling - called the HAMP domain - in the process of taxis. We present a novel model for binding between the transducer and its sensor, in which we propose inter-dimer contacts in addition to the already know intra-dimer contacts between the transducer and sensory rhodopsin. Preliminary experimental studies have been conducted in an effort to elucidate the structure of a HAMP domain within the transducer, which plays an important role in signal transduction. These experimental studies have strengthened our proposed structural and functional models.

 


April 20th, 2004, 1:30 am

Licentiate seminar by Mattias Weckstén, Computing Science, Chalmers University of Technology:

Resource Budgeting as a Tool for Reduced Development Cost for Embedded Real-Time Computer Systems

Room: Wigforssalen, Kristian IV:s väg, Halmstad University.

Discussion leader: Dr Jakob Axelsson, Volvo Car Corporation, Göteborg


ABSTRACT:

Wouldn’t it be great if there were a systematic method for derivation of non functional constraints available at design time that made it possible to verify design and make implementation a much clearer task? This kind of methods are needed since systems of increasing complexity has to be developed, and the cost for failing has proven to bee too high. The problem is how to derive the design time constraints into implementation time constraints, maintaining the traceability for the individual constraints, and early on get indications whether a project is about to fail or not.

A method for implementation time constraint derivation has been developed and is presented in this thesis. Along with the basic method, several extensions are proposed. Evaluations of the practical usefulness of the method and the method’s scalability have been done. To prove the method’s importance in real development projects, a method for evaluation of the usability of this kind of methods has also been developed. The evaluation of the practicality shows that it is possible to find close to optimal solutions (within percent) in short time (within minutes). The evaluation of the scalability shows that the run time for finding implementable solutions scales polynomial with the size of the task graph. The evaluation of the usability shows that using the proposed method always leads to lower development cost than using an ad hoc method, in the case that the implementation is about to fail.

Keywords: Real-time systems, embedded, resource budgeting, design tool, tightness optimization, guarantees.

 


March 24th, 2004, 1:15 am

Licentiate seminar by Mindaugas Drazdziulis, Computer Engineering, Chalmers University of Technology:

Sleep-Mode Circuit Techniques in the Presence of Gate Leakage


Room: EL43, Rännvägen 6B, Chalmers

Discussion leader: Anders Edman, Acreo AB, Norrköping

ABSTRACT:

In recent years the market for portable applications, which have an increasing demand for battery life-time, has grown dramatically. Also, there is a demand for more functionality per chip, which requires more transistors to be integrated on a single chip. Due to these trends the total power dissipation of the chip is growing. This dictates a continuous need for solutions that would improve the development of power-aware electronics.

In low-power applications the control over the power dissipated due to leakage currents is as significant as the control over the switched power dissipation. In the near future, the leakage power will dramatically increase in part due to the emerging new leakage mechanisms, such as gate direct tunneling which is growing exponentially when the gate insulator thickness is shrinking. Gate direct tunneling currents, also referred to as gate leakage, will soon reach the same size as subthreshold leakage and will continue to grow if advancements in process technology or circuit (or both) levels will not be made.

The appearance of gate leakage increases the need to re-evaluate previously introduced leakage power-saving circuit techniques, since most of them initially were targeting only subthreshold leakage. In this thesis the efficiency of established techniques is evaluated in terms of total (subthreshold and gate) leakage suppression. A new gate-leakage suppression technique is proposed which efficiently reduces total leakage, when a logic circuit is in sleep mode.

 


February 5th, 2004, 10:15

Licentiate seminar by Janna Khegai, Computing Science, Chalmers University of Technology:

Language Engineering in Grammatical Framework (GF)

Room: Sal EC, Hörsalsvägen 11, Chalmers, Göteborg

Discussion leader: Mats Wirén, Telia Research, Stockholm

ABSTRACT:

This thesis describes a number of practical experiments rather than theoretical investigations in the area of natural language processing. The basis for the work presented is Grammatical Framework (GF). It is a very complex system, which comprises among other things a grammar formalism based on type theory and its implementation written in Haskell. GF is intended for high-quality machine translation (of INTERLINGUA type) in the restricted language domains.

The primary concern of this thesis is however limited to the usage of GF as a piece of software. The main results are:

• Implementing a syntax editor, which provides a graphical user interface (GUI) for the command-line GF core.

• Writing a part of code for automatic generation of gramlets—pure Java programs with limited (compared to GF) functionality that can be run on PDA (Portable Device Assistants) and as applets in a browser.

• Writing the Russian resource grammar that takes care of the most basic morphological and syntactic rules and serves as a standard library for building application grammars (describing restricted language domains) in Russian. These results contribute to language engineering in GF on two di®erent levels:

• Author level (end-user) — constructing sentences in natural languages.

• Grammarian level — building a grammar description, which is later used on the author level. The last part of the thesis deals with a non-linguistic domain. In that experiment we try to apply functional parsing technique to the well-known problem of protein secondary structure prediction (bioinformatics).

Keywords: syntax editing, multilingual authoring, graphical user interface (GUI), interlingua, natural language processing (NLP), computational linguistics, machine translation (MT)

 


January 19th, 2004, 13:15

Licentiate seminar by Carl-Johan Lillieroth, Computing Science, Chalmers University of Technology:

Formal Verifications of FPGA Cores

Room: Sal HA2, Hörsalsvägen 4, Chalmers, Göteborg

Discussion leader: Lars Philipson, Lunds universitet

ABSTRACT:

Field Programmable Gate Arrays (FPGAs) offer a flexible way of designing systems based on hardware by configuration of a programmable hardware device. To allow for FPGAs to be used in electronic design, it has become necessary to use ready-made blocks, Intellectual Property cores (IP-cores). The use of IP-cores moves the verification effort from the user of the core, who uses the core to build a system on the FPGA, to the core vendor.

This thesis describes one way of using formal verification when verifying IP-cores for a specific FPGA, Xilinx Virtex. Xilinx provides a wide range of cores for Virtex, from simple arithmetic circuits, counters and registers, to interface circuits and entire micro-processors. For the formal verification, the core to be verified is represented in a standard netlist format (EDIF). It is compared to one of three possible netlists: a reliable netlist with the same functionality, a netlist synthesised from a behavioural (VHDL/Verilog) simulation model of the core, or a netlist synthesised from a behavioural specification. Standard synthesis tools (Synopsys, Synplicity, Leonardo) are used for the synthesis. The netlists are translated into LUSTRE programs to allow for equivalence checking using an automatic verification tool (Lucifer). This tool uses a propositional satisfiability solver (SAT) for verification of combinational circuits and SAT-based induction for sequential circuits. The SATprocedure is built on Stålmarck’s Method. If the circuits are found to be different, a simulation script for a common simulator (ModelSim) is generated. This script can be used to point out a difference in behaviour between the circuits when run with HDL models of the circuits. If not readily available, such models can be generated directly from the netlists using a tool available in Xilinx’ design flow.

This work was partly carried out at Xilinx and resulted in the verification tool ARGUS, which automates the verification of cores, from netlists to error simulation scripts. It was integrated into the design flow for cores and experiments with verification of industrial cores were carried out. The verification efforts at Xilinx resulted in the first published results of experiments in verification of real circuits using SAT-based induction. Also, bugs in real circuits were revealed and corrected.

To allow for translation of netlists into LUSTRE, we present a technique for creating formal models of FPGA components using the languages LUSTRE and Haskell. This technique is incorporated into ARGUS. It handles components that contain both state behaviour controlled by the clock, and behaviour not controlled by the clock. Integrated into ARGUS is also a technique for describing properties that allows for equivalence checking of sequential pipelined circuits. Thus, circuits with different length of the pipeline (latency) can be compared directly and with ease.

Keywords: formal verification, FPGAs, cores, intellectual property, SAT-based equivalence checking, SAT-based induction, pipelined circuits.


Previous page

Seminars 2003

Seminars 2002

 

Webmaster: Catharina Jerkbrant
Last modified: 050103