CLT, Centre for Language Technology, a focus area of research of the University of Gothenburg.
GF Resource Grammar Summer School, a collaborative effort to extend the GF Resource Grammar Library, 17-28 August 2009.
GoTAL, an international conference on Natural Language Processing, 25-27 August 2008, is organized by the group. The Proceedings appeared as Springer LNCS/LNAI vol. 5221.
The Language Technology Group at the Department of Computer Science and Engineering was founded in 2001. It built on the Department's earlier competence areas:
With this background, the group was in the beginning very much profiled towards precision-oriented tasks, rather than to wide coverage
In more recent years, the efforts have been extended to the creation of tools and resources usable in all kinds of language technology tasks:
The main characteristics identifying our group are
Currently (February 2008), the group has 8 members with a PhD and 4 PhD students. Some of the senior members have their principal affiliations at the Departments of Linguistics and Swedish Language.
Grammatical Framework (GF),
a multilingual grammar formalism based on the idea of a shared
abstract syntax and mappings between the abstract syntax and
concrete languages. GF has hundreds of users all over the world.
The GF Resource Grammar Library,
implements the morphology (inflection) and basic syntax (phrase structure)
of 16 languages:
Bulgarian, Catalan, Danish, Dutch, English,
Finnish, French, German, Italian, Norwegian,
Polish, Romanian, Russian, Spanish, Swedish,
These resources are,
freely available as open-source software. More languages are under construction,
in both in-house and external projects.
The Numeral Translator,
is a demo of embedded grammars in
Java. It translates number words between 88 languages.
The Letter Editor
is another demo of embedded grammars in
Java. It allows the user to write a letter in languages she doesn't know
while viewing it in a language she knows.
The Pizza ordering system,
is a demo of integrated speech language model,
supporting the used WC3 standards (e.g. Opera on Windows), the user
can construct an order by using spoken language.
Lexicon Extraction from Raw Text Data,
a tool for collecting a morphological lexicon by the use of inflection paradigms.
a Haskell library for developing inflection engines and morphological lexica.
Unsupervised Learning of Morphology,
a technique usable for languages with scarce resources.
A Fine-Grained Model for Language Identification,
a technique usable for short passages and language switching.
The BNF Converter (BNFC) is a high-level multi-backend compiler tool, inspired by GF. It has thousands of users and is included in Linux distributions such as Debian and Ubuntu.