GF Resource Grammar Library v. 1.2

Author: Aarne Ranta <aarne (at) cs.chalmers.se>
Last update: Fri Dec 21 18:15:24 2007

The GF Resource Grammar Library defines the basic grammar of ten languages: Danish, English, Finnish, French, German, Italian, Norwegian, Russian, Spanish, Swedish. Still incomplete implementations for Arabic and Catalan are also included.

New in December 2007: Browsing the library by syntax editor directly on the web.

Authors

Inger Andersson and Therese Soderberg (Spanish morphology), Nicolas Barth and Sylvain Pogodalla (French verb list), Ali El Dada (Arabic modules), Magda Gerritsen and Ulrich Real (Russian paradigms and lexicon), Janna Khegai (Russian modules), Bjorn Bringert (many Swadesh lexica), Carlos Gonzalía (Spanish cardinals), Harald Hammarström (German morphology), Patrik Jansson (Swedish cardinals), Andreas Priesnitz (German lexicon), Aarne Ranta, Jordi Saludes (Catalan modules), Henning Thielemann (German lexicon).

We are grateful for contributions and comments to several other people who have used this and the previous versions of the resource library, including Ludmilla Bogavac, Ana Bove, David Burke, Lauri Carlson, Gloria Casanellas, Karin Cavallin, Robin Cooper, Hans-Joachim Daniels, Elisabet Engdahl, Markus Forsberg, Kristofer Johannisson, Anni Laine, Hans Leiß, Peter Ljunglöf, Saara Myllyntausta, Wanjiku Ng'ang'a, Nadine Perera, Jordi Saludes.

License

The GF Resource Grammar Library is open-source software licensed under GNU Lesser General Public License (LGPL). See the file LICENSE for more details.

Scope

Coverage, for each language:

Organization:

Presentation:

Location

Assuming you have installed the libraries, you will find the precompiled gfc and gfr files directly under $GF_LIB_PATH, whose default value is /usr/local/share/GF/. The precompiled subdirectories are

    alltenses
    mathematical
    multimodal
    present

Do for instance

    cd $GF_LIB_PATH
    gf alltenses/langs.gfcm
      
    > p -cat=S -lang=LangEng "this grammar is too big" | tb

For more details, see the Synopsis.

Compilation

If you want to compile the library from scratch, use make in the root of the source directory:

    cd GF/lib/resource-1.0
    make

The make procedure does not by default make Arabic and Catalan, but you can uncomment the relevant lines in Makefile to compile them.

Encoding

Finnish, German, Romance, and Scandinavian languages are in isolatin-1.

Arabic and Russian are in UTF-8.

English is in pure ASCII.

The different encodings imply, unfortunately, that it is hard to get a nice view of all languages simultaneously. The easiest way to achieve this is to use gfeditor, which automatically converts grammars to UTF-8.

Using the resource as library

This API is accessible by both present and alltenses. The modules you most often need are

The Synopsis gives examples on the typical usage of these modules.

Using the resource as top level grammar

The following modules can be used for parsing and linearization. They are accessible from both present and alltenses.

In addition, there is in both present and alltenses the file

A way to test and view the resource grammar is to load langs.gfcm either into gfeditor or into the gf shell and perform actions such as syntax editing and treebank generation. For instance, the command

    > p -lang=LangEng -cat=S "this grammar is too big" | tb

creates a treebank entry with translations of this sentence.

For parsing, currently only English and the Scandinavian languages are within the limits ofr reasonable resources. For other languages L, parsing with LangL will probably eat up the computer resources before finishing the parser generation.

Accessing the lower level ground API

The Syntax API is implemented in terms a bunch of abstract modules, which as of version 1.2 are mainly interesting for implementors of the resource. See the documentation for version 1.1 for more details.

Known bugs and missing components

Danish

English

Finnish

French

German

Italian

Norwegian

Russian

Spanish

Swedish

More reading

Synopsis. The concise guide to API v. 1.2.

Grammars as Software Libraries. Slides with background and motivation for the resource grammar library.

GF Resource Grammar Library Version 1.0. Slides giving an overview of the library and practical hints on its use.

How to write resource grammars. Helps you start if you want to add another language to the library.

Parametrized modules for Romance languages. Slides explaining some ideas in the implementation of French, Italian, and Spanish.

Grammar writing by examples. Slides showing how linearization rules are written as strings parsable by the resource grammar.

Multimodal Resource Grammars. Slides showing how to use the multimodal resource library. N.B. the library examples are from multimodal/old, which is a reduced-size API.

GF Resource Grammar Library (pdf). Printable user manual with API documentation, for version 1.0.