Grammatical Framework Version 2.2

Highlights of GF version 2.2.

9/5/2005

Aarne Ranta

Summary of novelties in Version 2.2 in comparison to 2.1

  • New optimizations to reduce the size of GFC files
  • Improved parsing algorithms
  • Lots of bug fixes
  • Separate reuse modules no longer needed
  • Several new command options
  • New documentation:
  • New resource libraries
  • New example grammars
  • Visualization of module dependency graph
  • In the editor GUI, text corresponding to subtrees with constraints marked with red colour
  • Hierarchic modules used in the source code
  • haddock documentation available for source code
  • Optimizations to reduce GF's memory footprint when using large grammars.
  • The pm command can now convert identifiers in the grammar to UTF-8.

    Compiler optimizations

    The sometimes exploding size of generated gfc and gfr files has made it urgent to find optimizations that reduce the size of the code. There are five combinations optimizations that can be chosen, as the value of the optimize flag: The share and parametrize optimizations are always just good, whereas the values optimization may slow down the use of the table. However, it is very good for grammars mostly consisting of the inflection tables of lexical items: it can reduce the file size by the factor of 4.

    An optimization can be selected individually for each resource and concrete module by including the judgement

      flags optimize=(share|parametrize|values|all|none) ;
    
    in the module body. These flags can be overridden by a flag given in the i command, e.g.
      i -src -optimize=none Foo.gf
    
    Notice that the option -src is needed if there already are generated files created with other optimization flags.

    Important notice: If you use the Embedded GF Interpreter, or the improved parsing algorithms described below, only the values none, share and values can be used; the stronger optimizations are not supported yet. Also note that currently, GF aborts and reports an error if the stronger optimizations are used when creating the grammar for the Embedded GF Interpreter, or when trying to parse.

    Improved parsing algorithms

    We have implemented some of the suggested parsing algorithms described in Peter Ljunglöf's PhD thesis. So now there are the following options for parsing: The option -parser=X selects the parsing strategy. The default parser has the strategies chart, bottomup, topdown, old, with the first one being the default. The -cfg and -mcfg parsers only recognize the bottomup and topdown strategies.

    Note that the -cfg and -mcfg parsers can take a very long time on their first call, since they have to convert the GF grammar. This will only happen once in a GF run, provided the GF files are not changed.

    Tips for choosing the best parser for your grammar. Try with the default parser; if it is too slow, try the other two. Remember that the first time you parse they will be very slow, since they have to build parsing information. the -cfg parser is best on grammars with many parameters and inflection tables, and The -mcfg parser is even better when the grammar also has discontinuous constituents.

    Here is a small example from the resource library:

    > i -src -optimize=share lib/resource/english/LangEng.gf
    > p -cat=S ""
    > p -cat=S -cfg ""
    > p -cat=S -mcfg ""
    {Comment: Just some dummy parsing calls to calculate the parsing information}
    
    > p -cat=S -rawtrees=200000 "you will be running"
    {Comment: Nr of unfiltered trees: 169296 -- 99,996% av the trees are ill-typed}
    
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
    
    17730 msec
    
    > p -cat=S -cfg "you will be running"
    {Comment: Nr of unfiltered trees: 246 -- 97,5% of the trees are ill-typed}
    
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
    
    1580 msec
    
    > p -cat=S -mcfg "you will be running"
    {Comment: Nr of unfiltered trees: 6 -- all trees are type-corrent}
    
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP thou_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP ye_NP (IPredV ASimul run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV AAnter run_V))
    UseCl (PosTP TFuture ASimul) (SPredProgVP you_NP (IPredV ASimul run_V))
    
    470 msec