CLT retreat 2019

Bohusgården, 7–8 May

The Centre for Language Technology (CLT) in Gothenburg is an organization for collaboration between LT researchers at the Faculty of Arts and the IT Faculty at the University of Gothenburg and Chalmers University of Technology.

In recent years our activities have been limited to the irregular CLT retreat, where we have gathered in a pleasant location to inform each other about all language technology activities and research that are going on here in Gothenburg. The most recent such gathering was in November 2017, more than one year ago. Then we decided that the CLT retreat should be a more regular event, so here it is...


There will be time for everyone to give a presentation. A few (7) of them will be normal 15 minute presentations, but the majority (23) will be short 3–5 minute presentations with a following group discussion. There will also be a tutorial.

Tuesday 7 May

8.30 Bus leaves from Olof Wijksgatan
9.30 Bus arrives at Bohusgården
9.45 Fika
10.15 Session 1A
  Welcome (10 min): Peter Ljunglöf
  Long talk (15 min): Johannes Graën
  Short talks (4×5 min) + group discussions (2×15 min):
Benjamin Lyngfelt [slides] (Tjörn),
Staffan Larsson (Orust),
Dana Dannélls, Richard Johansson (Nordens sal)
11.45 Lunch
13.30 Session 1B
  Long talk (15 min): Sasha Berdicevskis
  Short talks (6×5 min) + group discussions (2×15 min):
David Alfter, Vidya Somashekarappa (Tjörn),
Kyoko Ohara [slides], Bill Noble, (Orust)
Dimitrios Kokkinakis, Jacobo Rouces (Nordens sal)
15.00 Fika
15.30 Session 1C
  Long talk (15 min): Vladislav Maraev
  Short talks (6×5 min) + group discussions (2×15 min):
Anna Lindahl, Herbert Lange (Tjörn),
Inari Listenmaa, Mehdi Ghanimifard (Orust),
Prasanth Kolachina, Stian Rødven Eide (Nordens sal)
17.00 (all work and no play makes Jack a dull boy)
18.30 Dinner
(z z z)

Wednesday 8 May

7.30 Breakfast
9.00 Session 2A
  Tutorial (50 min): Distributional semantics and generalized event knowledge, by Asad Sayeed
10.00 Fika
10.30 Session 2B
  Short talks (7×5 min) + group discussions (2×15 min):
Adam Ek, Anne Schumacher [slides], Dan Rosén (Nordens sal),
Kathrein Abu Kwaik, David Bamutura (Orust),
Luis Nieto Piña, Felix Morger (Tjörn)
11.45 Lunch
13.30 Session 2C
  Long talks (4×15 min):
Ann Lillieström,
Shafqat Mumtaz Virk,
Simon Dobnik,
Lars Borin
  Final discussion (10 min)
15.00 Fika
15.45 Bus leaves from Bohusgården
16.45 Bus arrives at Olof Wijksgatan

Instructions for short talks, group discussions, and their chairs

There will be 23 short talks, with following group discussions. This is a new concept, so you are guinea pigs! Here's the idea:

Short talk

The short talks should be 3–5 minutes, and introduce a research problem, interesting phenomenon, or anything that the speaker thinks is interesting enough. (E.g., something in the spirit of this is the cool problem I'm trying to solve, and these are my results so far)

If you don't want to use a computer presentation, that's perfectly fine. But if you have prepared a presentation, then please make sure that your computer works with the projector. Or put the PDF/PPT on a USB stick and give to someone else.

Group discussion

Then we will divide the audience into 2–3 groups (consisting of 8–15 people each). Every group will be assigned two of the speakers, and will discuss the problems/phenomena/results that the speakers introduced, for 15 minutes per speaker (i.e., a total of 30 minutes).

It's the speaker who decides what to do with their time: E.g., they can continue the presentation in more detail (like a regular talk), or they can show a demo, engage in a discussion, try to find common interests, etc.

This means that every (short talk) speaker will discuss their subject with a smaller group, and that every audience member will listen to 6–8 short talks and get to know two of them in more detail.

Discussion chairs

For this to work, every discussion will need a chair person, who keeps track of the time, makes sure that the discussion does not diverge, is respectful, and keeps good manners.

Try to prepare some questions to bring up if the discussion stalls: it doesn't have to be intricate questions, but things like what results do you have?, what's your next project?, what collaboration would be interesting?, …

Instructions for normal talks, and their chairs

Normal talk

Note that the normal talks have to be slightly shorter than the usual conference talk. You have 15 minutes for your presentation, plus 5 minutes for questions.

Session chairs

This is more-or-less standard procedure, but please make sure that the speaker keeps within their 15 minutes. Also, try to prepare a fall-back question to ask afterwards if the audience is silent.

Participants (in alphabetical order)

Aarne Ranta, CSE and Digital Grammars AB

I was born in Finland, studied to Master at the University of Helsinki, made my PhD studies in Stockholm but defended my PhD in Helsinki in 1990. I worked in Helsinki till 1997, after that at Xerox Research Centre Europe in Grenoble, and moved to Gothenburg in 1999. I am Professor of Computer Science at the CSE Department. I was also a founder and currently CEO of the start-up company Digital Grammars AB.

I was the first developer of Grammatical Framework (GF), which started at Xerox in 1998. I am particularly interested in grammatical abstractions over the world's languages. My most recent interests have been practical and commercial applications of grammars (in the company) and the division of labour between grammars and data-driven methods (at the university).

Adam Ek, CLASP

I am from Sweden, and moved to Gothenburg to write a PhD in computational linguistics at CLASP in 2019. I have a bachelor's and magister degree in computational linguistics from Stockholm University.

My research is mainly focused on entailment, specifically on combining deep learning and symbolic approaches. My main interests are in multi-task learning and combining different tasks to create representations of sentences. Currently, my main focus is on using constituent trees and dependency graphs to represent sentences.

Aleksandrs (Sasha) Berdicevskis, Språkbanken

I am from Latvia. I studied linguistics in Moscow and did my PhD in Bergen, Norway, in 2013. After that, I worked as a postdoc in computational historical linguistics in Tromsø, then as a postdoc in evolutionary linguistics / typology in Uppsala. On April 1, I moved to Gothenburg to become a researcher/infrastructure expert at Språkbanken.

I want to know how and why languages change. I am particularly interested in investigating which extra-linguistic factors affect language change, if any; in building and using corpora for research on language change and variation; and in quantitative approaches to typology.

Ann Lillieström, CSE

I grew up in Gothenburg. After spending a few years working in a cinema in Dublin in the late 90s and early 2000s, I returned to Sweden to take courses in mathematics, statistics, logic and computer science at the University of Gothenburg. I took my Master's degree in 2008, after which I started working as a project assistant in the computer science department at Chalmers. In 2009, I started a PhD in automated reasoning. Ten years and two children later, I am now about to finish my PhD and thinking about what to do next.

Recently, I have been working on applying constraint solving to morphological segmentation. Before that, I developed tools to help with automated reasoning in first-order logic: disproving the existence of finite models, efficient translation between sorted and unsorted logic, and efficient handling of transitive relations. I am on the lookout for problems in linguistics where automated reasoning or constraint solving could help.

Anna Lindahl, Språkbanken

I'm from Gothenburg, I studied linguistics at University of Gothenburg, with some detours in other subjects. I took the master in Language technology in 2017 and this year I started as a PhD student at Språkbanken.

The subject of my PhD studies is argumentation mining, where the goal is to identify arguments in Swedish texts. I'm currently evaluating a previous argument annotation project, and researching new annotation guidelines.

Anne Schumacher, Språkbanken

I am from Germany and have been living in Sweden since 2012. I have studied linguistics and language technology in Lund and Gothenburg and have been working as a software developer for Språkbanken since 2014.

As a software developer I don't do any research but I am generally interested in phonetics and speech technology. At Språkbanken I mostly work with improving our annotation pipeline Sparv and I prepare corpora for import into our corpus search tool Korp.

Asad Sayeed, CLASP

I am originally from Canada. After computer science Bachelor's and Master's degrees at Carleton University and the University of Ottawa, I did a computer science Ph.D. (2011) with the computational linguistics lab at the University of Maryland, College Park. I then moved to Germany as a researcher at Saarland University until 2017, when I took up a position at the University of Gothenburg with FLoV/CLASP as Associate Senior Lecturer.

My main research area is computational psycholinguistics, particularly sentence processing from a syntactic, semantic, and pragmatic point of view, using machine learning and crowdsourcing techniques to model human linguistic behaviours and judgements at the sentence level. I cover a wide range of interests such as semantic roles/thematic fit and the psycholinguistics of quantifier ambiguity.

Benjamin Lyngfelt, Dept. of Swedish

I'm a professor of Swedish linguistics, did a PhD on Swedish syntax in 2002 at the University of Gothenburg and has been at the Dept. of Swedish, GU, ever since (at times blended with work for the Swedish Language Council and Karlstad University)

I'm a grammarian, with quite broad research interests although not a computational linguist. The main focus of my research for the last few years has been the development of a Swedish constructicon, which is part of Karp, the lexical platform of Språkbanken. Future research interests include further advances towards a multilingual constructicon infrastructure and, to a somewhat lesser extent, the related development of a multilingual FrameNet.

Bill Noble, CLASP

I'm originally from the US. I finished a Masters of Logic at the University of Amsterdam in 2015. After that, I worked in industry in New York City for a couple of years before joining CLASP as a PhD student.

I'm using computational methods to study social aspects of language use. My interests include dialogue modeling, semantic change, and linguistic variation.

Dan Rosén, Språkbanken

I am almost from Gothenburg and my degrees are from GU and Chalmers: Computer Science M.Sc. 2012, Licentiate in program verification 2016. I work as a systems developer at Språkbanken since 2016.

Machine learning is now my main focus. I want to work on multi-task learning, semi-supervised learning and probe what neural networks learn and how they represent knowledge. I am currently working on dependency parsing.

Dana Dannélls, Språkbanken

I was born in Tel-Aviv, Israel and moved to Lerum, Sweden after my military service in 1997. I completed my bachelor's degree in Electronics and Computer Engineering in 2002, and my master's degree in Computational Linguistics at the University of Gothenburg in 2006. In 2013 I successfully defended my PhD that was funded by the University of Gothenburg and Graduate School of Language Technology (GSLT). The following year I was a postdoctoral researcher at the department of Computer Science and Engineering at Chalmers. Today I work as a researcher and a research project coordinator at Språkbanken.

During my doctoral and postdoctoral studies I explored methods for generating multilingual natural language from structured formal representations including Semantic Web ontologies. Another part of my research includes corpus-based approaches to developing lexical resources for natural language technology applications. Recently I have been involved in several OCR research projects, where I am in particular interested in improving OCR errors with the help of electronic lexicons and word lists. An ongoing OCR project is in collaboration with the national library of Sweden (KB).

David Alfter, Språkbanken

I am from Luxembourg and moved to Gothenburg to pursue a PhD in 2016. I took my master's degree in 2015 at the University of Trier, Germany. I work as PhD student at Språkbanken.

My topic concerns lexical complexity of Swedish single- and multi-word expressions from a language learner's perspective, characterized through NLP methods. Broader interests include (Intelligent) Computer-Assisted Language Learning ([I]CALL), language learning, machine learning.

David Bamutura, CSE and Mbarara University

I am from Uganda, and on a PhD Sandwich program between Makerere University and Chalmers University of Technology since August 2016. I am supervised by Dr. Peter Ljunglöf and Dr. Peter Nabende. I work as a Lecturer at Mbarara University of Science and Technology.

My doctoral research focusses on the development of computational resources and tools for under-resourced Bantu Languages using both symbolic and data-driven approaches. Currently, I am working on formalising the grammar of Runyankore and Rukiga.

Dimitrios Kokkinakis, Språkbanken

I am from Greece, and moved to Gothenburg and studied computational linguistics 1990. I took my master's degree in 1994, and became a PhD in 2002, both at University of Gothenburg. I work as researcher at the department of Swedish, section SpråkbankenText.

My main research interests is in the areas of Text Mining sometimes referred to as Text Analytics (e.g. information extraction); corpus linguistics; computational semantics and building language resource infrastructures. A particular focus of my research has been lexical acquisition and processing from large corpora and during the last years research on (bio)medical language processing in the scope of the area of medical, clinical and health informatics; and also research on cultural heritage data and large literature collections. Currently, I am very much interested in using language technology / natural language processing methods and tools to develop means to identify early signs of cognitive impairment.

Felix Morger, Språkbanken

I began studying linguistics with a focus on computational linguistics at Stockholm university, where I finished a bachelor's and master's degree. I've also worked as a software engineer, mostly doing web development. I am now a PhD-student at Språkbanken.

My research interest is in machine learning interpretability, specifically neural networks. I'm interested in how you can use interpretability techniques to better understand what and how neural networks learn and to what extent this corresponds to our linguistic understanding of languages.

Herbert Lange, CSE

I am from Germany and moved to Gothenburg in 2015 to begin my PhD. I took my master's degree in computational linguistics in 2014 from Ludwig-Maximilians University Munich. Now I work as a PhD student at the Computer Science department.

My main research interest is in formal approaches to syntax and semantics. For my PhD project I combine this interest with my interest in (historic) languages to build a language learning application which is also suitable for less-resourced languages to also be usable with historic languages.

Inari Listenmaa, Digital Grammars AB

I am from Finland, did both BA and MA in computational linguistics in University of Helsinki. I moved to Gothenburg to start my PhD in 2013. I defended in March 2019, and work now at Digital Grammars.

I am active in the GF (Grammatical Framework) community since 2010. My PhD topic was applying software testing methods to computational grammars, both GF and Constraint Grammar.

Jacobo Rouces, Språkbanken

I am from Spain, I did my PhD at Aalborg University in Denmark, working on integration of linked data using semantic frames. I moved to Gotheburg for a postdoc position. Now I work as researcher at Språkbanken.

My main interests are opinion mining, both from textual sources and from other sources with textual metadata. I have also worked on linked data.

Johannes Graën, Språkbanken and Pompeu Fabra University

I studied computational linguistics in Stuttgart and Barcelona and – after a short round trip to industry – did my PhD in the computational linguistics group in Zürich. Following that, I continued as a Postdoc in Zürich, contributing to several smaller projects and writing a grant proposal on corpus-based language learning applications. As said proposal was accepted, I started my project at Språkbanken in April 2019 (the other partner is a group with focus on applied linguistics, language learning and didactics at Pompeu Fabra University).

My main research interest has been parallel corpora, in particular alignment and the use of parallel data for different applications, from contrastive linguistics to language exploration for learners. My current projects aims at integrating techniques from corpus linguistics, crowdsourcing and lexical resources for language learning. I'm interested in learning on learning from the learners (and teachers), or rather from the data they generate.

Kathrein Abu Kwaik, CLASP

I am originally from Palestine where I took my master degree in information technology. I moved to Gothenburg to get my PhD degree at the University of Gothenburg in Computational Linguistics. My main research interests are NLP and Machine learning as a tool to solve NLP problems.

Nowadays I am working in the Arabic dialects such as dialect identification and sentiment analysis of dialects. we built a dialectal resource and measured the similarity and distance between dialects, so maybe we can transfer some successful tools from one dialect to the others.

Kyoko Ohara, Keio University and guest at Språkbanken

I am from Japan. I graduated from the University of Tokyo and joined Tokyo Research Laboratory at IBM, Ltd. I received my master's degree in linguistics at the University of California at Berkeley under a scholarship from IBM Japan. I became a PhD in 1996 at the University of California at Berkeley. I am a professor at the Faculty of Science and Technology at Keio University in Japan. I am currently a guest researcher at Språkbanken until the 20th of June.

I became interested in cognitive and computational linguistics through participating in a Japanese-English machine translation project at IBM Research. I am currently working on building Japanese FrameNet and Japanese Constructicon, applying the theories of cognitive linguistics. I am interested in linking Japanese FrameNet and Japanese Constructicon to other language resources through crowdsourcing etc. I am also interested in applying Japanese FrameNet and Japanese Constructicon to NLP systems and to language teaching.

Lars Borin, Språkbanken

I have a PhD in computational linguistics from Uppsala University (1992) and have worked at the universities in Uppsala, Stockholm, and, since 2002, Gothenburg, where I am professor of natural language processing in the Department of Swedish. I am the director of the national research infrastructure Nationella språkbanken, and also the national coordinator of Swe-Clarin, the Swedish node of CLARIN ERIC, a European research infrastructure currently involving .

My primary research interests are broadly linguistic: large-scale comparative linguistics (aka language typology), historical-comparative linguistics and lexicology, in particular lexical semantics. On the computational side, I like to apply language technology as a research tool in disciplines where text is an important form of primary research data, and I am also interested in the relation between increasingly accurate computational simulation of linguistic behavior and explanation of human linguistic ability.

Luis Nieto Piña, Språkbanken

I am from Spain and moved to Gothenburg in 2014 to do my PhD in Natural Language Processing. My background is Computer Science Engineering and Mathematics at the Autonomous University of Barcelona.

The topic of my PhD is meaning representation: I devise neural models that can learn the meaning of words from combining corpora and lexica. In particular, I have applied this approach to distinguish and represent the different senses of polysemic words.

Mehdi Ghanimifard, CLASP

PhD student at CLASP since 2015. Before that Master in Language Technology at GU, and before that Amirkabir University of Technology – Tehran Polytechnic.

My research area is in grounded language understanding and generation with neural language models. I am interested in examining models which can combine linguistic representations and uncertain perceptual representations in a single framework.

Peter Ljunglöf, CSE and Språkbanken

I am from Sweden, and moved to Gothenburg to study computational linguistics 1995. I took my master's degree in 1999, and became a PhD in 2004, both at University of Gothenburg. I work as universitetslektor at the department of Computer Science and Engineering, and at Språkbanken.

My main research area and interest is in using grammar formalisms for solving any kind of linguistic (and non-linguistic) problems, i.e., to specify and encode more information than just linguistic syntax. Currently my main focus is on language learning and multimodal text editing, but I am also interested in dialogue systems, communication aids, text analytics, annotation tools, parsing, etc., etc.

Prasanth Kolachina, CSE

I am from India, moved to Gothenburg to do my PhD in NLP in 2014. I took my master's degree in 2012 from IIIT-Hyderabad, India and will soon be defending my thesis at Chalmers in June.

My main research area and interest is in working with multilingual representations, for my PhD I worked primarily with interlingual grammars like GF and Universal Dependencies. My original interests have been in the problem of machine translation, though I currently work with other problems like dependency parsing and parsing/tranducers in general.

Richard Johansson, CSE

I am a senior lecturer at the CSE Department, since 2016. I earned my PhD in 2008 at Lund University and the Swedish National Graduate School of Language Technology (GSLT), where I was advised by Pierre Nugues. After my time in Lund, I spent a few years as a postdoc at the University of Trento with Alessandro Moschitti. Between 2011 and 2016, I had a position as a researcher and lecturer at Språkbanken.

My research field is natural language processing: I develop machine learning models and algorithms for extracting structure from text. My early research focused on structured prediction problems, in particular semantic role labeling. More recently, I have been interested in using linguistic resources (for instance, semantic networks) as a form of light supervision for machine learning models.

Shafqat Mumtaz Virk, Språkbanken

I am from Pakistan, and moved to Gothenburg to study Computer Science in 2005. I took my master's degree in 2007 from Chalmers , and became a PhD in 2013 at University of Gothenburg. I work as a researcher at Språkbanken.

My main research interests are in the areas of language engineering, computational linguistic resources development, semantic parsing, and information extraction. At Språkbanken, I am involved in a couple of projects where we are exploring the use of language technology for automatic extraction of typological linguistic information from descriptive grammars. Largely relying on semantic parsing and open information techniques, we are developing methodologies and/or tools which we expect to be useful for a wider audience.

Simon Dobnik, CLASP

I am from Slovenia. I took my bachelor, masters and a PhD in computational linguistics at University of Oxford. I came to Gothenburg in 2011 as a postdoc, then worked as a researcher and now as a senior lecturer/docent at CLASP and FLOV.

My research interests include spatial cognition, computational models of language and perception/vision, human-robot interaction, situated spoken dialogue systems, and computational representations of meaning (semantics). I have also developed an interest in Arabic NLP.

Staffan Larsson, CLASP and Talkamatic AB

I'm from Gothenburg and did my PhD here in 2002. I work as a professor of Computational Linguistics at CLASP and the Department of Philosophy, Linguistics and Theory of Science. I am also co-founder and research leader at Talkamatic AB.

My areas of interest include dialogue, dialogue systems, language and cognition, pragmatics, formal semantics, semantic coordination, in-vehicle dialogue systems, philosophy of language.

Stian Rødven Eide, Språkbanken

Norwegian alien, residing in the greener grass of Sweden since 2007 and currently doing my PhD at Språkbanken. Deeply entrenched in the Free Software movement since even longer ago and seemingly very fond of the definite articles. On the indefinite side, I have a master's in language technology and some bachelors' in linguistic and film studies.

My research interests are semantics, logic, music, democracy, world peace. It's all connected. In my PhD, I'm working with Riksdagens öppna data and hoping I'll some day be able to make a proper lie detector.

Vidya Somashekarappa, CLASP

I completed by Bachelors in Audiology and Speech-Language Pathology in 2016, then moved to Moscow for Masters in Cognitive Neuroscience. Currently a PhD student at the University of Gothenburg since February.

During Bachelors, my focus was on investigating the influence of online plasticity on perception and cognitive function accompanied by the neural changes in auditory processing using Brainstem responses. I have worked on non-invasive brain stimulation and constraint-induced aphasia therapy to elicit the neural mechanisms of the post-stroke aphasia recovery and their induction. Currently, I'm working on developing a model to understand the events (semantics) situated in space, a framework to understand language, events and dialogue by adding an additional eye gaze system. I'm also interested in understanding how prior probability helps shape color naming under the influence of reinforcement learning.

Vladislav Maraev, CLASP

I am from St. Petersburg, Russia. I moved to Gothenburg in 2017 to work towards my PhD. My first degree is in Telecommunications Engineering and my MA is in Cognitive Science from University of Lisbon. I also used to work as a programmer and project manager developing and integrating dialogue systems. Now I am a second year PhD student at CLASP.

My main research interest is non-verbal social signals in dialogue (first and foremost laughter), how to model them using dialogue formalisms and how to integrate them in spoken dialogue systems. Currently I am working on laughter syntax, where it can be positioned with respect to laughable (what laughter relates to) and what can act as a good predictor for laughter.