First steps of Gramateca: a corpus-based grammar initiative for Portuguese, driven by Linguateca

Diana Santos

Thursday 20 February 2014, 14.15; room 489 PAM

In this presentation, I will describe the goals and what we intend to accomplish with the Gramateca initiative, which you can read about at

I will describe the team and the different things people want to do, the funding and publication policies, and the most important criterion: that the analyses done remain publicly available so that others can improve on them and give different interpretations if they feel the need to.

I will also discuss some of the problems and the way we intend to solve them with the on-going activity of semantic annotation, human revision, and parser improvement, which are a substantial concern for those who do corpus-based work that is intended to be replicable.

Some specific examples of the studies that have been started will also be presented if time permits.

In order to spice up the presentation, I will try to contrast it with the corpus-based grammar published by Longman in 1999, by Stig Johansson and others.

