The Sofie Treebank - Sofie-trebanken

The Sofie Treebank - A Parallel Treebank of North European languages


The Sofie Treebank is a parallel treebank that at completion will consist of material from ten North European languages; Danish, Dutch, English, Estonian, Faroese, Finnish, German, Icelandic, Norwegian and Swedish. The text material of the treebank is taken from the Norwegian original and the translations of the first two chapters of Jostein Gaarder's novel Sofies verden.  (2011: There are tree-representations of all languages except Dutch, English and Finnish.)

Search the Corpus
Get access to the corpus

The treebank is being developed by the participants of the Nordic Treebank Network, in which academic institutions from Denmark, Estonia, Finland, Iceland, Norway, and Sweden take part. Some of the languages in the treebank are represented by more than one set of analyses, reflecting the fact that more than one institution has done work for that language. The analyses reflect different grammars, such as Dependency Grammar (Swedish - Växjö University) and a Phrase Structure Grammar of syntactic and function categories (University of Oslo).

An example of some of the clickable sentences in the corpus:





This is an example of one of the Danish analyses:



Published June 13, 2005 1:26 PM - Last modified June 14, 2020 4:36 PM