The Ruija Corpus

The Ruija Corpus is a speech corpus from Kven and Finnish-speaking areas in Northern Norway. 76 hours and 18 minutes of speech have been transcribed, and the corpus contains 428 971 words.

Photo: Vilfred Ingilæ

In April 2005 Norway got a new official minority language – Kven was recognised as a language. The Ruija Corpus consists of recordings of interviews, and it is the first on-line corpus of the Kven language. The recordings were made in Kven and Finnish-speaking areas between 1960 and 2009; old recordings have been transcribed and new data were gathered. The search interface of the corpus is Glossa which is designed by the Text Laboratory.

The data come from two research projects, both lead by Dr. Pia Lane at  the Department of Linguistics and Scandinavian Studies, University of Oslo.

LICHEN – The Linguistic and Cultural Heritage Electronic Network
LICHEN was one of the Norwegian research projects financed by the Research Council of Norway financed as a part of the International Polar Year. The project is a part of a larger network focusing on minority languages in the circumpolar region, coordinated by the University of Oulu. The partners of the Norwegian LICHEN-project are the University of Oulu, the University of Tromsø and Kvensk Institutt/Kainun Institutti. The interviews from 2007 were carried out in as a part of the LICHEN-project.

Link to the LICHEN-projects webpage

Link to the Research Council of Norway’s presentation of the project

Identities in Transition: A Longitudinal Study of Language Shift in a Kven Community, a post doctoral research project financed by the Research Council of Norway
In 1975 69 people in Bugøynes were interviewed by Finnish ethnologists. In 2009 11 of these were re-interviewed by the project’s Finnish field assistant, and the corpus has data from the same individuals interviewed 35 years apart. In addition, younger people who do not speak Kven were interviewed by a Norwegian-speaking assistant.

Access to data
The Ruija Corpus is password protected and can only be used for research purposes. Parts of the material from Bugøynes/Pykejä will be made available in non-password protected form once the interviewees have listened to the recordings and granted permission.

Apply for access to the Ruija Corpus

Search corpus


Published Mar. 2, 2011 9:33 AM - Last modified Jan. 31, 2021 9:27 AM