For the next three years prof. Janne Bondi Johannessen and the Text Laboratory will take part in the project HaBiT - Harvesting big text data for under-resourced languages - together with project partners Masarykova Univerzita in Brno and NTNU in Trondheim.
Friday August 22, Talko - a corpus of spoken Swedish in Finland - was launched at the Tenth Nordic Dialectology Conference in Åland.
Talko currently consists of more than 360,000 words from 151 informants in Finland. The Society of Swedish Literature in Finland has created the corpus. The search interface has been implemented in the corpus system Glossa by the Text Laboratory. The project has used the semiautomatic Transliterator from the Text Laboratory to transliterate phonetic transcriptions to orthographic transcriptions.
Read more about Talko (in Swedish)