The Norwegian component of LINDSEI

Susan Nacey and Hege Larsson Aas

Hedmark University College

Thursday 28 February 2013, 14.15-16.00; room 489 PAM

This talk explores the Norwegian component of LINDSEI, the Louvain International Database of Spoken English Interlanguage (Gilquin et al. 2010), a subcorpus that has been under compilation at Hedmark University College since 2010.

The LINDSEI corpus was created under the direction of the Centre for English Corpus Linguistics (CECL) in Belgium, which specializes in the collection and use of learner and multilingual corpora. The first edition of LINDSEI contains 554 interviews of learners from 11 different L1 backgrounds (usually 50 interviews per subcorpus). The interviews have been conducted in English, to capture the learners’ ‘interlanguage’, an idiosyncratic dialect that shares characteristics of their L1 and L2 (and sometimes other languages as well) (see Corder 1981: 17, 85).

Plans for a second edition began even before the publication of the first edition, with the ambition of including the audio files, and linking the spoken and transcribed text. Moreover, data from learners with other language backgrounds will be added, thereby increasing the size and linguistic range of the corpus. One of the additional subcorpora in the expanded LINDSEI will comprise interviews of Norwegian speakers of L2 English. The creation of this Norwegian component is nearing completion, as a team of researchers at Hedmark University College has now recorded 50 interviews and almost finished all the transcriptions.

The main aims of this talk are four-fold:
1) To raise awareness among corpus linguists in Norway of the existence of this corpus. The corpus is described in detail, and group members will be allowed a preview of the material, long before its ultimate publication.

2) To address some of the challenges inherent in the production of spoken corpora whose compilation depends upon widespread international collaboration. The main areas discussed in this regard involve issues raised while retracing the footsteps of the first group of LINDSEI researchers through the employment of the already-established guidelines for corpus compilation and transcription. Primary focus is here given to general issues concerning compilation and transcription (e.g. the role/identity of the interviewer), as well as more specific ones relating to the nature of learners’ interlanguage (e.g. ‘standard’ transcriptions of filled pauses).

3) To briefly discuss the research that has already been carried out using this corpus, as well as research that is underway. This includes two smaller studies, one on compensation strategies employed by Norwegian learners (Nacey & Graedler forthcoming) and one on preposition use in spoken English (Nacey & Graedler), as well as a doctoral project concerning highly recurrent word combinations in the spoken text of Norwegian learners (Aas).

4) To solicit ideas from members of the Corpus Linguistics Group for future research using LINDSEI. To this end, the talk will conclude with a brainstorming session involving all participants.


Corder, S. P. (1981). Error analysis and interlanguage. Oxford: Oxford University Press.
Gilquin, G., S. De Cock, & S. Granger. (2010). Louvain International Database of Spoken English Interlanguage (LINDSEI). Louvain-la-Neuve: Presses universitaires de Louvain.
Nacey, S., & A.-L. Graedler. (forthcoming). Communication strategies in a corpus of advanced learner English. In L. Degrand & Sylviane Granger (Eds.), Corpora and Language in Use: Proceedings. Louvain-la-Neuve, Belgium: Presses universitaires de Louvain.

