ENPC: Dialogue marking


The English-Norwegian Parallel Corpus


Extensions of the project

Dialogue Marking

All the original fiction texts in the corpus, English and Norwegian, have been marked for dialogue. The entity &qb; has been used for 'quote begin' and &qe; for 'quote end'. This will facilitate and encourage research in the area of direct speech as opposed to straightforward narrative.

All texts do not have clearly marked boundaries between direct speech and the rest of the text. At the one end we find texts with quotation marks at the beginning and the end of each utterance:


(1) "Are you bored with the election, my darling?" asked the Queen, stroking Harris's back. (ST1.1.1.s4)

At the other extreme, however, we find no overt marking at all:

(2)  Jeg vet noe, sier Rut, noe fælt noe. (BV2.1.1.s1)

Lit.: I know something, says Rut, something terrible.


The dialogue marking has been done partly automatically and partly manually, calling for some interpretation on the part of the person responsible for the dialogue marking.



Published July 6, 2010 10:39 AM - Last modified July 12, 2010 3:42 PM