A corpus-driven approach to discourse markers in spoken data

Gisle Andersen, NHH

Thursday 10 November 2011, 14.15-16, Room 489, PAM

The paper asks the question: can bottom-up data-driven approaches be useful for the study of discourse markers based on corpora? Corpus-based studies of discourse markers have almost exclusively applied the method of one-to-one searching (Walsh et al. 2008), where specific linguistic forms are searched for in the corpus, based on the researcher's a priori knowledge of their existence and relevance, gaining relevant search hits as a basis for study. I wish to explore the idea that the use of discourse markers may also be observed via bottom-up approaches, such as investigating systematic differences between datasets in terms of their collocational patterns, relative frequencies, lexical inventory etc. Discourse markers are often the result of innovative language use and processes of grammaticalisation. Further, many discourse markers are phrasal units. It is assumed that such emergent processes and the occurrence of new, phrasal discourse markers may be revealed via bottom-up approaches to the data. The discussion points will be illustrated with reference to findings based on two corpora of spoken English.
