Fakespeak – the language of fake news
Linguistic cues in fake news may be the key to its detection.
Photo: Silje Susanne Alvestad.
About the project
Fake news is defined as "news" items in which case the author knows that they are false and intends to deceive. The lion's share of the research on the detection of fake news is conducted by computer scientists alone.
However, linguists have shown that the linguistic features of a text vary according to its purpose. Thus, the language of fake news may be the key to its detection. This is the background of the linguistics-driven project "Fakespeak - the language of fake news. Fake news detection based on linguistic cues". The project involves a core team of linguists and computer scientists based in Norway and the UK.
The linguists will seek to reveal the grammatical and stylistic features of the language of fake news, referred to as "Fakespeak", in English, Norwegian and Russian.
To achieve this goal they will first build, and make use of existing, corpora of fake and real news from various online media outlets in all three languages, and then subject the datasets to thorough linguistic analyses.
The fake and genuine articles that we will compare will be written by one and the same author. This is to control for several potential sources of error. The linguists will apply methods and draw on insights from corpus linguistics, computational linguistics, applied linguistics, including forensic linguistics, as well as pragmatics and rhetoric.
Taking the linguists' findings as their point of departure, along with existing fake news detection systems, the computer scientists will seek to improve these systems by automating the defining features of Fakespeak.
The overall aim of the project is to enable fake news detection systems to discover and flag potentially harmful fake news items in a more accurate, efficient and timely manner than offered by current state-of-the-art systems.
By automating all and only the features of Fakespeak, the project team will enable the systems to detect and flag only deliberate disinformation, excluding, for example, (inadvertent) misinformation, satirical texts, parody, and texts reflecting a certain set of opinions. Thus, the project will take societal safety and security into consideration while at the same time guarding the freedom of speech.
The Research Council of Norway, project-ID 302573.
Start-up workshop, February 16-17
Tuesday, February 16:
- 9:15-9:45 Edson C. Tandoc Jr. Scholarly definitions of fake news
- 9:50-10:20 Sharon Levy On Fakeddit: A New Multimodal Benchmark Dataset for Fine-grained Fake News Detection, and related works from William Wang’s lab
- 10:30-11:00 Geir Hågen Karlsen Influence operations in social and other kinds of media, political communication, one-sided history writing etc. The case of Russia
- 12:00-12:45 Industrial collaboration partners
- 12:00-12:20 NTB (Geir Terje Ruud/Sarah Sørheim)
- 12:20-12:45 Faktisk.no (Kristoffer Egeberg)
Wednesday, February 17:
- 9:00-9:30 Maite Taboada The language of fake news and misinformation
- 9:30-10:00 Helena Woodfield and Jack Grieve The language of fake news. Corpus studies
- 10:15- Potentially collaborating projects
- 10:15-10:40 PAR-TS (Tor Olav Grøtan)
- 10:40-11:00 Threat-defuser (Laura Janda)
- 11:00-11:20 SCAM (Bente Kalsnes)