Seminar: Taraka Rama on Bayesian methods for historical linguistics
We are delighted to announce the second Language Change Seminar of this semester. Taraka Rama Kasicheyanula (Postdoctoral Fellow at IFI) will give a talk entitled Bayesian methods for historical linguistics. All are welcome!
Rama Kasicheyanula’s abstract:
“I will talk about the application of Bayesian methods for three highly interrelated tasks: identifying the right data size for making phylogenetic inferences, automated cognate detection, and phylogenetic inference itself. These are three crucial steps in the pipeline of computational historical linguistics, a vibrant and relatively recent field of research which has prospects of automating parts of, if not completely, an age-old tradition of manually constructing family trees through the laborious identification of shared innovations.
The first problem is to determine the right size of the meaning list required for inferring high quality trees. Recently published results show that Bayesian phylogenetic inference does not necessarily benefit from being given all available data as input and that there is a surprisingly solid linear correlation between the number of languages to be classified and the size of the most adequate dataset. This finding is particularly important when designing meaning lists for language families which do not have long tradition of classical comparative linguistic research.
Next, given word lists for a set of languages to be classified, the researcher must determine which words are cognate, i.e., share a common origin, and which not. This should be achieved in an automated way, and much current research is devoted to devising suitable algorithms for this task. My contribution to this highly competitive field of research has been to design two variants of the Chinese Restaurant Process (CRP). I show that the CRP variants work as well as InfoMap (the currently best known clustering algorithm) when applying two different cluster evaluation measures.
Finally, when it comes to producing the actual phylogenetic trees it is necessary to evaluate the utility of automated cognate identification techniques for inferring such phylogenies. The results of these techniques should be evaluated against cognate judgments done by experts. As it turns out, cognates inferred from automated cognate identification methods can, indeed, be used to infer high quality phylogenies.”