Colloquium
Frank Wood
Gatsby Computational Neuroscience Unit
University College London
Friday, March 27, 2009
3147 MEB
Refreshments 3:20 p.m.
Lecture 3:40 p.m.
Title: Natural Language Model Domain Adaptation
subtitle: A Doubly Hierarchical Pitman-Yor Process Language Model
Abstract
There are many real-world language modelling domains for which there
is not a sufficient quantity of training data to reliably estimate a
good model. Obtaining enough training data for such "specific" domains
can be both costly and a significant logistical challenge. In some
cases there may already exist a large quantity of training data from a
related or more general domain. The phrase domain adaptation is used
to describe ways to adapt models trained on copious, genera data to
fit some specific domain well.
In this talk I will show how to do domain adaptation of hierarchical
nonparametric Bayesian language models. Specifically I will present a
doubly hierarchical Pitman-Yor process language model and explain how
such a model accomplishes domain adaptation. I will show results that
suggest that this new domain adapting language model performs well in
comparison to prior art.