Colloquium
Kamesh Munagala
Computer Science
Duke University
Monday, October 6, 2008
3147 MEB
Refreshments 3:20 p.m.
Lecture 3:40 p.m.
Host: Suresh Venkatasubramanian
Title: LP-duality Based Algorithms for Restless Bandit Problems
Abstract
In numerous areas of operations research and artificial intelligence,
we face a problem common to gamblers choosing slot machines (or
bandits) in a casino: whether to try new machines or keep playing the
one we know and hope for the best. More generally, we face the
trade-off between exploration and exploitation: between learning from
and adapting to a stochastic system and exploiting our current
best-knowledge. A fundamental decision-theoretic model that captures
this trade-off is the celebrated Multi-arm Bandit Problem. The basic
multi-armed bandit problem admits to an elegant polynomial time
solution. However, many stochastic scheduling applications can only be
modeled using a generalization termed the Restless Bandit Problem,
which in its ultimate generality, is PSPACE-Hard to approximate to any
non-trivial factor.
In this talk, we derive a 2-approximation algorithm for a general
subclass of the Restless Bandit Problem, in which the state-transition
probability for each bandit is "separable" into a constant
probability matrix and a vector of arbitrary monotone functions of
time. The monotonicity models increasing levels of uncertainty as
time-since-last-observation increases. We mention applications of this
model to wireless scheduling and unmanned aircraft navigation. The
resulting algorithm is simple and intuitive, and in addition,
applicable to related stochastic scheduling problems.
Joint work with Sudipto Guha, UPenn, and Peng Shi, Duke.
Bio
Kamesh Munagala is an Assistant Professor of Computer Science at Duke
University. He obtained his Ph.D. in 2003 and M.S. in 2002 from
Stanford University, and B.Tech. in 1998 from IIT Bombay. He is
interested in discrete optimization, particularly approximation
algorithms for NP-Hard problems, online algorithms, stochastic
optimization, and query optimization in database systems and sensor
networks. He received the NSF CAREER award in 2008.