Filed under: Papers
[author]Piyush Rai, Hal Daume III, and Suresh Venkatasubramanian[/author]
Twenty-First International Joint Conference on Artificial Intelligence (IJCAI-09)
We present a streaming model for large scale classification (in the context of $\ell_2$-SVM) by leveraging connections between learning and computational geometry. The streaming model imposes the constraint that only a single pass over the data is allowed. The $\ell_2$-SVM is known to have an equivalent formulation in terms of minimum enclosing balls (MEB) and an efficient algorithm based on the idea of core sets exists (CVM) (Tsang et al., 2005) which learns a (1+$\epsilon$) approximate MEB for a set of points and yield an approximate solution to corresponding SVM instance. However CVM works in batch mode requiring multiple passes over the data. We present a single-pass SVM based on the minimum enclosing ball of streaming data. We show that the MEB updates for the streaming case can be easily adapted to learn the SVM weight vector using simple Perceptron-like update equations. Our algorithm performs polylogarithmic computation at each example, requires very small and constant storage, and finds simpler solutions (measured in terms of the number of support vectors). Experimental results show that, even in such restrictive settings, we can learn efficiently in just one pass and get accuracies comparable to other state-of-the-art SVM solvers. We also discuss some open issues and possible extensions.
Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>